From 2703c51aed8b727654d10f48d523fe2a753fc45b Mon Sep 17 00:00:00 2001
From: GitHub Action <action@github.com>
Date: Tue, 24 Dec 2024 05:10:14 +0000
Subject: [PATCH] Build 2024-12-24 05:10:14. 16ef7fc1. Event: schedule.

---
 404.html                                      |   2 +-
 404/index.html                                |   2 +-
 .../vJuFcpWvbN6zbAub4gcIZ/_buildManifest.js   |   1 +
 .../vJuFcpWvbN6zbAub4gcIZ/_ssgManifest.js     |   1 +
 .../augmentations/crops/transforms/index.html |   4 +-
 .../domain_adaptation/transforms/index.html   |   6 +-
 .../dropout/xy_masking/index.html             | 208 +++----
 .../geometric/functional/index.html           | 417 ++++++++------
 .../augmentations/geometric/rotate/index.html |   6 +-
 .../geometric/transforms/index.html           | 407 +++++++-------
 .../transforms3d/functional/index.html        | 129 +++--
 .../transforms3d/transforms/index.html        | 249 +++++----
 docs/api_reference/core/bbox_utils/index.html | 182 ++++---
 .../api_reference/core/composition/index.html |  46 +-
 .../core/keypoints_utils/index.html           | 509 +++++++++++-------
 .../core/transforms_interface/index.html      |  76 +--
 docs/api_reference/full_reference/index.html  |   2 +-
 docs/search/search_index.json                 |   2 +-
 docs/sitemap.xml                              | 182 +++----
 docs/sitemap.xml.gz                           | Bin 967 -> 967 bytes
 index.html                                    |   4 +-
 index.txt                                     |   4 +-
 people/index.html                             |   2 +-
 people/index.txt                              |   2 +-
 sitemap.xml                                   | 188 +++----
 testimonials/index.html                       |   2 +-
 testimonials/index.txt                        |   2 +-
 27 files changed, 1471 insertions(+), 1164 deletions(-)
 create mode 100755 _next/static/vJuFcpWvbN6zbAub4gcIZ/_buildManifest.js
 create mode 100755 _next/static/vJuFcpWvbN6zbAub4gcIZ/_ssgManifest.js
diff --git a/404.html b/404.html
index f50ae9e9..15116c8b 100755
--- a/404.html
+++ b/404.html
@@ -1 +1 @@
-<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="robots" content="noindex"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>404: This page could not be found.</title><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div style="font-family:system-ui,&quot;Segoe UI&quot;,Roboto,Helvetica,Arial,sans-serif,&quot;Apple Color Emoji&quot;,&quot;Segoe UI Emoji&quot;;height:100vh;text-align:center;display:flex;flex-direction:column;align-items:center;justify-content:center"><div><style>body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}</style><h1 class="next-error-h1" style="display:inline-block;margin:0 20px 0 0;padding:0 23px 0 0;font-size:24px;font-weight:500;vertical-align:top;line-height:49px">404</h1><div style="display:inline-block"><h2 style="font-size:14px;font-weight:400;line-height:49px;margin:0">This page could not be found.</h2></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\na:I[6213,[],\"OutletBoundary\"]\nc:I[6213,[],\"MetadataBoundary\"]\ne:I[6213,[],\"ViewportBoundary\"]\n10:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"ZNcQrWMYk7ymGN9y8PjnQ\",\"p\":\"\",\"c\":[\"\",\"_not-found\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"/_not-found\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"/_not-found\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"/_not-found\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:style\",\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:1:props:style\",\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:style\",\"children\":[\"$\",\"h2\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:children:props:style\",\"children\":\"This page could not be found.\"}]}]]}]}]],null,[\"$\",\"$La\",null,{\"children\":\"$Lb\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[[\"$\",\"meta\",null,{\"name\":\"robots\",\"content\":\"noindex\"}],[\"$\",\"$4\",\"6DRj4I7NFXj8EdmzR-Wx-\",{\"children\":[[\"$\",\"$Lc\",null,{\"children\":\"$Ld\"}],[\"$\",\"$Le\",null,{\"children\":\"$Lf\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$10\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"f:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\nd:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}]]\n"])</script><script>self.__next_f.push([1,"b:null\n"])</script></body></html>
\ No newline at end of file
+<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="robots" content="noindex"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>404: This page could not be found.</title><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div style="font-family:system-ui,&quot;Segoe UI&quot;,Roboto,Helvetica,Arial,sans-serif,&quot;Apple Color Emoji&quot;,&quot;Segoe UI Emoji&quot;;height:100vh;text-align:center;display:flex;flex-direction:column;align-items:center;justify-content:center"><div><style>body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}</style><h1 class="next-error-h1" style="display:inline-block;margin:0 20px 0 0;padding:0 23px 0 0;font-size:24px;font-weight:500;vertical-align:top;line-height:49px">404</h1><div style="display:inline-block"><h2 style="font-size:14px;font-weight:400;line-height:49px;margin:0">This page could not be found.</h2></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\na:I[6213,[],\"OutletBoundary\"]\nc:I[6213,[],\"MetadataBoundary\"]\ne:I[6213,[],\"ViewportBoundary\"]\n10:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"vJuFcpWvbN6zbAub4gcIZ\",\"p\":\"\",\"c\":[\"\",\"_not-found\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"/_not-found\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"/_not-found\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"/_not-found\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:style\",\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:1:props:style\",\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:style\",\"children\":[\"$\",\"h2\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:children:props:style\",\"children\":\"This page could not be found.\"}]}]]}]}]],null,[\"$\",\"$La\",null,{\"children\":\"$Lb\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[[\"$\",\"meta\",null,{\"name\":\"robots\",\"content\":\"noindex\"}],[\"$\",\"$4\",\"9SsqfOdEB0097F9U47EPB\",{\"children\":[[\"$\",\"$Lc\",null,{\"children\":\"$Ld\"}],[\"$\",\"$Le\",null,{\"children\":\"$Lf\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$10\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"f:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\nd:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}]]\n"])</script><script>self.__next_f.push([1,"b:null\n"])</script></body></html>
\ No newline at end of file
diff --git a/404/index.html b/404/index.html
index f50ae9e9..15116c8b 100755
--- a/404/index.html
+++ b/404/index.html
@@ -1 +1 @@
-<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="robots" content="noindex"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>404: This page could not be found.</title><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div style="font-family:system-ui,&quot;Segoe UI&quot;,Roboto,Helvetica,Arial,sans-serif,&quot;Apple Color Emoji&quot;,&quot;Segoe UI Emoji&quot;;height:100vh;text-align:center;display:flex;flex-direction:column;align-items:center;justify-content:center"><div><style>body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}</style><h1 class="next-error-h1" style="display:inline-block;margin:0 20px 0 0;padding:0 23px 0 0;font-size:24px;font-weight:500;vertical-align:top;line-height:49px">404</h1><div style="display:inline-block"><h2 style="font-size:14px;font-weight:400;line-height:49px;margin:0">This page could not be found.</h2></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\na:I[6213,[],\"OutletBoundary\"]\nc:I[6213,[],\"MetadataBoundary\"]\ne:I[6213,[],\"ViewportBoundary\"]\n10:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"ZNcQrWMYk7ymGN9y8PjnQ\",\"p\":\"\",\"c\":[\"\",\"_not-found\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"/_not-found\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"/_not-found\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"/_not-found\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:style\",\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:1:props:style\",\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:style\",\"children\":[\"$\",\"h2\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:children:props:style\",\"children\":\"This page could not be found.\"}]}]]}]}]],null,[\"$\",\"$La\",null,{\"children\":\"$Lb\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[[\"$\",\"meta\",null,{\"name\":\"robots\",\"content\":\"noindex\"}],[\"$\",\"$4\",\"6DRj4I7NFXj8EdmzR-Wx-\",{\"children\":[[\"$\",\"$Lc\",null,{\"children\":\"$Ld\"}],[\"$\",\"$Le\",null,{\"children\":\"$Lf\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$10\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"f:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\nd:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}]]\n"])</script><script>self.__next_f.push([1,"b:null\n"])</script></body></html>
\ No newline at end of file
+<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="robots" content="noindex"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>404: This page could not be found.</title><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div style="font-family:system-ui,&quot;Segoe UI&quot;,Roboto,Helvetica,Arial,sans-serif,&quot;Apple Color Emoji&quot;,&quot;Segoe UI Emoji&quot;;height:100vh;text-align:center;display:flex;flex-direction:column;align-items:center;justify-content:center"><div><style>body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}</style><h1 class="next-error-h1" style="display:inline-block;margin:0 20px 0 0;padding:0 23px 0 0;font-size:24px;font-weight:500;vertical-align:top;line-height:49px">404</h1><div style="display:inline-block"><h2 style="font-size:14px;font-weight:400;line-height:49px;margin:0">This page could not be found.</h2></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\na:I[6213,[],\"OutletBoundary\"]\nc:I[6213,[],\"MetadataBoundary\"]\ne:I[6213,[],\"ViewportBoundary\"]\n10:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"vJuFcpWvbN6zbAub4gcIZ\",\"p\":\"\",\"c\":[\"\",\"_not-found\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"/_not-found\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"/_not-found\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"/_not-found\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:style\",\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:1:props:style\",\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:style\",\"children\":[\"$\",\"h2\",null,{\"style\":\"$0:f:0:1:1:props:children:1:props:children:1:props:children:1:props:children:props:notFound:1:props:children:props:children:2:props:children:props:style\",\"children\":\"This page could not be found.\"}]}]]}]}]],null,[\"$\",\"$La\",null,{\"children\":\"$Lb\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[[\"$\",\"meta\",null,{\"name\":\"robots\",\"content\":\"noindex\"}],[\"$\",\"$4\",\"9SsqfOdEB0097F9U47EPB\",{\"children\":[[\"$\",\"$Lc\",null,{\"children\":\"$Ld\"}],[\"$\",\"$Le\",null,{\"children\":\"$Lf\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$10\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"f:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\nd:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}]]\n"])</script><script>self.__next_f.push([1,"b:null\n"])</script></body></html>
\ No newline at end of file
diff --git a/_next/static/vJuFcpWvbN6zbAub4gcIZ/_buildManifest.js b/_next/static/vJuFcpWvbN6zbAub4gcIZ/_buildManifest.js
new file mode 100755
index 00000000..961beef3
--- /dev/null
+++ b/_next/static/vJuFcpWvbN6zbAub4gcIZ/_buildManifest.js
@@ -0,0 +1 @@
+self.__BUILD_MANIFEST=function(e,r,t){return{__rewrites:{afterFiles:[],beforeFiles:[{has:void 0,source:"//_next/:path+",destination:"/_next/:path+"}],fallback:[]},__routerFilterStatic:{numItems:5,errorRate:1e-4,numBits:96,numHashes:14,bitArray:[1,0,1,0,1,0,e,e,e,e,r,r,r,r,r,e,r,r,r,r,e,r,r,r,e,e,r,r,r,r,e,r,e,e,r,r,e,e,e,r,r,e,e,e,e,e,e,r,e,e,r,e,e,r,r,e,r,e,r,r,r,r,r,r,e,r,e,e,r,r,e,r,e,e,r,r,e,e,r,e,e,r,r,e,e,e,e,r,e,e,e,e,e,r,r,r]},__routerFilterDynamic:{numItems:e,errorRate:1e-4,numBits:e,numHashes:null,bitArray:[]},"/_error":["static/chunks/pages/_error-9b7125ad1a1e68fa.js"],sortedPages:["/_app","/_error"]}}(0,1,0),self.__BUILD_MANIFEST_CB&&self.__BUILD_MANIFEST_CB();
\ No newline at end of file
diff --git a/_next/static/vJuFcpWvbN6zbAub4gcIZ/_ssgManifest.js b/_next/static/vJuFcpWvbN6zbAub4gcIZ/_ssgManifest.js
new file mode 100755
index 00000000..5b3ff592
--- /dev/null
+++ b/_next/static/vJuFcpWvbN6zbAub4gcIZ/_ssgManifest.js
@@ -0,0 +1 @@
+self.__SSG_MANIFEST=new Set([]);self.__SSG_MANIFEST_CB&&self.__SSG_MANIFEST_CB()
\ No newline at end of file
diff --git a/docs/api_reference/augmentations/crops/transforms/index.html b/docs/api_reference/augmentations/crops/transforms/index.html
index a40cd5c6..3c8a45d2 100644
--- a/docs/api_reference/augmentations/crops/transforms/index.html
+++ b/docs/api_reference/augmentations/crops/transforms/index.html
@@ -61,7 +61,7 @@
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos=" 53 "></span></a>
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos=" 54 "></span></a>        <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>bboxes</span><span class=p>)</span> <span class=o>&gt;</span> <span class=mi>0</span><span class=p>:</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos=" 55 "></span></a>            <span class=c1># Pick a bbox amongst all possible as our reference bbox.</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos=" 56 "></span></a>            <span class=n>bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=o>=</span><span class=p>(</span><span class=n>image_height</span><span class=p>,</span> <span class=n>image_width</span><span class=p>))</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos=" 56 "></span></a>            <span class=n>bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>shape</span><span class=o>=</span><span class=p>(</span><span class=n>image_height</span><span class=p>,</span> <span class=n>image_width</span><span class=p>))</span>
 </span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos=" 57 "></span></a>            <span class=n>bbox</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>choice</span><span class=p>(</span><span class=n>bboxes</span><span class=p>)</span>
 </span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos=" 58 "></span></a>
 </span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos=" 59 "></span></a>            <span class=n>x1</span><span class=p>,</span> <span class=n>y1</span><span class=p>,</span> <span class=n>x2</span><span class=p>,</span> <span class=n>y2</span> <span class=o>=</span> <span class=n>bbox</span><span class=p>[:</span><span class=mi>4</span><span class=p>]</span>
@@ -226,7 +226,7 @@
 </span><span id=__span-0-100><a id=__codelineno-0-100 name=__codelineno-0-100></a><a href=#__codelineno-0-100><span class=linenos data-linenos="100 "></span></a>
 </span><span id=__span-0-101><a id=__codelineno-0-101 name=__codelineno-0-101></a><a href=#__codelineno-0-101><span class=linenos data-linenos="101 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
 </span><span id=__span-0-102><a id=__codelineno-0-102 name=__codelineno-0-102></a><a href=#__codelineno-0-102><span class=linenos data-linenos="102 "></span></a>        <span class=k>return</span> <span class=p>(</span><span class=s2>&quot;erosion_rate&quot;</span><span class=p>,)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.crops.transforms.BaseCrop class="doc doc-heading" data-toc-label=BaseCrop> <code>class <strong> BaseCrop</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/crops/transforms.py#L64 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.crops.transforms.BaseCrop title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Base class for transforms that only perform cropping.</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/BaseCrop target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/crops/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>BaseCrop</span><span class=p>(</span><span class=n>DualTransform</span><span class=p>):</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.crops.transforms.BaseCrop class="doc doc-heading" data-toc-label=BaseCrop> <code>class <strong> BaseCrop</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/crops/transforms.py#L65 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.crops.transforms.BaseCrop title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Base class for transforms that only perform cropping.</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/BaseCrop target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/crops/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>BaseCrop</span><span class=p>(</span><span class=n>DualTransform</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Base class for transforms that only perform cropping.&quot;&quot;&quot;</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=n>ALL_TARGETS</span>
diff --git a/docs/api_reference/augmentations/domain_adaptation/transforms/index.html b/docs/api_reference/augmentations/domain_adaptation/transforms/index.html
index 0f2801ff..969eb35a 100644
--- a/docs/api_reference/augmentations/domain_adaptation/transforms/index.html
+++ b/docs/api_reference/augmentations/domain_adaptation/transforms/index.html
@@ -6,7 +6,7 @@
   .jupyter-wrapper .jp-MarkdownOutput.jp-RenderedHTMLCommon {
     font-size: 0.8rem;
   }
-</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/domain_adaptation/transforms.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=domain-adaptation-transforms-augmentationsdomain_adaptationtransforms>Domain Adaptation transforms (augmentations.domain_adaptation.transforms)<a class=headerlink href=#domain-adaptation-transforms-augmentationsdomain_adaptationtransforms title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.domain_adaptation.transforms></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.domain_adaptation.transforms.FDA class="doc doc-heading" data-toc-label=FDA> <code>class <strong> FDA</strong></code> <code> (reference_images, beta_limit=(0, 0.1), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/domain_adaptation/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.domain_adaptation.transforms.FDA title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Fourier Domain Adaptation (FDA) for simple "style transfer" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.</p> <p>This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.</p> <p>The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more of the original image's characteristics and higher values leading to more pronounced adaptation effects. It is recommended to use beta values less than 0.3 to avoid introducing artifacts.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>reference_images</code></td> <td><code>Sequence[Any]</code></td> <td><p>Sequence of objects to be converted into images by <code>read_fn</code>. This typically involves paths to images that serve as target domain examples for adaptation.</p></td> </tr> <tr> <td><code>beta_limit</code></td> <td><code>tuple[float, float] | float</code></td> <td><p>Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.</p></td> </tr> <tr> <td><code>read_fn</code></td> <td><code>Callable</code></td> <td><p>User-defined function for reading images. It takes an element from <code>reference_images</code> and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image</p> </div> <p>Image types: uint8, float32</p> <div class="admonition reference"> <p class=admonition-title>Reference</p> <ul> <li><a href=https://github.com/YanchaoYang/FDA>https://github.com/YanchaoYang/FDA</a></li> <li><a href=https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf>https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</a></li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/domain_adaptation/transforms.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=domain-adaptation-transforms-augmentationsdomain_adaptationtransforms>Domain Adaptation transforms (augmentations.domain_adaptation.transforms)<a class=headerlink href=#domain-adaptation-transforms-augmentationsdomain_adaptationtransforms title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.domain_adaptation.transforms></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.domain_adaptation.transforms.FDA class="doc doc-heading" data-toc-label=FDA> <code>class <strong> FDA</strong></code> <code> (reference_images, beta_limit=(0, 0.1), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/domain_adaptation/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.domain_adaptation.transforms.FDA title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Fourier Domain Adaptation (FDA) for simple "style transfer" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.</p> <p>This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.</p> <p>The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more of the original image's characteristics and higher values leading to more pronounced adaptation effects. It is recommended to use beta values less than 0.3 to avoid introducing artifacts.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>reference_images</code></td> <td><code>Sequence[Any]</code></td> <td><p>Sequence of objects to be converted into images by <code>read_fn</code>. This typically involves paths to images that serve as target domain examples for adaptation.</p></td> </tr> <tr> <td><code>beta_limit</code></td> <td><code>tuple[float, float] | float</code></td> <td><p>Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.</p></td> </tr> <tr> <td><code>read_fn</code></td> <td><code>Callable</code></td> <td><p>User-defined function for reading images. It takes an element from <code>reference_images</code> and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image</p> </div> <p>Image types: uint8, float32</p> <div class="admonition reference"> <p class=admonition-title>Reference</p> <ul> <li><a href=https://github.com/YanchaoYang/FDA>https://github.com/YanchaoYang/FDA</a></li> <li><a href=https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf>https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</a></li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>target_image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
@@ -111,7 +111,7 @@
 </span><span id=__span-0-97><a id=__codelineno-0-97 name=__codelineno-0-97></a><a href=#__codelineno-0-97><span class=linenos data-linenos="97 "></span></a>    <span class=k>def</span> <span class=nf>to_dict_private</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
 </span><span id=__span-0-98><a id=__codelineno-0-98 name=__codelineno-0-98></a><a href=#__codelineno-0-98><span class=linenos data-linenos="98 "></span></a>        <span class=n>msg</span> <span class=o>=</span> <span class=s2>&quot;FDA can not be serialized.&quot;</span>
 </span><span id=__span-0-99><a id=__codelineno-0-99 name=__codelineno-0-99></a><a href=#__codelineno-0-99><span class=linenos data-linenos="99 "></span></a>        <span class=k>raise</span> <span class=ne>NotImplementedError</span><span class=p>(</span><span class=n>msg</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.domain_adaptation.transforms.HistogramMatching class="doc doc-heading" data-toc-label=HistogramMatching> <code>class <strong> HistogramMatching</strong></code> <code> (reference_images, blend_ratio=(0.5, 1.0), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/domain_adaptation/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.domain_adaptation.transforms.HistogramMatching title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Adjust the pixel values of an input image to match the histogram of a reference image.</p> <p>This transform applies histogram matching, a technique that modifies the distribution of pixel intensities in the input image to closely resemble that of a reference image. This process is performed independently for each channel in multi-channel images, provided both the input and reference images have the same number of channels.</p> <p>Histogram matching is particularly useful for: - Normalizing images from different sources or captured under varying conditions. - Preparing images for feature matching or other computer vision tasks where consistent tone and contrast are important. - Simulating different lighting or camera conditions in a controlled manner.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>reference_images</code></td> <td><code>Sequence[Any]</code></td> <td><p>A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the <code>read_fn</code>.</p></td> </tr> <tr> <td><code>blend_ratio</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)</p></td> </tr> <tr> <td><code>read_fn</code></td> <td><code>Callable[[Any], np.ndarray]</code></td> <td><p>A function that takes an element from <code>reference_images</code> and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> <li>The effectiveness of the matching depends on the similarity between the input and reference images.</li> <li>For best results, choose reference images that represent the desired tone and contrast.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.domain_adaptation.transforms.HistogramMatching class="doc doc-heading" data-toc-label=HistogramMatching> <code>class <strong> HistogramMatching</strong></code> <code> (reference_images, blend_ratio=(0.5, 1.0), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/domain_adaptation/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.domain_adaptation.transforms.HistogramMatching title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Adjust the pixel values of an input image to match the histogram of a reference image.</p> <p>This transform applies histogram matching, a technique that modifies the distribution of pixel intensities in the input image to closely resemble that of a reference image. This process is performed independently for each channel in multi-channel images, provided both the input and reference images have the same number of channels.</p> <p>Histogram matching is particularly useful for: - Normalizing images from different sources or captured under varying conditions. - Preparing images for feature matching or other computer vision tasks where consistent tone and contrast are important. - Simulating different lighting or camera conditions in a controlled manner.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>reference_images</code></td> <td><code>Sequence[Any]</code></td> <td><p>A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the <code>read_fn</code>.</p></td> </tr> <tr> <td><code>blend_ratio</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)</p></td> </tr> <tr> <td><code>read_fn</code></td> <td><code>Callable[[Any], np.ndarray]</code></td> <td><p>A function that takes an element from <code>reference_images</code> and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> <li>The effectiveness of the matching depends on the similarity between the input and reference images.</li> <li>For best results, choose reference images that represent the desired tone and contrast.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>reference_image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
@@ -224,7 +224,7 @@
 </span><span id=__span-0-99><a id=__codelineno-0-99 name=__codelineno-0-99></a><a href=#__codelineno-0-99><span class=linenos data-linenos=" 99 "></span></a>    <span class=k>def</span> <span class=nf>to_dict_private</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
 </span><span id=__span-0-100><a id=__codelineno-0-100 name=__codelineno-0-100></a><a href=#__codelineno-0-100><span class=linenos data-linenos="100 "></span></a>        <span class=n>msg</span> <span class=o>=</span> <span class=s2>&quot;HistogramMatching can not be serialized.&quot;</span>
 </span><span id=__span-0-101><a id=__codelineno-0-101 name=__codelineno-0-101></a><a href=#__codelineno-0-101><span class=linenos data-linenos="101 "></span></a>        <span class=k>raise</span> <span class=ne>NotImplementedError</span><span class=p>(</span><span class=n>msg</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation class="doc doc-heading" data-toc-label=PixelDistributionAdaptation> <code>class <strong> PixelDistributionAdaptation</strong></code> <code> (reference_images, blend_ratio=(0.25, 1.0), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, transform_type=&#39;pca&#39;, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/domain_adaptation/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical transformation (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.</p> <p>The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse of the transform fitted on the reference image.</p> <p>This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>reference_images</code></td> <td><code>Sequence[Any]</code></td> <td><p>A sequence of objects (typically image paths) that will be converted into images by <code>read_fn</code>. These images serve as references for the domain adaptation.</p></td> </tr> <tr> <td><code>blend_ratio</code></td> <td><code>tuple[float, float]</code></td> <td><p>Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)</p></td> </tr> <tr> <td><code>read_fn</code></td> <td><code>Callable</code></td> <td><p>A user-defined function for reading and converting the objects in <code>reference_images</code> into numpy arrays. By default, it assumes these objects are image paths.</p></td> </tr> <tr> <td><code>transform_type</code></td> <td><code>Literal[&#34;pca&#34;, &#34;standard&#34;, &#34;minmax&#34;]</code></td> <td><p>Specifies the type of statistical transformation to apply. - "pca": Principal Component Analysis - "standard": StandardScaler (zero mean and unit variance) - "minmax": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: "pca"</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>The probability of applying the transform to any given image. Default: 0.5</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image</p> </div> <p>Image types: uint8, float32</p> <p>Number of channels: Any</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The effectiveness of the adaptation depends on the similarity between the input and reference domains.</li> <li>PCA transformation may alter color relationships more significantly than other methods.</li> <li>StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.</li> <li>The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.</li> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation class="doc doc-heading" data-toc-label=PixelDistributionAdaptation> <code>class <strong> PixelDistributionAdaptation</strong></code> <code> (reference_images, blend_ratio=(0.25, 1.0), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, transform_type=&#39;pca&#39;, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/domain_adaptation/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical transformation (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.</p> <p>The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse of the transform fitted on the reference image.</p> <p>This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>reference_images</code></td> <td><code>Sequence[Any]</code></td> <td><p>A sequence of objects (typically image paths) that will be converted into images by <code>read_fn</code>. These images serve as references for the domain adaptation.</p></td> </tr> <tr> <td><code>blend_ratio</code></td> <td><code>tuple[float, float]</code></td> <td><p>Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)</p></td> </tr> <tr> <td><code>read_fn</code></td> <td><code>Callable</code></td> <td><p>A user-defined function for reading and converting the objects in <code>reference_images</code> into numpy arrays. By default, it assumes these objects are image paths.</p></td> </tr> <tr> <td><code>transform_type</code></td> <td><code>Literal[&#34;pca&#34;, &#34;standard&#34;, &#34;minmax&#34;]</code></td> <td><p>Specifies the type of statistical transformation to apply. - "pca": Principal Component Analysis - "standard": StandardScaler (zero mean and unit variance) - "minmax": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: "pca"</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>The probability of applying the transform to any given image. Default: 0.5</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image</p> </div> <p>Image types: uint8, float32</p> <p>Number of channels: Any</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The effectiveness of the adaptation depends on the similarity between the input and reference domains.</li> <li>PCA transformation may alter color relationships more significantly than other methods.</li> <li>StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.</li> <li>The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.</li> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>reference_image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
diff --git a/docs/api_reference/augmentations/dropout/xy_masking/index.html b/docs/api_reference/augmentations/dropout/xy_masking/index.html
index 01609ce4..d656e8e7 100644
--- a/docs/api_reference/augmentations/dropout/xy_masking/index.html
+++ b/docs/api_reference/augmentations/dropout/xy_masking/index.html
@@ -56,8 +56,8 @@
 </span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos=" 48 "></span></a>        <span class=n>mask_x_length</span><span class=p>:</span> <span class=n>NonNegativeIntRangeType</span>
 </span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos=" 49 "></span></a>        <span class=n>mask_y_length</span><span class=p>:</span> <span class=n>NonNegativeIntRangeType</span>
 </span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos=" 50 "></span></a>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos=" 51 "></span></a>        <span class=n>fill_value</span><span class=p>:</span> <span class=n>DropoutFillValue</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span><span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated use fill instead&quot;</span><span class=p>)</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos=" 52 "></span></a>        <span class=n>mask_fill_value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span><span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated use fill_mask instead&quot;</span><span class=p>)</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos=" 51 "></span></a>        <span class=n>fill_value</span><span class=p>:</span> <span class=n>DropoutFillValue</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos=" 52 "></span></a>        <span class=n>mask_fill_value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span>
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos=" 53 "></span></a>
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos=" 54 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>DropoutFillValue</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos=" 55 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span>
@@ -74,110 +74,112 @@
 </span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos=" 66 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=n>msg</span><span class=p>)</span>
 </span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos=" 67 "></span></a>
 </span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos=" 68 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>fill_value</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos=" 69 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>fill_value</span>
-</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos=" 70 "></span></a>
-</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos=" 71 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_fill_value</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos=" 72 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_fill_value</span>
-</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos=" 73 "></span></a>
-</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos=" 74 "></span></a>            <span class=k>return</span> <span class=bp>self</span>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos=" 69 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;fill_value is deprecated, use fill instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos=" 70 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>fill_value</span>
+</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos=" 71 "></span></a>
+</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos=" 72 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_fill_value</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos=" 73 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;mask_fill_value is deprecated, use fill_mask instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos=" 74 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_fill_value</span>
 </span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos=" 75 "></span></a>
-</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos=" 76 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
-</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos=" 77 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos=" 78 "></span></a>        <span class=n>num_masks_x</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos=" 79 "></span></a>        <span class=n>num_masks_y</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos=" 80 "></span></a>        <span class=n>mask_x_length</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos=" 81 "></span></a>        <span class=n>mask_y_length</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos=" 82 "></span></a>        <span class=n>fill_value</span><span class=p>:</span> <span class=n>DropoutFillValue</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos=" 83 "></span></a>        <span class=n>mask_fill_value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos=" 84 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>DropoutFillValue</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos=" 85 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos=" 86 "></span></a>        <span class=n>p</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>0.5</span><span class=p>,</span>
-</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos=" 87 "></span></a>        <span class=n>always_apply</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos=" 88 "></span></a>    <span class=p>):</span>
-</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos=" 89 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=n>p</span><span class=p>,</span> <span class=n>fill</span><span class=o>=</span><span class=n>fill</span><span class=p>,</span> <span class=n>fill_mask</span><span class=o>=</span><span class=n>fill_mask</span><span class=p>)</span>
-</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos=" 90 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>num_masks_x</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>num_masks_x</span><span class=p>)</span>
-</span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos=" 91 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>num_masks_y</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>num_masks_y</span><span class=p>)</span>
-</span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos=" 92 "></span></a>
-</span><span id=__span-0-93><a id=__codelineno-0-93 name=__codelineno-0-93></a><a href=#__codelineno-0-93><span class=linenos data-linenos=" 93 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>mask_x_length</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>mask_x_length</span><span class=p>)</span>
-</span><span id=__span-0-94><a id=__codelineno-0-94 name=__codelineno-0-94></a><a href=#__codelineno-0-94><span class=linenos data-linenos=" 94 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>mask_y_length</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>mask_y_length</span><span class=p>)</span>
-</span><span id=__span-0-95><a id=__codelineno-0-95 name=__codelineno-0-95></a><a href=#__codelineno-0-95><span class=linenos data-linenos=" 95 "></span></a>
-</span><span id=__span-0-96><a id=__codelineno-0-96 name=__codelineno-0-96></a><a href=#__codelineno-0-96><span class=linenos data-linenos=" 96 "></span></a>    <span class=k>def</span> <span class=nf>validate_mask_length</span><span class=p>(</span>
-</span><span id=__span-0-97><a id=__codelineno-0-97 name=__codelineno-0-97></a><a href=#__codelineno-0-97><span class=linenos data-linenos=" 97 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-98><a id=__codelineno-0-98 name=__codelineno-0-98></a><a href=#__codelineno-0-98><span class=linenos data-linenos=" 98 "></span></a>        <span class=n>mask_length</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-99><a id=__codelineno-0-99 name=__codelineno-0-99></a><a href=#__codelineno-0-99><span class=linenos data-linenos=" 99 "></span></a>        <span class=n>dimension_size</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-100><a id=__codelineno-0-100 name=__codelineno-0-100></a><a href=#__codelineno-0-100><span class=linenos data-linenos="100 "></span></a>        <span class=n>dimension_name</span><span class=p>:</span> <span class=nb>str</span><span class=p>,</span>
-</span><span id=__span-0-101><a id=__codelineno-0-101 name=__codelineno-0-101></a><a href=#__codelineno-0-101><span class=linenos data-linenos="101 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-102><a id=__codelineno-0-102 name=__codelineno-0-102></a><a href=#__codelineno-0-102><span class=linenos data-linenos="102 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Validate the mask length against the corresponding image dimension size.&quot;&quot;&quot;</span>
-</span><span id=__span-0-103><a id=__codelineno-0-103 name=__codelineno-0-103></a><a href=#__codelineno-0-103><span class=linenos data-linenos="103 "></span></a>        <span class=k>if</span> <span class=n>mask_length</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-104><a id=__codelineno-0-104 name=__codelineno-0-104></a><a href=#__codelineno-0-104><span class=linenos data-linenos="104 "></span></a>            <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>mask_length</span><span class=p>,</span> <span class=p>(</span><span class=nb>tuple</span><span class=p>,</span> <span class=nb>list</span><span class=p>)):</span>
-</span><span id=__span-0-105><a id=__codelineno-0-105 name=__codelineno-0-105></a><a href=#__codelineno-0-105><span class=linenos data-linenos="105 "></span></a>                <span class=k>if</span> <span class=n>mask_length</span><span class=p>[</span><span class=mi>0</span><span class=p>]</span> <span class=o>&lt;</span> <span class=mi>0</span> <span class=ow>or</span> <span class=n>mask_length</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=n>dimension_size</span><span class=p>:</span>
-</span><span id=__span-0-106><a id=__codelineno-0-106 name=__codelineno-0-106></a><a href=#__codelineno-0-106><span class=linenos data-linenos="106 "></span></a>                    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span>
-</span><span id=__span-0-107><a id=__codelineno-0-107 name=__codelineno-0-107></a><a href=#__codelineno-0-107><span class=linenos data-linenos="107 "></span></a>                        <span class=sa>f</span><span class=s2>&quot;</span><span class=si>{</span><span class=n>dimension_name</span><span class=si>}</span><span class=s2> range </span><span class=si>{</span><span class=n>mask_length</span><span class=si>}</span><span class=s2> is out of valid range [0, </span><span class=si>{</span><span class=n>dimension_size</span><span class=si>}</span><span class=s2>]&quot;</span><span class=p>,</span>
-</span><span id=__span-0-108><a id=__codelineno-0-108 name=__codelineno-0-108></a><a href=#__codelineno-0-108><span class=linenos data-linenos="108 "></span></a>                    <span class=p>)</span>
-</span><span id=__span-0-109><a id=__codelineno-0-109 name=__codelineno-0-109></a><a href=#__codelineno-0-109><span class=linenos data-linenos="109 "></span></a>            <span class=k>elif</span> <span class=n>mask_length</span> <span class=o>&lt;</span> <span class=mi>0</span> <span class=ow>or</span> <span class=n>mask_length</span> <span class=o>&gt;</span> <span class=n>dimension_size</span><span class=p>:</span>
-</span><span id=__span-0-110><a id=__codelineno-0-110 name=__codelineno-0-110></a><a href=#__codelineno-0-110><span class=linenos data-linenos="110 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;</span><span class=si>{</span><span class=n>dimension_name</span><span class=si>}</span><span class=s2> </span><span class=si>{</span><span class=n>mask_length</span><span class=si>}</span><span class=s2> exceeds image </span><span class=si>{</span><span class=n>dimension_name</span><span class=si>}</span><span class=s2> </span><span class=si>{</span><span class=n>dimension_size</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span><span id=__span-0-111><a id=__codelineno-0-111 name=__codelineno-0-111></a><a href=#__codelineno-0-111><span class=linenos data-linenos="111 "></span></a>
-</span><span id=__span-0-112><a id=__codelineno-0-112 name=__codelineno-0-112></a><a href=#__codelineno-0-112><span class=linenos data-linenos="112 "></span></a>    <span class=k>def</span> <span class=nf>get_params_dependent_on_data</span><span class=p>(</span>
-</span><span id=__span-0-113><a id=__codelineno-0-113 name=__codelineno-0-113></a><a href=#__codelineno-0-113><span class=linenos data-linenos="113 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-114><a id=__codelineno-0-114 name=__codelineno-0-114></a><a href=#__codelineno-0-114><span class=linenos data-linenos="114 "></span></a>        <span class=n>params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
-</span><span id=__span-0-115><a id=__codelineno-0-115 name=__codelineno-0-115></a><a href=#__codelineno-0-115><span class=linenos data-linenos="115 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
-</span><span id=__span-0-116><a id=__codelineno-0-116 name=__codelineno-0-116></a><a href=#__codelineno-0-116><span class=linenos data-linenos="116 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>]:</span>
-</span><span id=__span-0-117><a id=__codelineno-0-117 name=__codelineno-0-117></a><a href=#__codelineno-0-117><span class=linenos data-linenos="117 "></span></a>        <span class=n>image_shape</span> <span class=o>=</span> <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>][:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-118><a id=__codelineno-0-118 name=__codelineno-0-118></a><a href=#__codelineno-0-118><span class=linenos data-linenos="118 "></span></a>
-</span><span id=__span-0-119><a id=__codelineno-0-119 name=__codelineno-0-119></a><a href=#__codelineno-0-119><span class=linenos data-linenos="119 "></span></a>        <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span>
+</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos=" 76 "></span></a>            <span class=k>return</span> <span class=bp>self</span>
+</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos=" 77 "></span></a>
+</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos=" 78 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
+</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos=" 79 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos=" 80 "></span></a>        <span class=n>num_masks_x</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos=" 81 "></span></a>        <span class=n>num_masks_y</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos=" 82 "></span></a>        <span class=n>mask_x_length</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos=" 83 "></span></a>        <span class=n>mask_y_length</span><span class=p>:</span> <span class=n>ScaleIntType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos=" 84 "></span></a>        <span class=n>fill_value</span><span class=p>:</span> <span class=n>DropoutFillValue</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos=" 85 "></span></a>        <span class=n>mask_fill_value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos=" 86 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>DropoutFillValue</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos=" 87 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos=" 88 "></span></a>        <span class=n>p</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>0.5</span><span class=p>,</span>
+</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos=" 89 "></span></a>        <span class=n>always_apply</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos=" 90 "></span></a>    <span class=p>):</span>
+</span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos=" 91 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=n>p</span><span class=p>,</span> <span class=n>fill</span><span class=o>=</span><span class=n>fill</span><span class=p>,</span> <span class=n>fill_mask</span><span class=o>=</span><span class=n>fill_mask</span><span class=p>)</span>
+</span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos=" 92 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>num_masks_x</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>num_masks_x</span><span class=p>)</span>
+</span><span id=__span-0-93><a id=__codelineno-0-93 name=__codelineno-0-93></a><a href=#__codelineno-0-93><span class=linenos data-linenos=" 93 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>num_masks_y</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>num_masks_y</span><span class=p>)</span>
+</span><span id=__span-0-94><a id=__codelineno-0-94 name=__codelineno-0-94></a><a href=#__codelineno-0-94><span class=linenos data-linenos=" 94 "></span></a>
+</span><span id=__span-0-95><a id=__codelineno-0-95 name=__codelineno-0-95></a><a href=#__codelineno-0-95><span class=linenos data-linenos=" 95 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>mask_x_length</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>mask_x_length</span><span class=p>)</span>
+</span><span id=__span-0-96><a id=__codelineno-0-96 name=__codelineno-0-96></a><a href=#__codelineno-0-96><span class=linenos data-linenos=" 96 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>mask_y_length</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=n>mask_y_length</span><span class=p>)</span>
+</span><span id=__span-0-97><a id=__codelineno-0-97 name=__codelineno-0-97></a><a href=#__codelineno-0-97><span class=linenos data-linenos=" 97 "></span></a>
+</span><span id=__span-0-98><a id=__codelineno-0-98 name=__codelineno-0-98></a><a href=#__codelineno-0-98><span class=linenos data-linenos=" 98 "></span></a>    <span class=k>def</span> <span class=nf>validate_mask_length</span><span class=p>(</span>
+</span><span id=__span-0-99><a id=__codelineno-0-99 name=__codelineno-0-99></a><a href=#__codelineno-0-99><span class=linenos data-linenos=" 99 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-100><a id=__codelineno-0-100 name=__codelineno-0-100></a><a href=#__codelineno-0-100><span class=linenos data-linenos="100 "></span></a>        <span class=n>mask_length</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-101><a id=__codelineno-0-101 name=__codelineno-0-101></a><a href=#__codelineno-0-101><span class=linenos data-linenos="101 "></span></a>        <span class=n>dimension_size</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-102><a id=__codelineno-0-102 name=__codelineno-0-102></a><a href=#__codelineno-0-102><span class=linenos data-linenos="102 "></span></a>        <span class=n>dimension_name</span><span class=p>:</span> <span class=nb>str</span><span class=p>,</span>
+</span><span id=__span-0-103><a id=__codelineno-0-103 name=__codelineno-0-103></a><a href=#__codelineno-0-103><span class=linenos data-linenos="103 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-104><a id=__codelineno-0-104 name=__codelineno-0-104></a><a href=#__codelineno-0-104><span class=linenos data-linenos="104 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Validate the mask length against the corresponding image dimension size.&quot;&quot;&quot;</span>
+</span><span id=__span-0-105><a id=__codelineno-0-105 name=__codelineno-0-105></a><a href=#__codelineno-0-105><span class=linenos data-linenos="105 "></span></a>        <span class=k>if</span> <span class=n>mask_length</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-106><a id=__codelineno-0-106 name=__codelineno-0-106></a><a href=#__codelineno-0-106><span class=linenos data-linenos="106 "></span></a>            <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>mask_length</span><span class=p>,</span> <span class=p>(</span><span class=nb>tuple</span><span class=p>,</span> <span class=nb>list</span><span class=p>)):</span>
+</span><span id=__span-0-107><a id=__codelineno-0-107 name=__codelineno-0-107></a><a href=#__codelineno-0-107><span class=linenos data-linenos="107 "></span></a>                <span class=k>if</span> <span class=n>mask_length</span><span class=p>[</span><span class=mi>0</span><span class=p>]</span> <span class=o>&lt;</span> <span class=mi>0</span> <span class=ow>or</span> <span class=n>mask_length</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=n>dimension_size</span><span class=p>:</span>
+</span><span id=__span-0-108><a id=__codelineno-0-108 name=__codelineno-0-108></a><a href=#__codelineno-0-108><span class=linenos data-linenos="108 "></span></a>                    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span>
+</span><span id=__span-0-109><a id=__codelineno-0-109 name=__codelineno-0-109></a><a href=#__codelineno-0-109><span class=linenos data-linenos="109 "></span></a>                        <span class=sa>f</span><span class=s2>&quot;</span><span class=si>{</span><span class=n>dimension_name</span><span class=si>}</span><span class=s2> range </span><span class=si>{</span><span class=n>mask_length</span><span class=si>}</span><span class=s2> is out of valid range [0, </span><span class=si>{</span><span class=n>dimension_size</span><span class=si>}</span><span class=s2>]&quot;</span><span class=p>,</span>
+</span><span id=__span-0-110><a id=__codelineno-0-110 name=__codelineno-0-110></a><a href=#__codelineno-0-110><span class=linenos data-linenos="110 "></span></a>                    <span class=p>)</span>
+</span><span id=__span-0-111><a id=__codelineno-0-111 name=__codelineno-0-111></a><a href=#__codelineno-0-111><span class=linenos data-linenos="111 "></span></a>            <span class=k>elif</span> <span class=n>mask_length</span> <span class=o>&lt;</span> <span class=mi>0</span> <span class=ow>or</span> <span class=n>mask_length</span> <span class=o>&gt;</span> <span class=n>dimension_size</span><span class=p>:</span>
+</span><span id=__span-0-112><a id=__codelineno-0-112 name=__codelineno-0-112></a><a href=#__codelineno-0-112><span class=linenos data-linenos="112 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;</span><span class=si>{</span><span class=n>dimension_name</span><span class=si>}</span><span class=s2> </span><span class=si>{</span><span class=n>mask_length</span><span class=si>}</span><span class=s2> exceeds image </span><span class=si>{</span><span class=n>dimension_name</span><span class=si>}</span><span class=s2> </span><span class=si>{</span><span class=n>dimension_size</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
+</span><span id=__span-0-113><a id=__codelineno-0-113 name=__codelineno-0-113></a><a href=#__codelineno-0-113><span class=linenos data-linenos="113 "></span></a>
+</span><span id=__span-0-114><a id=__codelineno-0-114 name=__codelineno-0-114></a><a href=#__codelineno-0-114><span class=linenos data-linenos="114 "></span></a>    <span class=k>def</span> <span class=nf>get_params_dependent_on_data</span><span class=p>(</span>
+</span><span id=__span-0-115><a id=__codelineno-0-115 name=__codelineno-0-115></a><a href=#__codelineno-0-115><span class=linenos data-linenos="115 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-116><a id=__codelineno-0-116 name=__codelineno-0-116></a><a href=#__codelineno-0-116><span class=linenos data-linenos="116 "></span></a>        <span class=n>params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
+</span><span id=__span-0-117><a id=__codelineno-0-117 name=__codelineno-0-117></a><a href=#__codelineno-0-117><span class=linenos data-linenos="117 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
+</span><span id=__span-0-118><a id=__codelineno-0-118 name=__codelineno-0-118></a><a href=#__codelineno-0-118><span class=linenos data-linenos="118 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>]:</span>
+</span><span id=__span-0-119><a id=__codelineno-0-119 name=__codelineno-0-119></a><a href=#__codelineno-0-119><span class=linenos data-linenos="119 "></span></a>        <span class=n>image_shape</span> <span class=o>=</span> <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>][:</span><span class=mi>2</span><span class=p>]</span>
 </span><span id=__span-0-120><a id=__codelineno-0-120 name=__codelineno-0-120></a><a href=#__codelineno-0-120><span class=linenos data-linenos="120 "></span></a>
-</span><span id=__span-0-121><a id=__codelineno-0-121 name=__codelineno-0-121></a><a href=#__codelineno-0-121><span class=linenos data-linenos="121 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>validate_mask_length</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>mask_x_length</span><span class=p>,</span> <span class=n>width</span><span class=p>,</span> <span class=s2>&quot;mask_x_length&quot;</span><span class=p>)</span>
-</span><span id=__span-0-122><a id=__codelineno-0-122 name=__codelineno-0-122></a><a href=#__codelineno-0-122><span class=linenos data-linenos="122 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>validate_mask_length</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>mask_y_length</span><span class=p>,</span> <span class=n>height</span><span class=p>,</span> <span class=s2>&quot;mask_y_length&quot;</span><span class=p>)</span>
-</span><span id=__span-0-123><a id=__codelineno-0-123 name=__codelineno-0-123></a><a href=#__codelineno-0-123><span class=linenos data-linenos="123 "></span></a>
-</span><span id=__span-0-124><a id=__codelineno-0-124 name=__codelineno-0-124></a><a href=#__codelineno-0-124><span class=linenos data-linenos="124 "></span></a>        <span class=n>masks_x</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>generate_masks</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>num_masks_x</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_x_length</span><span class=p>,</span> <span class=n>axis</span><span class=o>=</span><span class=s2>&quot;x&quot;</span><span class=p>)</span>
-</span><span id=__span-0-125><a id=__codelineno-0-125 name=__codelineno-0-125></a><a href=#__codelineno-0-125><span class=linenos data-linenos="125 "></span></a>        <span class=n>masks_y</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>generate_masks</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>num_masks_y</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_y_length</span><span class=p>,</span> <span class=n>axis</span><span class=o>=</span><span class=s2>&quot;y&quot;</span><span class=p>)</span>
-</span><span id=__span-0-126><a id=__codelineno-0-126 name=__codelineno-0-126></a><a href=#__codelineno-0-126><span class=linenos data-linenos="126 "></span></a>
-</span><span id=__span-0-127><a id=__codelineno-0-127 name=__codelineno-0-127></a><a href=#__codelineno-0-127><span class=linenos data-linenos="127 "></span></a>        <span class=n>holes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span><span class=n>masks_x</span> <span class=o>+</span> <span class=n>masks_y</span><span class=p>)</span>
+</span><span id=__span-0-121><a id=__codelineno-0-121 name=__codelineno-0-121></a><a href=#__codelineno-0-121><span class=linenos data-linenos="121 "></span></a>        <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span>
+</span><span id=__span-0-122><a id=__codelineno-0-122 name=__codelineno-0-122></a><a href=#__codelineno-0-122><span class=linenos data-linenos="122 "></span></a>
+</span><span id=__span-0-123><a id=__codelineno-0-123 name=__codelineno-0-123></a><a href=#__codelineno-0-123><span class=linenos data-linenos="123 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>validate_mask_length</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>mask_x_length</span><span class=p>,</span> <span class=n>width</span><span class=p>,</span> <span class=s2>&quot;mask_x_length&quot;</span><span class=p>)</span>
+</span><span id=__span-0-124><a id=__codelineno-0-124 name=__codelineno-0-124></a><a href=#__codelineno-0-124><span class=linenos data-linenos="124 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>validate_mask_length</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>mask_y_length</span><span class=p>,</span> <span class=n>height</span><span class=p>,</span> <span class=s2>&quot;mask_y_length&quot;</span><span class=p>)</span>
+</span><span id=__span-0-125><a id=__codelineno-0-125 name=__codelineno-0-125></a><a href=#__codelineno-0-125><span class=linenos data-linenos="125 "></span></a>
+</span><span id=__span-0-126><a id=__codelineno-0-126 name=__codelineno-0-126></a><a href=#__codelineno-0-126><span class=linenos data-linenos="126 "></span></a>        <span class=n>masks_x</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>generate_masks</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>num_masks_x</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_x_length</span><span class=p>,</span> <span class=n>axis</span><span class=o>=</span><span class=s2>&quot;x&quot;</span><span class=p>)</span>
+</span><span id=__span-0-127><a id=__codelineno-0-127 name=__codelineno-0-127></a><a href=#__codelineno-0-127><span class=linenos data-linenos="127 "></span></a>        <span class=n>masks_y</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>generate_masks</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>num_masks_y</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_y_length</span><span class=p>,</span> <span class=n>axis</span><span class=o>=</span><span class=s2>&quot;y&quot;</span><span class=p>)</span>
 </span><span id=__span-0-128><a id=__codelineno-0-128 name=__codelineno-0-128></a><a href=#__codelineno-0-128><span class=linenos data-linenos="128 "></span></a>
-</span><span id=__span-0-129><a id=__codelineno-0-129 name=__codelineno-0-129></a><a href=#__codelineno-0-129><span class=linenos data-linenos="129 "></span></a>        <span class=k>return</span> <span class=p>{</span><span class=s2>&quot;holes&quot;</span><span class=p>:</span> <span class=n>holes</span><span class=p>,</span> <span class=s2>&quot;seed&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>random_generator</span><span class=o>.</span><span class=n>integers</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=o>**</span><span class=mi>32</span> <span class=o>-</span> <span class=mi>1</span><span class=p>)}</span>
+</span><span id=__span-0-129><a id=__codelineno-0-129 name=__codelineno-0-129></a><a href=#__codelineno-0-129><span class=linenos data-linenos="129 "></span></a>        <span class=n>holes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span><span class=n>masks_x</span> <span class=o>+</span> <span class=n>masks_y</span><span class=p>)</span>
 </span><span id=__span-0-130><a id=__codelineno-0-130 name=__codelineno-0-130></a><a href=#__codelineno-0-130><span class=linenos data-linenos="130 "></span></a>
-</span><span id=__span-0-131><a id=__codelineno-0-131 name=__codelineno-0-131></a><a href=#__codelineno-0-131><span class=linenos data-linenos="131 "></span></a>    <span class=k>def</span> <span class=nf>generate_mask_size</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>mask_length</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>int</span><span class=p>:</span>
-</span><span id=__span-0-132><a id=__codelineno-0-132 name=__codelineno-0-132></a><a href=#__codelineno-0-132><span class=linenos data-linenos="132 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=o>*</span><span class=n>mask_length</span><span class=p>)</span>
-</span><span id=__span-0-133><a id=__codelineno-0-133 name=__codelineno-0-133></a><a href=#__codelineno-0-133><span class=linenos data-linenos="133 "></span></a>
-</span><span id=__span-0-134><a id=__codelineno-0-134 name=__codelineno-0-134></a><a href=#__codelineno-0-134><span class=linenos data-linenos="134 "></span></a>    <span class=k>def</span> <span class=nf>generate_masks</span><span class=p>(</span>
-</span><span id=__span-0-135><a id=__codelineno-0-135 name=__codelineno-0-135></a><a href=#__codelineno-0-135><span class=linenos data-linenos="135 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-136><a id=__codelineno-0-136 name=__codelineno-0-136></a><a href=#__codelineno-0-136><span class=linenos data-linenos="136 "></span></a>        <span class=n>num_masks</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
-</span><span id=__span-0-137><a id=__codelineno-0-137 name=__codelineno-0-137></a><a href=#__codelineno-0-137><span class=linenos data-linenos="137 "></span></a>        <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
-</span><span id=__span-0-138><a id=__codelineno-0-138 name=__codelineno-0-138></a><a href=#__codelineno-0-138><span class=linenos data-linenos="138 "></span></a>        <span class=n>max_length</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-139><a id=__codelineno-0-139 name=__codelineno-0-139></a><a href=#__codelineno-0-139><span class=linenos data-linenos="139 "></span></a>        <span class=n>axis</span><span class=p>:</span> <span class=nb>str</span><span class=p>,</span>
-</span><span id=__span-0-140><a id=__codelineno-0-140 name=__codelineno-0-140></a><a href=#__codelineno-0-140><span class=linenos data-linenos="140 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>list</span><span class=p>[</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]]:</span>
-</span><span id=__span-0-141><a id=__codelineno-0-141 name=__codelineno-0-141></a><a href=#__codelineno-0-141><span class=linenos data-linenos="141 "></span></a>        <span class=k>if</span> <span class=n>max_length</span> <span class=ow>is</span> <span class=kc>None</span> <span class=ow>or</span> <span class=n>max_length</span> <span class=o>==</span> <span class=mi>0</span> <span class=ow>or</span> <span class=p>(</span><span class=nb>isinstance</span><span class=p>(</span><span class=n>num_masks</span><span class=p>,</span> <span class=p>(</span><span class=nb>int</span><span class=p>,</span> <span class=nb>float</span><span class=p>))</span> <span class=ow>and</span> <span class=n>num_masks</span> <span class=o>==</span> <span class=mi>0</span><span class=p>):</span>
-</span><span id=__span-0-142><a id=__codelineno-0-142 name=__codelineno-0-142></a><a href=#__codelineno-0-142><span class=linenos data-linenos="142 "></span></a>            <span class=k>return</span> <span class=p>[]</span>
-</span><span id=__span-0-143><a id=__codelineno-0-143 name=__codelineno-0-143></a><a href=#__codelineno-0-143><span class=linenos data-linenos="143 "></span></a>
-</span><span id=__span-0-144><a id=__codelineno-0-144 name=__codelineno-0-144></a><a href=#__codelineno-0-144><span class=linenos data-linenos="144 "></span></a>        <span class=n>masks</span> <span class=o>=</span> <span class=p>[]</span>
-</span><span id=__span-0-145><a id=__codelineno-0-145 name=__codelineno-0-145></a><a href=#__codelineno-0-145><span class=linenos data-linenos="145 "></span></a>        <span class=n>num_masks_integer</span> <span class=o>=</span> <span class=p>(</span>
-</span><span id=__span-0-146><a id=__codelineno-0-146 name=__codelineno-0-146></a><a href=#__codelineno-0-146><span class=linenos data-linenos="146 "></span></a>            <span class=n>num_masks</span> <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>num_masks</span><span class=p>,</span> <span class=nb>int</span><span class=p>)</span> <span class=k>else</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=n>num_masks</span><span class=p>[</span><span class=mi>0</span><span class=p>],</span> <span class=n>num_masks</span><span class=p>[</span><span class=mi>1</span><span class=p>])</span>
-</span><span id=__span-0-147><a id=__codelineno-0-147 name=__codelineno-0-147></a><a href=#__codelineno-0-147><span class=linenos data-linenos="147 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-148><a id=__codelineno-0-148 name=__codelineno-0-148></a><a href=#__codelineno-0-148><span class=linenos data-linenos="148 "></span></a>
-</span><span id=__span-0-149><a id=__codelineno-0-149 name=__codelineno-0-149></a><a href=#__codelineno-0-149><span class=linenos data-linenos="149 "></span></a>        <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span>
+</span><span id=__span-0-131><a id=__codelineno-0-131 name=__codelineno-0-131></a><a href=#__codelineno-0-131><span class=linenos data-linenos="131 "></span></a>        <span class=k>return</span> <span class=p>{</span><span class=s2>&quot;holes&quot;</span><span class=p>:</span> <span class=n>holes</span><span class=p>,</span> <span class=s2>&quot;seed&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>random_generator</span><span class=o>.</span><span class=n>integers</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=o>**</span><span class=mi>32</span> <span class=o>-</span> <span class=mi>1</span><span class=p>)}</span>
+</span><span id=__span-0-132><a id=__codelineno-0-132 name=__codelineno-0-132></a><a href=#__codelineno-0-132><span class=linenos data-linenos="132 "></span></a>
+</span><span id=__span-0-133><a id=__codelineno-0-133 name=__codelineno-0-133></a><a href=#__codelineno-0-133><span class=linenos data-linenos="133 "></span></a>    <span class=k>def</span> <span class=nf>generate_mask_size</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>mask_length</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>int</span><span class=p>:</span>
+</span><span id=__span-0-134><a id=__codelineno-0-134 name=__codelineno-0-134></a><a href=#__codelineno-0-134><span class=linenos data-linenos="134 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=o>*</span><span class=n>mask_length</span><span class=p>)</span>
+</span><span id=__span-0-135><a id=__codelineno-0-135 name=__codelineno-0-135></a><a href=#__codelineno-0-135><span class=linenos data-linenos="135 "></span></a>
+</span><span id=__span-0-136><a id=__codelineno-0-136 name=__codelineno-0-136></a><a href=#__codelineno-0-136><span class=linenos data-linenos="136 "></span></a>    <span class=k>def</span> <span class=nf>generate_masks</span><span class=p>(</span>
+</span><span id=__span-0-137><a id=__codelineno-0-137 name=__codelineno-0-137></a><a href=#__codelineno-0-137><span class=linenos data-linenos="137 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-138><a id=__codelineno-0-138 name=__codelineno-0-138></a><a href=#__codelineno-0-138><span class=linenos data-linenos="138 "></span></a>        <span class=n>num_masks</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-139><a id=__codelineno-0-139 name=__codelineno-0-139></a><a href=#__codelineno-0-139><span class=linenos data-linenos="139 "></span></a>        <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-140><a id=__codelineno-0-140 name=__codelineno-0-140></a><a href=#__codelineno-0-140><span class=linenos data-linenos="140 "></span></a>        <span class=n>max_length</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-141><a id=__codelineno-0-141 name=__codelineno-0-141></a><a href=#__codelineno-0-141><span class=linenos data-linenos="141 "></span></a>        <span class=n>axis</span><span class=p>:</span> <span class=nb>str</span><span class=p>,</span>
+</span><span id=__span-0-142><a id=__codelineno-0-142 name=__codelineno-0-142></a><a href=#__codelineno-0-142><span class=linenos data-linenos="142 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>list</span><span class=p>[</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]]:</span>
+</span><span id=__span-0-143><a id=__codelineno-0-143 name=__codelineno-0-143></a><a href=#__codelineno-0-143><span class=linenos data-linenos="143 "></span></a>        <span class=k>if</span> <span class=n>max_length</span> <span class=ow>is</span> <span class=kc>None</span> <span class=ow>or</span> <span class=n>max_length</span> <span class=o>==</span> <span class=mi>0</span> <span class=ow>or</span> <span class=p>(</span><span class=nb>isinstance</span><span class=p>(</span><span class=n>num_masks</span><span class=p>,</span> <span class=p>(</span><span class=nb>int</span><span class=p>,</span> <span class=nb>float</span><span class=p>))</span> <span class=ow>and</span> <span class=n>num_masks</span> <span class=o>==</span> <span class=mi>0</span><span class=p>):</span>
+</span><span id=__span-0-144><a id=__codelineno-0-144 name=__codelineno-0-144></a><a href=#__codelineno-0-144><span class=linenos data-linenos="144 "></span></a>            <span class=k>return</span> <span class=p>[]</span>
+</span><span id=__span-0-145><a id=__codelineno-0-145 name=__codelineno-0-145></a><a href=#__codelineno-0-145><span class=linenos data-linenos="145 "></span></a>
+</span><span id=__span-0-146><a id=__codelineno-0-146 name=__codelineno-0-146></a><a href=#__codelineno-0-146><span class=linenos data-linenos="146 "></span></a>        <span class=n>masks</span> <span class=o>=</span> <span class=p>[]</span>
+</span><span id=__span-0-147><a id=__codelineno-0-147 name=__codelineno-0-147></a><a href=#__codelineno-0-147><span class=linenos data-linenos="147 "></span></a>        <span class=n>num_masks_integer</span> <span class=o>=</span> <span class=p>(</span>
+</span><span id=__span-0-148><a id=__codelineno-0-148 name=__codelineno-0-148></a><a href=#__codelineno-0-148><span class=linenos data-linenos="148 "></span></a>            <span class=n>num_masks</span> <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>num_masks</span><span class=p>,</span> <span class=nb>int</span><span class=p>)</span> <span class=k>else</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=n>num_masks</span><span class=p>[</span><span class=mi>0</span><span class=p>],</span> <span class=n>num_masks</span><span class=p>[</span><span class=mi>1</span><span class=p>])</span>
+</span><span id=__span-0-149><a id=__codelineno-0-149 name=__codelineno-0-149></a><a href=#__codelineno-0-149><span class=linenos data-linenos="149 "></span></a>        <span class=p>)</span>
 </span><span id=__span-0-150><a id=__codelineno-0-150 name=__codelineno-0-150></a><a href=#__codelineno-0-150><span class=linenos data-linenos="150 "></span></a>
-</span><span id=__span-0-151><a id=__codelineno-0-151 name=__codelineno-0-151></a><a href=#__codelineno-0-151><span class=linenos data-linenos="151 "></span></a>        <span class=k>for</span> <span class=n>_</span> <span class=ow>in</span> <span class=nb>range</span><span class=p>(</span><span class=n>num_masks_integer</span><span class=p>):</span>
-</span><span id=__span-0-152><a id=__codelineno-0-152 name=__codelineno-0-152></a><a href=#__codelineno-0-152><span class=linenos data-linenos="152 "></span></a>            <span class=n>length</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>generate_mask_size</span><span class=p>(</span><span class=n>max_length</span><span class=p>)</span>
-</span><span id=__span-0-153><a id=__codelineno-0-153 name=__codelineno-0-153></a><a href=#__codelineno-0-153><span class=linenos data-linenos="153 "></span></a>
-</span><span id=__span-0-154><a id=__codelineno-0-154 name=__codelineno-0-154></a><a href=#__codelineno-0-154><span class=linenos data-linenos="154 "></span></a>            <span class=k>if</span> <span class=n>axis</span> <span class=o>==</span> <span class=s2>&quot;x&quot;</span><span class=p>:</span>
-</span><span id=__span-0-155><a id=__codelineno-0-155 name=__codelineno-0-155></a><a href=#__codelineno-0-155><span class=linenos data-linenos="155 "></span></a>                <span class=n>x_min</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>width</span> <span class=o>-</span> <span class=n>length</span><span class=p>)</span>
-</span><span id=__span-0-156><a id=__codelineno-0-156 name=__codelineno-0-156></a><a href=#__codelineno-0-156><span class=linenos data-linenos="156 "></span></a>                <span class=n>y_min</span> <span class=o>=</span> <span class=mi>0</span>
-</span><span id=__span-0-157><a id=__codelineno-0-157 name=__codelineno-0-157></a><a href=#__codelineno-0-157><span class=linenos data-linenos="157 "></span></a>                <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>=</span> <span class=n>x_min</span> <span class=o>+</span> <span class=n>length</span><span class=p>,</span> <span class=n>height</span>
-</span><span id=__span-0-158><a id=__codelineno-0-158 name=__codelineno-0-158></a><a href=#__codelineno-0-158><span class=linenos data-linenos="158 "></span></a>            <span class=k>else</span><span class=p>:</span>  <span class=c1># axis == &#39;y&#39;</span>
-</span><span id=__span-0-159><a id=__codelineno-0-159 name=__codelineno-0-159></a><a href=#__codelineno-0-159><span class=linenos data-linenos="159 "></span></a>                <span class=n>y_min</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>height</span> <span class=o>-</span> <span class=n>length</span><span class=p>)</span>
-</span><span id=__span-0-160><a id=__codelineno-0-160 name=__codelineno-0-160></a><a href=#__codelineno-0-160><span class=linenos data-linenos="160 "></span></a>                <span class=n>x_min</span> <span class=o>=</span> <span class=mi>0</span>
-</span><span id=__span-0-161><a id=__codelineno-0-161 name=__codelineno-0-161></a><a href=#__codelineno-0-161><span class=linenos data-linenos="161 "></span></a>                <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>=</span> <span class=n>width</span><span class=p>,</span> <span class=n>y_min</span> <span class=o>+</span> <span class=n>length</span>
-</span><span id=__span-0-162><a id=__codelineno-0-162 name=__codelineno-0-162></a><a href=#__codelineno-0-162><span class=linenos data-linenos="162 "></span></a>
-</span><span id=__span-0-163><a id=__codelineno-0-163 name=__codelineno-0-163></a><a href=#__codelineno-0-163><span class=linenos data-linenos="163 "></span></a>            <span class=n>masks</span><span class=o>.</span><span class=n>append</span><span class=p>((</span><span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span><span class=p>))</span>
-</span><span id=__span-0-164><a id=__codelineno-0-164 name=__codelineno-0-164></a><a href=#__codelineno-0-164><span class=linenos data-linenos="164 "></span></a>        <span class=k>return</span> <span class=n>masks</span>
-</span><span id=__span-0-165><a id=__codelineno-0-165 name=__codelineno-0-165></a><a href=#__codelineno-0-165><span class=linenos data-linenos="165 "></span></a>
-</span><span id=__span-0-166><a id=__codelineno-0-166 name=__codelineno-0-166></a><a href=#__codelineno-0-166><span class=linenos data-linenos="166 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
-</span><span id=__span-0-167><a id=__codelineno-0-167 name=__codelineno-0-167></a><a href=#__codelineno-0-167><span class=linenos data-linenos="167 "></span></a>        <span class=k>return</span> <span class=p>(</span>
-</span><span id=__span-0-168><a id=__codelineno-0-168 name=__codelineno-0-168></a><a href=#__codelineno-0-168><span class=linenos data-linenos="168 "></span></a>            <span class=s2>&quot;num_masks_x&quot;</span><span class=p>,</span>
-</span><span id=__span-0-169><a id=__codelineno-0-169 name=__codelineno-0-169></a><a href=#__codelineno-0-169><span class=linenos data-linenos="169 "></span></a>            <span class=s2>&quot;num_masks_y&quot;</span><span class=p>,</span>
-</span><span id=__span-0-170><a id=__codelineno-0-170 name=__codelineno-0-170></a><a href=#__codelineno-0-170><span class=linenos data-linenos="170 "></span></a>            <span class=s2>&quot;mask_x_length&quot;</span><span class=p>,</span>
-</span><span id=__span-0-171><a id=__codelineno-0-171 name=__codelineno-0-171></a><a href=#__codelineno-0-171><span class=linenos data-linenos="171 "></span></a>            <span class=s2>&quot;mask_y_length&quot;</span><span class=p>,</span>
-</span><span id=__span-0-172><a id=__codelineno-0-172 name=__codelineno-0-172></a><a href=#__codelineno-0-172><span class=linenos data-linenos="172 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
-</span><span id=__span-0-173><a id=__codelineno-0-173 name=__codelineno-0-173></a><a href=#__codelineno-0-173><span class=linenos data-linenos="173 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
-</span><span id=__span-0-174><a id=__codelineno-0-174 name=__codelineno-0-174></a><a href=#__codelineno-0-174><span class=linenos data-linenos="174 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-151><a id=__codelineno-0-151 name=__codelineno-0-151></a><a href=#__codelineno-0-151><span class=linenos data-linenos="151 "></span></a>        <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span>
+</span><span id=__span-0-152><a id=__codelineno-0-152 name=__codelineno-0-152></a><a href=#__codelineno-0-152><span class=linenos data-linenos="152 "></span></a>
+</span><span id=__span-0-153><a id=__codelineno-0-153 name=__codelineno-0-153></a><a href=#__codelineno-0-153><span class=linenos data-linenos="153 "></span></a>        <span class=k>for</span> <span class=n>_</span> <span class=ow>in</span> <span class=nb>range</span><span class=p>(</span><span class=n>num_masks_integer</span><span class=p>):</span>
+</span><span id=__span-0-154><a id=__codelineno-0-154 name=__codelineno-0-154></a><a href=#__codelineno-0-154><span class=linenos data-linenos="154 "></span></a>            <span class=n>length</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>generate_mask_size</span><span class=p>(</span><span class=n>max_length</span><span class=p>)</span>
+</span><span id=__span-0-155><a id=__codelineno-0-155 name=__codelineno-0-155></a><a href=#__codelineno-0-155><span class=linenos data-linenos="155 "></span></a>
+</span><span id=__span-0-156><a id=__codelineno-0-156 name=__codelineno-0-156></a><a href=#__codelineno-0-156><span class=linenos data-linenos="156 "></span></a>            <span class=k>if</span> <span class=n>axis</span> <span class=o>==</span> <span class=s2>&quot;x&quot;</span><span class=p>:</span>
+</span><span id=__span-0-157><a id=__codelineno-0-157 name=__codelineno-0-157></a><a href=#__codelineno-0-157><span class=linenos data-linenos="157 "></span></a>                <span class=n>x_min</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>width</span> <span class=o>-</span> <span class=n>length</span><span class=p>)</span>
+</span><span id=__span-0-158><a id=__codelineno-0-158 name=__codelineno-0-158></a><a href=#__codelineno-0-158><span class=linenos data-linenos="158 "></span></a>                <span class=n>y_min</span> <span class=o>=</span> <span class=mi>0</span>
+</span><span id=__span-0-159><a id=__codelineno-0-159 name=__codelineno-0-159></a><a href=#__codelineno-0-159><span class=linenos data-linenos="159 "></span></a>                <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>=</span> <span class=n>x_min</span> <span class=o>+</span> <span class=n>length</span><span class=p>,</span> <span class=n>height</span>
+</span><span id=__span-0-160><a id=__codelineno-0-160 name=__codelineno-0-160></a><a href=#__codelineno-0-160><span class=linenos data-linenos="160 "></span></a>            <span class=k>else</span><span class=p>:</span>  <span class=c1># axis == &#39;y&#39;</span>
+</span><span id=__span-0-161><a id=__codelineno-0-161 name=__codelineno-0-161></a><a href=#__codelineno-0-161><span class=linenos data-linenos="161 "></span></a>                <span class=n>y_min</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>height</span> <span class=o>-</span> <span class=n>length</span><span class=p>)</span>
+</span><span id=__span-0-162><a id=__codelineno-0-162 name=__codelineno-0-162></a><a href=#__codelineno-0-162><span class=linenos data-linenos="162 "></span></a>                <span class=n>x_min</span> <span class=o>=</span> <span class=mi>0</span>
+</span><span id=__span-0-163><a id=__codelineno-0-163 name=__codelineno-0-163></a><a href=#__codelineno-0-163><span class=linenos data-linenos="163 "></span></a>                <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>=</span> <span class=n>width</span><span class=p>,</span> <span class=n>y_min</span> <span class=o>+</span> <span class=n>length</span>
+</span><span id=__span-0-164><a id=__codelineno-0-164 name=__codelineno-0-164></a><a href=#__codelineno-0-164><span class=linenos data-linenos="164 "></span></a>
+</span><span id=__span-0-165><a id=__codelineno-0-165 name=__codelineno-0-165></a><a href=#__codelineno-0-165><span class=linenos data-linenos="165 "></span></a>            <span class=n>masks</span><span class=o>.</span><span class=n>append</span><span class=p>((</span><span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span><span class=p>))</span>
+</span><span id=__span-0-166><a id=__codelineno-0-166 name=__codelineno-0-166></a><a href=#__codelineno-0-166><span class=linenos data-linenos="166 "></span></a>        <span class=k>return</span> <span class=n>masks</span>
+</span><span id=__span-0-167><a id=__codelineno-0-167 name=__codelineno-0-167></a><a href=#__codelineno-0-167><span class=linenos data-linenos="167 "></span></a>
+</span><span id=__span-0-168><a id=__codelineno-0-168 name=__codelineno-0-168></a><a href=#__codelineno-0-168><span class=linenos data-linenos="168 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
+</span><span id=__span-0-169><a id=__codelineno-0-169 name=__codelineno-0-169></a><a href=#__codelineno-0-169><span class=linenos data-linenos="169 "></span></a>        <span class=k>return</span> <span class=p>(</span>
+</span><span id=__span-0-170><a id=__codelineno-0-170 name=__codelineno-0-170></a><a href=#__codelineno-0-170><span class=linenos data-linenos="170 "></span></a>            <span class=s2>&quot;num_masks_x&quot;</span><span class=p>,</span>
+</span><span id=__span-0-171><a id=__codelineno-0-171 name=__codelineno-0-171></a><a href=#__codelineno-0-171><span class=linenos data-linenos="171 "></span></a>            <span class=s2>&quot;num_masks_y&quot;</span><span class=p>,</span>
+</span><span id=__span-0-172><a id=__codelineno-0-172 name=__codelineno-0-172></a><a href=#__codelineno-0-172><span class=linenos data-linenos="172 "></span></a>            <span class=s2>&quot;mask_x_length&quot;</span><span class=p>,</span>
+</span><span id=__span-0-173><a id=__codelineno-0-173 name=__codelineno-0-173></a><a href=#__codelineno-0-173><span class=linenos data-linenos="173 "></span></a>            <span class=s2>&quot;mask_y_length&quot;</span><span class=p>,</span>
+</span><span id=__span-0-174><a id=__codelineno-0-174 name=__codelineno-0-174></a><a href=#__codelineno-0-174><span class=linenos data-linenos="174 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
+</span><span id=__span-0-175><a id=__codelineno-0-175 name=__codelineno-0-175></a><a href=#__codelineno-0-175><span class=linenos data-linenos="175 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
+</span><span id=__span-0-176><a id=__codelineno-0-176 name=__codelineno-0-176></a><a href=#__codelineno-0-176><span class=linenos data-linenos="176 "></span></a>        <span class=p>)</span>
 </span></code></pre></div> </details> </div> </div> </div> </div> </div> </article> </div> <script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script> </div> <button type=button class="md-top md-icon" data-md-component=top hidden> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8z"/></svg> Back to top </button> </main> <footer class=md-footer> <nav class="md-footer__inner md-grid" aria-label=Footer> <a href=../mask_dropout/ class="md-footer__link md-footer__link--prev" aria-label="Previous: MaskDropout augmentation (augmentations.dropout.mask_dropout)"> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11z"/></svg> </div> <div class=md-footer__title> <span class=md-footer__direction> Previous </span> <div class=md-ellipsis> MaskDropout augmentation (augmentations.dropout.mask_dropout) </div> </div> </a> <a href=../../geometric/ class="md-footer__link md-footer__link--next" aria-label="Next: Index"> <div class=md-footer__title> <span class=md-footer__direction> Next </span> <div class=md-ellipsis> Index </div> </div> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M4 11v2h12l-5.5 5.5 1.42 1.42L19.84 12l-7.92-7.92L10.5 5.5 16 11z"/></svg> </div> </a> </nav> <div class="md-footer-meta md-typeset"> <div class="md-footer-meta__inner md-grid"> <div class=md-copyright> </div> <div class=md-social> <a href=https://twitter.com/albumentations target=_blank rel=noopener title=twitter.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 512 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8l164.9-188.5L26.8 48h145.6l100.5 132.9zm-24.8 373.8h39.1L151.1 88h-42z"/></svg> </a> <a href=https://www.linkedin.com/company/100504475/ target=_blank rel=noopener title=www.linkedin.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 448 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3M135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5m282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9z"/></svg> </a> <a href=https://discord.com/invite/AKPrrDYNAt target=_blank rel=noopener title=discord.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 640 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M524.531 69.836a1.5 1.5 0 0 0-.764-.7A485 485 0 0 0 404.081 32.03a1.82 1.82 0 0 0-1.923.91 338 338 0 0 0-14.9 30.6 447.9 447.9 0 0 0-134.426 0 310 310 0 0 0-15.135-30.6 1.89 1.89 0 0 0-1.924-.91 483.7 483.7 0 0 0-119.688 37.107 1.7 1.7 0 0 0-.788.676C39.068 183.651 18.186 294.69 28.43 404.354a2.02 2.02 0 0 0 .765 1.375 487.7 487.7 0 0 0 146.825 74.189 1.9 1.9 0 0 0 2.063-.676A348 348 0 0 0 208.12 430.4a1.86 1.86 0 0 0-1.019-2.588 321 321 0 0 1-45.868-21.853 1.885 1.885 0 0 1-.185-3.126 251 251 0 0 0 9.109-7.137 1.82 1.82 0 0 1 1.9-.256c96.229 43.917 200.41 43.917 295.5 0a1.81 1.81 0 0 1 1.924.233 235 235 0 0 0 9.132 7.16 1.884 1.884 0 0 1-.162 3.126 301.4 301.4 0 0 1-45.89 21.83 1.875 1.875 0 0 0-1 2.611 391 391 0 0 0 30.014 48.815 1.86 1.86 0 0 0 2.063.7A486 486 0 0 0 610.7 405.729a1.88 1.88 0 0 0 .765-1.352c12.264-126.783-20.532-236.912-86.934-334.541M222.491 337.58c-28.972 0-52.844-26.587-52.844-59.239s23.409-59.241 52.844-59.241c29.665 0 53.306 26.82 52.843 59.239 0 32.654-23.41 59.241-52.843 59.241m195.38 0c-28.971 0-52.843-26.587-52.843-59.239s23.409-59.241 52.843-59.241c29.667 0 53.307 26.82 52.844 59.239 0 32.654-23.177 59.241-52.844 59.241"/></svg> </a> </div> </div> </div> </footer> </div> <div class=md-dialog data-md-component=dialog> <div class="md-dialog__inner md-typeset"></div> </div> <script id=__config type=application/json>{"base": "../../../..", "features": ["navigation.sections", "navigation.indexes", "navigation.top", "navigation.footer", "navigation.path", "navigation.prune", "search.suggest", "search.highlight", "search.share", "toc.follow", "toc.integrate"], "search": "../../../../assets/javascripts/workers/search.6ce7567c.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}}</script> <script src=../../../../assets/javascripts/bundle.88dd0f4e.min.js></script> <script src=https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/MathJax.js?config=TeX-MML-AM_CHTML"></script> <script src=../../../../js/extra.js></script> </body> </html>
\ No newline at end of file
diff --git a/docs/api_reference/augmentations/geometric/functional/index.html b/docs/api_reference/augmentations/geometric/functional/index.html
index 2368df08..92de180f 100644
--- a/docs/api_reference/augmentations/geometric/functional/index.html
+++ b/docs/api_reference/augmentations/geometric/functional/index.html
@@ -6,7 +6,7 @@
   .jupyter-wrapper .jp-MarkdownOutput.jp-RenderedHTMLCommon {
     font-size: 0.8rem;
   }
-</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/geometric/functional.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=geometric-functional-transforms-augmentationsgeometricfunctional>Geometric functional transforms (augmentations.geometric.functional)<a class=headerlink href=#geometric-functional-transforms-augmentationsgeometricfunctional title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.geometric.functional></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.adjust_padding_by_position class="doc doc-heading" data-toc-label=adjust_padding_by_position()> <code class="highlight language-python">def adjust_padding_by_position (h_top, h_bottom, w_left, w_right, position, py_random) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2855 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.adjust_padding_by_position title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Adjust padding values based on desired position.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>adjust_padding_by_position</span><span class=p>(</span>
+</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/geometric/functional.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=geometric-functional-transforms-augmentationsgeometricfunctional>Geometric functional transforms (augmentations.geometric.functional)<a class=headerlink href=#geometric-functional-transforms-augmentationsgeometricfunctional title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.geometric.functional></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.adjust_padding_by_position class="doc doc-heading" data-toc-label=adjust_padding_by_position()> <code class="highlight language-python">def adjust_padding_by_position (h_top, h_bottom, w_left, w_right, position, py_random) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2863 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.adjust_padding_by_position title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Adjust padding values based on desired position.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>adjust_padding_by_position</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>h_top</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>h_bottom</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>w_left</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
@@ -40,7 +40,7 @@
 </span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=k>return</span> <span class=n>h_top</span><span class=p>,</span> <span class=n>h_bottom</span><span class=p>,</span> <span class=n>w_left</span><span class=p>,</span> <span class=n>w_right</span>
 </span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
 </span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Unknown position: </span><span class=si>{</span><span class=n>position</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.almost_equal_intervals class="doc doc-heading" data-toc-label=almost_equal_intervals()> <code class="highlight language-python">def almost_equal_intervals (n, parts) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2501 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.almost_equal_intervals title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generates an array of nearly equal integer intervals that sum up to <code>n</code>.</p> <p>This function divides the number <code>n</code> into <code>parts</code> nearly equal parts. It ensures that the sum of all parts equals <code>n</code>, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>n</code></td> <td><code>int</code></td> <td><p>The total value to be split.</p></td> </tr> <tr> <td><code>parts</code></td> <td><code>int</code></td> <td><p>The number of parts to split into.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of integers where each integer represents the size of a part.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>almost_equal_intervals</span><span class=p>(</span><span class=mi>20</span><span class=p>,</span> <span class=mi>3</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.almost_equal_intervals class="doc doc-heading" data-toc-label=almost_equal_intervals()> <code class="highlight language-python">def almost_equal_intervals (n, parts) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2509 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.almost_equal_intervals title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generates an array of nearly equal integer intervals that sum up to <code>n</code>.</p> <p>This function divides the number <code>n</code> into <code>parts</code> nearly equal parts. It ensures that the sum of all parts equals <code>n</code>, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>n</code></td> <td><code>int</code></td> <td><p>The total value to be split.</p></td> </tr> <tr> <td><code>parts</code></td> <td><code>int</code></td> <td><p>The number of parts to split into.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of integers where each integer represents the size of a part.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>almost_equal_intervals</span><span class=p>(</span><span class=mi>20</span><span class=p>,</span> <span class=mi>3</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=n>array</span><span class=p>([</span><span class=mi>7</span><span class=p>,</span> <span class=mi>7</span><span class=p>,</span> <span class=mi>6</span><span class=p>])</span>  <span class=c1># Splits 20 into three parts: 7, 7, and 6</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>almost_equal_intervals</span><span class=p>(</span><span class=mi>16</span><span class=p>,</span> <span class=mi>4</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=n>array</span><span class=p>([</span><span class=mi>4</span><span class=p>,</span> <span class=mi>4</span><span class=p>,</span> <span class=mi>4</span><span class=p>,</span> <span class=mi>4</span><span class=p>])</span>  <span class=c1># Splits 16 into four equal parts</span>
@@ -69,7 +69,7 @@
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span>
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>        <span class=p>[</span><span class=n>part_size</span> <span class=o>+</span> <span class=mi>1</span> <span class=k>if</span> <span class=n>i</span> <span class=o>&lt;</span> <span class=n>remainder</span> <span class=k>else</span> <span class=n>part_size</span> <span class=k>for</span> <span class=n>i</span> <span class=ow>in</span> <span class=nb>range</span><span class=p>(</span><span class=n>parts</span><span class=p>)],</span>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.apply_affine_to_points class="doc doc-heading" data-toc-label=apply_affine_to_points()> <code class="highlight language-python">def apply_affine_to_points (points, matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L672 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.apply_affine_to_points title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply affine transformation to a set of points.</p> <p>This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>points</code></td> <td><code>np.ndarray</code></td> <td><p>Array of points with shape (N, 2).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>3x3 affine transformation matrix.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed points with shape (N, 2).</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;points&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.apply_affine_to_points class="doc doc-heading" data-toc-label=apply_affine_to_points()> <code class="highlight language-python">def apply_affine_to_points (points, matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L679 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.apply_affine_to_points title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply affine transformation to a set of points.</p> <p>This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>points</code></td> <td><code>np.ndarray</code></td> <td><p>Array of points with shape (N, 2).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>3x3 affine transformation matrix.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed points with shape (N, 2).</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;points&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>apply_affine_to_points</span><span class=p>(</span><span class=n>points</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Apply affine transformation to a set of points.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -95,7 +95,7 @@
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=p>)</span>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>
 </span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=k>return</span> <span class=n>transformed_points</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>2</span><span class=p>]</span> <span class=o>/</span> <span class=n>transformed_points</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>:]</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_affine class="doc doc-heading" data-toc-label=bboxes_affine()> <code class="highlight language-python">def bboxes_affine (bboxes, matrix, rotate_method, image_shape, border_mode, output_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L859 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_affine title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to bounding boxes.</p> <p>For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes</p> <p>For other border modes, it directly applies the affine transformation without padding.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Input bounding boxes</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>Affine transformation matrix</p></td> </tr> <tr> <td><code>rotate_method</code></td> <td><code>str</code></td> <td><p>Method for rotating bounding boxes ('largest_box' or 'ellipse')</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Sequence[int]</code></td> <td><p>Shape of the input image</p></td> </tr> <tr> <td><code>border_mode</code></td> <td><code>int</code></td> <td><p>OpenCV border mode</p></td> </tr> <tr> <td><code>output_shape</code></td> <td><code>Sequence[int]</code></td> <td><p>Shape of the output image</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed and normalized bounding boxes</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_affine class="doc doc-heading" data-toc-label=bboxes_affine()> <code class="highlight language-python">def bboxes_affine (bboxes, matrix, rotate_method, image_shape, border_mode, output_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L866 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_affine title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to bounding boxes.</p> <p>For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes</p> <p>For other border modes, it directly applies the affine transformation without padding.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Input bounding boxes</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>Affine transformation matrix</p></td> </tr> <tr> <td><code>rotate_method</code></td> <td><code>str</code></td> <td><p>Method for rotating bounding boxes ('largest_box' or 'ellipse')</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Sequence[int]</code></td> <td><p>Shape of the input image</p></td> </tr> <tr> <td><code>border_mode</code></td> <td><code>int</code></td> <td><p>OpenCV border mode</p></td> </tr> <tr> <td><code>output_shape</code></td> <td><code>Sequence[int]</code></td> <td><p>Shape of the output image</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed and normalized bounding boxes</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_affine</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
@@ -163,7 +163,7 @@
 </span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>    <span class=n>validated_bboxes</span> <span class=o>=</span> <span class=n>validate_bboxes</span><span class=p>(</span><span class=n>transformed_bboxes</span><span class=p>,</span> <span class=n>output_shape</span><span class=p>)</span>
 </span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>
 </span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>    <span class=k>return</span> <span class=n>normalize_bboxes</span><span class=p>(</span><span class=n>validated_bboxes</span><span class=p>,</span> <span class=n>output_shape</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_affine_ellipse class="doc doc-heading" data-toc-label=bboxes_affine_ellipse()> <code class="highlight language-python">def bboxes_affine_ellipse (bboxes, matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L802 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_affine_ellipse title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to bounding boxes using an ellipse approximation method.</p> <p>This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 3x3 affine transformation matrix to apply.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transformed bounding boxes with the same shape as the input. Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by any additional attributes from the input bounding boxes.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The ellipse approximation method can provide a tighter bounding box compared to the largest box method, especially for rotations.</li> <li>360 points are used to approximate each ellipse, which provides a good balance between accuracy and computational efficiency.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method may be more suitable for objects that are roughly elliptical in shape.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_affine_ellipse class="doc doc-heading" data-toc-label=bboxes_affine_ellipse()> <code class="highlight language-python">def bboxes_affine_ellipse (bboxes, matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L809 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_affine_ellipse title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to bounding boxes using an ellipse approximation method.</p> <p>This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 3x3 affine transformation matrix to apply.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transformed bounding boxes with the same shape as the input. Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by any additional attributes from the input bounding boxes.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The ellipse approximation method can provide a tighter bounding box compared to the largest box method, especially for rotations.</li> <li>360 points are used to approximate each ellipse, which provides a good balance between accuracy and computational efficiency.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method may be more suitable for objects that are roughly elliptical in shape.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_affine_ellipse</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Apply an affine transformation to bounding boxes using an ellipse approximation method.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -218,7 +218,7 @@
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=n>new_y_max</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>max</span><span class=p>(</span><span class=n>transformed_points</span><span class=p>[:,</span> <span class=p>:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>axis</span><span class=o>=</span><span class=mi>1</span><span class=p>)</span>
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>([</span><span class=n>new_x_min</span><span class=p>,</span> <span class=n>new_y_min</span><span class=p>,</span> <span class=n>new_x_max</span><span class=p>,</span> <span class=n>new_y_max</span><span class=p>,</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>:]])</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_affine_largest_box class="doc doc-heading" data-toc-label=bboxes_affine_largest_box()> <code class="highlight language-python">def bboxes_affine_largest_box (bboxes, matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L748 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_affine_largest_box title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to bounding boxes and return the largest enclosing boxes.</p> <p>This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 3x3 affine transformation matrix to apply.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transformed bounding boxes with the same shape as the input. Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by any additional attributes from the input bounding boxes.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The resulting bounding boxes are the smallest axis-aligned boxes that completely enclose the transformed original boxes. They may be larger than the minimal possible bounding box if the original box becomes rotated.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method is called "largest box" because it returns the largest axis-aligned box that encloses all corners of the transformed bounding box.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>10</span><span class=p>,</span> <span class=mi>10</span><span class=p>,</span> <span class=mi>20</span><span class=p>,</span> <span class=mi>20</span><span class=p>,</span> <span class=mi>1</span><span class=p>],</span> <span class=p>[</span><span class=mi>30</span><span class=p>,</span> <span class=mi>30</span><span class=p>,</span> <span class=mi>40</span><span class=p>,</span> <span class=mi>40</span><span class=p>,</span> <span class=mi>2</span><span class=p>]])</span>  <span class=c1># Two boxes with class labels</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_affine_largest_box class="doc doc-heading" data-toc-label=bboxes_affine_largest_box()> <code class="highlight language-python">def bboxes_affine_largest_box (bboxes, matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L755 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_affine_largest_box title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to bounding boxes and return the largest enclosing boxes.</p> <p>This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 3x3 affine transformation matrix to apply.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transformed bounding boxes with the same shape as the input. Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by any additional attributes from the input bounding boxes.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The resulting bounding boxes are the smallest axis-aligned boxes that completely enclose the transformed original boxes. They may be larger than the minimal possible bounding box if the original box becomes rotated.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method is called "largest box" because it returns the largest axis-aligned box that encloses all corners of the transformed bounding box.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>10</span><span class=p>,</span> <span class=mi>10</span><span class=p>,</span> <span class=mi>20</span><span class=p>,</span> <span class=mi>20</span><span class=p>,</span> <span class=mi>1</span><span class=p>],</span> <span class=p>[</span><span class=mi>30</span><span class=p>,</span> <span class=mi>30</span><span class=p>,</span> <span class=mi>40</span><span class=p>,</span> <span class=mi>40</span><span class=p>,</span> <span class=mi>2</span><span class=p>]])</span>  <span class=c1># Two boxes with class labels</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>matrix</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>2</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>5</span><span class=p>],</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>5</span><span class=p>],</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>]])</span>  <span class=c1># Scale by 2 and translate by (5, 5)</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transformed_bboxes</span> <span class=o>=</span> <span class=n>bboxes_affine_largest_box</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>matrix</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=nb>print</span><span class=p>(</span><span class=n>transformed_bboxes</span><span class=p>)</span>
@@ -276,7 +276,7 @@
 </span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>    <span class=n>new_y_max</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>max</span><span class=p>(</span><span class=n>transformed_corners</span><span class=p>[:,</span> <span class=p>:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>axis</span><span class=o>=</span><span class=mi>1</span><span class=p>)</span>
 </span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>
 </span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>([</span><span class=n>new_x_min</span><span class=p>,</span> <span class=n>new_y_min</span><span class=p>,</span> <span class=n>new_x_max</span><span class=p>,</span> <span class=n>new_y_max</span><span class=p>,</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>:]])</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_d4 class="doc doc-heading" data-toc-label=bboxes_d4()> <code class="highlight language-python">def bboxes_d4 (bboxes, group_member) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L117 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_d4 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies a <code>D_4</code> symmetry group transformation to a bounding box.</p> <p>The function transforms a bounding box according to the specified group member from the <code>D_4</code> group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions.</p> <ul> <li>bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</li> <li>group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply. Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.</li> </ul> <ul> <li>BoxInternalType: The transformed bounding box.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p><strong>Examples:</strong></p> <ul> <li>Applying a 90-degree rotation: <code>bbox_d4((10, 20, 110, 120), 'r90')</code> This would rotate the bounding box 90 degrees within a 100x100 image.</li> </ul> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_d4 class="doc doc-heading" data-toc-label=bboxes_d4()> <code class="highlight language-python">def bboxes_d4 (bboxes, group_member) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L111 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_d4 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies a <code>D_4</code> symmetry group transformation to a bounding box.</p> <p>The function transforms a bounding box according to the specified group member from the <code>D_4</code> group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions.</p> <ul> <li>bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</li> <li>group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply. Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.</li> </ul> <ul> <li>BoxInternalType: The transformed bounding box.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p><strong>Examples:</strong></p> <ul> <li>Applying a 90-degree rotation: <code>bbox_d4((10, 20, 110, 120), 'r90')</code> This would rotate the bounding box 90 degrees within a 100x100 image.</li> </ul> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_d4</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>group_member</span><span class=p>:</span> <span class=n>D4Type</span><span class=p>,</span>
@@ -322,7 +322,7 @@
 </span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=k>return</span> <span class=n>transformations</span><span class=p>[</span><span class=n>group_member</span><span class=p>](</span><span class=n>bboxes</span><span class=p>)</span>
 </span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>
 </span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Invalid group member: </span><span class=si>{</span><span class=n>group_member</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_grid_shuffle class="doc doc-heading" data-toc-label=bboxes_grid_shuffle()> <code class="highlight language-python">def bboxes_grid_shuffle (bboxes, tiles, mapping, image_shape, min_area, min_visibility) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3009 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_grid_shuffle title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply grid shuffle transformation to bounding boxes.</p> <p>This function transforms bounding boxes according to a grid shuffle operation. It handles cases where bounding boxes may be split into multiple components after shuffling and applies filtering based on minimum area and visibility requirements.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (N, 4+) where N is the number of boxes. Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents optional additional fields (e.g., class_id, score).</p></td> </tr> <tr> <td><code>tiles</code></td> <td><code>np.ndarray</code></td> <td><p>Array of tile coordinates with shape (M, 4) where M is the number of tiles. Each tile is in format [start_y, start_x, end_y, end_x].</p></td> </tr> <tr> <td><code>mapping</code></td> <td><code>list[int]</code></td> <td><p>List of indices defining how tiles should be rearranged. Each index i in the list contains the index of the tile that should be moved to position i.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> <tr> <td><code>min_area</code></td> <td><code>float</code></td> <td><p>Minimum area threshold in pixels. If a component's area after shuffling is smaller than this value, it will be filtered out. If None, no area filtering is applied.</p></td> </tr> <tr> <td><code>min_visibility</code></td> <td><code>float</code></td> <td><p>Minimum visibility ratio threshold in range [0, 1]. Calculated as (component_area / original_area). If a component's visibility is lower than this value, it will be filtered out. If None, no visibility filtering is applied.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of transformed bounding boxes with shape (K, 4+) where K is the number of valid components after shuffling and filtering. The format of each box matches the input format, preserving any additional fields. If no valid components remain after filtering, returns an empty array with shape (0, C) where C matches the input column count.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function converts bboxes to masks before applying the transformation to handle cases where boxes may be split into multiple components.</li> <li>After shuffling, each component is validated against min_area and min_visibility requirements independently.</li> <li>Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and copied to all components derived from the same original bbox.</li> <li>Empty input arrays are handled gracefully and return empty arrays of the appropriate shape.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>10</span><span class=p>,</span> <span class=mi>10</span><span class=p>,</span> <span class=mi>90</span><span class=p>,</span> <span class=mi>90</span><span class=p>]])</span>  <span class=c1># Single box crossing multiple tiles</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_grid_shuffle class="doc doc-heading" data-toc-label=bboxes_grid_shuffle()> <code class="highlight language-python">def bboxes_grid_shuffle (bboxes, tiles, mapping, image_shape, min_area, min_visibility) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3017 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_grid_shuffle title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply grid shuffle transformation to bounding boxes.</p> <p>This function transforms bounding boxes according to a grid shuffle operation. It handles cases where bounding boxes may be split into multiple components after shuffling and applies filtering based on minimum area and visibility requirements.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (N, 4+) where N is the number of boxes. Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents optional additional fields (e.g., class_id, score).</p></td> </tr> <tr> <td><code>tiles</code></td> <td><code>np.ndarray</code></td> <td><p>Array of tile coordinates with shape (M, 4) where M is the number of tiles. Each tile is in format [start_y, start_x, end_y, end_x].</p></td> </tr> <tr> <td><code>mapping</code></td> <td><code>list[int]</code></td> <td><p>List of indices defining how tiles should be rearranged. Each index i in the list contains the index of the tile that should be moved to position i.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> <tr> <td><code>min_area</code></td> <td><code>float</code></td> <td><p>Minimum area threshold in pixels. If a component's area after shuffling is smaller than this value, it will be filtered out. If None, no area filtering is applied.</p></td> </tr> <tr> <td><code>min_visibility</code></td> <td><code>float</code></td> <td><p>Minimum visibility ratio threshold in range [0, 1]. Calculated as (component_area / original_area). If a component's visibility is lower than this value, it will be filtered out. If None, no visibility filtering is applied.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of transformed bounding boxes with shape (K, 4+) where K is the number of valid components after shuffling and filtering. The format of each box matches the input format, preserving any additional fields. If no valid components remain after filtering, returns an empty array with shape (0, C) where C matches the input column count.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function converts bboxes to masks before applying the transformation to handle cases where boxes may be split into multiple components.</li> <li>After shuffling, each component is validated against min_area and min_visibility requirements independently.</li> <li>Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and copied to all components derived from the same original bbox.</li> <li>Empty input arrays are handled gracefully and return empty arrays of the appropriate shape.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>10</span><span class=p>,</span> <span class=mi>10</span><span class=p>,</span> <span class=mi>90</span><span class=p>,</span> <span class=mi>90</span><span class=p>]])</span>  <span class=c1># Single box crossing multiple tiles</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>tiles</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>...</span>     <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>],</span>    <span class=c1># top-left tile</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>...</span>     <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>100</span><span class=p>],</span>  <span class=c1># top-right tile</span>
@@ -443,7 +443,7 @@
 </span><span id=__span-0-109><a id=__codelineno-0-109 name=__codelineno-0-109></a><a href=#__codelineno-0-109><span class=linenos data-linenos="109 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>zeros</span><span class=p>((</span><span class=mi>0</span><span class=p>,</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>bboxes</span><span class=o>.</span><span class=n>dtype</span><span class=p>)</span>
 </span><span id=__span-0-110><a id=__codelineno-0-110 name=__codelineno-0-110></a><a href=#__codelineno-0-110><span class=linenos data-linenos="110 "></span></a>
 </span><span id=__span-0-111><a id=__codelineno-0-111 name=__codelineno-0-111></a><a href=#__codelineno-0-111><span class=linenos data-linenos="111 "></span></a>    <span class=k>return</span> <span class=n>shuffled_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_hflip class="doc doc-heading" data-toc-label=bboxes_hflip()> <code class="highlight language-python">def bboxes_hflip (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1197 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_hflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip bounding boxes horizontally around the y-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of horizontally flipped bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_hflip class="doc doc-heading" data-toc-label=bboxes_hflip()> <code class="highlight language-python">def bboxes_hflip (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1204 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_hflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip bounding boxes horizontally around the y-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of horizontally flipped bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_hflip</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Flip bounding boxes horizontally around the y-axis.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -459,8 +459,8 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>flipped_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span>  <span class=c1># new x_max = 1 - x_min</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>return</span> <span class=n>flipped_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_rot90 class="doc doc-heading" data-toc-label=bboxes_rot90()> <code class="highlight language-python">def bboxes_rot90 (bboxes, factor) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L74 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_rot90 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Rotates bounding boxes by 90 degrees CCW (see np.rot90)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> <tr> <td><code>factor</code></td> <td><code>int</code></td> <td><p>Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of rotated bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If factor is not in set {0, 1, 2, 3}.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
-</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_rot90</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>factor</span><span class=p>:</span> <span class=nb>int</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_rot90 class="doc doc-heading" data-toc-label=bboxes_rot90()> <code class="highlight language-python">def bboxes_rot90 (bboxes, factor) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L74 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_rot90 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Rotates bounding boxes by 90 degrees CCW (see np.rot90)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> <tr> <td><code>factor</code></td> <td><code>Literal[0, 1, 2, 3]</code></td> <td><p>Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of rotated bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_rot90</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>factor</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Rotates bounding boxes by 90 degrees CCW (see np.rot90)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>    Args:</span>
@@ -470,37 +470,31 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        np.ndarray: A numpy array of rotated bounding boxes with the same shape as input.</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    Raises:</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        ValueError: If factor is not in set {0, 1, 2, 3}.</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=ow>not</span> <span class=ow>in</span> <span class=p>{</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>}:</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;Parameter factor must be in set {0, 1, 2, 3}&quot;</span><span class=p>)</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>        <span class=k>return</span> <span class=n>bboxes</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=n>rotated_bboxes</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>=</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>        <span class=k>return</span> <span class=n>bboxes</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>    <span class=n>rotated_bboxes</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>=</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>1</span><span class=p>:</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>y_min</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_max</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>y_max</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_min</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_180_FACTOR</span><span class=p>:</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_max</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_max</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_min</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_min</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_270_FACTOR</span><span class=p>:</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_max</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>x_min</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_min</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>x_max</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=k>return</span> <span class=n>rotated_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_transpose class="doc doc-heading" data-toc-label=bboxes_transpose()> <code class="highlight language-python">def bboxes_transpose (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1215 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_transpose title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transpose bounding boxes by swapping x and y coordinates.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of transposed bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>1</span><span class=p>:</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>y_min</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_max</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>y_max</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_min</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_180_FACTOR</span><span class=p>:</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_max</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_max</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x_min</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_min</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_270_FACTOR</span><span class=p>:</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_max</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>x_min</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y_min</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>rotated_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>x_max</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=k>return</span> <span class=n>rotated_bboxes</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_transpose class="doc doc-heading" data-toc-label=bboxes_transpose()> <code class="highlight language-python">def bboxes_transpose (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1222 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_transpose title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transpose bounding boxes by swapping x and y coordinates.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of transposed bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_transpose</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Transpose bounding boxes by swapping x and y coordinates.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -515,7 +509,7 @@
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>    <span class=n>transposed_bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>=</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>3</span><span class=p>,</span> <span class=mi>2</span><span class=p>]]</span>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=k>return</span> <span class=n>transposed_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_vflip class="doc doc-heading" data-toc-label=bboxes_vflip()> <code class="highlight language-python">def bboxes_vflip (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1179 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_vflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip bounding boxes vertically around the x-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of vertically flipped bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.bboxes_vflip class="doc doc-heading" data-toc-label=bboxes_vflip()> <code class="highlight language-python">def bboxes_vflip (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1186 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.bboxes_vflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip bounding boxes vertically around the x-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of vertically flipped bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>bboxes_vflip</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Flip bounding boxes vertically around the x-axis.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -531,7 +525,7 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>flipped_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>  <span class=c1># new y_max = 1 - y_min</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>return</span> <span class=n>flipped_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.calculate_affine_transform_padding class="doc doc-heading" data-toc-label=calculate_affine_transform_padding()> <code class="highlight language-python">def calculate_affine_transform_padding (matrix, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L700 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.calculate_affine_transform_padding title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the necessary padding for an affine transformation to avoid empty spaces.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>calculate_affine_transform_padding</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.calculate_affine_transform_padding class="doc doc-heading" data-toc-label=calculate_affine_transform_padding()> <code class="highlight language-python">def calculate_affine_transform_padding (matrix, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L707 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.calculate_affine_transform_padding title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the necessary padding for an affine transformation to avoid empty spaces.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>calculate_affine_transform_padding</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]:</span>
@@ -577,7 +571,7 @@
 </span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>    <span class=n>pad_bottom</span> <span class=o>=</span> <span class=nb>max</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>math</span><span class=o>.</span><span class=n>ceil</span><span class=p>(</span><span class=n>max_y</span> <span class=o>-</span> <span class=n>height</span><span class=p>))</span>
 </span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>
 </span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=k>return</span> <span class=n>pad_left</span><span class=p>,</span> <span class=n>pad_right</span><span class=p>,</span> <span class=n>pad_top</span><span class=p>,</span> <span class=n>pad_bottom</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.center class="doc doc-heading" data-toc-label=center()> <code class="highlight language-python">def center (image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2376 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.center title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the center coordinates if image. Used by images, masks and keypoints.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[float, float]</code></td> <td><p>center_x, center_y</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>center</span><span class=p>(</span><span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>]:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.center class="doc doc-heading" data-toc-label=center()> <code class="highlight language-python">def center (image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2384 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.center title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the center coordinates if image. Used by images, masks and keypoints.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[float, float]</code></td> <td><p>center_x, center_y</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>center</span><span class=p>(</span><span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>]:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Calculate the center coordinates if image. Used by images, masks and keypoints.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -588,7 +582,7 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a>    <span class=k>return</span> <span class=n>width</span> <span class=o>/</span> <span class=mi>2</span> <span class=o>-</span> <span class=mf>0.5</span><span class=p>,</span> <span class=n>height</span> <span class=o>/</span> <span class=mi>2</span> <span class=o>-</span> <span class=mf>0.5</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.center_bbox class="doc doc-heading" data-toc-label=center_bbox()> <code class="highlight language-python">def center_bbox (image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2389 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.center_bbox title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the center coordinates for of image for bounding boxes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[float, float]</code></td> <td><p>center_x, center_y</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>center_bbox</span><span class=p>(</span><span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>]:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.center_bbox class="doc doc-heading" data-toc-label=center_bbox()> <code class="highlight language-python">def center_bbox (image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2397 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.center_bbox title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the center coordinates for of image for bounding boxes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[float, float]</code></td> <td><p>center_x, center_y</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>center_bbox</span><span class=p>(</span><span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>]:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Calculate the center coordinates for of image for bounding boxes.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -599,7 +593,7 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a>    <span class=k>return</span> <span class=n>width</span> <span class=o>/</span> <span class=mi>2</span><span class=p>,</span> <span class=n>height</span> <span class=o>/</span> <span class=mi>2</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.compute_tps_weights class="doc doc-heading" data-toc-label=compute_tps_weights()> <code class="highlight language-python">def compute_tps_weights (src_points, dst_points) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3162 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.compute_tps_weights title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Compute Thin Plate Spline weights.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>src_points</code></td> <td><code>np.ndarray</code></td> <td><p>Source control points with shape (num_points, 2)</p></td> </tr> <tr> <td><code>dst_points</code></td> <td><code>np.ndarray</code></td> <td><p>Destination control points with shape (num_points, 2)</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple of</code></td> <td><ul> <li>nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)</li> <li>affine_weights: Weights for affine transformation (3, 2) [constant term, x scale/shear, y scale/shear]</li> </ul></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The TPS interpolation is decomposed into: 1. Nonlinear part (controlled by kernel weights) 2. Affine part (global scaling, rotation, translation)</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>compute_tps_weights</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.compute_tps_weights class="doc doc-heading" data-toc-label=compute_tps_weights()> <code class="highlight language-python">def compute_tps_weights (src_points, dst_points) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3170 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.compute_tps_weights title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Compute Thin Plate Spline weights.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>src_points</code></td> <td><code>np.ndarray</code></td> <td><p>Source control points with shape (num_points, 2)</p></td> </tr> <tr> <td><code>dst_points</code></td> <td><code>np.ndarray</code></td> <td><p>Destination control points with shape (num_points, 2)</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple of</code></td> <td><ul> <li>nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)</li> <li>affine_weights: Weights for affine transformation (3, 2) [constant term, x scale/shear, y scale/shear]</li> </ul></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The TPS interpolation is decomposed into: 1. Nonlinear part (controlled by kernel weights) 2. Affine part (global scaling, rotation, translation)</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>compute_tps_weights</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>src_points</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>dst_points</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>]:</span>
@@ -655,7 +649,7 @@
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=n>affine_weights</span> <span class=o>=</span> <span class=n>all_weights</span><span class=p>[</span><span class=n>num_points</span><span class=p>:]</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>
 </span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>    <span class=k>return</span> <span class=n>nonlinear_weights</span><span class=p>,</span> <span class=n>affine_weights</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.compute_transformed_image_bounds class="doc doc-heading" data-toc-label=compute_transformed_image_bounds()> <code class="highlight language-python">def compute_transformed_image_bounds (matrix, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2315 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.compute_transformed_image_bounds title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Compute the bounds of an image after applying an affine transformation.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 3x3 affine transformation matrix.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Tuple[int, int]</code></td> <td><p>The shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[np.ndarray, np.ndarray]</code></td> <td><p>A tuple containing: - min_coords: An array with the minimum x and y coordinates. - max_coords: An array with the maximum x and y coordinates.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>compute_transformed_image_bounds</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.compute_transformed_image_bounds class="doc doc-heading" data-toc-label=compute_transformed_image_bounds()> <code class="highlight language-python">def compute_transformed_image_bounds (matrix, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2323 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.compute_transformed_image_bounds title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Compute the bounds of an image after applying an affine transformation.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 3x3 affine transformation matrix.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Tuple[int, int]</code></td> <td><p>The shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[np.ndarray, np.ndarray]</code></td> <td><p>A tuple containing: - min_coords: An array with the minimum x and y coordinates. - max_coords: An array with the maximum x and y coordinates.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>compute_transformed_image_bounds</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>]:</span>
@@ -684,7 +678,7 @@
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>max_coords</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>ceil</span><span class=p>(</span><span class=n>transformed_corners</span><span class=o>.</span><span class=n>max</span><span class=p>(</span><span class=n>axis</span><span class=o>=</span><span class=mi>0</span><span class=p>))</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>int</span><span class=p>)</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>return</span> <span class=n>min_coords</span><span class=p>,</span> <span class=n>max_coords</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.create_affine_transformation_matrix class="doc doc-heading" data-toc-label=create_affine_transformation_matrix()> <code class="highlight language-python">def create_affine_transformation_matrix (translate, shear, scale, rotate, shift) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2253 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.create_affine_transformation_matrix title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create an affine transformation matrix combining translation, shear, scale, and rotation.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>translate</code></td> <td><code>dict[str, float]</code></td> <td><p>Translation in x and y directions.</p></td> </tr> <tr> <td><code>shear</code></td> <td><code>dict[str, float]</code></td> <td><p>Shear in x and y directions (in degrees).</p></td> </tr> <tr> <td><code>scale</code></td> <td><code>dict[str, float]</code></td> <td><p>Scale factors for x and y directions.</p></td> </tr> <tr> <td><code>rotate</code></td> <td><code>float</code></td> <td><p>Rotation angle in degrees.</p></td> </tr> <tr> <td><code>shift</code></td> <td><code>tuple[float, float]</code></td> <td><p>Shift to apply before and after transformations.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The resulting 3x3 affine transformation matrix.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>create_affine_transformation_matrix</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.create_affine_transformation_matrix class="doc doc-heading" data-toc-label=create_affine_transformation_matrix()> <code class="highlight language-python">def create_affine_transformation_matrix (translate, shear, scale, rotate, shift) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2261 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.create_affine_transformation_matrix title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create an affine transformation matrix combining translation, shear, scale, and rotation.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>translate</code></td> <td><code>dict[str, float]</code></td> <td><p>Translation in x and y directions.</p></td> </tr> <tr> <td><code>shear</code></td> <td><code>dict[str, float]</code></td> <td><p>Shear in x and y directions (in degrees).</p></td> </tr> <tr> <td><code>scale</code></td> <td><code>dict[str, float]</code></td> <td><p>Scale factors for x and y directions.</p></td> </tr> <tr> <td><code>rotate</code></td> <td><code>float</code></td> <td><p>Rotation angle in degrees.</p></td> </tr> <tr> <td><code>shift</code></td> <td><code>tuple[float, float]</code></td> <td><p>Shift to apply before and after transformations.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The resulting 3x3 affine transformation matrix.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>create_affine_transformation_matrix</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>translate</span><span class=p>:</span> <span class=n>XYInt</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>shear</span><span class=p>:</span> <span class=n>XYFloat</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>scale</span><span class=p>:</span> <span class=n>XYFloat</span><span class=p>,</span>
@@ -744,7 +738,7 @@
 </span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>    <span class=n>m</span><span class=p>[</span><span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>]</span>
 </span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>
 </span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>    <span class=k>return</span> <span class=n>m</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.create_piecewise_affine_maps class="doc doc-heading" data-toc-label=create_piecewise_affine_maps()> <code class="highlight language-python">def create_piecewise_affine_maps (image_shape, grid, scale, absolute_scale, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2689 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.create_piecewise_affine_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create maps for piecewise affine transformation using OpenCV's remap function.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>create_piecewise_affine_maps</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.create_piecewise_affine_maps class="doc doc-heading" data-toc-label=create_piecewise_affine_maps()> <code class="highlight language-python">def create_piecewise_affine_maps (image_shape, grid, scale, absolute_scale, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2697 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.create_piecewise_affine_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create maps for piecewise affine transformation using OpenCV's remap function.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>create_piecewise_affine_maps</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>grid</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>scale</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
@@ -815,14 +809,14 @@
 </span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>    <span class=n>map_y</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>clip</span><span class=p>(</span><span class=n>map_y</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=n>height</span> <span class=o>-</span> <span class=mi>1</span><span class=p>,</span> <span class=n>out</span><span class=o>=</span><span class=n>map_y</span><span class=p>)</span>
 </span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>
 </span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>    <span class=k>return</span> <span class=n>map_x</span><span class=p>,</span> <span class=n>map_y</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.create_shape_groups class="doc doc-heading" data-toc-label=create_shape_groups()> <code class="highlight language-python">def create_shape_groups (tiles) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3122 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.create_shape_groups title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Groups tiles by their shape and stores the indices for each shape.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>create_shape_groups</span><span class=p>(</span><span class=n>tiles</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span><span class=p>]]:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.create_shape_groups class="doc doc-heading" data-toc-label=create_shape_groups()> <code class="highlight language-python">def create_shape_groups (tiles) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3130 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.create_shape_groups title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Groups tiles by their shape and stores the indices for each shape.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>create_shape_groups</span><span class=p>(</span><span class=n>tiles</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span><span class=p>]]:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos="2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Groups tiles by their shape and stores the indices for each shape.&quot;&quot;&quot;</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos="3 "></span></a>    <span class=n>shape_groups</span> <span class=o>=</span> <span class=n>defaultdict</span><span class=p>(</span><span class=nb>list</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos="4 "></span></a>    <span class=k>for</span> <span class=n>index</span><span class=p>,</span> <span class=p>(</span><span class=n>start_y</span><span class=p>,</span> <span class=n>start_x</span><span class=p>,</span> <span class=n>end_y</span><span class=p>,</span> <span class=n>end_x</span><span class=p>)</span> <span class=ow>in</span> <span class=nb>enumerate</span><span class=p>(</span><span class=n>tiles</span><span class=p>):</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos="5 "></span></a>        <span class=n>shape</span> <span class=o>=</span> <span class=p>(</span><span class=n>end_y</span> <span class=o>-</span> <span class=n>start_y</span><span class=p>,</span> <span class=n>end_x</span> <span class=o>-</span> <span class=n>start_x</span><span class=p>)</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos="6 "></span></a>        <span class=n>shape_groups</span><span class=p>[</span><span class=n>shape</span><span class=p>]</span><span class=o>.</span><span class=n>append</span><span class=p>(</span><span class=n>index</span><span class=p>)</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos="7 "></span></a>    <span class=k>return</span> <span class=n>shape_groups</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.d4 class="doc doc-heading" data-toc-label=d4()> <code class="highlight language-python">def d4 (img, group_member) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1108 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.d4 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies a <code>D_4</code> symmetry group transformation to an image array.</p> <p>This function manipulates an image using transformations such as rotations and flips, corresponding to the <code>D_4</code> dihedral group symmetry operations. Each transformation is identified by a unique group member code.</p> <ul> <li>img (np.ndarray): The input image array to transform.</li> <li>group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:</li> <li>'e': Identity (no transformation).</li> <li>'r90': Rotate 90 degrees counterclockwise.</li> <li>'r180': Rotate 180 degrees.</li> <li>'r270': Rotate 270 degrees counterclockwise.</li> <li>'v': Vertical flip.</li> <li>'hvt': Transpose over second diagonal</li> <li>'h': Horizontal flip.</li> <li>'t': Transpose (reflect over the main diagonal).</li> </ul> <ul> <li>np.ndarray: The transformed image array.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p><strong>Examples:</strong></p> <ul> <li>Rotating an image by 90 degrees: <code>transformed_image = d4(original_image, 'r90')</code></li> <li>Applying a horizontal flip to an image: <code>transformed_image = d4(original_image, 'h')</code></li> </ul> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>d4</span><span class=p>(</span><span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>group_member</span><span class=p>:</span> <span class=n>D4Type</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.d4 class="doc doc-heading" data-toc-label=d4()> <code class="highlight language-python">def d4 (img, group_member) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1115 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.d4 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies a <code>D_4</code> symmetry group transformation to an image array.</p> <p>This function manipulates an image using transformations such as rotations and flips, corresponding to the <code>D_4</code> dihedral group symmetry operations. Each transformation is identified by a unique group member code.</p> <ul> <li>img (np.ndarray): The input image array to transform.</li> <li>group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:</li> <li>'e': Identity (no transformation).</li> <li>'r90': Rotate 90 degrees counterclockwise.</li> <li>'r180': Rotate 180 degrees.</li> <li>'r270': Rotate 270 degrees counterclockwise.</li> <li>'v': Vertical flip.</li> <li>'hvt': Transpose over second diagonal</li> <li>'h': Horizontal flip.</li> <li>'t': Transpose (reflect over the main diagonal).</li> </ul> <ul> <li>np.ndarray: The transformed image array.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p><strong>Examples:</strong></p> <ul> <li>Rotating an image by 90 degrees: <code>transformed_image = d4(original_image, 'r90')</code></li> <li>Applying a horizontal flip to an image: <code>transformed_image = d4(original_image, 'h')</code></li> </ul> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>d4</span><span class=p>(</span><span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>group_member</span><span class=p>:</span> <span class=n>D4Type</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Applies a `D_4` symmetry group transformation to an image array.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    This function manipulates an image using transformations such as rotations and flips,</span>
@@ -869,7 +863,7 @@
 </span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>        <span class=k>return</span> <span class=n>transformations</span><span class=p>[</span><span class=n>group_member</span><span class=p>](</span><span class=n>img</span><span class=p>)</span>
 </span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>
 </span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Invalid group member: </span><span class=si>{</span><span class=n>group_member</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.distort_image class="doc doc-heading" data-toc-label=distort_image()> <code class="highlight language-python">def distort_image (image, generated_mesh, interpolation) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1796 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.distort_image title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply perspective distortion to an image based on a generated mesh.</p> <p>This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image</code></td> <td><code>np.ndarray</code></td> <td><p>The input image to be distorted. Can be a 2D grayscale image or a 3D color image.</p></td> </tr> <tr> <td><code>generated_mesh</code></td> <td><code>np.ndarray</code></td> <td><p>A 2D array where each row represents a quadrilateral cell as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4]. The first four values define the source rectangle, and the last eight values define the destination quadrilateral.</p></td> </tr> <tr> <td><code>interpolation</code></td> <td><code>int</code></td> <td><p>Interpolation method to be used in the perspective transformation. Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The distorted image with the same shape and dtype as the input image.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function preserves the channel dimension of the input image.</li> <li>Each cell of the generated mesh is transformed independently and then blended into the output image.</li> <li>The distortion is applied using perspective transformation, which allows for more complex distortions compared to affine transformations.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>255</span><span class=p>,</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.distort_image class="doc doc-heading" data-toc-label=distort_image()> <code class="highlight language-python">def distort_image (image, generated_mesh, interpolation) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1803 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.distort_image title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply perspective distortion to an image based on a generated mesh.</p> <p>This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image</code></td> <td><code>np.ndarray</code></td> <td><p>The input image to be distorted. Can be a 2D grayscale image or a 3D color image.</p></td> </tr> <tr> <td><code>generated_mesh</code></td> <td><code>np.ndarray</code></td> <td><p>A 2D array where each row represents a quadrilateral cell as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4]. The first four values define the source rectangle, and the last eight values define the destination quadrilateral.</p></td> </tr> <tr> <td><code>interpolation</code></td> <td><code>int</code></td> <td><p>Interpolation method to be used in the perspective transformation. Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The distorted image with the same shape and dtype as the input image.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function preserves the channel dimension of the input image.</li> <li>Each cell of the generated mesh is transformed independently and then blended into the output image.</li> <li>The distortion is applied using perspective transformation, which allows for more complex distortions compared to affine transformations.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>255</span><span class=p>,</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>mesh</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>45</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>45</span><span class=p>,</span> <span class=mi>45</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>45</span><span class=p>]])</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>distorted</span> <span class=o>=</span> <span class=n>distort_image</span><span class=p>(</span><span class=n>image</span><span class=p>,</span> <span class=n>mesh</span><span class=p>,</span> <span class=n>cv2</span><span class=o>.</span><span class=n>INTER_LINEAR</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>distorted</span><span class=o>.</span><span class=n>shape</span>
@@ -949,7 +943,7 @@
 </span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>        <span class=n>distorted_image</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>copyTo</span><span class=p>(</span><span class=n>warped</span><span class=p>,</span> <span class=n>mask</span><span class=p>,</span> <span class=n>distorted_image</span><span class=p>)</span>
 </span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>
 </span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos="75 "></span></a>    <span class=k>return</span> <span class=n>distorted_image</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.find_keypoint class="doc doc-heading" data-toc-label=find_keypoint()> <code class="highlight language-python">def find_keypoint (position, distance_map, threshold, inverted) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L997 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.find_keypoint title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Determine if a valid keypoint can be found at the given position.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>find_keypoint</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.find_keypoint class="doc doc-heading" data-toc-label=find_keypoint()> <code class="highlight language-python">def find_keypoint (position, distance_map, threshold, inverted) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1004 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.find_keypoint title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Determine if a valid keypoint can be found at the given position.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>find_keypoint</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>position</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>distance_map</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>threshold</span><span class=p>:</span> <span class=nb>float</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
@@ -963,7 +957,7 @@
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>    <span class=k>if</span> <span class=n>inverted</span> <span class=ow>and</span> <span class=n>threshold</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span> <span class=ow>and</span> <span class=n>value</span> <span class=o>&lt;=</span> <span class=n>threshold</span><span class=p>:</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>        <span class=k>return</span> <span class=kc>None</span>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=k>return</span> <span class=nb>float</span><span class=p>(</span><span class=n>x</span><span class=p>),</span> <span class=nb>float</span><span class=p>(</span><span class=n>y</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.flip_bboxes class="doc doc-heading" data-toc-label=flip_bboxes()> <code class="highlight language-python">def flip_bboxes (bboxes, flip_horizontal=False, flip_vertical=False, image_shape=(0, 0)) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1768 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.flip_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip bounding boxes horizontally and/or vertically.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...].</p></td> </tr> <tr> <td><code>flip_horizontal</code></td> <td><code>bool</code></td> <td><p>Whether to flip horizontally.</p></td> </tr> <tr> <td><code>flip_vertical</code></td> <td><code>bool</code></td> <td><p>Whether to flip vertically.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Flipped bounding boxes.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.flip_bboxes class="doc doc-heading" data-toc-label=flip_bboxes()> <code class="highlight language-python">def flip_bboxes (bboxes, flip_horizontal=False, flip_vertical=False, image_shape=(0, 0)) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1775 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.flip_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip bounding boxes horizontally and/or vertically.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...].</p></td> </tr> <tr> <td><code>flip_horizontal</code></td> <td><code>bool</code></td> <td><p>Whether to flip horizontally.</p></td> </tr> <tr> <td><code>flip_vertical</code></td> <td><code>bool</code></td> <td><p>Whether to flip vertically.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Flipped bounding boxes.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>flip_bboxes</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>flip_horizontal</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
@@ -989,7 +983,7 @@
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=k>if</span> <span class=n>flip_vertical</span><span class=p>:</span>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=n>flipped_bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>=</span> <span class=n>rows</span> <span class=o>-</span> <span class=n>flipped_bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>3</span><span class=p>,</span> <span class=mi>1</span><span class=p>]]</span>
 </span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=k>return</span> <span class=n>flipped_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.from_distance_maps class="doc doc-heading" data-toc-label=from_distance_maps()> <code class="highlight language-python">def from_distance_maps (distance_maps, inverted, if_not_found_coords=None, threshold=None) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1013 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.from_distance_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert distance maps back to keypoints coordinates.</p> <p>This function is the inverse of <code>to_distance_maps</code>. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>distance_maps</code></td> <td><code>np.ndarray</code></td> <td><p>A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint.</p></td> </tr> <tr> <td><code>inverted</code></td> <td><code>bool</code></td> <td><p>If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity).</p></td> </tr> <tr> <td><code>if_not_found_coords</code></td> <td><code>Sequence[int] | dict[str, Any] | None</code></td> <td><p>Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None.</p></td> </tr> <tr> <td><code>threshold</code></td> <td><code>float | None</code></td> <td><p>A threshold value to determine valid keypoints. For inverted maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates of the reconstructed keypoints. If <code>drop_if_not_found</code> is True (derived from if_not_found_coords), the output may have fewer rows than input keypoints.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the input <code>distance_maps</code> is not a 3D array.</p></td> </tr> </tbody> </table> <div class="admonition notes"> <p class=admonition-title>Notes</p> <ul> <li>The function uses vectorized operations for improved performance, especially with large numbers of keypoints.</li> <li>When <code>threshold</code> is None, all keypoints are considered valid, and <code>if_not_found_coords</code> is not used.</li> <li>The function assumes that the input distance maps are properly normalized and scaled according to the original image dimensions.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>distance_maps</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>rand</span><span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>)</span>  <span class=c1># 3 keypoints</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.from_distance_maps class="doc doc-heading" data-toc-label=from_distance_maps()> <code class="highlight language-python">def from_distance_maps (distance_maps, inverted, if_not_found_coords=None, threshold=None) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1020 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.from_distance_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert distance maps back to keypoints coordinates.</p> <p>This function is the inverse of <code>to_distance_maps</code>. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>distance_maps</code></td> <td><code>np.ndarray</code></td> <td><p>A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint.</p></td> </tr> <tr> <td><code>inverted</code></td> <td><code>bool</code></td> <td><p>If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity).</p></td> </tr> <tr> <td><code>if_not_found_coords</code></td> <td><code>Sequence[int] | dict[str, Any] | None</code></td> <td><p>Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None.</p></td> </tr> <tr> <td><code>threshold</code></td> <td><code>float | None</code></td> <td><p>A threshold value to determine valid keypoints. For inverted maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates of the reconstructed keypoints. If <code>drop_if_not_found</code> is True (derived from if_not_found_coords), the output may have fewer rows than input keypoints.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the input <code>distance_maps</code> is not a 3D array.</p></td> </tr> </tbody> </table> <div class="admonition notes"> <p class=admonition-title>Notes</p> <ul> <li>The function uses vectorized operations for improved performance, especially with large numbers of keypoints.</li> <li>When <code>threshold</code> is None, all keypoints are considered valid, and <code>if_not_found_coords</code> is not used.</li> <li>The function assumes that the input distance maps are properly normalized and scaled according to the original image dimensions.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>distance_maps</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>rand</span><span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>)</span>  <span class=c1># 3 keypoints</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>inverted</span> <span class=o>=</span> <span class=kc>True</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>if_not_found_coords</span> <span class=o>=</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>]</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>threshold</span> <span class=o>=</span> <span class=mf>0.5</span>
@@ -1089,7 +1083,7 @@
 </span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos="91 "></span></a>            <span class=k>return</span> <span class=n>keypoints</span><span class=p>[</span><span class=n>valid_mask</span><span class=p>]</span>
 </span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos="92 "></span></a>
 </span><span id=__span-0-93><a id=__codelineno-0-93 name=__codelineno-0-93></a><a href=#__codelineno-0-93><span class=linenos data-linenos="93 "></span></a>    <span class=k>return</span> <span class=n>keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_displacement_fields class="doc doc-heading" data-toc-label=generate_displacement_fields()> <code class="highlight language-python">def generate_displacement_fields (image_shape, alpha, sigma, same_dxdy, kernel_size, random_generator, noise_distribution) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1519 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_displacement_fields title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate displacement fields for elastic transform.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image (height, width)</p></td> </tr> <tr> <td><code>alpha</code></td> <td><code>float</code></td> <td><p>Scaling factor for displacement</p></td> </tr> <tr> <td><code>sigma</code></td> <td><code>float</code></td> <td><p>Standard deviation for Gaussian blur</p></td> </tr> <tr> <td><code>same_dxdy</code></td> <td><code>bool</code></td> <td><p>Whether to use same displacement field for both directions</p></td> </tr> <tr> <td><code>kernel_size</code></td> <td><code>tuple[int, int]</code></td> <td><p>Size of Gaussian blur kernel</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>NumPy random number generator</p></td> </tr> <tr> <td><code>noise_distribution</code></td> <td><code>Literal[&#39;gaussian&#39;, &#39;uniform&#39;]</code></td> <td><p>Type of noise distribution to use ("gaussian" or "uniform")</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple</code></td> <td><p>(dx, dy) displacement fields</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_displacement_fields</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_displacement_fields class="doc doc-heading" data-toc-label=generate_displacement_fields()> <code class="highlight language-python">def generate_displacement_fields (image_shape, alpha, sigma, same_dxdy, kernel_size, random_generator, noise_distribution) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1526 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_displacement_fields title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate displacement fields for elastic transform.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image (height, width)</p></td> </tr> <tr> <td><code>alpha</code></td> <td><code>float</code></td> <td><p>Scaling factor for displacement</p></td> </tr> <tr> <td><code>sigma</code></td> <td><code>float</code></td> <td><p>Standard deviation for Gaussian blur</p></td> </tr> <tr> <td><code>same_dxdy</code></td> <td><code>bool</code></td> <td><p>Whether to use same displacement field for both directions</p></td> </tr> <tr> <td><code>kernel_size</code></td> <td><code>tuple[int, int]</code></td> <td><p>Size of Gaussian blur kernel</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>NumPy random number generator</p></td> </tr> <tr> <td><code>noise_distribution</code></td> <td><code>Literal[&#39;gaussian&#39;, &#39;uniform&#39;]</code></td> <td><p>Type of noise distribution to use ("gaussian" or "uniform")</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple</code></td> <td><p>(dx, dy) displacement fields</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_displacement_fields</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>alpha</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>sigma</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
@@ -1132,7 +1126,7 @@
 </span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=n>dy</span> <span class=o>=</span> <span class=n>dx</span> <span class=k>if</span> <span class=n>same_dxdy</span> <span class=k>else</span> <span class=n>generate_noise_field</span><span class=p>()</span>
 </span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>
 </span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>    <span class=k>return</span> <span class=n>dx</span><span class=p>,</span> <span class=n>dy</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons class="doc doc-heading" data-toc-label=generate_distorted_grid_polygons()> <code class="highlight language-python">def generate_distorted_grid_polygons (dimensions, magnitude, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1951 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate distorted grid polygons based on input dimensions and magnitude.</p> <p>This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>dimensions</code></td> <td><code>np.ndarray</code></td> <td><p>A 3D array of shape (grid_height, grid_width, 4) where each element is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.</p></td> </tr> <tr> <td><code>magnitude</code></td> <td><code>int</code></td> <td><p>Maximum pixel-wise displacement for distortion. The actual displacement will be randomly chosen in the range [-magnitude, magnitude].</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>A random number generator.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 2D array of shape (total_cells, 8) where each row represents a distorted polygon as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>Only internal grid points are distorted; boundary points remain fixed.</li> <li>The function ensures consistent distortion across shared vertices of adjacent cells.</li> <li>The distortion is applied to the following points of each internal cell:<ul> <li>Bottom-right of the cell above and to the left</li> <li>Bottom-left of the cell above</li> <li>Top-right of the cell to the left</li> <li>Top-left of the current cell</li> </ul> </li> <li>Each square represents a cell, and the X marks indicate the coordinates where displacement occurs. +--+--+--+--+ | | | | | +--X--X--X--+ | | | | | +--X--X--X--+ | | | | | +--X--X--X--+ | | | | | +--+--+--+--+</li> <li>For each X, the coordinates of the left, right, top, and bottom edges in the four adjacent cells are displaced.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>dimensions</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>],</span> <span class=p>[</span><span class=mi>50</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>50</span><span class=p>]],</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons class="doc doc-heading" data-toc-label=generate_distorted_grid_polygons()> <code class="highlight language-python">def generate_distorted_grid_polygons (dimensions, magnitude, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1958 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate distorted grid polygons based on input dimensions and magnitude.</p> <p>This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>dimensions</code></td> <td><code>np.ndarray</code></td> <td><p>A 3D array of shape (grid_height, grid_width, 4) where each element is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.</p></td> </tr> <tr> <td><code>magnitude</code></td> <td><code>int</code></td> <td><p>Maximum pixel-wise displacement for distortion. The actual displacement will be randomly chosen in the range [-magnitude, magnitude].</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>A random number generator.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 2D array of shape (total_cells, 8) where each row represents a distorted polygon as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>Only internal grid points are distorted; boundary points remain fixed.</li> <li>The function ensures consistent distortion across shared vertices of adjacent cells.</li> <li>The distortion is applied to the following points of each internal cell:<ul> <li>Bottom-right of the cell above and to the left</li> <li>Bottom-left of the cell above</li> <li>Top-right of the cell to the left</li> <li>Top-left of the current cell</li> </ul> </li> <li>Each square represents a cell, and the X marks indicate the coordinates where displacement occurs. +--+--+--+--+ | | | | | +--X--X--X--+ | | | | | +--X--X--X--+ | | | | | +--X--X--X--+ | | | | | +--+--+--+--+</li> <li>For each X, the coordinates of the left, right, top, and bottom edges in the four adjacent cells are displaced.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>dimensions</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>],</span> <span class=p>[</span><span class=mi>50</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>50</span><span class=p>]],</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>...</span>                        <span class=p>[[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>100</span><span class=p>],</span> <span class=p>[</span><span class=mi>50</span><span class=p>,</span> <span class=mi>50</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>]]])</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>distorted</span> <span class=o>=</span> <span class=n>generate_distorted_grid_polygons</span><span class=p>(</span><span class=n>dimensions</span><span class=p>,</span> <span class=n>magnitude</span><span class=o>=</span><span class=mi>10</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>distorted</span><span class=o>.</span><span class=n>shape</span>
@@ -1223,7 +1217,7 @@
 </span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos="84 "></span></a>            <span class=n>polygons</span><span class=p>[</span><span class=n>i</span> <span class=o>*</span> <span class=n>grid_width</span> <span class=o>+</span> <span class=n>j</span><span class=p>,</span> <span class=mi>0</span><span class=p>:</span><span class=mi>2</span><span class=p>]</span> <span class=o>+=</span> <span class=p>[</span><span class=n>dx</span><span class=p>,</span> <span class=n>dy</span><span class=p>]</span>
 </span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos="85 "></span></a>
 </span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos="86 "></span></a>    <span class=k>return</span> <span class=n>polygons</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_grid class="doc doc-heading" data-toc-label=generate_grid()> <code class="highlight language-python">def generate_grid (image_shape, steps_x, steps_y, num_steps) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2402 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_grid title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate a distorted grid for image transformation based on given step sizes.</p> <p>This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image as (height, width).</p></td> </tr> <tr> <td><code>steps_x</code></td> <td><code>list[float]</code></td> <td><p>List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction.</p></td> </tr> <tr> <td><code>steps_y</code></td> <td><code>list[float]</code></td> <td><p>List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction.</p></td> </tr> <tr> <td><code>num_steps</code></td> <td><code>int</code></td> <td><p>The number of steps to divide each axis into. This determines the granularity of the distortion grid.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[np.ndarray, np.ndarray]</code></td> <td><p>A tuple containing two 2D numpy arrays: - map_x: A 2D array of float32 values representing the x-coordinates of the distorted grid. - map_y: A 2D array of float32 values representing the y-coordinates of the distorted grid.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function generates a grid where each cell can be distorted independently.</li> <li>The distortion is controlled by the steps_x and steps_y parameters, which determine how much each grid line is shifted.</li> <li>The resulting map_x and map_y can be used directly with cv2.remap() to apply the distortion to an image.</li> <li>The distortion is applied smoothly across each grid cell using linear interpolation.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image_shape</span> <span class=o>=</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_grid class="doc doc-heading" data-toc-label=generate_grid()> <code class="highlight language-python">def generate_grid (image_shape, steps_x, steps_y, num_steps) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2410 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_grid title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate a distorted grid for image transformation based on given step sizes.</p> <p>This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image as (height, width).</p></td> </tr> <tr> <td><code>steps_x</code></td> <td><code>list[float]</code></td> <td><p>List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction.</p></td> </tr> <tr> <td><code>steps_y</code></td> <td><code>list[float]</code></td> <td><p>List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction.</p></td> </tr> <tr> <td><code>num_steps</code></td> <td><code>int</code></td> <td><p>The number of steps to divide each axis into. This determines the granularity of the distortion grid.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[np.ndarray, np.ndarray]</code></td> <td><p>A tuple containing two 2D numpy arrays: - map_x: A 2D array of float32 values representing the x-coordinates of the distorted grid. - map_y: A 2D array of float32 values representing the y-coordinates of the distorted grid.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function generates a grid where each cell can be distorted independently.</li> <li>The distortion is controlled by the steps_x and steps_y parameters, which determine how much each grid line is shifted.</li> <li>The resulting map_x and map_y can be used directly with cv2.remap() to apply the distortion to an image.</li> <li>The distortion is applied smoothly across each grid cell using linear interpolation.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image_shape</span> <span class=o>=</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>steps_x</span> <span class=o>=</span> <span class=p>[</span><span class=mf>1.1</span><span class=p>,</span> <span class=mf>0.9</span><span class=p>,</span> <span class=mf>1.0</span><span class=p>,</span> <span class=mf>1.2</span><span class=p>,</span> <span class=mf>0.95</span><span class=p>,</span> <span class=mf>1.05</span><span class=p>]</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>steps_y</span> <span class=o>=</span> <span class=p>[</span><span class=mf>0.9</span><span class=p>,</span> <span class=mf>1.1</span><span class=p>,</span> <span class=mf>1.0</span><span class=p>,</span> <span class=mf>1.1</span><span class=p>,</span> <span class=mf>0.9</span><span class=p>,</span> <span class=mf>1.0</span><span class=p>]</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>num_steps</span> <span class=o>=</span> <span class=mi>5</span>
@@ -1300,7 +1294,7 @@
 </span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>        <span class=n>prev</span> <span class=o>=</span> <span class=n>cur</span>
 </span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>
 </span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>meshgrid</span><span class=p>(</span><span class=n>xx</span><span class=p>,</span> <span class=n>yy</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_reflected_bboxes class="doc doc-heading" data-toc-label=generate_reflected_bboxes()> <code class="highlight language-python">def generate_reflected_bboxes (bboxes, grid_dims, image_shape, center_in_origin=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1691 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_reflected_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate reflected bounding boxes for the entire reflection grid.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Original bounding boxes.</p></td> </tr> <tr> <td><code>grid_dims</code></td> <td><code>dict[str, tuple[int, int]]</code></td> <td><p>Grid dimensions and original position.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the original image as (height, width).</p></td> </tr> <tr> <td><code>center_in_origin</code></td> <td><code>bool</code></td> <td><p>If True, center the grid at the origin. Default is False.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of reflected and shifted bounding boxes for the entire grid.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_reflected_bboxes</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_reflected_bboxes class="doc doc-heading" data-toc-label=generate_reflected_bboxes()> <code class="highlight language-python">def generate_reflected_bboxes (bboxes, grid_dims, image_shape, center_in_origin=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1698 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_reflected_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate reflected bounding boxes for the entire reflection grid.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Original bounding boxes.</p></td> </tr> <tr> <td><code>grid_dims</code></td> <td><code>dict[str, tuple[int, int]]</code></td> <td><p>Grid dimensions and original position.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the original image as (height, width).</p></td> </tr> <tr> <td><code>center_in_origin</code></td> <td><code>bool</code></td> <td><p>If True, center the grid at the origin. Default is False.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of reflected and shifted bounding boxes for the entire grid.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_reflected_bboxes</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>grid_dims</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
@@ -1375,7 +1369,7 @@
 </span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>    <span class=n>result</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>vstack</span><span class=p>(</span><span class=n>new_bboxes</span><span class=p>)</span>
 </span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>
 </span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos="75 "></span></a>    <span class=k>return</span> <span class=n>shift_bboxes</span><span class=p>(</span><span class=n>result</span><span class=p>,</span> <span class=o>-</span><span class=n>shift_vector</span><span class=p>)</span> <span class=k>if</span> <span class=n>center_in_origin</span> <span class=k>else</span> <span class=n>result</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_reflected_keypoints class="doc doc-heading" data-toc-label=generate_reflected_keypoints()> <code class="highlight language-python">def generate_reflected_keypoints (keypoints, grid_dims, image_shape, center_in_origin=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2111 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_reflected_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate reflected keypoints for the entire reflection grid.</p> <p>This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the <code>center_in_origin</code> parameter.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Original keypoints array of shape (N, 4+), where N is the number of keypoints, and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>grid_dims</code></td> <td><code>dict[str, tuple[int, int]]</code></td> <td><p>A dictionary containing grid dimensions and original position. It should have the following keys: - "grid_shape": tuple[int, int] representing (grid_rows, grid_cols) - "original_position": tuple[int, int] representing (original_row, original_col)</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the original image as (height, width).</p></td> </tr> <tr> <td><code>center_in_origin</code></td> <td><code>bool</code></td> <td><p>If True, center the grid at the origin. Default is False.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of reflected and shifted keypoints for the entire grid. The shape is (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function handles keypoint flipping and shifting to create a grid of reflected keypoints.</li> <li>It preserves the angle and scale information of the keypoints during transformations.</li> <li>The resulting grid can be either centered at the origin or positioned based on the original grid.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_reflected_keypoints</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_reflected_keypoints class="doc doc-heading" data-toc-label=generate_reflected_keypoints()> <code class="highlight language-python">def generate_reflected_keypoints (keypoints, grid_dims, image_shape, center_in_origin=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2118 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_reflected_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate reflected keypoints for the entire reflection grid.</p> <p>This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the <code>center_in_origin</code> parameter.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Original keypoints array of shape (N, 4+), where N is the number of keypoints, and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>grid_dims</code></td> <td><code>dict[str, tuple[int, int]]</code></td> <td><p>A dictionary containing grid dimensions and original position. It should have the following keys: - "grid_shape": tuple[int, int] representing (grid_rows, grid_cols) - "original_position": tuple[int, int] representing (original_row, original_col)</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the original image as (height, width).</p></td> </tr> <tr> <td><code>center_in_origin</code></td> <td><code>bool</code></td> <td><p>If True, center the grid at the origin. Default is False.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of reflected and shifted keypoints for the entire grid. The shape is (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function handles keypoint flipping and shifting to create a grid of reflected keypoints.</li> <li>It preserves the angle and scale information of the keypoints during transformations.</li> <li>The resulting grid can be either centered at the origin or positioned based on the original grid.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_reflected_keypoints</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>grid_dims</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
@@ -1430,7 +1424,7 @@
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=c1># Shift all versions to the original position</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=n>shift_vector</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=p>[</span><span class=n>original_col</span> <span class=o>*</span> <span class=n>cols</span><span class=p>,</span> <span class=n>original_row</span> <span class=o>*</span> <span class=n>rows</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>],</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=p>[</span><span class=n>original_col</span> <span class=o>*</span> <span class=n>cols</span><span class=p>,</span> <span class=n>original_row</span> <span class=o>*</span> <span class=n>rows</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>],</span>
 </span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=p>)</span>  <span class=c1># Only shift x and y</span>
 </span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>    <span class=n>keypoints</span> <span class=o>=</span> <span class=n>shift_keypoints</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>shift_vector</span><span class=p>)</span>
 </span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>    <span class=n>keypoints_hflipped</span> <span class=o>=</span> <span class=n>shift_keypoints</span><span class=p>(</span><span class=n>keypoints_hflipped</span><span class=p>,</span> <span class=n>shift_vector</span><span class=p>)</span>
@@ -1458,16 +1452,17 @@
 </span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos="81 "></span></a>                    <span class=p>(</span><span class=n>grid_row</span> <span class=o>-</span> <span class=n>original_row</span><span class=p>)</span> <span class=o>*</span> <span class=n>rows</span><span class=p>,</span>
 </span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos="82 "></span></a>                    <span class=mi>0</span><span class=p>,</span>
 </span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos="83 "></span></a>                    <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos="84 "></span></a>                <span class=p>],</span>
-</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos="85 "></span></a>            <span class=p>)</span>
-</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos="86 "></span></a>            <span class=n>shifted_keypoints</span> <span class=o>=</span> <span class=n>shift_keypoints</span><span class=p>(</span><span class=n>current_keypoints</span><span class=p>,</span> <span class=n>cell_shift</span><span class=p>)</span>
-</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos="87 "></span></a>
-</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos="88 "></span></a>            <span class=n>new_keypoints</span><span class=o>.</span><span class=n>append</span><span class=p>(</span><span class=n>shifted_keypoints</span><span class=p>)</span>
-</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos="89 "></span></a>
-</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos="90 "></span></a>    <span class=n>result</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>vstack</span><span class=p>(</span><span class=n>new_keypoints</span><span class=p>)</span>
-</span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos="91 "></span></a>
-</span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos="92 "></span></a>    <span class=k>return</span> <span class=n>shift_keypoints</span><span class=p>(</span><span class=n>result</span><span class=p>,</span> <span class=o>-</span><span class=n>shift_vector</span><span class=p>)</span> <span class=k>if</span> <span class=n>center_in_origin</span> <span class=k>else</span> <span class=n>result</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_shuffled_splits class="doc doc-heading" data-toc-label=generate_shuffled_splits()> <code class="highlight language-python">def generate_shuffled_splits (size, divisions, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2528 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_shuffled_splits title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate shuffled splits for a given dimension size and number of divisions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>size</code></td> <td><code>int</code></td> <td><p>Total size of the dimension (height or width).</p></td> </tr> <tr> <td><code>divisions</code></td> <td><code>int</code></td> <td><p>Number of divisions (rows or columns).</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator | None</code></td> <td><p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Cumulative edges of the shuffled intervals.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_shuffled_splits</span><span class=p>(</span>
+</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos="84 "></span></a>                    <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos="85 "></span></a>                <span class=p>],</span>
+</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos="86 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos="87 "></span></a>            <span class=n>shifted_keypoints</span> <span class=o>=</span> <span class=n>shift_keypoints</span><span class=p>(</span><span class=n>current_keypoints</span><span class=p>,</span> <span class=n>cell_shift</span><span class=p>)</span>
+</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos="88 "></span></a>
+</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos="89 "></span></a>            <span class=n>new_keypoints</span><span class=o>.</span><span class=n>append</span><span class=p>(</span><span class=n>shifted_keypoints</span><span class=p>)</span>
+</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos="90 "></span></a>
+</span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos="91 "></span></a>    <span class=n>result</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>vstack</span><span class=p>(</span><span class=n>new_keypoints</span><span class=p>)</span>
+</span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos="92 "></span></a>
+</span><span id=__span-0-93><a id=__codelineno-0-93 name=__codelineno-0-93></a><a href=#__codelineno-0-93><span class=linenos data-linenos="93 "></span></a>    <span class=k>return</span> <span class=n>shift_keypoints</span><span class=p>(</span><span class=n>result</span><span class=p>,</span> <span class=o>-</span><span class=n>shift_vector</span><span class=p>)</span> <span class=k>if</span> <span class=n>center_in_origin</span> <span class=k>else</span> <span class=n>result</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.generate_shuffled_splits class="doc doc-heading" data-toc-label=generate_shuffled_splits()> <code class="highlight language-python">def generate_shuffled_splits (size, divisions, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2536 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.generate_shuffled_splits title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate shuffled splits for a given dimension size and number of divisions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>size</code></td> <td><code>int</code></td> <td><p>Total size of the dimension (height or width).</p></td> </tr> <tr> <td><code>divisions</code></td> <td><code>int</code></td> <td><p>Number of divisions (rows or columns).</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator | None</code></td> <td><p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Cumulative edges of the shuffled intervals.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>generate_shuffled_splits</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>size</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>divisions</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>random_generator</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>Generator</span><span class=p>,</span>
@@ -1486,7 +1481,7 @@
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=n>intervals</span> <span class=o>=</span> <span class=n>almost_equal_intervals</span><span class=p>(</span><span class=n>size</span><span class=p>,</span> <span class=n>divisions</span><span class=p>)</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>random_generator</span><span class=o>.</span><span class=n>shuffle</span><span class=p>(</span><span class=n>intervals</span><span class=p>)</span>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>insert</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>cumsum</span><span class=p>(</span><span class=n>intervals</span><span class=p>),</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps class="doc doc-heading" data-toc-label=get_camera_matrix_distortion_maps()> <code class="highlight language-python">def get_camera_matrix_distortion_maps (image_shape, k, center_xy) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3259 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate distortion maps using camera matrix model.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape</p></td> </tr> <tr> <td><code>k</code></td> <td><code>float</code></td> <td><p>Distortion coefficient</p></td> </tr> <tr> <td><code>center_xy</code></td> <td><code>tuple[float, float]</code></td> <td><p>Center of distortion</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple of</code></td> <td><ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_camera_matrix_distortion_maps</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps class="doc doc-heading" data-toc-label=get_camera_matrix_distortion_maps()> <code class="highlight language-python">def get_camera_matrix_distortion_maps (image_shape, k, center_xy) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3267 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate distortion maps using camera matrix model.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape</p></td> </tr> <tr> <td><code>k</code></td> <td><code>float</code></td> <td><p>Distortion coefficient</p></td> </tr> <tr> <td><code>center_xy</code></td> <td><code>tuple[float, float]</code></td> <td><p>Center of distortion</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple of</code></td> <td><ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_camera_matrix_distortion_maps</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>k</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>center_xy</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>],</span>
@@ -1516,7 +1511,7 @@
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=p>(</span><span class=n>width</span><span class=p>,</span> <span class=n>height</span><span class=p>),</span>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=n>cv2</span><span class=o>.</span><span class=n>CV_32FC1</span><span class=p>,</span>
 </span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_dimension_padding class="doc doc-heading" data-toc-label=get_dimension_padding()> <code class="highlight language-python">def get_dimension_padding (current_size, min_size, divisor) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2793 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_dimension_padding title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate padding for a single dimension.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>current_size</code></td> <td><code>int</code></td> <td><p>Current size of the dimension</p></td> </tr> <tr> <td><code>min_size</code></td> <td><code>int | None</code></td> <td><p>Minimum size requirement, if any</p></td> </tr> <tr> <td><code>divisor</code></td> <td><code>int | None</code></td> <td><p>Divisor for padding to make size divisible, if any</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[int, int]</code></td> <td><p>(pad_before, pad_after)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_dimension_padding</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_dimension_padding class="doc doc-heading" data-toc-label=get_dimension_padding()> <code class="highlight language-python">def get_dimension_padding (current_size, min_size, divisor) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2801 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_dimension_padding title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate padding for a single dimension.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>current_size</code></td> <td><code>int</code></td> <td><p>Current size of the dimension</p></td> </tr> <tr> <td><code>min_size</code></td> <td><code>int | None</code></td> <td><p>Minimum size requirement, if any</p></td> </tr> <tr> <td><code>divisor</code></td> <td><code>int | None</code></td> <td><p>Divisor for padding to make size divisible, if any</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[int, int]</code></td> <td><p>(pad_before, pad_after)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_dimension_padding</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>current_size</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>min_size</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>divisor</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
@@ -1545,7 +1540,7 @@
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>            <span class=k>return</span> <span class=n>pad_before</span><span class=p>,</span> <span class=n>pad_after</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>return</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>0</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps class="doc doc-heading" data-toc-label=get_fisheye_distortion_maps()> <code class="highlight language-python">def get_fisheye_distortion_maps (image_shape, k, center_xy) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3291 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate distortion maps using fisheye model.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape</p></td> </tr> <tr> <td><code>k</code></td> <td><code>float</code></td> <td><p>Distortion coefficient</p></td> </tr> <tr> <td><code>center_xy</code></td> <td><code>tuple[float, float]</code></td> <td><p>Center of distortion</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple of</code></td> <td><ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_fisheye_distortion_maps</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps class="doc doc-heading" data-toc-label=get_fisheye_distortion_maps()> <code class="highlight language-python">def get_fisheye_distortion_maps (image_shape, k, center_xy) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3299 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate distortion maps using fisheye model.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape</p></td> </tr> <tr> <td><code>k</code></td> <td><code>float</code></td> <td><p>Distortion coefficient</p></td> </tr> <tr> <td><code>center_xy</code></td> <td><code>tuple[float, float]</code></td> <td><p>Center of distortion</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple of</code></td> <td><ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_fisheye_distortion_maps</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>k</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>center_xy</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>],</span>
@@ -1587,7 +1582,7 @@
 </span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=n>map_y</span> <span class=o>=</span> <span class=n>r_dist</span> <span class=o>*</span> <span class=n>np</span><span class=o>.</span><span class=n>sin</span><span class=p>(</span><span class=n>theta</span><span class=p>)</span> <span class=o>+</span> <span class=n>center_y</span>
 </span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>
 </span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>    <span class=k>return</span> <span class=n>map_x</span><span class=p>,</span> <span class=n>map_y</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_pad_grid_dimensions class="doc doc-heading" data-toc-label=get_pad_grid_dimensions()> <code class="highlight language-python">def get_pad_grid_dimensions (pad_top, pad_bottom, pad_left, pad_right, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1653 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_pad_grid_dimensions title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the dimensions of the grid needed for reflection padding and the position of the original image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>pad_top</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad above the image.</p></td> </tr> <tr> <td><code>pad_bottom</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad below the image.</p></td> </tr> <tr> <td><code>pad_left</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad to the left of the image.</p></td> </tr> <tr> <td><code>pad_right</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad to the right of the image.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the original image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>dict[str, tuple[int, int]]</code></td> <td><p>A dictionary containing: - 'grid_shape': A tuple (grid_rows, grid_cols) where: - grid_rows (int): Number of times the image needs to be repeated vertically. - grid_cols (int): Number of times the image needs to be repeated horizontally. - 'original_position': A tuple (original_row, original_col) where: - original_row (int): Row index of the original image in the grid. - original_col (int): Column index of the original image in the grid.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_pad_grid_dimensions</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_pad_grid_dimensions class="doc doc-heading" data-toc-label=get_pad_grid_dimensions()> <code class="highlight language-python">def get_pad_grid_dimensions (pad_top, pad_bottom, pad_left, pad_right, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1660 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_pad_grid_dimensions title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate the dimensions of the grid needed for reflection padding and the position of the original image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>pad_top</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad above the image.</p></td> </tr> <tr> <td><code>pad_bottom</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad below the image.</p></td> </tr> <tr> <td><code>pad_left</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad to the left of the image.</p></td> </tr> <tr> <td><code>pad_right</code></td> <td><code>int</code></td> <td><p>Number of pixels to pad to the right of the image.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the original image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>dict[str, tuple[int, int]]</code></td> <td><p>A dictionary containing: - 'grid_shape': A tuple (grid_rows, grid_cols) where: - grid_rows (int): Number of times the image needs to be repeated vertically. - grid_cols (int): Number of times the image needs to be repeated horizontally. - 'original_position': A tuple (original_row, original_col) where: - original_row (int): Row index of the original image in the grid. - original_col (int): Column index of the original image in the grid.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_pad_grid_dimensions</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>pad_top</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>pad_bottom</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>pad_left</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
@@ -1623,7 +1618,7 @@
 </span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=s2>&quot;grid_shape&quot;</span><span class=p>:</span> <span class=p>(</span><span class=n>grid_rows</span><span class=p>,</span> <span class=n>grid_cols</span><span class=p>),</span>
 </span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=s2>&quot;original_position&quot;</span><span class=p>:</span> <span class=p>(</span><span class=n>original_row</span><span class=p>,</span> <span class=n>original_col</span><span class=p>),</span>
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=p>}</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_padding_params class="doc doc-heading" data-toc-label=get_padding_params()> <code class="highlight language-python">def get_padding_params (image_shape, min_height, min_width, pad_height_divisor, pad_width_divisor) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2824 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_padding_params title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate padding parameters based on target dimensions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>(height, width) of the image</p></td> </tr> <tr> <td><code>min_height</code></td> <td><code>int | None</code></td> <td><p>Minimum height requirement, if any</p></td> </tr> <tr> <td><code>min_width</code></td> <td><code>int | None</code></td> <td><p>Minimum width requirement, if any</p></td> </tr> <tr> <td><code>pad_height_divisor</code></td> <td><code>int | None</code></td> <td><p>Divisor for height padding, if any</p></td> </tr> <tr> <td><code>pad_width_divisor</code></td> <td><code>int | None</code></td> <td><p>Divisor for width padding, if any</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[int, int, int, int]</code></td> <td><p>(pad_top, pad_bottom, pad_left, pad_right)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_padding_params</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.get_padding_params class="doc doc-heading" data-toc-label=get_padding_params()> <code class="highlight language-python">def get_padding_params (image_shape, min_height, min_width, pad_height_divisor, pad_width_divisor) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2832 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.get_padding_params title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate padding parameters based on target dimensions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>(height, width) of the image</p></td> </tr> <tr> <td><code>min_height</code></td> <td><code>int | None</code></td> <td><p>Minimum height requirement, if any</p></td> </tr> <tr> <td><code>min_width</code></td> <td><code>int | None</code></td> <td><p>Minimum width requirement, if any</p></td> </tr> <tr> <td><code>pad_height_divisor</code></td> <td><code>int | None</code></td> <td><p>Divisor for height padding, if any</p></td> </tr> <tr> <td><code>pad_width_divisor</code></td> <td><code>int | None</code></td> <td><p>Divisor for width padding, if any</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[int, int, int, int]</code></td> <td><p>(pad_top, pad_bottom, pad_left, pad_right)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>get_padding_params</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>min_height</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>min_width</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
@@ -1652,7 +1647,7 @@
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>w_pad_left</span><span class=p>,</span> <span class=n>w_pad_right</span> <span class=o>=</span> <span class=n>get_dimension_padding</span><span class=p>(</span><span class=n>cols</span><span class=p>,</span> <span class=n>min_width</span><span class=p>,</span> <span class=n>pad_width_divisor</span><span class=p>)</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>return</span> <span class=n>h_pad_top</span><span class=p>,</span> <span class=n>h_pad_bottom</span><span class=p>,</span> <span class=n>w_pad_left</span><span class=p>,</span> <span class=n>w_pad_right</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.is_identity_matrix class="doc doc-heading" data-toc-label=is_identity_matrix()> <code class="highlight language-python">def is_identity_matrix (matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L519 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.is_identity_matrix title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Check if the given matrix is an identity matrix.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>A 3x3 affine transformation matrix.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bool</code></td> <td><p>True if the matrix is an identity matrix, False otherwise.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>is_identity_matrix</span><span class=p>(</span><span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>bool</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.is_identity_matrix class="doc doc-heading" data-toc-label=is_identity_matrix()> <code class="highlight language-python">def is_identity_matrix (matrix) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L527 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.is_identity_matrix title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Check if the given matrix is an identity matrix.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>A 3x3 affine transformation matrix.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bool</code></td> <td><p>True if the matrix is an identity matrix, False otherwise.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>is_identity_matrix</span><span class=p>(</span><span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>bool</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Check if the given matrix is an identity matrix.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -1662,7 +1657,7 @@
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>        bool: True if the matrix is an identity matrix, False otherwise.</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>allclose</span><span class=p>(</span><span class=n>matrix</span><span class=p>,</span> <span class=n>np</span><span class=o>.</span><span class=n>eye</span><span class=p>(</span><span class=mi>3</span><span class=p>,</span> <span class=n>dtype</span><span class=o>=</span><span class=n>matrix</span><span class=o>.</span><span class=n>dtype</span><span class=p>))</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.is_valid_component class="doc doc-heading" data-toc-label=is_valid_component()> <code class="highlight language-python">def is_valid_component (component_area, original_area, min_area, min_visibility) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2998 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.is_valid_component title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate if a component meets the minimum requirements.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>is_valid_component</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.is_valid_component class="doc doc-heading" data-toc-label=is_valid_component()> <code class="highlight language-python">def is_valid_component (component_area, original_area, min_area, min_visibility) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3006 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.is_valid_component title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate if a component meets the minimum requirements.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>is_valid_component</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos="2 "></span></a>    <span class=n>component_area</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos="3 "></span></a>    <span class=n>original_area</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos="4 "></span></a>    <span class=n>min_area</span><span class=p>:</span> <span class=nb>float</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
@@ -1671,7 +1666,7 @@
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos="7 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Validate if a component meets the minimum requirements.&quot;&quot;&quot;</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos="8 "></span></a>    <span class=n>visibility</span> <span class=o>=</span> <span class=n>component_area</span> <span class=o>/</span> <span class=n>original_area</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos="9 "></span></a>    <span class=k>return</span> <span class=p>(</span><span class=n>min_area</span> <span class=ow>is</span> <span class=kc>None</span> <span class=ow>or</span> <span class=n>component_area</span> <span class=o>&gt;=</span> <span class=n>min_area</span><span class=p>)</span> <span class=ow>and</span> <span class=p>(</span><span class=n>min_visibility</span> <span class=ow>is</span> <span class=kc>None</span> <span class=ow>or</span> <span class=n>visibility</span> <span class=o>&gt;=</span> <span class=n>min_visibility</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_affine class="doc doc-heading" data-toc-label=keypoints_affine()> <code class="highlight language-python">def keypoints_affine (keypoints, matrix, image_shape, scale, border_mode) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L580 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_affine title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to keypoints.</p> <p>This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, 4+) where N is the number of keypoints. Each keypoint is represented as [x, y, angle, scale, ...].</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 2x3 or 3x3 affine transformation matrix.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image (height, width).</p></td> </tr> <tr> <td><code>scale</code></td> <td><code>dict[str, float]</code></td> <td><p>Dictionary containing scale factors for x and y directions. Expected keys are 'x' and 'y'.</p></td> </tr> <tr> <td><code>border_mode</code></td> <td><code>int</code></td> <td><p>Border mode for handling keypoints near image edges. Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed keypoints array with the same shape as input.</p></td> </tr> </tbody> </table> <div class="admonition notes"> <p class=admonition-title>Notes</p> <ul> <li>The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.</li> <li>Coordinates (x, y) are transformed using the affine matrix.</li> <li>Angles are adjusted based on the rotation component of the affine transformation.</li> <li>Scales are multiplied by the maximum of x and y scale factors.</li> <li>The @angle_2pi_range decorator ensures angles remain in the [0, 2π] range.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>]])</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_affine class="doc doc-heading" data-toc-label=keypoints_affine()> <code class="highlight language-python">def keypoints_affine (keypoints, matrix, image_shape, scale, border_mode) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L588 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_affine title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply an affine transformation to keypoints.</p> <p>This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, 4+) where N is the number of keypoints. Each keypoint is represented as [x, y, angle, scale, ...].</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The 2x3 or 3x3 affine transformation matrix.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image (height, width).</p></td> </tr> <tr> <td><code>scale</code></td> <td><code>dict[str, float]</code></td> <td><p>Dictionary containing scale factors for x and y directions. Expected keys are 'x' and 'y'.</p></td> </tr> <tr> <td><code>border_mode</code></td> <td><code>int</code></td> <td><p>Border mode for handling keypoints near image edges. Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed keypoints array with the same shape as input.</p></td> </tr> </tbody> </table> <div class="admonition notes"> <p class=admonition-title>Notes</p> <ul> <li>The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.</li> <li>Coordinates (x, y) are transformed using the affine matrix.</li> <li>Angles are adjusted based on the rotation component of the affine transformation.</li> <li>Scales are multiplied by the maximum of x and y scale factors.</li> <li>The @angle_2pi_range decorator ensures angles remain in the [0, 2π] range.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>]])</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>matrix</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mf>1.5</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=mi>10</span><span class=p>],</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mf>1.2</span><span class=p>,</span> <span class=mi>20</span><span class=p>]])</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>scale</span> <span class=o>=</span> <span class=p>{</span><span class=s1>&#39;x&#39;</span><span class=p>:</span> <span class=mf>1.5</span><span class=p>,</span> <span class=s1>&#39;y&#39;</span><span class=p>:</span> <span class=mf>1.2</span><span class=p>}</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transformed_keypoints</span> <span class=o>=</span> <span class=n>keypoints_affine</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>matrix</span><span class=p>,</span> <span class=p>(</span><span class=mi>480</span><span class=p>,</span> <span class=mi>640</span><span class=p>),</span> <span class=n>scale</span><span class=p>,</span> <span class=n>cv2</span><span class=o>.</span><span class=n>BORDER_REFLECT_101</span><span class=p>)</span>
@@ -1740,7 +1735,7 @@
 </span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>            <span class=n>center_in_origin</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span>
 </span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>        <span class=p>)</span>
 </span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>
-</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>    <span class=c1># Extract x, y coordinates</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>    <span class=c1># Extract x, y coordinates (z is preserved)</span>
 </span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>    <span class=n>xy</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>2</span><span class=p>]</span>
 </span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>
 </span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>    <span class=c1># Ensure matrix is 2x3</span>
@@ -1753,19 +1748,18 @@
 </span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos="76 "></span></a>    <span class=c1># Calculate angle adjustment</span>
 </span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos="77 "></span></a>    <span class=n>angle_adjustment</span> <span class=o>=</span> <span class=n>rotation2d_matrix_to_euler_angles</span><span class=p>(</span><span class=n>matrix</span><span class=p>[:</span><span class=mi>2</span><span class=p>,</span> <span class=p>:</span><span class=mi>2</span><span class=p>],</span> <span class=n>y_up</span><span class=o>=</span><span class=kc>False</span><span class=p>)</span>
 </span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos="78 "></span></a>
-</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos="79 "></span></a>    <span class=c1># Update angles</span>
-</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos="80 "></span></a>    <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>+</span> <span class=n>angle_adjustment</span>
+</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos="79 "></span></a>    <span class=c1># Update angles (now at index 3)</span>
+</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos="80 "></span></a>    <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>+</span> <span class=n>angle_adjustment</span>
 </span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos="81 "></span></a>
-</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos="82 "></span></a>    <span class=c1># Update scales</span>
+</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos="82 "></span></a>    <span class=c1># Update scales (now at index 4)</span>
 </span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos="83 "></span></a>    <span class=n>max_scale</span> <span class=o>=</span> <span class=nb>max</span><span class=p>(</span><span class=n>scale</span><span class=p>[</span><span class=s2>&quot;x&quot;</span><span class=p>],</span> <span class=n>scale</span><span class=p>[</span><span class=s2>&quot;y&quot;</span><span class=p>])</span>
-</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos="84 "></span></a>
-</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos="85 "></span></a>    <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>*=</span> <span class=n>max_scale</span>
-</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos="86 "></span></a>
-</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos="87 "></span></a>    <span class=c1># Update x, y coordinates</span>
-</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos="88 "></span></a>    <span class=n>keypoints</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>xy_transformed</span>
-</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos="89 "></span></a>
-</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos="90 "></span></a>    <span class=k>return</span> <span class=n>keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_d4 class="doc doc-heading" data-toc-label=keypoints_d4()> <code class="highlight language-python">def keypoints_d4 (keypoints, group_member, image_shape, ** params) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L212 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_d4 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies a <code>D_4</code> symmetry group transformation to a keypoint.</p> <p>This function adjusts a keypoint's coordinates according to the specified <code>D_4</code> group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.</p> <ul> <li>keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). -group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply. Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.</li> <li>image_shape (tuple[int, int]): The shape of the image.</li> <li>params (Any): Not used</li> </ul> <ul> <li>KeypointInternalType: The transformed keypoint.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.</li> </ul> <p><strong>Examples:</strong></p> <ul> <li>Rotating a keypoint by 90 degrees in a 100x100 image: <code>keypoint_d4((50, 30), 'r90', 100, 100)</code> This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.</li> </ul> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos="84 "></span></a>    <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>]</span> <span class=o>*=</span> <span class=n>max_scale</span>
+</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos="85 "></span></a>
+</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos="86 "></span></a>    <span class=c1># Update x, y coordinates and preserve z</span>
+</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos="87 "></span></a>    <span class=n>keypoints</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>xy_transformed</span>
+</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos="88 "></span></a>
+</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos="89 "></span></a>    <span class=k>return</span> <span class=n>keypoints</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_d4 class="doc doc-heading" data-toc-label=keypoints_d4()> <code class="highlight language-python">def keypoints_d4 (keypoints, group_member, image_shape, ** params) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L200 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_d4 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies a <code>D_4</code> symmetry group transformation to a keypoint.</p> <p>This function adjusts a keypoint's coordinates according to the specified <code>D_4</code> group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.</p> <ul> <li>keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). -group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply. Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.</li> <li>image_shape (tuple[int, int]): The shape of the image.</li> <li>params (Any): Not used</li> </ul> <ul> <li>KeypointInternalType: The transformed keypoint.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.</li> </ul> <p><strong>Examples:</strong></p> <ul> <li>Rotating a keypoint by 90 degrees in a 100x100 image: <code>keypoint_d4((50, 30), 'r90', 100, 100)</code> This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.</li> </ul> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>keypoints_d4</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>group_member</span><span class=p>:</span> <span class=n>D4Type</span><span class=p>,</span>
@@ -1814,7 +1808,7 @@
 </span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=k>return</span> <span class=n>transformations</span><span class=p>[</span><span class=n>group_member</span><span class=p>](</span><span class=n>keypoints</span><span class=p>)</span>
 </span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>
 </span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Invalid group member: </span><span class=si>{</span><span class=n>group_member</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_hflip class="doc doc-heading" data-toc-label=keypoints_hflip()> <code class="highlight language-python">def keypoints_hflip (keypoints, cols) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1255 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_hflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip keypoints horizontally around the y-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>cols</code></td> <td><code>int</code></td> <td><p>Image width.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of flipped keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_hflip class="doc doc-heading" data-toc-label=keypoints_hflip()> <code class="highlight language-python">def keypoints_hflip (keypoints, cols) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1262 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_hflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip keypoints horizontally around the y-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>cols</code></td> <td><code>int</code></td> <td><p>Image width.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of flipped keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=nd>@angle_2pi_range</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=k>def</span> <span class=nf>keypoints_hflip</span><span class=p>(</span><span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>cols</span><span class=p>:</span> <span class=nb>int</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Flip keypoints horizontally around the y-axis.</span>
@@ -1832,14 +1826,14 @@
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=n>flipped_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=p>(</span><span class=n>cols</span> <span class=o>-</span> <span class=mi>1</span><span class=p>)</span> <span class=o>-</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span>
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=c1># Adjust angles</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>flipped_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>-</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>flipped_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>-</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=k>return</span> <span class=n>flipped_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_rot90 class="doc doc-heading" data-toc-label=keypoints_rot90()> <code class="highlight language-python">def keypoints_rot90 (keypoints, factor, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L165 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_rot90 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>factor</code></td> <td><code>int</code></td> <td><p>The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The rotated keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the factor is not in the set {0, 1, 2, 3}.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_rot90 class="doc doc-heading" data-toc-label=keypoints_rot90()> <code class="highlight language-python">def keypoints_rot90 (keypoints, factor, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L159 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_rot90 title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>factor</code></td> <td><code>int</code></td> <td><p>The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The rotated keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=nd>@angle_2pi_range</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=k>def</span> <span class=nf>keypoints_rot90</span><span class=p>(</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>factor</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>factor</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</span>
@@ -1851,36 +1845,30 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        np.ndarray: The rotated keypoints with the same shape as the input.</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>    Raises:</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        ValueError: If the factor is not in the set {0, 1, 2, 3}.</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=ow>not</span> <span class=ow>in</span> <span class=p>{</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>}:</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;Parameter factor must be in set {0, 1, 2, 3}&quot;</span><span class=p>)</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>        <span class=k>return</span> <span class=n>keypoints</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>    <span class=n>rotated_keypoints</span> <span class=o>=</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=k>return</span> <span class=n>keypoints</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=n>rotated_keypoints</span> <span class=o>=</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>1</span><span class=p>:</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>y</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>width</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle</span> <span class=o>-</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>/</span> <span class=mi>2</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_180_FACTOR</span><span class=p>:</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>width</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>height</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle</span> <span class=o>-</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_270_FACTOR</span><span class=p>:</span>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>height</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y</span>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>x</span>
-</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle</span> <span class=o>+</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>/</span> <span class=mi>2</span>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=k>return</span> <span class=n>rotated_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_scale class="doc doc-heading" data-toc-label=keypoints_scale()> <code class="highlight language-python">def keypoints_scale (keypoints, scale_x, scale_y) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L288 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_scale title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Scales keypoints by scale_x and scale_y.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>scale_x</code></td> <td><code>float</code></td> <td><p>Scale coefficient x-axis.</p></td> </tr> <tr> <td><code>scale_y</code></td> <td><code>float</code></td> <td><p>Scale coefficient y-axis.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of scaled keypoints with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=k>if</span> <span class=n>factor</span> <span class=o>==</span> <span class=mi>1</span><span class=p>:</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>y</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>width</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle</span> <span class=o>-</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>/</span> <span class=mi>2</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_180_FACTOR</span><span class=p>:</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>width</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>x</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>height</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle</span> <span class=o>-</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=k>elif</span> <span class=n>factor</span> <span class=o>==</span> <span class=n>ROT90_270_FACTOR</span><span class=p>:</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>height</span> <span class=o>-</span> <span class=mi>1</span> <span class=o>-</span> <span class=n>y</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=n>x</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>rotated_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle</span> <span class=o>+</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>/</span> <span class=mi>2</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=k>return</span> <span class=n>rotated_keypoints</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_scale class="doc doc-heading" data-toc-label=keypoints_scale()> <code class="highlight language-python">def keypoints_scale (keypoints, scale_x, scale_y) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L276 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_scale title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Scales keypoints by scale_x and scale_y.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of keypoints with shape (N, 5+) in the format (x, y, z, angle, scale, ...).</p></td> </tr> <tr> <td><code>scale_x</code></td> <td><code>float</code></td> <td><p>Scale coefficient x-axis.</p></td> </tr> <tr> <td><code>scale_y</code></td> <td><code>float</code></td> <td><p>Scale coefficient y-axis.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of scaled keypoints with the same shape as input. X and Y coordinates are scaled by their respective scale factors, Z coordinate remains unchanged, and the keypoint scale is multiplied by max(scale_x, scale_y).</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>keypoints_scale</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>scale_x</span><span class=p>:</span> <span class=nb>float</span><span class=p>,</span>
@@ -1889,39 +1877,44 @@
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Scales keypoints by scale_x and scale_y.</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>    Args:</span>
-</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        keypoints: A numpy array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</span>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        scale_x: Scale coefficient x-axis.</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        scale_y: Scale coefficient y-axis.</span>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>    Returns:</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        A numpy array of scaled keypoints with the same shape as input.</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=c1># Extract x, y, angle, and scale</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span> <span class=o>=</span> <span class=p>(</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>],</span>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=p>)</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=c1># Scale x and y</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=n>x_scaled</span> <span class=o>=</span> <span class=n>x</span> <span class=o>*</span> <span class=n>scale_x</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>y_scaled</span> <span class=o>=</span> <span class=n>y</span> <span class=o>*</span> <span class=n>scale_y</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=c1># Scale the keypoint scale by the maximum of scale_x and scale_y</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=n>scale_scaled</span> <span class=o>=</span> <span class=n>scale</span> <span class=o>*</span> <span class=nb>max</span><span class=p>(</span><span class=n>scale_x</span><span class=p>,</span> <span class=n>scale_y</span><span class=p>)</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=c1># Create the output array</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=n>scaled_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>([</span><span class=n>x_scaled</span><span class=p>,</span> <span class=n>y_scaled</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale_scaled</span><span class=p>])</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=c1># If there are additional columns, preserve them</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>(</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=p>[</span><span class=n>scaled_keypoints</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:]],</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=k>return</span> <span class=n>scaled_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_transpose class="doc doc-heading" data-toc-label=keypoints_transpose()> <code class="highlight language-python">def keypoints_transpose (keypoints) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1278 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_transpose title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transposes keypoints along the main diagonal.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transposed keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        keypoints: A numpy array of keypoints with shape (N, 5+) in the format</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>                  (x, y, z, angle, scale, ...).</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        scale_x: Scale coefficient x-axis.</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>        scale_y: Scale coefficient y-axis.</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        A numpy array of scaled keypoints with the same shape as input.</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        X and Y coordinates are scaled by their respective scale factors,</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>        Z coordinate remains unchanged, and the keypoint scale is multiplied</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        by max(scale_x, scale_y).</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=c1># Extract x, y, z, angle, and scale</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span> <span class=o>=</span> <span class=p>(</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>],</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>],</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=p>)</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=c1># Scale x and y</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=n>x_scaled</span> <span class=o>=</span> <span class=n>x</span> <span class=o>*</span> <span class=n>scale_x</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=n>y_scaled</span> <span class=o>=</span> <span class=n>y</span> <span class=o>*</span> <span class=n>scale_y</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=c1># Scale the keypoint scale by the maximum of scale_x and scale_y</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=n>scale_scaled</span> <span class=o>=</span> <span class=n>scale</span> <span class=o>*</span> <span class=nb>max</span><span class=p>(</span><span class=n>scale_x</span><span class=p>,</span> <span class=n>scale_y</span><span class=p>)</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=c1># Create the output array</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=n>scaled_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>([</span><span class=n>x_scaled</span><span class=p>,</span> <span class=n>y_scaled</span><span class=p>,</span> <span class=n>z</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale_scaled</span><span class=p>])</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=c1># If there are additional columns, preserve them</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>(</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>            <span class=p>[</span><span class=n>scaled_keypoints</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:]],</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=k>return</span> <span class=n>scaled_keypoints</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_transpose class="doc doc-heading" data-toc-label=keypoints_transpose()> <code class="highlight language-python">def keypoints_transpose (keypoints) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1285 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_transpose title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transposes keypoints along the main diagonal.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transposed keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=nd>@angle_2pi_range</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=k>def</span> <span class=nf>keypoints_transpose</span><span class=p>(</span><span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Transposes keypoints along the main diagonal.</span>
@@ -1938,15 +1931,15 @@
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=n>transposed_keypoints</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>]]</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>]]</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=c1># Adjust angles to reflect the coordinate swap</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>angles</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>transposed_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>(</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>angles</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>transposed_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>(</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>        <span class=n>angles</span> <span class=o>&lt;=</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span><span class=p>,</span>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>        <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>/</span> <span class=mi>2</span> <span class=o>-</span> <span class=n>angles</span><span class=p>,</span>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>        <span class=mi>3</span> <span class=o>*</span> <span class=n>np</span><span class=o>.</span><span class=n>pi</span> <span class=o>/</span> <span class=mi>2</span> <span class=o>-</span> <span class=n>angles</span><span class=p>,</span>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=p>)</span>
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=k>return</span> <span class=n>transposed_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_vflip class="doc doc-heading" data-toc-label=keypoints_vflip()> <code class="highlight language-python">def keypoints_vflip (keypoints, rows) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1232 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_vflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip keypoints vertically around the x-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>rows</code></td> <td><code>int</code></td> <td><p>Image height.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of flipped keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.keypoints_vflip class="doc doc-heading" data-toc-label=keypoints_vflip()> <code class="highlight language-python">def keypoints_vflip (keypoints, rows) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1239 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.keypoints_vflip title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Flip keypoints vertically around the x-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p></td> </tr> <tr> <td><code>rows</code></td> <td><code>int</code></td> <td><p>Image height.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of flipped keypoints with the same shape as the input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=nd>@angle_2pi_range</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=k>def</span> <span class=nf>keypoints_vflip</span><span class=p>(</span><span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>rows</span><span class=p>:</span> <span class=nb>int</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Flip keypoints vertically around the x-axis.</span>
@@ -1964,10 +1957,10 @@
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=n>flipped_keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>=</span> <span class=p>(</span><span class=n>rows</span> <span class=o>-</span> <span class=mi>1</span><span class=p>)</span> <span class=o>-</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=c1># Negate angles</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>flipped_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=o>-</span><span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>flipped_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=o>-</span><span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=k>return</span> <span class=n>flipped_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.perspective_bboxes class="doc doc-heading" data-toc-label=perspective_bboxes()> <code class="highlight language-python">def perspective_bboxes (bboxes, image_shape, matrix, max_width, max_height, keep_size) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L371 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.perspective_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies perspective transformation to bounding boxes.</p> <p>This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...). Additional columns beyond the first 4 are preserved unchanged.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The perspective transformation matrix.</p></td> </tr> <tr> <td><code>max_width</code></td> <td><code>int</code></td> <td><p>The maximum width of the output image.</p></td> </tr> <tr> <td><code>max_height</code></td> <td><code>int</code></td> <td><p>The maximum height of the output image.</p></td> </tr> <tr> <td><code>keep_size</code></td> <td><code>bool</code></td> <td><p>If True, maintains the original image size after transformation.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transformed bounding boxes with the same shape as input. The first 4 columns contain the transformed coordinates, and any additional columns are preserved from the input.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This function modifies only the coordinate columns (first 4) of the input bounding boxes.</li> <li>Any additional attributes (columns beyond the first 4) are kept unchanged.</li> <li>The function handles denormalization and renormalization of coordinates internally.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.3</span><span class=p>,</span> <span class=mf>0.3</span><span class=p>,</span> <span class=mi>1</span><span class=p>],</span> <span class=p>[</span><span class=mf>0.5</span><span class=p>,</span> <span class=mf>0.5</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>,</span> <span class=mi>2</span><span class=p>]])</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.perspective_bboxes class="doc doc-heading" data-toc-label=perspective_bboxes()> <code class="highlight language-python">def perspective_bboxes (bboxes, image_shape, matrix, max_width, max_height, keep_size) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L364 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.perspective_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Applies perspective transformation to bounding boxes.</p> <p>This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...). Additional columns beyond the first 4 are preserved unchanged.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>The perspective transformation matrix.</p></td> </tr> <tr> <td><code>max_width</code></td> <td><code>int</code></td> <td><p>The maximum width of the output image.</p></td> </tr> <tr> <td><code>max_height</code></td> <td><code>int</code></td> <td><p>The maximum height of the output image.</p></td> </tr> <tr> <td><code>keep_size</code></td> <td><code>bool</code></td> <td><p>If True, maintains the original image size after transformation.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of transformed bounding boxes with the same shape as input. The first 4 columns contain the transformed coordinates, and any additional columns are preserved from the input.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This function modifies only the coordinate columns (first 4) of the input bounding boxes.</li> <li>Any additional attributes (columns beyond the first 4) are kept unchanged.</li> <li>The function handles denormalization and renormalization of coordinates internally.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.3</span><span class=p>,</span> <span class=mf>0.3</span><span class=p>,</span> <span class=mi>1</span><span class=p>],</span> <span class=p>[</span><span class=mf>0.5</span><span class=p>,</span> <span class=mf>0.5</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>,</span> <span class=mi>2</span><span class=p>]])</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image_shape</span> <span class=o>=</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>)</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>matrix</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mf>1.5</span><span class=p>,</span> <span class=mf>0.2</span><span class=p>,</span> <span class=o>-</span><span class=mi>20</span><span class=p>],</span> <span class=p>[</span><span class=o>-</span><span class=mf>0.1</span><span class=p>,</span> <span class=mf>1.3</span><span class=p>,</span> <span class=o>-</span><span class=mi>10</span><span class=p>],</span> <span class=p>[</span><span class=mf>0.002</span><span class=p>,</span> <span class=mf>0.001</span><span class=p>,</span> <span class=mi>1</span><span class=p>]])</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transformed_bboxes</span> <span class=o>=</span> <span class=n>perspective_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>,</span> <span class=n>matrix</span><span class=p>,</span> <span class=mi>150</span><span class=p>,</span> <span class=mi>150</span><span class=p>,</span> <span class=kc>False</span><span class=p>)</span>
@@ -2043,7 +2036,83 @@
 </span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>    <span class=n>transformed_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span> <span class=o>=</span> <span class=n>normalized_coords</span>
 </span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>
 </span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>    <span class=k>return</span> <span class=n>transformed_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles class="doc doc-heading" data-toc-label=rotation2d_matrix_to_euler_angles()> <code class="highlight language-python">def rotation2d_matrix_to_euler_angles (matrix, y_up) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L445 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>rotation2d_matrix_to_euler_angles</span><span class=p>(</span><span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>y_up</span><span class=p>:</span> <span class=nb>bool</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>float</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.perspective_keypoints class="doc doc-heading" data-toc-label=perspective_keypoints()> <code class="highlight language-python">def perspective_keypoints (keypoints, image_shape, matrix, max_width, max_height, keep_size) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L449 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.perspective_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply perspective transformation to keypoints.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of shape (N, 5+) in format [x, y, z, angle, scale, ...].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Original image shape (height, width).</p></td> </tr> <tr> <td><code>matrix</code></td> <td><code>np.ndarray</code></td> <td><p>3x3 perspective transformation matrix.</p></td> </tr> <tr> <td><code>max_width</code></td> <td><code>int</code></td> <td><p>Maximum width after transformation.</p></td> </tr> <tr> <td><code>max_height</code></td> <td><code>int</code></td> <td><p>Maximum height after transformation.</p></td> </tr> <tr> <td><code>keep_size</code></td> <td><code>bool</code></td> <td><p>Whether to keep original size.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed keypoints array with same shape as input. Z coordinate remains unchanged through the transformation.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=nd>@angle_2pi_range</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=k>def</span> <span class=nf>perspective_keypoints</span><span class=p>(</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>    <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a>    <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a>    <span class=n>keep_size</span><span class=p>:</span> <span class=nb>bool</span><span class=p>,</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Apply perspective transformation to keypoints.</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    Args:</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        keypoints: Array of shape (N, 5+) in format [x, y, z, angle, scale, ...].</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        image_shape: Original image shape (height, width).</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        matrix: 3x3 perspective transformation matrix.</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        max_width: Maximum width after transformation.</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>        max_height: Maximum height after transformation.</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        keep_size: Whether to keep original size.</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>        Transformed keypoints array with same shape as input.</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>        Z coordinate remains unchanged through the transformation.</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=n>keypoints</span> <span class=o>=</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span> <span class=o>=</span> <span class=p>(</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>],</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>],</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=p>)</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=c1># Reshape keypoints for perspective transform</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=n>keypoint_vector</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>((</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>))</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span><span class=o>.</span><span class=n>reshape</span><span class=p>(</span><span class=o>-</span><span class=mi>1</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=c1># Apply perspective transform</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=n>transformed_points</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>perspectiveTransform</span><span class=p>(</span><span class=n>keypoint_vector</span><span class=p>,</span> <span class=n>matrix</span><span class=p>)</span><span class=o>.</span><span class=n>squeeze</span><span class=p>()</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>    <span class=c1># Unsqueeze if we have a single keypoint</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>    <span class=k>if</span> <span class=n>transformed_points</span><span class=o>.</span><span class=n>ndim</span> <span class=o>==</span> <span class=mi>1</span><span class=p>:</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>        <span class=n>transformed_points</span> <span class=o>=</span> <span class=n>transformed_points</span><span class=p>[</span><span class=n>np</span><span class=o>.</span><span class=n>newaxis</span><span class=p>,</span> <span class=p>:]</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span> <span class=o>=</span> <span class=n>transformed_points</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>transformed_points</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=c1># Update angles</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>    <span class=n>angle</span> <span class=o>+=</span> <span class=n>rotation2d_matrix_to_euler_angles</span><span class=p>(</span><span class=n>matrix</span><span class=p>[:</span><span class=mi>2</span><span class=p>,</span> <span class=p>:</span><span class=mi>2</span><span class=p>],</span> <span class=n>y_up</span><span class=o>=</span><span class=kc>True</span><span class=p>)</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>    <span class=c1># Calculate scale factors</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=n>scale_x</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>sign</span><span class=p>(</span><span class=n>matrix</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>])</span> <span class=o>*</span> <span class=n>np</span><span class=o>.</span><span class=n>sqrt</span><span class=p>(</span><span class=n>matrix</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>**</span> <span class=mi>2</span> <span class=o>+</span> <span class=n>matrix</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>**</span> <span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=n>scale_y</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>sign</span><span class=p>(</span><span class=n>matrix</span><span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>1</span><span class=p>])</span> <span class=o>*</span> <span class=n>np</span><span class=o>.</span><span class=n>sqrt</span><span class=p>(</span><span class=n>matrix</span><span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>**</span> <span class=mi>2</span> <span class=o>+</span> <span class=n>matrix</span><span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>1</span><span class=p>]</span> <span class=o>**</span> <span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=n>scale</span> <span class=o>*=</span> <span class=nb>max</span><span class=p>(</span><span class=n>scale_x</span><span class=p>,</span> <span class=n>scale_y</span><span class=p>)</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=k>if</span> <span class=n>keep_size</span><span class=p>:</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>        <span class=n>scale_x</span> <span class=o>=</span> <span class=n>width</span> <span class=o>/</span> <span class=n>max_width</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>        <span class=n>scale_y</span> <span class=o>=</span> <span class=n>height</span> <span class=o>/</span> <span class=n>max_height</span>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>        <span class=n>x</span> <span class=o>*=</span> <span class=n>scale_x</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>        <span class=n>y</span> <span class=o>*=</span> <span class=n>scale_y</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>        <span class=n>scale</span> <span class=o>*=</span> <span class=nb>max</span><span class=p>(</span><span class=n>scale_x</span><span class=p>,</span> <span class=n>scale_y</span><span class=p>)</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>    <span class=c1># Create the output array with unchanged z coordinate</span>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>    <span class=n>transformed_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>([</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span><span class=p>])</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>
+</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>    <span class=c1># If there are additional columns, preserve them</span>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:</span>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>(</span>
+</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>            <span class=p>[</span>
+</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>                <span class=n>transformed_keypoints</span><span class=p>,</span>
+</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>                <span class=n>keypoints</span><span class=p>[:,</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:],</span>
+</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>            <span class=p>],</span>
+</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos="75 "></span></a>
+</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos="76 "></span></a>    <span class=k>return</span> <span class=n>transformed_keypoints</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles class="doc doc-heading" data-toc-label=rotation2d_matrix_to_euler_angles()> <code class="highlight language-python">def rotation2d_matrix_to_euler_angles (matrix, y_up) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L438 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>rotation2d_matrix_to_euler_angles</span><span class=p>(</span><span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>y_up</span><span class=p>:</span> <span class=nb>bool</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>float</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos="2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Args:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos="3 "></span></a><span class=sd>    matrix (np.ndarray): Rotation matrix</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos="4 "></span></a><span class=sd>    y_up (bool): is Y axis looks up or down</span>
@@ -2052,7 +2121,7 @@
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos="7 "></span></a>    <span class=k>if</span> <span class=n>y_up</span><span class=p>:</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos="8 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>arctan2</span><span class=p>(</span><span class=n>matrix</span><span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>matrix</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>])</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos="9 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>arctan2</span><span class=p>(</span><span class=o>-</span><span class=n>matrix</span><span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>matrix</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>])</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.shift_bboxes class="doc doc-heading" data-toc-label=shift_bboxes()> <code class="highlight language-python">def shift_bboxes (bboxes, shift_vector) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1632 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.shift_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Shift bounding boxes by a given vector.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (n, m) where n is the number of bboxes and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>shift_vector</code></td> <td><code>np.ndarray</code></td> <td><p>Vector to shift the bounding boxes by, with shape (4,) for [shift_x, shift_y, shift_x, shift_y].</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Shifted bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>shift_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shift_vector</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.shift_bboxes class="doc doc-heading" data-toc-label=shift_bboxes()> <code class="highlight language-python">def shift_bboxes (bboxes, shift_vector) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1639 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.shift_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Shift bounding boxes by a given vector.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (n, m) where n is the number of bboxes and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>shift_vector</code></td> <td><code>np.ndarray</code></td> <td><p>Vector to shift the bounding boxes by, with shape (4,) for [shift_x, shift_y, shift_x, shift_y].</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Shifted bounding boxes with the same shape as input.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>shift_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shift_vector</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Shift bounding boxes by a given vector.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -2071,7 +2140,7 @@
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=n>shifted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span> <span class=o>+=</span> <span class=n>shift_vector</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>return</span> <span class=n>shifted_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups class="doc doc-heading" data-toc-label=shuffle_tiles_within_shape_groups()> <code class="highlight language-python">def shuffle_tiles_within_shape_groups (shape_groups, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3131 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>shape_groups</code></td> <td><code>dict[tuple[int, int], list[int]]</code></td> <td><p>Groups of tile indices categorized by shape.</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>The random generator to use for shuffling the indices. If None, a new random generator will be used.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>list[int]</code></td> <td><p>A list where each index is mapped to the new index of the tile after shuffling.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>shuffle_tiles_within_shape_groups</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups class="doc doc-heading" data-toc-label=shuffle_tiles_within_shape_groups()> <code class="highlight language-python">def shuffle_tiles_within_shape_groups (shape_groups, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3139 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>shape_groups</code></td> <td><code>dict[tuple[int, int], list[int]]</code></td> <td><p>Groups of tile indices categorized by shape.</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>The random generator to use for shuffling the indices. If None, a new random generator will be used.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>list[int]</code></td> <td><p>A list where each index is mapped to the new index of the tile after shuffling.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>shuffle_tiles_within_shape_groups</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>shape_groups</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span><span class=p>]],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>random_generator</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>Generator</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span><span class=p>]:</span>
@@ -2100,7 +2169,7 @@
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>            <span class=n>mapping</span><span class=p>[</span><span class=n>old</span><span class=p>]</span> <span class=o>=</span> <span class=n>new</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>return</span> <span class=n>mapping</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.split_uniform_grid class="doc doc-heading" data-toc-label=split_uniform_grid()> <code class="highlight language-python">def split_uniform_grid (image_shape, grid, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2549 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.split_uniform_grid title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Splits an image shape into a uniform grid specified by the grid dimensions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image as (height, width).</p></td> </tr> <tr> <td><code>grid</code></td> <td><code>tuple[int, int]</code></td> <td><p>The grid size as (rows, columns).</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The function uses <code>generate_shuffled_splits</code> to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>split_uniform_grid</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.split_uniform_grid class="doc doc-heading" data-toc-label=split_uniform_grid()> <code class="highlight language-python">def split_uniform_grid (image_shape, grid, random_generator) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2557 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.split_uniform_grid title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Splits an image shape into a uniform grid specified by the grid dimensions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image as (height, width).</p></td> </tr> <tr> <td><code>grid</code></td> <td><code>tuple[int, int]</code></td> <td><p>The grid size as (rows, columns).</p></td> </tr> <tr> <td><code>random_generator</code></td> <td><code>np.random.Generator</code></td> <td><p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The function uses <code>generate_shuffled_splits</code> to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>split_uniform_grid</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>grid</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>random_generator</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>Generator</span><span class=p>,</span>
@@ -2141,7 +2210,7 @@
 </span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=p>]</span>
 </span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>
 </span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span><span class=n>tiles</span><span class=p>,</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>int16</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.swap_tiles_on_image class="doc doc-heading" data-toc-label=swap_tiles_on_image()> <code class="highlight language-python">def swap_tiles_on_image (image, tiles, mapping=None) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2965 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.swap_tiles_on_image title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Swap tiles on the image according to the new format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image</code></td> <td><code>np.ndarray</code></td> <td><p>Input image.</p></td> </tr> <tr> <td><code>tiles</code></td> <td><code>np.ndarray</code></td> <td><p>Array of tiles with each tile as [start_y, start_x, end_y, end_x].</p></td> </tr> <tr> <td><code>mapping</code></td> <td><code>list[int] | None</code></td> <td><p>list of new tile indices.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Output image with tiles swapped according to the random shuffle.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>swap_tiles_on_image</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.swap_tiles_on_image class="doc doc-heading" data-toc-label=swap_tiles_on_image()> <code class="highlight language-python">def swap_tiles_on_image (image, tiles, mapping=None) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2973 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.swap_tiles_on_image title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Swap tiles on the image according to the new format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>image</code></td> <td><code>np.ndarray</code></td> <td><p>Input image.</p></td> </tr> <tr> <td><code>tiles</code></td> <td><code>np.ndarray</code></td> <td><p>Array of tiles with each tile as [start_y, start_x, end_y, end_x].</p></td> </tr> <tr> <td><code>mapping</code></td> <td><code>list[int] | None</code></td> <td><p>list of new tile indices.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Output image with tiles swapped according to the random shuffle.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>swap_tiles_on_image</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>image</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>tiles</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>mapping</span><span class=p>:</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
@@ -2172,7 +2241,7 @@
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=p>]</span>
 </span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
 </span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=k>return</span> <span class=n>new_image</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints class="doc doc-heading" data-toc-label=swap_tiles_on_keypoints()> <code class="highlight language-python">def swap_tiles_on_keypoints (keypoints, tiles, mapping) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2891 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Swap the positions of keypoints based on a tile mapping.</p> <p>This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of shape (N, 2) where N is the number of keypoints. Each row represents a keypoint's (x, y) coordinates.</p></td> </tr> <tr> <td><code>tiles</code></td> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of shape (M, 4) where M is the number of tiles. Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.</p></td> </tr> <tr> <td><code>mapping</code></td> <td><code>np.ndarray</code></td> <td><p>A 1D numpy array of shape (M,) where M is the number of tiles. Each element i contains the index of the tile that tile i should be swapped with.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of the same shape as the input keypoints, containing the new positions of the keypoints after the tile swap.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>RuntimeWarning</code></td> <td><p>If any keypoint is not found within any tile.</p></td> </tr> </tbody> </table> <div class="admonition notes"> <p class=admonition-title>Notes</p> <ul> <li>Keypoints that do not fall within any tile will remain unchanged.</li> <li>The function assumes that the tiles do not overlap and cover the entire image space.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>swap_tiles_on_keypoints</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints class="doc doc-heading" data-toc-label=swap_tiles_on_keypoints()> <code class="highlight language-python">def swap_tiles_on_keypoints (keypoints, tiles, mapping) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2899 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Swap the positions of keypoints based on a tile mapping.</p> <p>This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of shape (N, 2) where N is the number of keypoints. Each row represents a keypoint's (x, y) coordinates.</p></td> </tr> <tr> <td><code>tiles</code></td> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of shape (M, 4) where M is the number of tiles. Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.</p></td> </tr> <tr> <td><code>mapping</code></td> <td><code>np.ndarray</code></td> <td><p>A 1D numpy array of shape (M,) where M is the number of tiles. Each element i contains the index of the tile that tile i should be swapped with.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 2D numpy array of the same shape as the input keypoints, containing the new positions of the keypoints after the tile swap.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>RuntimeWarning</code></td> <td><p>If any keypoint is not found within any tile.</p></td> </tr> </tbody> </table> <div class="admonition notes"> <p class=admonition-title>Notes</p> <ul> <li>Keypoints that do not fall within any tile will remain unchanged.</li> <li>The function assumes that the tiles do not overlap and cover the entire image space.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>swap_tiles_on_keypoints</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>tiles</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>mapping</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
@@ -2244,7 +2313,7 @@
 </span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>    <span class=n>new_keypoints</span><span class=p>[</span><span class=n>not_in_any_tile</span><span class=p>]</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[</span><span class=n>not_in_any_tile</span><span class=p>]</span>
 </span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>
 </span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>    <span class=k>return</span> <span class=n>new_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.to_distance_maps class="doc doc-heading" data-toc-label=to_distance_maps()> <code class="highlight language-python">def to_distance_maps (keypoints, image_shape, inverted=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L929 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.to_distance_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate a <code>(H,W,N)</code> array of distance maps for <code>N</code> keypoints.</p> <p>The <code>n</code>-th distance map contains at every location <code>(y, x)</code> the euclidean distance to the <code>n</code>-th keypoint.</p> <p>This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 2+) where N is the number of keypoints. Each row represents a keypoint's (x, y) coordinates.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>tuple[int, int] shape of the image (height, width)</p></td> </tr> <tr> <td><code>inverted</code></td> <td><code>bool</code></td> <td><p>If <code>True</code>, inverted distance maps are returned where each distance value d is replaced by <code>d/(d+1)</code>, i.e. the distance maps have values in the range <code>(0.0, 1.0]</code> with <code>1.0</code> denoting exactly the position of the respective keypoint.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A <code>float32</code> array of shape (H, W, N) containing <code>N</code> distance maps for <code>N</code> keypoints. Each location <code>(y, x, n)</code> in the array denotes the euclidean distance at <code>(y, x)</code> to the <code>n</code>-th keypoint. If <code>inverted</code> is <code>True</code>, the distance <code>d</code> is replaced by <code>d/(d+1)</code>. The height and width of the array match the height and width in <code>image_shape</code>.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>to_distance_maps</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.to_distance_maps class="doc doc-heading" data-toc-label=to_distance_maps()> <code class="highlight language-python">def to_distance_maps (keypoints, image_shape, inverted=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L936 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.to_distance_maps title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Generate a <code>(H,W,N)</code> array of distance maps for <code>N</code> keypoints.</p> <p>The <code>n</code>-th distance map contains at every location <code>(y, x)</code> the euclidean distance to the <code>n</code>-th keypoint.</p> <p>This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 2+) where N is the number of keypoints. Each row represents a keypoint's (x, y) coordinates.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>tuple[int, int] shape of the image (height, width)</p></td> </tr> <tr> <td><code>inverted</code></td> <td><code>bool</code></td> <td><p>If <code>True</code>, inverted distance maps are returned where each distance value d is replaced by <code>d/(d+1)</code>, i.e. the distance maps have values in the range <code>(0.0, 1.0]</code> with <code>1.0</code> denoting exactly the position of the respective keypoint.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A <code>float32</code> array of shape (H, W, N) containing <code>N</code> distance maps for <code>N</code> keypoints. Each location <code>(y, x, n)</code> in the array denotes the euclidean distance at <code>(y, x)</code> to the <code>n</code>-th keypoint. If <code>inverted</code> is <code>True</code>, the distance <code>d</code> is replaced by <code>d/(d+1)</code>. The height and width of the array match the height and width in <code>image_shape</code>.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>to_distance_maps</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>inverted</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
@@ -2292,7 +2361,7 @@
 </span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=k>if</span> <span class=n>inverted</span><span class=p>:</span>
 </span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=k>return</span> <span class=p>(</span><span class=mi>1</span> <span class=o>/</span> <span class=p>(</span><span class=n>distances</span> <span class=o>+</span> <span class=mi>1</span><span class=p>))</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
 </span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>    <span class=k>return</span> <span class=n>distances</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.tps_transform class="doc doc-heading" data-toc-label=tps_transform()> <code class="highlight language-python">def tps_transform (target_points, control_points, nonlinear_weights, affine_weights) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3220 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.tps_transform title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply Thin Plate Spline transformation to points.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>target_points</code></td> <td><code>np.ndarray</code></td> <td><p>Points to transform with shape (num_targets, 2)</p></td> </tr> <tr> <td><code>control_points</code></td> <td><code>np.ndarray</code></td> <td><p>Original control points with shape (num_controls, 2)</p></td> </tr> <tr> <td><code>nonlinear_weights</code></td> <td><code>np.ndarray</code></td> <td><p>TPS kernel weights with shape (num_controls, 2)</p></td> </tr> <tr> <td><code>affine_weights</code></td> <td><code>np.ndarray</code></td> <td><p>Affine transformation weights with shape (3, 2)</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed points with shape (num_targets, 2)</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The transformation combines: 1. Nonlinear warping based on distances to control points 2. Global affine transformation (scale, rotation, translation)</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>tps_transform</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.tps_transform class="doc doc-heading" data-toc-label=tps_transform()> <code class="highlight language-python">def tps_transform (target_points, control_points, nonlinear_weights, affine_weights) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L3228 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.tps_transform title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Apply Thin Plate Spline transformation to points.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>target_points</code></td> <td><code>np.ndarray</code></td> <td><p>Points to transform with shape (num_targets, 2)</p></td> </tr> <tr> <td><code>control_points</code></td> <td><code>np.ndarray</code></td> <td><p>Original control points with shape (num_controls, 2)</p></td> </tr> <tr> <td><code>nonlinear_weights</code></td> <td><code>np.ndarray</code></td> <td><p>TPS kernel weights with shape (num_controls, 2)</p></td> </tr> <tr> <td><code>affine_weights</code></td> <td><code>np.ndarray</code></td> <td><p>Affine transformation weights with shape (3, 2)</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transformed points with shape (num_targets, 2)</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The transformation combines: 1. Nonlinear warping based on distances to control points 2. Global affine transformation (scale, rotation, translation)</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>tps_transform</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>target_points</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>control_points</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>nonlinear_weights</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
@@ -2329,7 +2398,7 @@
 </span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=c1># Combine nonlinear and affine transformations</span>
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=k>return</span> <span class=n>kernel_matrix</span> <span class=o>@</span> <span class=n>nonlinear_weights</span> <span class=o>+</span> <span class=n>affine_terms</span> <span class=o>@</span> <span class=n>affine_weights</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.transpose class="doc doc-heading" data-toc-label=transpose()> <code class="highlight language-python">def transpose (img) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1157 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.transpose title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>img</code></td> <td><code>np.ndarray</code></td> <td><p>Input array.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transposed array.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>transpose</span><span class=p>(</span><span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.transpose class="doc doc-heading" data-toc-label=transpose()> <code class="highlight language-python">def transpose (img) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1164 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.transpose title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>img</code></td> <td><code>np.ndarray</code></td> <td><p>Input array.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Transposed array.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>transpose</span><span class=p>(</span><span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Transposes the first two dimensions of an array of any dimensionality.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=sd>    Retains the order of any additional dimensions.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -2345,7 +2414,7 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=c1># Transpose the array using the new axes order</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>return</span> <span class=n>img</span><span class=o>.</span><span class=n>transpose</span><span class=p>(</span><span class=n>new_axes</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.validate_bboxes class="doc doc-heading" data-toc-label=validate_bboxes()> <code class="highlight language-python">def validate_bboxes (bboxes, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1607 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.validate_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate bounding boxes and remove invalid ones.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of valid bounding boxes, potentially with fewer boxes than the input.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>10</span><span class=p>,</span> <span class=mi>20</span><span class=p>,</span> <span class=mi>30</span><span class=p>,</span> <span class=mi>40</span><span class=p>],</span> <span class=p>[</span><span class=o>-</span><span class=mi>10</span><span class=p>,</span> <span class=o>-</span><span class=mi>10</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>5</span><span class=p>],</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>120</span><span class=p>,</span> <span class=mi>120</span><span class=p>]])</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.validate_bboxes class="doc doc-heading" data-toc-label=validate_bboxes()> <code class="highlight language-python">def validate_bboxes (bboxes, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L1614 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.validate_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate bounding boxes and remove invalid ones.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of valid bounding boxes, potentially with fewer boxes than the input.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mi>10</span><span class=p>,</span> <span class=mi>20</span><span class=p>,</span> <span class=mi>30</span><span class=p>,</span> <span class=mi>40</span><span class=p>],</span> <span class=p>[</span><span class=o>-</span><span class=mi>10</span><span class=p>,</span> <span class=o>-</span><span class=mi>10</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>5</span><span class=p>],</span> <span class=p>[</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>120</span><span class=p>,</span> <span class=mi>120</span><span class=p>]])</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>valid_bboxes</span> <span class=o>=</span> <span class=n>validate_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>))</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=nb>print</span><span class=p>(</span><span class=n>valid_bboxes</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=p>[[</span><span class=mi>10</span> <span class=mi>20</span> <span class=mi>30</span> <span class=mi>40</span><span class=p>]]</span>
@@ -2372,7 +2441,7 @@
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=n>valid_indices</span> <span class=o>=</span> <span class=p>(</span><span class=n>x_max</span> <span class=o>&gt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y_max</span> <span class=o>&gt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>x_min</span> <span class=o>&lt;</span> <span class=n>cols</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y_min</span> <span class=o>&lt;</span> <span class=n>rows</span><span class=p>)</span>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=k>return</span> <span class=n>bboxes</span><span class=p>[</span><span class=n>valid_indices</span><span class=p>]</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.validate_if_not_found_coords class="doc doc-heading" data-toc-label=validate_if_not_found_coords()> <code class="highlight language-python">def validate_if_not_found_coords (if_not_found_coords) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L979 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.validate_if_not_found_coords title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate and process <code>if_not_found_coords</code> parameter.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>validate_if_not_found_coords</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.validate_if_not_found_coords class="doc doc-heading" data-toc-label=validate_if_not_found_coords()> <code class="highlight language-python">def validate_if_not_found_coords (if_not_found_coords) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L986 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.validate_if_not_found_coords title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate and process <code>if_not_found_coords</code> parameter.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>validate_if_not_found_coords</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>if_not_found_coords</span><span class=p>:</span> <span class=n>Sequence</span><span class=p>[</span><span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>bool</span><span class=p>,</span> <span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>]:</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Validate and process `if_not_found_coords` parameter.&quot;&quot;&quot;</span>
@@ -2388,7 +2457,7 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=n>msg</span> <span class=o>=</span> <span class=s2>&quot;Expected if_not_found_coords to be None, tuple, list, or dict.&quot;</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=n>msg</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.validate_keypoints class="doc doc-heading" data-toc-label=validate_keypoints()> <code class="highlight language-python">def validate_keypoints (keypoints, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2078 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.validate_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate keypoints and remove those that fall outside the image boundaries.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, M) where N is the number of keypoints and M &gt;= 2. The first two columns represent x and y coordinates.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of valid keypoints that fall within the image boundaries.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>validate_keypoints</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.geometric.functional.validate_keypoints class="doc doc-heading" data-toc-label=validate_keypoints()> <code class="highlight language-python">def validate_keypoints (keypoints, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/geometric/functional.py#L2085 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.geometric.functional.validate_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Validate keypoints and remove those that fall outside the image boundaries.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, M) where N is the number of keypoints and M &gt;= 2. The first two columns represent x and y coordinates.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Shape of the image as (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of valid keypoints that fall within the image boundaries.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>validate_keypoints</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
diff --git a/docs/api_reference/augmentations/geometric/rotate/index.html b/docs/api_reference/augmentations/geometric/rotate/index.html
index 91c8c6bb..f96e35d6 100644
--- a/docs/api_reference/augmentations/geometric/rotate/index.html
+++ b/docs/api_reference/augmentations/geometric/rotate/index.html
@@ -22,7 +22,7 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=n>ALL_TARGETS</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=k>def</span> <span class=nf>apply</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>factor</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=k>def</span> <span class=nf>apply</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>factor</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>rot90</span><span class=p>(</span><span class=n>img</span><span class=p>,</span> <span class=n>factor</span><span class=p>)</span>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=k>def</span> <span class=nf>get_params</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>int</span><span class=p>]:</span>
@@ -32,7 +32,7 @@
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_bboxes</span><span class=p>(</span>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=bp>self</span><span class=p>,</span>
 </span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>factor</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>factor</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>bboxes_rot90</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>factor</span><span class=p>)</span>
@@ -40,7 +40,7 @@
 </span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span>
 </span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=bp>self</span><span class=p>,</span>
 </span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>factor</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>factor</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span>
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>keypoints_rot90</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>factor</span><span class=p>,</span> <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>])</span>
diff --git a/docs/api_reference/augmentations/geometric/transforms/index.html b/docs/api_reference/augmentations/geometric/transforms/index.html
index a0bef437..cc6284f9 100644
--- a/docs/api_reference/augmentations/geometric/transforms/index.html
+++ b/docs/api_reference/augmentations/geometric/transforms/index.html
@@ -1111,7 +1111,7 @@
 </span><span id=__span-0-144><a id=__codelineno-0-144 name=__codelineno-0-144></a><a href=#__codelineno-0-144><span class=linenos data-linenos="144 "></span></a>
 </span><span id=__span-0-145><a id=__codelineno-0-145 name=__codelineno-0-145></a><a href=#__codelineno-0-145><span class=linenos data-linenos="145 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
 </span><span id=__span-0-146><a id=__codelineno-0-146 name=__codelineno-0-146></a><a href=#__codelineno-0-146><span class=linenos data-linenos="146 "></span></a>        <span class=k>return</span> <span class=s2>&quot;num_grid_xy&quot;</span><span class=p>,</span> <span class=s2>&quot;magnitude&quot;</span><span class=p>,</span> <span class=s2>&quot;interpolation&quot;</span><span class=p>,</span> <span class=s2>&quot;mask_interpolation&quot;</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.HorizontalFlip class="doc doc-heading" data-toc-label=HorizontalFlip> <code>class <strong> HorizontalFlip</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L1398 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.HorizontalFlip title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Flip the input horizontally around the y-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/HorizontalFlip target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>HorizontalFlip</span><span class=p>(</span><span class=n>DualTransform</span><span class=p>):</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.HorizontalFlip class="doc doc-heading" data-toc-label=HorizontalFlip> <code>class <strong> HorizontalFlip</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L1397 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.HorizontalFlip title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Flip the input horizontally around the y-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/HorizontalFlip target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/geometric/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>HorizontalFlip</span><span class=p>(</span><span class=n>DualTransform</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Flip the input horizontally around the y-axis.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -1679,173 +1679,172 @@
 </span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos=" 59 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>BaseTransformInitSchema</span><span class=p>):</span>
 </span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos=" 60 "></span></a>        <span class=n>scale</span><span class=p>:</span> <span class=n>NonNegativeFloatRangeType</span>
 </span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos=" 61 "></span></a>        <span class=n>keep_size</span><span class=p>:</span> <span class=nb>bool</span>
-</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos=" 62 "></span></a>        <span class=n>pad_mode</span><span class=p>:</span> <span class=n>BorderModeType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span>
-</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos=" 63 "></span></a>            <span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated use border_mode instead&quot;</span><span class=p>,</span>
-</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos=" 64 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos=" 65 "></span></a>        <span class=n>pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span><span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated use fill instead&quot;</span><span class=p>)</span>
-</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos=" 66 "></span></a>        <span class=n>mask_pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span>
-</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos=" 67 "></span></a>            <span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated use fill_mask instead&quot;</span><span class=p>,</span>
-</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos=" 68 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos=" 69 "></span></a>        <span class=n>fit_output</span><span class=p>:</span> <span class=nb>bool</span>
-</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos=" 70 "></span></a>        <span class=n>interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
-</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos=" 71 "></span></a>        <span class=n>mask_interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
-</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos=" 72 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span>
-</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos=" 73 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span>
-</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos=" 74 "></span></a>        <span class=n>border_mode</span><span class=p>:</span> <span class=n>BorderModeType</span>
-</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos=" 75 "></span></a>
-</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos=" 76 "></span></a>        <span class=nd>@model_validator</span><span class=p>(</span><span class=n>mode</span><span class=o>=</span><span class=s2>&quot;after&quot;</span><span class=p>)</span>
-</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos=" 77 "></span></a>        <span class=k>def</span> <span class=nf>validate_deprecated_fields</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>Self</span><span class=p>:</span>
-</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos=" 78 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_mode</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos=" 79 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_mode</span>
-</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos=" 80 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_val</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos=" 81 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_val</span>
-</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos=" 82 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_pad_val</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos=" 83 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_pad_val</span>
-</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos=" 84 "></span></a>            <span class=k>return</span> <span class=bp>self</span>
-</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos=" 85 "></span></a>
-</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos=" 86 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
-</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos=" 87 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos=" 88 "></span></a>        <span class=n>scale</span><span class=p>:</span> <span class=n>ScaleFloatType</span> <span class=o>=</span> <span class=p>(</span><span class=mf>0.05</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>),</span>
-</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos=" 89 "></span></a>        <span class=n>keep_size</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
-</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos=" 90 "></span></a>        <span class=n>pad_mode</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos=" 91 "></span></a>        <span class=n>pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos=" 92 "></span></a>        <span class=n>mask_pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-93><a id=__codelineno-0-93 name=__codelineno-0-93></a><a href=#__codelineno-0-93><span class=linenos data-linenos=" 93 "></span></a>        <span class=n>fit_output</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
-</span><span id=__span-0-94><a id=__codelineno-0-94 name=__codelineno-0-94></a><a href=#__codelineno-0-94><span class=linenos data-linenos=" 94 "></span></a>        <span class=n>interpolation</span><span class=p>:</span> <span class=nb>int</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>INTER_LINEAR</span><span class=p>,</span>
-</span><span id=__span-0-95><a id=__codelineno-0-95 name=__codelineno-0-95></a><a href=#__codelineno-0-95><span class=linenos data-linenos=" 95 "></span></a>        <span class=n>mask_interpolation</span><span class=p>:</span> <span class=nb>int</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>INTER_NEAREST</span><span class=p>,</span>
-</span><span id=__span-0-96><a id=__codelineno-0-96 name=__codelineno-0-96></a><a href=#__codelineno-0-96><span class=linenos data-linenos=" 96 "></span></a>        <span class=n>border_mode</span><span class=p>:</span> <span class=nb>int</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>BORDER_CONSTANT</span><span class=p>,</span>
-</span><span id=__span-0-97><a id=__codelineno-0-97 name=__codelineno-0-97></a><a href=#__codelineno-0-97><span class=linenos data-linenos=" 97 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-98><a id=__codelineno-0-98 name=__codelineno-0-98></a><a href=#__codelineno-0-98><span class=linenos data-linenos=" 98 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-99><a id=__codelineno-0-99 name=__codelineno-0-99></a><a href=#__codelineno-0-99><span class=linenos data-linenos=" 99 "></span></a>        <span class=n>p</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>0.5</span><span class=p>,</span>
-</span><span id=__span-0-100><a id=__codelineno-0-100 name=__codelineno-0-100></a><a href=#__codelineno-0-100><span class=linenos data-linenos="100 "></span></a>        <span class=n>always_apply</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-101><a id=__codelineno-0-101 name=__codelineno-0-101></a><a href=#__codelineno-0-101><span class=linenos data-linenos="101 "></span></a>    <span class=p>):</span>
-</span><span id=__span-0-102><a id=__codelineno-0-102 name=__codelineno-0-102></a><a href=#__codelineno-0-102><span class=linenos data-linenos="102 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>p</span><span class=p>,</span> <span class=n>always_apply</span><span class=o>=</span><span class=n>always_apply</span><span class=p>)</span>
-</span><span id=__span-0-103><a id=__codelineno-0-103 name=__codelineno-0-103></a><a href=#__codelineno-0-103><span class=linenos data-linenos="103 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>scale</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>],</span> <span class=n>scale</span><span class=p>)</span>
-</span><span id=__span-0-104><a id=__codelineno-0-104 name=__codelineno-0-104></a><a href=#__codelineno-0-104><span class=linenos data-linenos="104 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span> <span class=o>=</span> <span class=n>keep_size</span>
-</span><span id=__span-0-105><a id=__codelineno-0-105 name=__codelineno-0-105></a><a href=#__codelineno-0-105><span class=linenos data-linenos="105 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span> <span class=o>=</span> <span class=n>border_mode</span>
-</span><span id=__span-0-106><a id=__codelineno-0-106 name=__codelineno-0-106></a><a href=#__codelineno-0-106><span class=linenos data-linenos="106 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=n>fill</span>
-</span><span id=__span-0-107><a id=__codelineno-0-107 name=__codelineno-0-107></a><a href=#__codelineno-0-107><span class=linenos data-linenos="107 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=n>fill_mask</span>
-</span><span id=__span-0-108><a id=__codelineno-0-108 name=__codelineno-0-108></a><a href=#__codelineno-0-108><span class=linenos data-linenos="108 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fit_output</span> <span class=o>=</span> <span class=n>fit_output</span>
-</span><span id=__span-0-109><a id=__codelineno-0-109 name=__codelineno-0-109></a><a href=#__codelineno-0-109><span class=linenos data-linenos="109 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>interpolation</span> <span class=o>=</span> <span class=n>interpolation</span>
-</span><span id=__span-0-110><a id=__codelineno-0-110 name=__codelineno-0-110></a><a href=#__codelineno-0-110><span class=linenos data-linenos="110 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>mask_interpolation</span> <span class=o>=</span> <span class=n>mask_interpolation</span>
-</span><span id=__span-0-111><a id=__codelineno-0-111 name=__codelineno-0-111></a><a href=#__codelineno-0-111><span class=linenos data-linenos="111 "></span></a>
-</span><span id=__span-0-112><a id=__codelineno-0-112 name=__codelineno-0-112></a><a href=#__codelineno-0-112><span class=linenos data-linenos="112 "></span></a>    <span class=k>def</span> <span class=nf>apply</span><span class=p>(</span>
-</span><span id=__span-0-113><a id=__codelineno-0-113 name=__codelineno-0-113></a><a href=#__codelineno-0-113><span class=linenos data-linenos="113 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-114><a id=__codelineno-0-114 name=__codelineno-0-114></a><a href=#__codelineno-0-114><span class=linenos data-linenos="114 "></span></a>        <span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-115><a id=__codelineno-0-115 name=__codelineno-0-115></a><a href=#__codelineno-0-115><span class=linenos data-linenos="115 "></span></a>        <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-116><a id=__codelineno-0-116 name=__codelineno-0-116></a><a href=#__codelineno-0-116><span class=linenos data-linenos="116 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-117><a id=__codelineno-0-117 name=__codelineno-0-117></a><a href=#__codelineno-0-117><span class=linenos data-linenos="117 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-118><a id=__codelineno-0-118 name=__codelineno-0-118></a><a href=#__codelineno-0-118><span class=linenos data-linenos="118 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
-</span><span id=__span-0-119><a id=__codelineno-0-119 name=__codelineno-0-119></a><a href=#__codelineno-0-119><span class=linenos data-linenos="119 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-120><a id=__codelineno-0-120 name=__codelineno-0-120></a><a href=#__codelineno-0-120><span class=linenos data-linenos="120 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective</span><span class=p>(</span>
-</span><span id=__span-0-121><a id=__codelineno-0-121 name=__codelineno-0-121></a><a href=#__codelineno-0-121><span class=linenos data-linenos="121 "></span></a>            <span class=n>img</span><span class=p>,</span>
-</span><span id=__span-0-122><a id=__codelineno-0-122 name=__codelineno-0-122></a><a href=#__codelineno-0-122><span class=linenos data-linenos="122 "></span></a>            <span class=n>matrix</span><span class=p>,</span>
-</span><span id=__span-0-123><a id=__codelineno-0-123 name=__codelineno-0-123></a><a href=#__codelineno-0-123><span class=linenos data-linenos="123 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
-</span><span id=__span-0-124><a id=__codelineno-0-124 name=__codelineno-0-124></a><a href=#__codelineno-0-124><span class=linenos data-linenos="124 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
-</span><span id=__span-0-125><a id=__codelineno-0-125 name=__codelineno-0-125></a><a href=#__codelineno-0-125><span class=linenos data-linenos="125 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>fill</span><span class=p>,</span>
-</span><span id=__span-0-126><a id=__codelineno-0-126 name=__codelineno-0-126></a><a href=#__codelineno-0-126><span class=linenos data-linenos="126 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span><span class=p>,</span>
-</span><span id=__span-0-127><a id=__codelineno-0-127 name=__codelineno-0-127></a><a href=#__codelineno-0-127><span class=linenos data-linenos="127 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
-</span><span id=__span-0-128><a id=__codelineno-0-128 name=__codelineno-0-128></a><a href=#__codelineno-0-128><span class=linenos data-linenos="128 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>interpolation</span><span class=p>,</span>
-</span><span id=__span-0-129><a id=__codelineno-0-129 name=__codelineno-0-129></a><a href=#__codelineno-0-129><span class=linenos data-linenos="129 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-130><a id=__codelineno-0-130 name=__codelineno-0-130></a><a href=#__codelineno-0-130><span class=linenos data-linenos="130 "></span></a>
-</span><span id=__span-0-131><a id=__codelineno-0-131 name=__codelineno-0-131></a><a href=#__codelineno-0-131><span class=linenos data-linenos="131 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_mask</span><span class=p>(</span>
-</span><span id=__span-0-132><a id=__codelineno-0-132 name=__codelineno-0-132></a><a href=#__codelineno-0-132><span class=linenos data-linenos="132 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-133><a id=__codelineno-0-133 name=__codelineno-0-133></a><a href=#__codelineno-0-133><span class=linenos data-linenos="133 "></span></a>        <span class=n>mask</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-134><a id=__codelineno-0-134 name=__codelineno-0-134></a><a href=#__codelineno-0-134><span class=linenos data-linenos="134 "></span></a>        <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-135><a id=__codelineno-0-135 name=__codelineno-0-135></a><a href=#__codelineno-0-135><span class=linenos data-linenos="135 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-136><a id=__codelineno-0-136 name=__codelineno-0-136></a><a href=#__codelineno-0-136><span class=linenos data-linenos="136 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-137><a id=__codelineno-0-137 name=__codelineno-0-137></a><a href=#__codelineno-0-137><span class=linenos data-linenos="137 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
-</span><span id=__span-0-138><a id=__codelineno-0-138 name=__codelineno-0-138></a><a href=#__codelineno-0-138><span class=linenos data-linenos="138 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-139><a id=__codelineno-0-139 name=__codelineno-0-139></a><a href=#__codelineno-0-139><span class=linenos data-linenos="139 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective</span><span class=p>(</span>
-</span><span id=__span-0-140><a id=__codelineno-0-140 name=__codelineno-0-140></a><a href=#__codelineno-0-140><span class=linenos data-linenos="140 "></span></a>            <span class=n>mask</span><span class=p>,</span>
-</span><span id=__span-0-141><a id=__codelineno-0-141 name=__codelineno-0-141></a><a href=#__codelineno-0-141><span class=linenos data-linenos="141 "></span></a>            <span class=n>matrix</span><span class=p>,</span>
-</span><span id=__span-0-142><a id=__codelineno-0-142 name=__codelineno-0-142></a><a href=#__codelineno-0-142><span class=linenos data-linenos="142 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
-</span><span id=__span-0-143><a id=__codelineno-0-143 name=__codelineno-0-143></a><a href=#__codelineno-0-143><span class=linenos data-linenos="143 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
-</span><span id=__span-0-144><a id=__codelineno-0-144 name=__codelineno-0-144></a><a href=#__codelineno-0-144><span class=linenos data-linenos="144 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span><span class=p>,</span>
-</span><span id=__span-0-145><a id=__codelineno-0-145 name=__codelineno-0-145></a><a href=#__codelineno-0-145><span class=linenos data-linenos="145 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span><span class=p>,</span>
-</span><span id=__span-0-146><a id=__codelineno-0-146 name=__codelineno-0-146></a><a href=#__codelineno-0-146><span class=linenos data-linenos="146 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
-</span><span id=__span-0-147><a id=__codelineno-0-147 name=__codelineno-0-147></a><a href=#__codelineno-0-147><span class=linenos data-linenos="147 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>mask_interpolation</span><span class=p>,</span>
-</span><span id=__span-0-148><a id=__codelineno-0-148 name=__codelineno-0-148></a><a href=#__codelineno-0-148><span class=linenos data-linenos="148 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-149><a id=__codelineno-0-149 name=__codelineno-0-149></a><a href=#__codelineno-0-149><span class=linenos data-linenos="149 "></span></a>
-</span><span id=__span-0-150><a id=__codelineno-0-150 name=__codelineno-0-150></a><a href=#__codelineno-0-150><span class=linenos data-linenos="150 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_bboxes</span><span class=p>(</span>
-</span><span id=__span-0-151><a id=__codelineno-0-151 name=__codelineno-0-151></a><a href=#__codelineno-0-151><span class=linenos data-linenos="151 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-152><a id=__codelineno-0-152 name=__codelineno-0-152></a><a href=#__codelineno-0-152><span class=linenos data-linenos="152 "></span></a>        <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-153><a id=__codelineno-0-153 name=__codelineno-0-153></a><a href=#__codelineno-0-153><span class=linenos data-linenos="153 "></span></a>        <span class=n>matrix_bbox</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-154><a id=__codelineno-0-154 name=__codelineno-0-154></a><a href=#__codelineno-0-154><span class=linenos data-linenos="154 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-155><a id=__codelineno-0-155 name=__codelineno-0-155></a><a href=#__codelineno-0-155><span class=linenos data-linenos="155 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-156><a id=__codelineno-0-156 name=__codelineno-0-156></a><a href=#__codelineno-0-156><span class=linenos data-linenos="156 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
-</span><span id=__span-0-157><a id=__codelineno-0-157 name=__codelineno-0-157></a><a href=#__codelineno-0-157><span class=linenos data-linenos="157 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-158><a id=__codelineno-0-158 name=__codelineno-0-158></a><a href=#__codelineno-0-158><span class=linenos data-linenos="158 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective_bboxes</span><span class=p>(</span>
-</span><span id=__span-0-159><a id=__codelineno-0-159 name=__codelineno-0-159></a><a href=#__codelineno-0-159><span class=linenos data-linenos="159 "></span></a>            <span class=n>bboxes</span><span class=p>,</span>
-</span><span id=__span-0-160><a id=__codelineno-0-160 name=__codelineno-0-160></a><a href=#__codelineno-0-160><span class=linenos data-linenos="160 "></span></a>            <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>],</span>
-</span><span id=__span-0-161><a id=__codelineno-0-161 name=__codelineno-0-161></a><a href=#__codelineno-0-161><span class=linenos data-linenos="161 "></span></a>            <span class=n>matrix_bbox</span><span class=p>,</span>
-</span><span id=__span-0-162><a id=__codelineno-0-162 name=__codelineno-0-162></a><a href=#__codelineno-0-162><span class=linenos data-linenos="162 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
-</span><span id=__span-0-163><a id=__codelineno-0-163 name=__codelineno-0-163></a><a href=#__codelineno-0-163><span class=linenos data-linenos="163 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
-</span><span id=__span-0-164><a id=__codelineno-0-164 name=__codelineno-0-164></a><a href=#__codelineno-0-164><span class=linenos data-linenos="164 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
-</span><span id=__span-0-165><a id=__codelineno-0-165 name=__codelineno-0-165></a><a href=#__codelineno-0-165><span class=linenos data-linenos="165 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-166><a id=__codelineno-0-166 name=__codelineno-0-166></a><a href=#__codelineno-0-166><span class=linenos data-linenos="166 "></span></a>
-</span><span id=__span-0-167><a id=__codelineno-0-167 name=__codelineno-0-167></a><a href=#__codelineno-0-167><span class=linenos data-linenos="167 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span>
-</span><span id=__span-0-168><a id=__codelineno-0-168 name=__codelineno-0-168></a><a href=#__codelineno-0-168><span class=linenos data-linenos="168 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-169><a id=__codelineno-0-169 name=__codelineno-0-169></a><a href=#__codelineno-0-169><span class=linenos data-linenos="169 "></span></a>        <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-170><a id=__codelineno-0-170 name=__codelineno-0-170></a><a href=#__codelineno-0-170><span class=linenos data-linenos="170 "></span></a>        <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-171><a id=__codelineno-0-171 name=__codelineno-0-171></a><a href=#__codelineno-0-171><span class=linenos data-linenos="171 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-172><a id=__codelineno-0-172 name=__codelineno-0-172></a><a href=#__codelineno-0-172><span class=linenos data-linenos="172 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
-</span><span id=__span-0-173><a id=__codelineno-0-173 name=__codelineno-0-173></a><a href=#__codelineno-0-173><span class=linenos data-linenos="173 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
-</span><span id=__span-0-174><a id=__codelineno-0-174 name=__codelineno-0-174></a><a href=#__codelineno-0-174><span class=linenos data-linenos="174 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-175><a id=__codelineno-0-175 name=__codelineno-0-175></a><a href=#__codelineno-0-175><span class=linenos data-linenos="175 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective_keypoints</span><span class=p>(</span>
-</span><span id=__span-0-176><a id=__codelineno-0-176 name=__codelineno-0-176></a><a href=#__codelineno-0-176><span class=linenos data-linenos="176 "></span></a>            <span class=n>keypoints</span><span class=p>,</span>
-</span><span id=__span-0-177><a id=__codelineno-0-177 name=__codelineno-0-177></a><a href=#__codelineno-0-177><span class=linenos data-linenos="177 "></span></a>            <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>],</span>
-</span><span id=__span-0-178><a id=__codelineno-0-178 name=__codelineno-0-178></a><a href=#__codelineno-0-178><span class=linenos data-linenos="178 "></span></a>            <span class=n>matrix</span><span class=p>,</span>
-</span><span id=__span-0-179><a id=__codelineno-0-179 name=__codelineno-0-179></a><a href=#__codelineno-0-179><span class=linenos data-linenos="179 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
-</span><span id=__span-0-180><a id=__codelineno-0-180 name=__codelineno-0-180></a><a href=#__codelineno-0-180><span class=linenos data-linenos="180 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
-</span><span id=__span-0-181><a id=__codelineno-0-181 name=__codelineno-0-181></a><a href=#__codelineno-0-181><span class=linenos data-linenos="181 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
-</span><span id=__span-0-182><a id=__codelineno-0-182 name=__codelineno-0-182></a><a href=#__codelineno-0-182><span class=linenos data-linenos="182 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-183><a id=__codelineno-0-183 name=__codelineno-0-183></a><a href=#__codelineno-0-183><span class=linenos data-linenos="183 "></span></a>
-</span><span id=__span-0-184><a id=__codelineno-0-184 name=__codelineno-0-184></a><a href=#__codelineno-0-184><span class=linenos data-linenos="184 "></span></a>    <span class=k>def</span> <span class=nf>get_params_dependent_on_data</span><span class=p>(</span>
-</span><span id=__span-0-185><a id=__codelineno-0-185 name=__codelineno-0-185></a><a href=#__codelineno-0-185><span class=linenos data-linenos="185 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-186><a id=__codelineno-0-186 name=__codelineno-0-186></a><a href=#__codelineno-0-186><span class=linenos data-linenos="186 "></span></a>        <span class=n>params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
-</span><span id=__span-0-187><a id=__codelineno-0-187 name=__codelineno-0-187></a><a href=#__codelineno-0-187><span class=linenos data-linenos="187 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
-</span><span id=__span-0-188><a id=__codelineno-0-188 name=__codelineno-0-188></a><a href=#__codelineno-0-188><span class=linenos data-linenos="188 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
-</span><span id=__span-0-189><a id=__codelineno-0-189 name=__codelineno-0-189></a><a href=#__codelineno-0-189><span class=linenos data-linenos="189 "></span></a>        <span class=n>image_shape</span> <span class=o>=</span> <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>][:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-190><a id=__codelineno-0-190 name=__codelineno-0-190></a><a href=#__codelineno-0-190><span class=linenos data-linenos="190 "></span></a>
-</span><span id=__span-0-191><a id=__codelineno-0-191 name=__codelineno-0-191></a><a href=#__codelineno-0-191><span class=linenos data-linenos="191 "></span></a>        <span class=n>scale</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>uniform</span><span class=p>(</span><span class=o>*</span><span class=bp>self</span><span class=o>.</span><span class=n>scale</span><span class=p>)</span>
-</span><span id=__span-0-192><a id=__codelineno-0-192 name=__codelineno-0-192></a><a href=#__codelineno-0-192><span class=linenos data-linenos="192 "></span></a>
-</span><span id=__span-0-193><a id=__codelineno-0-193 name=__codelineno-0-193></a><a href=#__codelineno-0-193><span class=linenos data-linenos="193 "></span></a>        <span class=n>points</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>generate_perspective_points</span><span class=p>(</span>
-</span><span id=__span-0-194><a id=__codelineno-0-194 name=__codelineno-0-194></a><a href=#__codelineno-0-194><span class=linenos data-linenos="194 "></span></a>            <span class=n>image_shape</span><span class=p>,</span>
-</span><span id=__span-0-195><a id=__codelineno-0-195 name=__codelineno-0-195></a><a href=#__codelineno-0-195><span class=linenos data-linenos="195 "></span></a>            <span class=n>scale</span><span class=p>,</span>
-</span><span id=__span-0-196><a id=__codelineno-0-196 name=__codelineno-0-196></a><a href=#__codelineno-0-196><span class=linenos data-linenos="196 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>random_generator</span><span class=p>,</span>
-</span><span id=__span-0-197><a id=__codelineno-0-197 name=__codelineno-0-197></a><a href=#__codelineno-0-197><span class=linenos data-linenos="197 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-198><a id=__codelineno-0-198 name=__codelineno-0-198></a><a href=#__codelineno-0-198><span class=linenos data-linenos="198 "></span></a>        <span class=n>points</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>order_points</span><span class=p>(</span><span class=n>points</span><span class=p>)</span>
-</span><span id=__span-0-199><a id=__codelineno-0-199 name=__codelineno-0-199></a><a href=#__codelineno-0-199><span class=linenos data-linenos="199 "></span></a>
-</span><span id=__span-0-200><a id=__codelineno-0-200 name=__codelineno-0-200></a><a href=#__codelineno-0-200><span class=linenos data-linenos="200 "></span></a>        <span class=n>matrix</span><span class=p>,</span> <span class=n>max_width</span><span class=p>,</span> <span class=n>max_height</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>compute_perspective_params</span><span class=p>(</span>
-</span><span id=__span-0-201><a id=__codelineno-0-201 name=__codelineno-0-201></a><a href=#__codelineno-0-201><span class=linenos data-linenos="201 "></span></a>            <span class=n>points</span><span class=p>,</span>
-</span><span id=__span-0-202><a id=__codelineno-0-202 name=__codelineno-0-202></a><a href=#__codelineno-0-202><span class=linenos data-linenos="202 "></span></a>            <span class=n>image_shape</span><span class=p>,</span>
-</span><span id=__span-0-203><a id=__codelineno-0-203 name=__codelineno-0-203></a><a href=#__codelineno-0-203><span class=linenos data-linenos="203 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-204><a id=__codelineno-0-204 name=__codelineno-0-204></a><a href=#__codelineno-0-204><span class=linenos data-linenos="204 "></span></a>
-</span><span id=__span-0-205><a id=__codelineno-0-205 name=__codelineno-0-205></a><a href=#__codelineno-0-205><span class=linenos data-linenos="205 "></span></a>        <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>fit_output</span><span class=p>:</span>
-</span><span id=__span-0-206><a id=__codelineno-0-206 name=__codelineno-0-206></a><a href=#__codelineno-0-206><span class=linenos data-linenos="206 "></span></a>            <span class=n>matrix</span><span class=p>,</span> <span class=n>max_width</span><span class=p>,</span> <span class=n>max_height</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>expand_transform</span><span class=p>(</span>
-</span><span id=__span-0-207><a id=__codelineno-0-207 name=__codelineno-0-207></a><a href=#__codelineno-0-207><span class=linenos data-linenos="207 "></span></a>                <span class=n>matrix</span><span class=p>,</span>
-</span><span id=__span-0-208><a id=__codelineno-0-208 name=__codelineno-0-208></a><a href=#__codelineno-0-208><span class=linenos data-linenos="208 "></span></a>                <span class=n>image_shape</span><span class=p>,</span>
-</span><span id=__span-0-209><a id=__codelineno-0-209 name=__codelineno-0-209></a><a href=#__codelineno-0-209><span class=linenos data-linenos="209 "></span></a>            <span class=p>)</span>
-</span><span id=__span-0-210><a id=__codelineno-0-210 name=__codelineno-0-210></a><a href=#__codelineno-0-210><span class=linenos data-linenos="210 "></span></a>
-</span><span id=__span-0-211><a id=__codelineno-0-211 name=__codelineno-0-211></a><a href=#__codelineno-0-211><span class=linenos data-linenos="211 "></span></a>        <span class=k>return</span> <span class=p>{</span>
-</span><span id=__span-0-212><a id=__codelineno-0-212 name=__codelineno-0-212></a><a href=#__codelineno-0-212><span class=linenos data-linenos="212 "></span></a>            <span class=s2>&quot;matrix&quot;</span><span class=p>:</span> <span class=n>matrix</span><span class=p>,</span>
-</span><span id=__span-0-213><a id=__codelineno-0-213 name=__codelineno-0-213></a><a href=#__codelineno-0-213><span class=linenos data-linenos="213 "></span></a>            <span class=s2>&quot;max_height&quot;</span><span class=p>:</span> <span class=n>max_height</span><span class=p>,</span>
-</span><span id=__span-0-214><a id=__codelineno-0-214 name=__codelineno-0-214></a><a href=#__codelineno-0-214><span class=linenos data-linenos="214 "></span></a>            <span class=s2>&quot;max_width&quot;</span><span class=p>:</span> <span class=n>max_width</span><span class=p>,</span>
-</span><span id=__span-0-215><a id=__codelineno-0-215 name=__codelineno-0-215></a><a href=#__codelineno-0-215><span class=linenos data-linenos="215 "></span></a>            <span class=s2>&quot;matrix_bbox&quot;</span><span class=p>:</span> <span class=n>matrix</span><span class=p>,</span>
-</span><span id=__span-0-216><a id=__codelineno-0-216 name=__codelineno-0-216></a><a href=#__codelineno-0-216><span class=linenos data-linenos="216 "></span></a>        <span class=p>}</span>
-</span><span id=__span-0-217><a id=__codelineno-0-217 name=__codelineno-0-217></a><a href=#__codelineno-0-217><span class=linenos data-linenos="217 "></span></a>
-</span><span id=__span-0-218><a id=__codelineno-0-218 name=__codelineno-0-218></a><a href=#__codelineno-0-218><span class=linenos data-linenos="218 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
-</span><span id=__span-0-219><a id=__codelineno-0-219 name=__codelineno-0-219></a><a href=#__codelineno-0-219><span class=linenos data-linenos="219 "></span></a>        <span class=k>return</span> <span class=p>(</span>
-</span><span id=__span-0-220><a id=__codelineno-0-220 name=__codelineno-0-220></a><a href=#__codelineno-0-220><span class=linenos data-linenos="220 "></span></a>            <span class=s2>&quot;scale&quot;</span><span class=p>,</span>
-</span><span id=__span-0-221><a id=__codelineno-0-221 name=__codelineno-0-221></a><a href=#__codelineno-0-221><span class=linenos data-linenos="221 "></span></a>            <span class=s2>&quot;keep_size&quot;</span><span class=p>,</span>
-</span><span id=__span-0-222><a id=__codelineno-0-222 name=__codelineno-0-222></a><a href=#__codelineno-0-222><span class=linenos data-linenos="222 "></span></a>            <span class=s2>&quot;border_mode&quot;</span><span class=p>,</span>
-</span><span id=__span-0-223><a id=__codelineno-0-223 name=__codelineno-0-223></a><a href=#__codelineno-0-223><span class=linenos data-linenos="223 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
-</span><span id=__span-0-224><a id=__codelineno-0-224 name=__codelineno-0-224></a><a href=#__codelineno-0-224><span class=linenos data-linenos="224 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
-</span><span id=__span-0-225><a id=__codelineno-0-225 name=__codelineno-0-225></a><a href=#__codelineno-0-225><span class=linenos data-linenos="225 "></span></a>            <span class=s2>&quot;fit_output&quot;</span><span class=p>,</span>
-</span><span id=__span-0-226><a id=__codelineno-0-226 name=__codelineno-0-226></a><a href=#__codelineno-0-226><span class=linenos data-linenos="226 "></span></a>            <span class=s2>&quot;interpolation&quot;</span><span class=p>,</span>
-</span><span id=__span-0-227><a id=__codelineno-0-227 name=__codelineno-0-227></a><a href=#__codelineno-0-227><span class=linenos data-linenos="227 "></span></a>            <span class=s2>&quot;mask_interpolation&quot;</span><span class=p>,</span>
-</span><span id=__span-0-228><a id=__codelineno-0-228 name=__codelineno-0-228></a><a href=#__codelineno-0-228><span class=linenos data-linenos="228 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos=" 62 "></span></a>        <span class=n>pad_mode</span><span class=p>:</span> <span class=n>BorderModeType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos=" 63 "></span></a>        <span class=n>pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos=" 64 "></span></a>        <span class=n>mask_pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos=" 65 "></span></a>        <span class=n>fit_output</span><span class=p>:</span> <span class=nb>bool</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos=" 66 "></span></a>        <span class=n>interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
+</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos=" 67 "></span></a>        <span class=n>mask_interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos=" 68 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos=" 69 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span>
+</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos=" 70 "></span></a>        <span class=n>border_mode</span><span class=p>:</span> <span class=n>BorderModeType</span>
+</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos=" 71 "></span></a>
+</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos=" 72 "></span></a>        <span class=nd>@model_validator</span><span class=p>(</span><span class=n>mode</span><span class=o>=</span><span class=s2>&quot;after&quot;</span><span class=p>)</span>
+</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos=" 73 "></span></a>        <span class=k>def</span> <span class=nf>validate_deprecated_fields</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>Self</span><span class=p>:</span>
+</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos=" 74 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_mode</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos=" 75 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;pad_mode is deprecated, use border_mode instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos=" 76 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_mode</span>
+</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos=" 77 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_val</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos=" 78 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;pad_val is deprecated, use fill instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos=" 79 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>pad_val</span>
+</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos=" 80 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_pad_val</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos=" 81 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;mask_pad_val is deprecated, use fill_mask instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos=" 82 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_pad_val</span>
+</span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos=" 83 "></span></a>            <span class=k>return</span> <span class=bp>self</span>
+</span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos=" 84 "></span></a>
+</span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos=" 85 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
+</span><span id=__span-0-86><a id=__codelineno-0-86 name=__codelineno-0-86></a><a href=#__codelineno-0-86><span class=linenos data-linenos=" 86 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos=" 87 "></span></a>        <span class=n>scale</span><span class=p>:</span> <span class=n>ScaleFloatType</span> <span class=o>=</span> <span class=p>(</span><span class=mf>0.05</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>),</span>
+</span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos=" 88 "></span></a>        <span class=n>keep_size</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
+</span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos=" 89 "></span></a>        <span class=n>pad_mode</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-90><a id=__codelineno-0-90 name=__codelineno-0-90></a><a href=#__codelineno-0-90><span class=linenos data-linenos=" 90 "></span></a>        <span class=n>pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-91><a id=__codelineno-0-91 name=__codelineno-0-91></a><a href=#__codelineno-0-91><span class=linenos data-linenos=" 91 "></span></a>        <span class=n>mask_pad_val</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-92><a id=__codelineno-0-92 name=__codelineno-0-92></a><a href=#__codelineno-0-92><span class=linenos data-linenos=" 92 "></span></a>        <span class=n>fit_output</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
+</span><span id=__span-0-93><a id=__codelineno-0-93 name=__codelineno-0-93></a><a href=#__codelineno-0-93><span class=linenos data-linenos=" 93 "></span></a>        <span class=n>interpolation</span><span class=p>:</span> <span class=nb>int</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>INTER_LINEAR</span><span class=p>,</span>
+</span><span id=__span-0-94><a id=__codelineno-0-94 name=__codelineno-0-94></a><a href=#__codelineno-0-94><span class=linenos data-linenos=" 94 "></span></a>        <span class=n>mask_interpolation</span><span class=p>:</span> <span class=nb>int</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>INTER_NEAREST</span><span class=p>,</span>
+</span><span id=__span-0-95><a id=__codelineno-0-95 name=__codelineno-0-95></a><a href=#__codelineno-0-95><span class=linenos data-linenos=" 95 "></span></a>        <span class=n>border_mode</span><span class=p>:</span> <span class=nb>int</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>BORDER_CONSTANT</span><span class=p>,</span>
+</span><span id=__span-0-96><a id=__codelineno-0-96 name=__codelineno-0-96></a><a href=#__codelineno-0-96><span class=linenos data-linenos=" 96 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-97><a id=__codelineno-0-97 name=__codelineno-0-97></a><a href=#__codelineno-0-97><span class=linenos data-linenos=" 97 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-98><a id=__codelineno-0-98 name=__codelineno-0-98></a><a href=#__codelineno-0-98><span class=linenos data-linenos=" 98 "></span></a>        <span class=n>p</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>0.5</span><span class=p>,</span>
+</span><span id=__span-0-99><a id=__codelineno-0-99 name=__codelineno-0-99></a><a href=#__codelineno-0-99><span class=linenos data-linenos=" 99 "></span></a>        <span class=n>always_apply</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-100><a id=__codelineno-0-100 name=__codelineno-0-100></a><a href=#__codelineno-0-100><span class=linenos data-linenos="100 "></span></a>    <span class=p>):</span>
+</span><span id=__span-0-101><a id=__codelineno-0-101 name=__codelineno-0-101></a><a href=#__codelineno-0-101><span class=linenos data-linenos="101 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>p</span><span class=p>,</span> <span class=n>always_apply</span><span class=o>=</span><span class=n>always_apply</span><span class=p>)</span>
+</span><span id=__span-0-102><a id=__codelineno-0-102 name=__codelineno-0-102></a><a href=#__codelineno-0-102><span class=linenos data-linenos="102 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>scale</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>float</span><span class=p>,</span> <span class=nb>float</span><span class=p>],</span> <span class=n>scale</span><span class=p>)</span>
+</span><span id=__span-0-103><a id=__codelineno-0-103 name=__codelineno-0-103></a><a href=#__codelineno-0-103><span class=linenos data-linenos="103 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span> <span class=o>=</span> <span class=n>keep_size</span>
+</span><span id=__span-0-104><a id=__codelineno-0-104 name=__codelineno-0-104></a><a href=#__codelineno-0-104><span class=linenos data-linenos="104 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span> <span class=o>=</span> <span class=n>border_mode</span>
+</span><span id=__span-0-105><a id=__codelineno-0-105 name=__codelineno-0-105></a><a href=#__codelineno-0-105><span class=linenos data-linenos="105 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=n>fill</span>
+</span><span id=__span-0-106><a id=__codelineno-0-106 name=__codelineno-0-106></a><a href=#__codelineno-0-106><span class=linenos data-linenos="106 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=n>fill_mask</span>
+</span><span id=__span-0-107><a id=__codelineno-0-107 name=__codelineno-0-107></a><a href=#__codelineno-0-107><span class=linenos data-linenos="107 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fit_output</span> <span class=o>=</span> <span class=n>fit_output</span>
+</span><span id=__span-0-108><a id=__codelineno-0-108 name=__codelineno-0-108></a><a href=#__codelineno-0-108><span class=linenos data-linenos="108 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>interpolation</span> <span class=o>=</span> <span class=n>interpolation</span>
+</span><span id=__span-0-109><a id=__codelineno-0-109 name=__codelineno-0-109></a><a href=#__codelineno-0-109><span class=linenos data-linenos="109 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>mask_interpolation</span> <span class=o>=</span> <span class=n>mask_interpolation</span>
+</span><span id=__span-0-110><a id=__codelineno-0-110 name=__codelineno-0-110></a><a href=#__codelineno-0-110><span class=linenos data-linenos="110 "></span></a>
+</span><span id=__span-0-111><a id=__codelineno-0-111 name=__codelineno-0-111></a><a href=#__codelineno-0-111><span class=linenos data-linenos="111 "></span></a>    <span class=k>def</span> <span class=nf>apply</span><span class=p>(</span>
+</span><span id=__span-0-112><a id=__codelineno-0-112 name=__codelineno-0-112></a><a href=#__codelineno-0-112><span class=linenos data-linenos="112 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-113><a id=__codelineno-0-113 name=__codelineno-0-113></a><a href=#__codelineno-0-113><span class=linenos data-linenos="113 "></span></a>        <span class=n>img</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-114><a id=__codelineno-0-114 name=__codelineno-0-114></a><a href=#__codelineno-0-114><span class=linenos data-linenos="114 "></span></a>        <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-115><a id=__codelineno-0-115 name=__codelineno-0-115></a><a href=#__codelineno-0-115><span class=linenos data-linenos="115 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-116><a id=__codelineno-0-116 name=__codelineno-0-116></a><a href=#__codelineno-0-116><span class=linenos data-linenos="116 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-117><a id=__codelineno-0-117 name=__codelineno-0-117></a><a href=#__codelineno-0-117><span class=linenos data-linenos="117 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
+</span><span id=__span-0-118><a id=__codelineno-0-118 name=__codelineno-0-118></a><a href=#__codelineno-0-118><span class=linenos data-linenos="118 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-119><a id=__codelineno-0-119 name=__codelineno-0-119></a><a href=#__codelineno-0-119><span class=linenos data-linenos="119 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective</span><span class=p>(</span>
+</span><span id=__span-0-120><a id=__codelineno-0-120 name=__codelineno-0-120></a><a href=#__codelineno-0-120><span class=linenos data-linenos="120 "></span></a>            <span class=n>img</span><span class=p>,</span>
+</span><span id=__span-0-121><a id=__codelineno-0-121 name=__codelineno-0-121></a><a href=#__codelineno-0-121><span class=linenos data-linenos="121 "></span></a>            <span class=n>matrix</span><span class=p>,</span>
+</span><span id=__span-0-122><a id=__codelineno-0-122 name=__codelineno-0-122></a><a href=#__codelineno-0-122><span class=linenos data-linenos="122 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
+</span><span id=__span-0-123><a id=__codelineno-0-123 name=__codelineno-0-123></a><a href=#__codelineno-0-123><span class=linenos data-linenos="123 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
+</span><span id=__span-0-124><a id=__codelineno-0-124 name=__codelineno-0-124></a><a href=#__codelineno-0-124><span class=linenos data-linenos="124 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>fill</span><span class=p>,</span>
+</span><span id=__span-0-125><a id=__codelineno-0-125 name=__codelineno-0-125></a><a href=#__codelineno-0-125><span class=linenos data-linenos="125 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span><span class=p>,</span>
+</span><span id=__span-0-126><a id=__codelineno-0-126 name=__codelineno-0-126></a><a href=#__codelineno-0-126><span class=linenos data-linenos="126 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
+</span><span id=__span-0-127><a id=__codelineno-0-127 name=__codelineno-0-127></a><a href=#__codelineno-0-127><span class=linenos data-linenos="127 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>interpolation</span><span class=p>,</span>
+</span><span id=__span-0-128><a id=__codelineno-0-128 name=__codelineno-0-128></a><a href=#__codelineno-0-128><span class=linenos data-linenos="128 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-129><a id=__codelineno-0-129 name=__codelineno-0-129></a><a href=#__codelineno-0-129><span class=linenos data-linenos="129 "></span></a>
+</span><span id=__span-0-130><a id=__codelineno-0-130 name=__codelineno-0-130></a><a href=#__codelineno-0-130><span class=linenos data-linenos="130 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_mask</span><span class=p>(</span>
+</span><span id=__span-0-131><a id=__codelineno-0-131 name=__codelineno-0-131></a><a href=#__codelineno-0-131><span class=linenos data-linenos="131 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-132><a id=__codelineno-0-132 name=__codelineno-0-132></a><a href=#__codelineno-0-132><span class=linenos data-linenos="132 "></span></a>        <span class=n>mask</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-133><a id=__codelineno-0-133 name=__codelineno-0-133></a><a href=#__codelineno-0-133><span class=linenos data-linenos="133 "></span></a>        <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-134><a id=__codelineno-0-134 name=__codelineno-0-134></a><a href=#__codelineno-0-134><span class=linenos data-linenos="134 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-135><a id=__codelineno-0-135 name=__codelineno-0-135></a><a href=#__codelineno-0-135><span class=linenos data-linenos="135 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-136><a id=__codelineno-0-136 name=__codelineno-0-136></a><a href=#__codelineno-0-136><span class=linenos data-linenos="136 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
+</span><span id=__span-0-137><a id=__codelineno-0-137 name=__codelineno-0-137></a><a href=#__codelineno-0-137><span class=linenos data-linenos="137 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-138><a id=__codelineno-0-138 name=__codelineno-0-138></a><a href=#__codelineno-0-138><span class=linenos data-linenos="138 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective</span><span class=p>(</span>
+</span><span id=__span-0-139><a id=__codelineno-0-139 name=__codelineno-0-139></a><a href=#__codelineno-0-139><span class=linenos data-linenos="139 "></span></a>            <span class=n>mask</span><span class=p>,</span>
+</span><span id=__span-0-140><a id=__codelineno-0-140 name=__codelineno-0-140></a><a href=#__codelineno-0-140><span class=linenos data-linenos="140 "></span></a>            <span class=n>matrix</span><span class=p>,</span>
+</span><span id=__span-0-141><a id=__codelineno-0-141 name=__codelineno-0-141></a><a href=#__codelineno-0-141><span class=linenos data-linenos="141 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
+</span><span id=__span-0-142><a id=__codelineno-0-142 name=__codelineno-0-142></a><a href=#__codelineno-0-142><span class=linenos data-linenos="142 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
+</span><span id=__span-0-143><a id=__codelineno-0-143 name=__codelineno-0-143></a><a href=#__codelineno-0-143><span class=linenos data-linenos="143 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span><span class=p>,</span>
+</span><span id=__span-0-144><a id=__codelineno-0-144 name=__codelineno-0-144></a><a href=#__codelineno-0-144><span class=linenos data-linenos="144 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>border_mode</span><span class=p>,</span>
+</span><span id=__span-0-145><a id=__codelineno-0-145 name=__codelineno-0-145></a><a href=#__codelineno-0-145><span class=linenos data-linenos="145 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
+</span><span id=__span-0-146><a id=__codelineno-0-146 name=__codelineno-0-146></a><a href=#__codelineno-0-146><span class=linenos data-linenos="146 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>mask_interpolation</span><span class=p>,</span>
+</span><span id=__span-0-147><a id=__codelineno-0-147 name=__codelineno-0-147></a><a href=#__codelineno-0-147><span class=linenos data-linenos="147 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-148><a id=__codelineno-0-148 name=__codelineno-0-148></a><a href=#__codelineno-0-148><span class=linenos data-linenos="148 "></span></a>
+</span><span id=__span-0-149><a id=__codelineno-0-149 name=__codelineno-0-149></a><a href=#__codelineno-0-149><span class=linenos data-linenos="149 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_bboxes</span><span class=p>(</span>
+</span><span id=__span-0-150><a id=__codelineno-0-150 name=__codelineno-0-150></a><a href=#__codelineno-0-150><span class=linenos data-linenos="150 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-151><a id=__codelineno-0-151 name=__codelineno-0-151></a><a href=#__codelineno-0-151><span class=linenos data-linenos="151 "></span></a>        <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-152><a id=__codelineno-0-152 name=__codelineno-0-152></a><a href=#__codelineno-0-152><span class=linenos data-linenos="152 "></span></a>        <span class=n>matrix_bbox</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-153><a id=__codelineno-0-153 name=__codelineno-0-153></a><a href=#__codelineno-0-153><span class=linenos data-linenos="153 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-154><a id=__codelineno-0-154 name=__codelineno-0-154></a><a href=#__codelineno-0-154><span class=linenos data-linenos="154 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-155><a id=__codelineno-0-155 name=__codelineno-0-155></a><a href=#__codelineno-0-155><span class=linenos data-linenos="155 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
+</span><span id=__span-0-156><a id=__codelineno-0-156 name=__codelineno-0-156></a><a href=#__codelineno-0-156><span class=linenos data-linenos="156 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-157><a id=__codelineno-0-157 name=__codelineno-0-157></a><a href=#__codelineno-0-157><span class=linenos data-linenos="157 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective_bboxes</span><span class=p>(</span>
+</span><span id=__span-0-158><a id=__codelineno-0-158 name=__codelineno-0-158></a><a href=#__codelineno-0-158><span class=linenos data-linenos="158 "></span></a>            <span class=n>bboxes</span><span class=p>,</span>
+</span><span id=__span-0-159><a id=__codelineno-0-159 name=__codelineno-0-159></a><a href=#__codelineno-0-159><span class=linenos data-linenos="159 "></span></a>            <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>],</span>
+</span><span id=__span-0-160><a id=__codelineno-0-160 name=__codelineno-0-160></a><a href=#__codelineno-0-160><span class=linenos data-linenos="160 "></span></a>            <span class=n>matrix_bbox</span><span class=p>,</span>
+</span><span id=__span-0-161><a id=__codelineno-0-161 name=__codelineno-0-161></a><a href=#__codelineno-0-161><span class=linenos data-linenos="161 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
+</span><span id=__span-0-162><a id=__codelineno-0-162 name=__codelineno-0-162></a><a href=#__codelineno-0-162><span class=linenos data-linenos="162 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
+</span><span id=__span-0-163><a id=__codelineno-0-163 name=__codelineno-0-163></a><a href=#__codelineno-0-163><span class=linenos data-linenos="163 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
+</span><span id=__span-0-164><a id=__codelineno-0-164 name=__codelineno-0-164></a><a href=#__codelineno-0-164><span class=linenos data-linenos="164 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-165><a id=__codelineno-0-165 name=__codelineno-0-165></a><a href=#__codelineno-0-165><span class=linenos data-linenos="165 "></span></a>
+</span><span id=__span-0-166><a id=__codelineno-0-166 name=__codelineno-0-166></a><a href=#__codelineno-0-166><span class=linenos data-linenos="166 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span>
+</span><span id=__span-0-167><a id=__codelineno-0-167 name=__codelineno-0-167></a><a href=#__codelineno-0-167><span class=linenos data-linenos="167 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-168><a id=__codelineno-0-168 name=__codelineno-0-168></a><a href=#__codelineno-0-168><span class=linenos data-linenos="168 "></span></a>        <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-169><a id=__codelineno-0-169 name=__codelineno-0-169></a><a href=#__codelineno-0-169><span class=linenos data-linenos="169 "></span></a>        <span class=n>matrix</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-170><a id=__codelineno-0-170 name=__codelineno-0-170></a><a href=#__codelineno-0-170><span class=linenos data-linenos="170 "></span></a>        <span class=n>max_height</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-171><a id=__codelineno-0-171 name=__codelineno-0-171></a><a href=#__codelineno-0-171><span class=linenos data-linenos="171 "></span></a>        <span class=n>max_width</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span>
+</span><span id=__span-0-172><a id=__codelineno-0-172 name=__codelineno-0-172></a><a href=#__codelineno-0-172><span class=linenos data-linenos="172 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
+</span><span id=__span-0-173><a id=__codelineno-0-173 name=__codelineno-0-173></a><a href=#__codelineno-0-173><span class=linenos data-linenos="173 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-174><a id=__codelineno-0-174 name=__codelineno-0-174></a><a href=#__codelineno-0-174><span class=linenos data-linenos="174 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>perspective_keypoints</span><span class=p>(</span>
+</span><span id=__span-0-175><a id=__codelineno-0-175 name=__codelineno-0-175></a><a href=#__codelineno-0-175><span class=linenos data-linenos="175 "></span></a>            <span class=n>keypoints</span><span class=p>,</span>
+</span><span id=__span-0-176><a id=__codelineno-0-176 name=__codelineno-0-176></a><a href=#__codelineno-0-176><span class=linenos data-linenos="176 "></span></a>            <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>],</span>
+</span><span id=__span-0-177><a id=__codelineno-0-177 name=__codelineno-0-177></a><a href=#__codelineno-0-177><span class=linenos data-linenos="177 "></span></a>            <span class=n>matrix</span><span class=p>,</span>
+</span><span id=__span-0-178><a id=__codelineno-0-178 name=__codelineno-0-178></a><a href=#__codelineno-0-178><span class=linenos data-linenos="178 "></span></a>            <span class=n>max_width</span><span class=p>,</span>
+</span><span id=__span-0-179><a id=__codelineno-0-179 name=__codelineno-0-179></a><a href=#__codelineno-0-179><span class=linenos data-linenos="179 "></span></a>            <span class=n>max_height</span><span class=p>,</span>
+</span><span id=__span-0-180><a id=__codelineno-0-180 name=__codelineno-0-180></a><a href=#__codelineno-0-180><span class=linenos data-linenos="180 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>keep_size</span><span class=p>,</span>
+</span><span id=__span-0-181><a id=__codelineno-0-181 name=__codelineno-0-181></a><a href=#__codelineno-0-181><span class=linenos data-linenos="181 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-182><a id=__codelineno-0-182 name=__codelineno-0-182></a><a href=#__codelineno-0-182><span class=linenos data-linenos="182 "></span></a>
+</span><span id=__span-0-183><a id=__codelineno-0-183 name=__codelineno-0-183></a><a href=#__codelineno-0-183><span class=linenos data-linenos="183 "></span></a>    <span class=k>def</span> <span class=nf>get_params_dependent_on_data</span><span class=p>(</span>
+</span><span id=__span-0-184><a id=__codelineno-0-184 name=__codelineno-0-184></a><a href=#__codelineno-0-184><span class=linenos data-linenos="184 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-185><a id=__codelineno-0-185 name=__codelineno-0-185></a><a href=#__codelineno-0-185><span class=linenos data-linenos="185 "></span></a>        <span class=n>params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
+</span><span id=__span-0-186><a id=__codelineno-0-186 name=__codelineno-0-186></a><a href=#__codelineno-0-186><span class=linenos data-linenos="186 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
+</span><span id=__span-0-187><a id=__codelineno-0-187 name=__codelineno-0-187></a><a href=#__codelineno-0-187><span class=linenos data-linenos="187 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
+</span><span id=__span-0-188><a id=__codelineno-0-188 name=__codelineno-0-188></a><a href=#__codelineno-0-188><span class=linenos data-linenos="188 "></span></a>        <span class=n>image_shape</span> <span class=o>=</span> <span class=n>params</span><span class=p>[</span><span class=s2>&quot;shape&quot;</span><span class=p>][:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-189><a id=__codelineno-0-189 name=__codelineno-0-189></a><a href=#__codelineno-0-189><span class=linenos data-linenos="189 "></span></a>
+</span><span id=__span-0-190><a id=__codelineno-0-190 name=__codelineno-0-190></a><a href=#__codelineno-0-190><span class=linenos data-linenos="190 "></span></a>        <span class=n>scale</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>uniform</span><span class=p>(</span><span class=o>*</span><span class=bp>self</span><span class=o>.</span><span class=n>scale</span><span class=p>)</span>
+</span><span id=__span-0-191><a id=__codelineno-0-191 name=__codelineno-0-191></a><a href=#__codelineno-0-191><span class=linenos data-linenos="191 "></span></a>
+</span><span id=__span-0-192><a id=__codelineno-0-192 name=__codelineno-0-192></a><a href=#__codelineno-0-192><span class=linenos data-linenos="192 "></span></a>        <span class=n>points</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>generate_perspective_points</span><span class=p>(</span>
+</span><span id=__span-0-193><a id=__codelineno-0-193 name=__codelineno-0-193></a><a href=#__codelineno-0-193><span class=linenos data-linenos="193 "></span></a>            <span class=n>image_shape</span><span class=p>,</span>
+</span><span id=__span-0-194><a id=__codelineno-0-194 name=__codelineno-0-194></a><a href=#__codelineno-0-194><span class=linenos data-linenos="194 "></span></a>            <span class=n>scale</span><span class=p>,</span>
+</span><span id=__span-0-195><a id=__codelineno-0-195 name=__codelineno-0-195></a><a href=#__codelineno-0-195><span class=linenos data-linenos="195 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>random_generator</span><span class=p>,</span>
+</span><span id=__span-0-196><a id=__codelineno-0-196 name=__codelineno-0-196></a><a href=#__codelineno-0-196><span class=linenos data-linenos="196 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-197><a id=__codelineno-0-197 name=__codelineno-0-197></a><a href=#__codelineno-0-197><span class=linenos data-linenos="197 "></span></a>        <span class=n>points</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>order_points</span><span class=p>(</span><span class=n>points</span><span class=p>)</span>
+</span><span id=__span-0-198><a id=__codelineno-0-198 name=__codelineno-0-198></a><a href=#__codelineno-0-198><span class=linenos data-linenos="198 "></span></a>
+</span><span id=__span-0-199><a id=__codelineno-0-199 name=__codelineno-0-199></a><a href=#__codelineno-0-199><span class=linenos data-linenos="199 "></span></a>        <span class=n>matrix</span><span class=p>,</span> <span class=n>max_width</span><span class=p>,</span> <span class=n>max_height</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>compute_perspective_params</span><span class=p>(</span>
+</span><span id=__span-0-200><a id=__codelineno-0-200 name=__codelineno-0-200></a><a href=#__codelineno-0-200><span class=linenos data-linenos="200 "></span></a>            <span class=n>points</span><span class=p>,</span>
+</span><span id=__span-0-201><a id=__codelineno-0-201 name=__codelineno-0-201></a><a href=#__codelineno-0-201><span class=linenos data-linenos="201 "></span></a>            <span class=n>image_shape</span><span class=p>,</span>
+</span><span id=__span-0-202><a id=__codelineno-0-202 name=__codelineno-0-202></a><a href=#__codelineno-0-202><span class=linenos data-linenos="202 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-203><a id=__codelineno-0-203 name=__codelineno-0-203></a><a href=#__codelineno-0-203><span class=linenos data-linenos="203 "></span></a>
+</span><span id=__span-0-204><a id=__codelineno-0-204 name=__codelineno-0-204></a><a href=#__codelineno-0-204><span class=linenos data-linenos="204 "></span></a>        <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>fit_output</span><span class=p>:</span>
+</span><span id=__span-0-205><a id=__codelineno-0-205 name=__codelineno-0-205></a><a href=#__codelineno-0-205><span class=linenos data-linenos="205 "></span></a>            <span class=n>matrix</span><span class=p>,</span> <span class=n>max_width</span><span class=p>,</span> <span class=n>max_height</span> <span class=o>=</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>expand_transform</span><span class=p>(</span>
+</span><span id=__span-0-206><a id=__codelineno-0-206 name=__codelineno-0-206></a><a href=#__codelineno-0-206><span class=linenos data-linenos="206 "></span></a>                <span class=n>matrix</span><span class=p>,</span>
+</span><span id=__span-0-207><a id=__codelineno-0-207 name=__codelineno-0-207></a><a href=#__codelineno-0-207><span class=linenos data-linenos="207 "></span></a>                <span class=n>image_shape</span><span class=p>,</span>
+</span><span id=__span-0-208><a id=__codelineno-0-208 name=__codelineno-0-208></a><a href=#__codelineno-0-208><span class=linenos data-linenos="208 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-209><a id=__codelineno-0-209 name=__codelineno-0-209></a><a href=#__codelineno-0-209><span class=linenos data-linenos="209 "></span></a>
+</span><span id=__span-0-210><a id=__codelineno-0-210 name=__codelineno-0-210></a><a href=#__codelineno-0-210><span class=linenos data-linenos="210 "></span></a>        <span class=k>return</span> <span class=p>{</span>
+</span><span id=__span-0-211><a id=__codelineno-0-211 name=__codelineno-0-211></a><a href=#__codelineno-0-211><span class=linenos data-linenos="211 "></span></a>            <span class=s2>&quot;matrix&quot;</span><span class=p>:</span> <span class=n>matrix</span><span class=p>,</span>
+</span><span id=__span-0-212><a id=__codelineno-0-212 name=__codelineno-0-212></a><a href=#__codelineno-0-212><span class=linenos data-linenos="212 "></span></a>            <span class=s2>&quot;max_height&quot;</span><span class=p>:</span> <span class=n>max_height</span><span class=p>,</span>
+</span><span id=__span-0-213><a id=__codelineno-0-213 name=__codelineno-0-213></a><a href=#__codelineno-0-213><span class=linenos data-linenos="213 "></span></a>            <span class=s2>&quot;max_width&quot;</span><span class=p>:</span> <span class=n>max_width</span><span class=p>,</span>
+</span><span id=__span-0-214><a id=__codelineno-0-214 name=__codelineno-0-214></a><a href=#__codelineno-0-214><span class=linenos data-linenos="214 "></span></a>            <span class=s2>&quot;matrix_bbox&quot;</span><span class=p>:</span> <span class=n>matrix</span><span class=p>,</span>
+</span><span id=__span-0-215><a id=__codelineno-0-215 name=__codelineno-0-215></a><a href=#__codelineno-0-215><span class=linenos data-linenos="215 "></span></a>        <span class=p>}</span>
+</span><span id=__span-0-216><a id=__codelineno-0-216 name=__codelineno-0-216></a><a href=#__codelineno-0-216><span class=linenos data-linenos="216 "></span></a>
+</span><span id=__span-0-217><a id=__codelineno-0-217 name=__codelineno-0-217></a><a href=#__codelineno-0-217><span class=linenos data-linenos="217 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
+</span><span id=__span-0-218><a id=__codelineno-0-218 name=__codelineno-0-218></a><a href=#__codelineno-0-218><span class=linenos data-linenos="218 "></span></a>        <span class=k>return</span> <span class=p>(</span>
+</span><span id=__span-0-219><a id=__codelineno-0-219 name=__codelineno-0-219></a><a href=#__codelineno-0-219><span class=linenos data-linenos="219 "></span></a>            <span class=s2>&quot;scale&quot;</span><span class=p>,</span>
+</span><span id=__span-0-220><a id=__codelineno-0-220 name=__codelineno-0-220></a><a href=#__codelineno-0-220><span class=linenos data-linenos="220 "></span></a>            <span class=s2>&quot;keep_size&quot;</span><span class=p>,</span>
+</span><span id=__span-0-221><a id=__codelineno-0-221 name=__codelineno-0-221></a><a href=#__codelineno-0-221><span class=linenos data-linenos="221 "></span></a>            <span class=s2>&quot;border_mode&quot;</span><span class=p>,</span>
+</span><span id=__span-0-222><a id=__codelineno-0-222 name=__codelineno-0-222></a><a href=#__codelineno-0-222><span class=linenos data-linenos="222 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
+</span><span id=__span-0-223><a id=__codelineno-0-223 name=__codelineno-0-223></a><a href=#__codelineno-0-223><span class=linenos data-linenos="223 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
+</span><span id=__span-0-224><a id=__codelineno-0-224 name=__codelineno-0-224></a><a href=#__codelineno-0-224><span class=linenos data-linenos="224 "></span></a>            <span class=s2>&quot;fit_output&quot;</span><span class=p>,</span>
+</span><span id=__span-0-225><a id=__codelineno-0-225 name=__codelineno-0-225></a><a href=#__codelineno-0-225><span class=linenos data-linenos="225 "></span></a>            <span class=s2>&quot;interpolation&quot;</span><span class=p>,</span>
+</span><span id=__span-0-226><a id=__codelineno-0-226 name=__codelineno-0-226></a><a href=#__codelineno-0-226><span class=linenos data-linenos="226 "></span></a>            <span class=s2>&quot;mask_interpolation&quot;</span><span class=p>,</span>
+</span><span id=__span-0-227><a id=__codelineno-0-227 name=__codelineno-0-227></a><a href=#__codelineno-0-227><span class=linenos data-linenos="227 "></span></a>        <span class=p>)</span>
 </span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.PiecewiseAffine class="doc doc-heading" data-toc-label=PiecewiseAffine> <code>class <strong> PiecewiseAffine</strong></code> <code> (scale=(0.03, 0.05), nb_rows=(4, 4), nb_cols=(4, 4), interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, absolute_scale=False, p=0.5, always_apply=None, keypoints_threshold=0.01) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.PiecewiseAffine title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Apply piecewise affine transformations to the input image.</p> <p>This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points around via affine transformations. This leads to local distortions in the image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>scale</code></td> <td><code>tuple[float, float] | float</code></td> <td><p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Recommended values are in the range (0.01, 0.05) for small distortions, and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).</p></td> </tr> <tr> <td><code>nb_rows</code></td> <td><code>tuple[int, int] | int</code></td> <td><p>Number of rows of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of rows. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p></td> </tr> <tr> <td><code>nb_cols</code></td> <td><code>tuple[int, int] | int</code></td> <td><p>Number of columns of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of columns. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p></td> </tr> <tr> <td><code>interpolation</code></td> <td><code>OpenCV flag</code></td> <td><p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p></td> </tr> <tr> <td><code>mask_interpolation</code></td> <td><code>OpenCV flag</code></td> <td><p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p></td> </tr> <tr> <td><code>absolute_scale</code></td> <td><code>bool</code></td> <td><p>If set to True, the value of the scale parameter will be treated as an absolute pixel value. If set to False, it will be treated as a fraction of the image height and width. Default: False.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This augmentation is very slow. Consider using <code>ElasticTransform</code> instead, which is at least 10x faster.</li> <li>The augmentation may not always produce visible effects, especially with small scale values.</li> <li>For keypoints and bounding boxes, the transformation might move them outside the image boundaries. In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>3</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
@@ -2209,41 +2208,41 @@
 </span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos=" 45 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=n>ALL_TARGETS</span>
 </span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos=" 46 "></span></a>
 </span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos=" 47 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>BaseTransformInitSchema</span><span class=p>):</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos=" 48 "></span></a>        <span class=n>shift_limit</span><span class=p>:</span> <span class=n>SymmetricRangeType</span> <span class=o>=</span> <span class=p>(</span><span class=o>-</span><span class=mf>0.0625</span><span class=p>,</span> <span class=mf>0.0625</span><span class=p>)</span>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos=" 49 "></span></a>        <span class=n>scale_limit</span><span class=p>:</span> <span class=n>SymmetricRangeType</span> <span class=o>=</span> <span class=p>(</span><span class=o>-</span><span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>)</span>
-</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos=" 50 "></span></a>        <span class=n>rotate_limit</span><span class=p>:</span> <span class=n>SymmetricRangeType</span> <span class=o>=</span> <span class=p>(</span><span class=o>-</span><span class=mi>45</span><span class=p>,</span> <span class=mi>45</span><span class=p>)</span>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos=" 51 "></span></a>        <span class=n>interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>INTER_LINEAR</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos=" 52 "></span></a>        <span class=n>border_mode</span><span class=p>:</span> <span class=n>BorderModeType</span> <span class=o>=</span> <span class=n>cv2</span><span class=o>.</span><span class=n>BORDER_REFLECT_101</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos=" 48 "></span></a>        <span class=n>shift_limit</span><span class=p>:</span> <span class=n>SymmetricRangeType</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos=" 49 "></span></a>        <span class=n>scale_limit</span><span class=p>:</span> <span class=n>SymmetricRangeType</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos=" 50 "></span></a>        <span class=n>rotate_limit</span><span class=p>:</span> <span class=n>SymmetricRangeType</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos=" 51 "></span></a>        <span class=n>interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos=" 52 "></span></a>        <span class=n>border_mode</span><span class=p>:</span> <span class=n>BorderModeType</span>
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos=" 53 "></span></a>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos=" 54 "></span></a>        <span class=n>value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span>
-</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos=" 55 "></span></a>            <span class=n>default</span><span class=o>=</span><span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos=" 56 "></span></a>            <span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated. Use fill instead.&quot;</span><span class=p>,</span>
-</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos=" 57 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos=" 58 "></span></a>        <span class=n>mask_value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span>
-</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos=" 59 "></span></a>            <span class=n>default</span><span class=o>=</span><span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos=" 60 "></span></a>            <span class=n>deprecated</span><span class=o>=</span><span class=s2>&quot;Deprecated. Use fill_mask instead.&quot;</span><span class=p>,</span>
-</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos=" 61 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos=" 62 "></span></a>
-</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos=" 63 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span>
-</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos=" 64 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span>
-</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos=" 65 "></span></a>
-</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos=" 66 "></span></a>        <span class=n>shift_limit_x</span><span class=p>:</span> <span class=n>ScaleFloatType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span><span class=n>default</span><span class=o>=</span><span class=kc>None</span><span class=p>)</span>
-</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos=" 67 "></span></a>        <span class=n>shift_limit_y</span><span class=p>:</span> <span class=n>ScaleFloatType</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=n>Field</span><span class=p>(</span><span class=n>default</span><span class=o>=</span><span class=kc>None</span><span class=p>)</span>
-</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos=" 68 "></span></a>        <span class=n>rotate_method</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;largest_box&quot;</span><span class=p>,</span> <span class=s2>&quot;ellipse&quot;</span><span class=p>]</span> <span class=o>=</span> <span class=s2>&quot;largest_box&quot;</span>
-</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos=" 69 "></span></a>        <span class=n>mask_interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
-</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos=" 70 "></span></a>
-</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos=" 71 "></span></a>        <span class=nd>@model_validator</span><span class=p>(</span><span class=n>mode</span><span class=o>=</span><span class=s2>&quot;after&quot;</span><span class=p>)</span>
-</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos=" 72 "></span></a>        <span class=k>def</span> <span class=nf>check_shift_limit</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>Self</span><span class=p>:</span>
-</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos=" 73 "></span></a>            <span class=n>bounds</span> <span class=o>=</span> <span class=o>-</span><span class=mi>1</span><span class=p>,</span> <span class=mi>1</span>
-</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos=" 74 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span> <span class=o>=</span> <span class=n>to_tuple</span><span class=p>(</span>
-</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos=" 75 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span> <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span> <span class=k>else</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit</span><span class=p>,</span>
-</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos=" 76 "></span></a>            <span class=p>)</span>
-</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos=" 77 "></span></a>            <span class=n>check_range</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span><span class=p>,</span> <span class=o>*</span><span class=n>bounds</span><span class=p>,</span> <span class=s2>&quot;shift_limit_x&quot;</span><span class=p>)</span>
-</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos=" 78 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span> <span class=o>=</span> <span class=n>to_tuple</span><span class=p>(</span>
-</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos=" 79 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span> <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span> <span class=k>else</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit</span><span class=p>,</span>
-</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos=" 80 "></span></a>            <span class=p>)</span>
-</span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos=" 81 "></span></a>            <span class=n>check_range</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span><span class=p>,</span> <span class=o>*</span><span class=n>bounds</span><span class=p>,</span> <span class=s2>&quot;shift_limit_y&quot;</span><span class=p>)</span>
-</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos=" 82 "></span></a>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos=" 54 "></span></a>        <span class=n>value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos=" 55 "></span></a>        <span class=n>mask_value</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos=" 56 "></span></a>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos=" 57 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos=" 58 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos=" 59 "></span></a>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos=" 60 "></span></a>        <span class=n>shift_limit_x</span><span class=p>:</span> <span class=n>ScaleFloatType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos=" 61 "></span></a>        <span class=n>shift_limit_y</span><span class=p>:</span> <span class=n>ScaleFloatType</span> <span class=o>|</span> <span class=kc>None</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos=" 62 "></span></a>        <span class=n>rotate_method</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;largest_box&quot;</span><span class=p>,</span> <span class=s2>&quot;ellipse&quot;</span><span class=p>]</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos=" 63 "></span></a>        <span class=n>mask_interpolation</span><span class=p>:</span> <span class=n>InterpolationType</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos=" 64 "></span></a>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos=" 65 "></span></a>        <span class=nd>@model_validator</span><span class=p>(</span><span class=n>mode</span><span class=o>=</span><span class=s2>&quot;after&quot;</span><span class=p>)</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos=" 66 "></span></a>        <span class=k>def</span> <span class=nf>check_shift_limit</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>Self</span><span class=p>:</span>
+</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos=" 67 "></span></a>            <span class=n>bounds</span> <span class=o>=</span> <span class=o>-</span><span class=mi>1</span><span class=p>,</span> <span class=mi>1</span>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos=" 68 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span> <span class=o>=</span> <span class=n>to_tuple</span><span class=p>(</span>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos=" 69 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span> <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span> <span class=k>else</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit</span><span class=p>,</span>
+</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos=" 70 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos=" 71 "></span></a>            <span class=n>check_range</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_x</span><span class=p>,</span> <span class=o>*</span><span class=n>bounds</span><span class=p>,</span> <span class=s2>&quot;shift_limit_x&quot;</span><span class=p>)</span>
+</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos=" 72 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span> <span class=o>=</span> <span class=n>to_tuple</span><span class=p>(</span>
+</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos=" 73 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span> <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span> <span class=k>else</span> <span class=bp>self</span><span class=o>.</span><span class=n>shift_limit</span><span class=p>,</span>
+</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos=" 74 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos=" 75 "></span></a>            <span class=n>check_range</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>shift_limit_y</span><span class=p>,</span> <span class=o>*</span><span class=n>bounds</span><span class=p>,</span> <span class=s2>&quot;shift_limit_y&quot;</span><span class=p>)</span>
+</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos=" 76 "></span></a>
+</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos=" 77 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>value</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos=" 78 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;value is deprecated, use fill instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos=" 79 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>value</span>
+</span><span id=__span-0-80><a id=__codelineno-0-80 name=__codelineno-0-80></a><a href=#__codelineno-0-80><span class=linenos data-linenos=" 80 "></span></a>            <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_value</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-81><a id=__codelineno-0-81 name=__codelineno-0-81></a><a href=#__codelineno-0-81><span class=linenos data-linenos=" 81 "></span></a>                <span class=n>warn</span><span class=p>(</span><span class=s2>&quot;mask_value is deprecated, use fill_mask instead&quot;</span><span class=p>,</span> <span class=ne>DeprecationWarning</span><span class=p>,</span> <span class=n>stacklevel</span><span class=o>=</span><span class=mi>2</span><span class=p>)</span>
+</span><span id=__span-0-82><a id=__codelineno-0-82 name=__codelineno-0-82></a><a href=#__codelineno-0-82><span class=linenos data-linenos=" 82 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>mask_value</span>
 </span><span id=__span-0-83><a id=__codelineno-0-83 name=__codelineno-0-83></a><a href=#__codelineno-0-83><span class=linenos data-linenos=" 83 "></span></a>            <span class=k>return</span> <span class=bp>self</span>
 </span><span id=__span-0-84><a id=__codelineno-0-84 name=__codelineno-0-84></a><a href=#__codelineno-0-84><span class=linenos data-linenos=" 84 "></span></a>
 </span><span id=__span-0-85><a id=__codelineno-0-85 name=__codelineno-0-85></a><a href=#__codelineno-0-85><span class=linenos data-linenos=" 85 "></span></a>        <span class=nd>@field_validator</span><span class=p>(</span><span class=s2>&quot;scale_limit&quot;</span><span class=p>)</span>
@@ -2489,7 +2488,7 @@
 </span><span id=__span-0-151><a id=__codelineno-0-151 name=__codelineno-0-151></a><a href=#__codelineno-0-151><span class=linenos data-linenos="151 "></span></a>            <span class=s2>&quot;num_control_points&quot;</span><span class=p>,</span>
 </span><span id=__span-0-152><a id=__codelineno-0-152 name=__codelineno-0-152></a><a href=#__codelineno-0-152><span class=linenos data-linenos="152 "></span></a>            <span class=o>*</span><span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=n>get_transform_init_args_names</span><span class=p>(),</span>
 </span><span id=__span-0-153><a id=__codelineno-0-153 name=__codelineno-0-153></a><a href=#__codelineno-0-153><span class=linenos data-linenos="153 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.Transpose class="doc doc-heading" data-toc-label=Transpose> <code>class <strong> Transpose</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L1463 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.Transpose title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Transpose the input by swapping its rows and columns.</p> <p>This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The dimensions of the output will be swapped compared to the input. For example, an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).</li> <li>This transform is its own inverse. Applying it twice will return the original input.</li> <li>For multi-channel images (like RGB), the channels are preserved in their original order.</li> <li>Bounding boxes will have their coordinates adjusted to match the new image dimensions.</li> <li>Keypoints will have their x and y coordinates swapped.</li> </ul> </div> <p>Mathematical Details: 1. For an input image I of shape (H, W, C), the output O is: O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1] 2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max): new_bbox = (y_min, x_min, y_max, x_max) 3. For keypoints with coordinates (x, y): new_keypoint = (y, x)</p> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.Transpose class="doc doc-heading" data-toc-label=Transpose> <code>class <strong> Transpose</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L1462 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.Transpose title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Transpose the input by swapping its rows and columns.</p> <p>This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The dimensions of the output will be swapped compared to the input. For example, an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).</li> <li>This transform is its own inverse. Applying it twice will return the original input.</li> <li>For multi-channel images (like RGB), the channels are preserved in their original order.</li> <li>Bounding boxes will have their coordinates adjusted to match the new image dimensions.</li> <li>Keypoints will have their x and y coordinates swapped.</li> </ul> </div> <p>Mathematical Details: 1. For an input image I of shape (H, W, C), the output O is: O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1] 2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max): new_bbox = (y_min, x_min, y_max, x_max) 3. For keypoints with coordinates (x, y): new_keypoint = (y, x)</p> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>...</span>     <span class=p>[[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=p>[</span><span class=mi>4</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>6</span><span class=p>]],</span>
@@ -2567,7 +2566,7 @@
 </span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>
 </span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[()]:</span>
 </span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>        <span class=k>return</span> <span class=p>()</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.VerticalFlip class="doc doc-heading" data-toc-label=VerticalFlip> <code>class <strong> VerticalFlip</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L1369 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.VerticalFlip title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Flip the input vertically around the x-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This transform flips the image upside down. The top of the image becomes the bottom and vice versa.</li> <li>The dimensions of the image remain unchanged.</li> <li>For multi-channel images (like RGB), each channel is flipped independently.</li> <li>Bounding boxes are adjusted to match their new positions in the flipped image.</li> <li>Keypoints are moved to their new positions in the flipped image.</li> </ul> </div> <p>Mathematical Details: 1. For an input image I of shape (H, W, C), the output O is: O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1] 2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max): new_bbox = (x_min, H-y_max, x_max, H-y_min) 3. For keypoints with coordinates (x, y): new_keypoint = (x, H-y) where H is the height of the image.</p> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.geometric.transforms.VerticalFlip class="doc doc-heading" data-toc-label=VerticalFlip> <code>class <strong> VerticalFlip</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/geometric/transforms.py#L1368 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.geometric.transforms.VerticalFlip title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Flip the input vertically around the x-axis.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This transform flips the image upside down. The top of the image becomes the bottom and vice versa.</li> <li>The dimensions of the image remain unchanged.</li> <li>For multi-channel images (like RGB), each channel is flipped independently.</li> <li>Bounding boxes are adjusted to match their new positions in the flipped image.</li> <li>Keypoints are moved to their new positions in the flipped image.</li> </ul> </div> <p>Mathematical Details: 1. For an input image I of shape (H, W, C), the output O is: O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1] 2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max): new_bbox = (x_min, H-y_max, x_max, H-y_min) 3. For keypoints with coordinates (x, y): new_keypoint = (x, H-y) where H is the height of the image.</p> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>...</span>     <span class=p>[[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span> <span class=p>[</span><span class=mi>4</span><span class=p>,</span> <span class=mi>5</span><span class=p>,</span> <span class=mi>6</span><span class=p>]],</span>
diff --git a/docs/api_reference/augmentations/transforms3d/functional/index.html b/docs/api_reference/augmentations/transforms3d/functional/index.html
index 286e1146..f98d18e3 100644
--- a/docs/api_reference/augmentations/transforms3d/functional/index.html
+++ b/docs/api_reference/augmentations/transforms3d/functional/index.html
@@ -6,7 +6,7 @@
   .jupyter-wrapper .jp-MarkdownOutput.jp-RenderedHTMLCommon {
     font-size: 0.8rem;
   }
-</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/transforms3d/functional.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=3d-volumetric-functional-transforms-augmentationstransforms3dfunctional>3D (Volumetric) functional transforms (augmentations.transforms3d.functional)<a class=headerlink href=#3d-volumetric-functional-transforms-augmentationstransforms3dfunctional title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.transforms3d.functional></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.adjust_padding_by_position3d class="doc doc-heading" data-toc-label=adjust_padding_by_position3d()> <code class="highlight language-python">def adjust_padding_by_position3d (paddings, position, py_random) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L9 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.adjust_padding_by_position3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Adjust padding values based on desired position for 3D data.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>paddings</code></td> <td><code>list[tuple[int, int]]</code></td> <td><p>List of tuples containing padding pairs for each dimension [(d_pad), (h_pad), (w_pad)]</p></td> </tr> <tr> <td><code>position</code></td> <td><code>Literal[&#39;center&#39;, &#39;random&#39;]</code></td> <td><p>Position of the image after padding. Either 'center' or 'random'</p></td> </tr> <tr> <td><code>py_random</code></td> <td><code>Random</code></td> <td><p>Random number generator</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[int, int, int, int, int, int]</code></td> <td><p>Final padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>adjust_padding_by_position3d</span><span class=p>(</span>
+</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/transforms3d/functional.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=3d-volumetric-functional-transforms-augmentationstransforms3dfunctional>3D (Volumetric) functional transforms (augmentations.transforms3d.functional)<a class=headerlink href=#3d-volumetric-functional-transforms-augmentationstransforms3dfunctional title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.transforms3d.functional></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.adjust_padding_by_position3d class="doc doc-heading" data-toc-label=adjust_padding_by_position3d()> <code class="highlight language-python">def adjust_padding_by_position3d (paddings, position, py_random) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L10 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.adjust_padding_by_position3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Adjust padding values based on desired position for 3D data.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>paddings</code></td> <td><code>list[tuple[int, int]]</code></td> <td><p>List of tuples containing padding pairs for each dimension [(d_pad), (h_pad), (w_pad)]</p></td> </tr> <tr> <td><code>position</code></td> <td><code>Literal[&#39;center&#39;, &#39;random&#39;]</code></td> <td><p>Position of the image after padding. Either 'center' or 'random'</p></td> </tr> <tr> <td><code>py_random</code></td> <td><code>Random</code></td> <td><p>Random number generator</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>tuple[int, int, int, int, int, int]</code></td> <td><p>Final padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>adjust_padding_by_position3d</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>paddings</span><span class=p>:</span> <span class=nb>list</span><span class=p>[</span><span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]],</span>  <span class=c1># [(front, back), (top, bottom), (left, right)]</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>position</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;center&quot;</span><span class=p>,</span> <span class=s2>&quot;random&quot;</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>py_random</span><span class=p>:</span> <span class=n>random</span><span class=o>.</span><span class=n>Random</span><span class=p>,</span>
@@ -44,7 +44,7 @@
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>w_pad</span><span class=p>),</span>  <span class=c1># w_left</span>
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>w_pad</span> <span class=o>-</span> <span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>w_pad</span><span class=p>),</span>  <span class=c1># w_right</span>
 </span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.crop3d class="doc doc-heading" data-toc-label=crop3d()> <code class="highlight language-python">def crop3d (volume, crop_coords) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L89 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.crop3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Crop 3D volume using coordinates.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>volume</code></td> <td><code>ndarray</code></td> <td><p>Input volume with shape (z, y, x) or (z, y, x, channels)</p></td> </tr> <tr> <td><code>crop_coords</code></td> <td><code>tuple[int, int, int, int, int, int]</code></td> <td><p>Tuple of (z_min, z_max, y_min, y_max, x_min, x_max) coordinates for cropping</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ndarray</code></td> <td><p>Cropped volume with same number of dimensions as input</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>crop3d</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.crop3d class="doc doc-heading" data-toc-label=crop3d()> <code class="highlight language-python">def crop3d (volume, crop_coords) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L100 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.crop3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Crop 3D volume using coordinates.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>volume</code></td> <td><code>ndarray</code></td> <td><p>Input volume with shape (z, y, x) or (z, y, x, channels)</p></td> </tr> <tr> <td><code>crop_coords</code></td> <td><code>tuple[int, int, int, int, int, int]</code></td> <td><p>Tuple of (z_min, z_max, y_min, y_max, x_min, x_max) coordinates for cropping</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ndarray</code></td> <td><p>Cropped volume with same number of dimensions as input</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>crop3d</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>crop_coords</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
@@ -60,51 +60,110 @@
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>z_min</span><span class=p>,</span> <span class=n>z_max</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>y_max</span><span class=p>,</span> <span class=n>x_min</span><span class=p>,</span> <span class=n>x_max</span> <span class=o>=</span> <span class=n>crop_coords</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>return</span> <span class=n>volume</span><span class=p>[</span><span class=n>z_min</span><span class=p>:</span><span class=n>z_max</span><span class=p>,</span> <span class=n>y_min</span><span class=p>:</span><span class=n>y_max</span><span class=p>,</span> <span class=n>x_min</span><span class=p>:</span><span class=n>x_max</span><span class=p>]</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.cutout3d class="doc doc-heading" data-toc-label=cutout3d()> <code class="highlight language-python">def cutout3d (volume, holes, fill_value) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L107 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.cutout3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Cut out holes in 3D volume and fill them with a given value.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>cutout3d</span><span class=p>(</span><span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>holes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>fill_value</span><span class=p>:</span> <span class=n>ColorType</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.cutout3d class="doc doc-heading" data-toc-label=cutout3d()> <code class="highlight language-python">def cutout3d (volume, holes, fill_value) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L118 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.cutout3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Cut out holes in 3D volume and fill them with a given value.</p> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="1 "></span></a><span class=k>def</span> <span class=nf>cutout3d</span><span class=p>(</span><span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>holes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>fill_value</span><span class=p>:</span> <span class=n>ColorType</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos="2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Cut out holes in 3D volume and fill them with a given value.&quot;&quot;&quot;</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos="3 "></span></a>    <span class=n>volume</span> <span class=o>=</span> <span class=n>volume</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos="4 "></span></a>    <span class=k>for</span> <span class=n>z1</span><span class=p>,</span> <span class=n>y1</span><span class=p>,</span> <span class=n>x1</span><span class=p>,</span> <span class=n>z2</span><span class=p>,</span> <span class=n>y2</span><span class=p>,</span> <span class=n>x2</span> <span class=ow>in</span> <span class=n>holes</span><span class=p>:</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos="5 "></span></a>        <span class=n>volume</span><span class=p>[</span><span class=n>z1</span><span class=p>:</span><span class=n>z2</span><span class=p>,</span> <span class=n>y1</span><span class=p>:</span><span class=n>y2</span><span class=p>,</span> <span class=n>x1</span><span class=p>:</span><span class=n>x2</span><span class=p>]</span> <span class=o>=</span> <span class=n>fill_value</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos="6 "></span></a>    <span class=k>return</span> <span class=n>volume</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.pad_3d_with_params class="doc doc-heading" data-toc-label=pad_3d_with_params()> <code class="highlight language-python">def pad_3d_with_params (volume, padding, value) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L49 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.pad_3d_with_params title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Pad 3D image with given parameters.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>volume</code></td> <td><code>ndarray</code></td> <td><p>Input volume with shape (depth, height, width) or (depth, height, width, channels)</p></td> </tr> <tr> <td><code>padding</code></td> <td><code>tuple[int, int, int, int, int, int]</code></td> <td><p>Padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</p></td> </tr> <tr> <td><code>value</code></td> <td><code>Union[float, collections.abc.Sequence[float]]</code></td> <td><p>Padding value</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ndarray</code></td> <td><p>Padded image with same number of dimensions as input</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>pad_3d_with_params</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.filter_keypoints_in_holes3d class="doc doc-heading" data-toc-label=filter_keypoints_in_holes3d()> <code class="highlight language-python">def filter_keypoints_in_holes3d (keypoints, holes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L169 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.filter_keypoints_in_holes3d title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Filter out keypoints that are inside any of the 3D holes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (num_keypoints, 3+). The first three columns are x, y, z coordinates.</p></td> </tr> <tr> <td><code>holes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of holes with shape (num_holes, 6). Each hole is represented as [z1, y1, x1, z2, y2, x2].</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of keypoints that are not inside any hole.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>)</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>filter_keypoints_in_holes3d</span><span class=p>(</span><span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>holes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Filter out keypoints that are inside any of the 3D holes.</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>    Args:</span>
+</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=sd>        keypoints (np.ndarray): Array of keypoints with shape (num_keypoints, 3+).</span>
+</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>                               The first three columns are x, y, z coordinates.</span>
+</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>        holes (np.ndarray): Array of holes with shape (num_holes, 6).</span>
+</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>                           Each hole is represented as [z1, y1, x1, z2, y2, x2].</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        np.ndarray: Array of keypoints that are not inside any hole.</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=k>if</span> <span class=n>holes</span><span class=o>.</span><span class=n>size</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>        <span class=k>return</span> <span class=n>keypoints</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=c1># Broadcast keypoints and holes for vectorized comparison</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=c1># Convert keypoints from XYZ to ZYX for comparison with holes</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>kp_z</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>][:,</span> <span class=n>np</span><span class=o>.</span><span class=n>newaxis</span><span class=p>]</span>  <span class=c1># Shape: (num_keypoints, 1)</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=n>kp_y</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>][:,</span> <span class=n>np</span><span class=o>.</span><span class=n>newaxis</span><span class=p>]</span>  <span class=c1># Shape: (num_keypoints, 1)</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=n>kp_x</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>][:,</span> <span class=n>np</span><span class=o>.</span><span class=n>newaxis</span><span class=p>]</span>  <span class=c1># Shape: (num_keypoints, 1)</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=c1># Extract hole coordinates (in ZYX order)</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=n>hole_z1</span> <span class=o>=</span> <span class=n>holes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span>  <span class=c1># Shape: (num_holes,)</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=n>hole_y1</span> <span class=o>=</span> <span class=n>holes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=n>hole_x1</span> <span class=o>=</span> <span class=n>holes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>hole_z2</span> <span class=o>=</span> <span class=n>holes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=n>hole_y2</span> <span class=o>=</span> <span class=n>holes</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>]</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=n>hole_x2</span> <span class=o>=</span> <span class=n>holes</span><span class=p>[:,</span> <span class=mi>5</span><span class=p>]</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=c1># Check if each keypoint is inside each hole</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=n>inside_hole</span> <span class=o>=</span> <span class=p>(</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=p>(</span><span class=n>kp_z</span> <span class=o>&gt;=</span> <span class=n>hole_z1</span><span class=p>)</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>kp_z</span> <span class=o>&lt;</span> <span class=n>hole_z2</span><span class=p>)</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>kp_y</span> <span class=o>&gt;=</span> <span class=n>hole_y1</span><span class=p>)</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>kp_y</span> <span class=o>&lt;</span> <span class=n>hole_y2</span><span class=p>)</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>kp_x</span> <span class=o>&gt;=</span> <span class=n>hole_x1</span><span class=p>)</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>kp_x</span> <span class=o>&lt;</span> <span class=n>hole_x2</span><span class=p>)</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=p>)</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=c1># A keypoint is valid if it&#39;s not inside any hole</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>    <span class=n>valid_keypoints</span> <span class=o>=</span> <span class=o>~</span><span class=n>np</span><span class=o>.</span><span class=n>any</span><span class=p>(</span><span class=n>inside_hole</span><span class=p>,</span> <span class=n>axis</span><span class=o>=</span><span class=mi>1</span><span class=p>)</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>    <span class=c1># Return filtered keypoints with same dtype as input</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=n>result</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[</span><span class=n>valid_keypoints</span><span class=p>]</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>result</span><span class=p>)</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=c1># Ensure empty result has correct shape and dtype</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>keypoints</span><span class=o>.</span><span class=n>dtype</span><span class=p>)</span><span class=o>.</span><span class=n>reshape</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>])</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=k>return</span> <span class=n>result</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.pad_3d_with_params class="doc doc-heading" data-toc-label=pad_3d_with_params()> <code class="highlight language-python">def pad_3d_with_params (volume, padding, value) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L50 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.pad_3d_with_params title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Pad 3D volume with given parameters.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>volume</code></td> <td><code>ndarray</code></td> <td><p>Input volume with shape (depth, height, width) or (depth, height, width, channels)</p></td> </tr> <tr> <td><code>padding</code></td> <td><code>tuple[int, int, int, int, int, int]</code></td> <td><p>Padding values in format: (depth_front, depth_back, height_top, height_bottom, width_left, width_right) where: - depth_front/back: padding at start/end of depth axis (z) - height_top/bottom: padding at start/end of height axis (y) - width_left/right: padding at start/end of width axis (x)</p></td> </tr> <tr> <td><code>value</code></td> <td><code>Union[float, collections.abc.Sequence[float]]</code></td> <td><p>Value to fill the padding</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ndarray</code></td> <td><p>Padded volume with same number of dimensions as input</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The padding order matches the volume dimensions (depth, height, width). For each dimension, the first value is padding at the start (smaller indices), and the second value is padding at the end (larger indices).</p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>pad_3d_with_params</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>padding</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>  <span class=c1># (d_front, d_back, h_top, h_bottom, w_left, w_right)</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>padding</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>value</span><span class=p>:</span> <span class=n>ColorType</span><span class=p>,</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Pad 3D image with given parameters.</span>
+</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Pad 3D volume with given parameters.</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        volume: Input volume with shape (depth, height, width) or (depth, height, width, channels)</span>
-</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        padding: Padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</span>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        value: Padding value</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    Returns:</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        Padded image with same number of dimensions as input</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=n>d_front</span><span class=p>,</span> <span class=n>d_back</span><span class=p>,</span> <span class=n>h_top</span><span class=p>,</span> <span class=n>h_bottom</span><span class=p>,</span> <span class=n>w_left</span><span class=p>,</span> <span class=n>w_right</span> <span class=o>=</span> <span class=n>padding</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        padding: Padding values in format:</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>            (depth_front, depth_back, height_top, height_bottom, width_left, width_right)</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>            where:</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>            - depth_front/back: padding at start/end of depth axis (z)</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>            - height_top/bottom: padding at start/end of height axis (y)</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>            - width_left/right: padding at start/end of width axis (x)</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        value: Value to fill the padding</span>
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=c1># Skip if no padding is needed</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>if</span> <span class=n>d_front</span> <span class=o>==</span> <span class=n>d_back</span> <span class=o>==</span> <span class=n>h_top</span> <span class=o>==</span> <span class=n>h_bottom</span> <span class=o>==</span> <span class=n>w_left</span> <span class=o>==</span> <span class=n>w_right</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>        <span class=k>return</span> <span class=n>volume</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>    <span class=c1># Handle both 3D and 4D arrays</span>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=n>pad_width</span> <span class=o>=</span> <span class=p>[</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>        <span class=p>(</span><span class=n>d_front</span><span class=p>,</span> <span class=n>d_back</span><span class=p>),</span>  <span class=c1># depth padding</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=p>(</span><span class=n>h_top</span><span class=p>,</span> <span class=n>h_bottom</span><span class=p>),</span>  <span class=c1># height padding</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=p>(</span><span class=n>w_left</span><span class=p>,</span> <span class=n>w_right</span><span class=p>),</span>  <span class=c1># width padding</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=p>]</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=c1># Add channel padding if 4D array</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=k>if</span> <span class=n>volume</span><span class=o>.</span><span class=n>ndim</span> <span class=o>==</span> <span class=n>NUM_VOLUME_DIMENSIONS</span><span class=p>:</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=n>pad_width</span><span class=o>.</span><span class=n>append</span><span class=p>((</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>))</span>  <span class=c1># no padding for channels</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>pad</span><span class=p>(</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=n>volume</span><span class=p>,</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>pad_width</span><span class=o>=</span><span class=n>pad_width</span><span class=p>,</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=n>mode</span><span class=o>=</span><span class=s2>&quot;constant&quot;</span><span class=p>,</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>constant_values</span><span class=o>=</span><span class=n>value</span><span class=p>,</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.transform_cube class="doc doc-heading" data-toc-label=transform_cube()> <code class="highlight language-python">def transform_cube (cube, index) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L115 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.transform_cube title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transform cube by index (0-47)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>cube</code></td> <td><code>ndarray</code></td> <td><p>Input array with shape (D, H, W) or (D, H, W, C)</p></td> </tr> <tr> <td><code>index</code></td> <td><code>int</code></td> <td><p>Integer from 0 to 47 specifying which transformation to apply</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ndarray</code></td> <td><p>Transformed cube with same shape as input</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>transform_cube</span><span class=p>(</span><span class=n>cube</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>index</span><span class=p>:</span> <span class=nb>int</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        Padded volume with same number of dimensions as input</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>    Note:</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>        The padding order matches the volume dimensions (depth, height, width).</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>        For each dimension, the first value is padding at the start (smaller indices),</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>        and the second value is padding at the end (larger indices).</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=n>depth_front</span><span class=p>,</span> <span class=n>depth_back</span><span class=p>,</span> <span class=n>height_top</span><span class=p>,</span> <span class=n>height_bottom</span><span class=p>,</span> <span class=n>width_left</span><span class=p>,</span> <span class=n>width_right</span> <span class=o>=</span> <span class=n>padding</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=c1># Skip if no padding is needed</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>if</span> <span class=nb>all</span><span class=p>(</span><span class=n>p</span> <span class=o>==</span> <span class=mi>0</span> <span class=k>for</span> <span class=n>p</span> <span class=ow>in</span> <span class=n>padding</span><span class=p>):</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=k>return</span> <span class=n>volume</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=c1># Handle both 3D and 4D arrays</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=n>pad_width</span> <span class=o>=</span> <span class=p>[</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=p>(</span><span class=n>depth_front</span><span class=p>,</span> <span class=n>depth_back</span><span class=p>),</span>  <span class=c1># depth (z) padding</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=p>(</span><span class=n>height_top</span><span class=p>,</span> <span class=n>height_bottom</span><span class=p>),</span>  <span class=c1># height (y) padding</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=p>(</span><span class=n>width_left</span><span class=p>,</span> <span class=n>width_right</span><span class=p>),</span>  <span class=c1># width (x) padding</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=p>]</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=c1># Add channel padding if 4D array</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=k>if</span> <span class=n>volume</span><span class=o>.</span><span class=n>ndim</span> <span class=o>==</span> <span class=n>NUM_VOLUME_DIMENSIONS</span><span class=p>:</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>        <span class=n>pad_width</span><span class=o>.</span><span class=n>append</span><span class=p>((</span><span class=mi>0</span><span class=p>,</span> <span class=mi>0</span><span class=p>))</span>  <span class=c1># no padding for channels</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>pad</span><span class=p>(</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=n>volume</span><span class=p>,</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>        <span class=n>pad_width</span><span class=o>=</span><span class=n>pad_width</span><span class=p>,</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>        <span class=n>mode</span><span class=o>=</span><span class=s2>&quot;constant&quot;</span><span class=p>,</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=n>constant_values</span><span class=o>=</span><span class=n>value</span><span class=p>,</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>    <span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.augmentations.transforms3d.functional.transform_cube class="doc doc-heading" data-toc-label=transform_cube()> <code class="highlight language-python">def transform_cube (cube, index) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/augmentations/transforms3d/functional.py#L126 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.augmentations.transforms3d.functional.transform_cube title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Transform cube by index (0-47)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>cube</code></td> <td><code>ndarray</code></td> <td><p>Input array with shape (D, H, W) or (D, H, W, C)</p></td> </tr> <tr> <td><code>index</code></td> <td><code>int</code></td> <td><p>Integer from 0 to 47 specifying which transformation to apply</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ndarray</code></td> <td><p>Transformed cube with same shape as input</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/functional.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>transform_cube</span><span class=p>(</span><span class=n>cube</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>index</span><span class=p>:</span> <span class=nb>int</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Transform cube by index (0-47)</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
diff --git a/docs/api_reference/augmentations/transforms3d/transforms/index.html b/docs/api_reference/augmentations/transforms3d/transforms/index.html
index 8d935c19..c1666bae 100644
--- a/docs/api_reference/augmentations/transforms3d/transforms/index.html
+++ b/docs/api_reference/augmentations/transforms3d/transforms/index.html
@@ -9,7 +9,7 @@
 </style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/augmentations/transforms3d/transforms.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=3d-volumetric-transforms-augmentationstransforms3dtransforms>3D (Volumetric) transforms (augmentations.transforms3d.transforms)<a class=headerlink href=#3d-volumetric-transforms-augmentationstransforms3dtransforms title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.augmentations.transforms3d.transforms></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.BaseCropAndPad3D class="doc doc-heading" data-toc-label=BaseCropAndPad3D> <code>class <strong> BaseCropAndPad3D</strong></code> <code> (pad_if_needed, fill, fill_mask, pad_position, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.BaseCropAndPad3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Base class for 3D transforms that need both cropping and padding.</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/BaseCropAndPad3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos="  1 "></span></a><span class=k>class</span> <span class=nc>BaseCropAndPad3D</span><span class=p>(</span><span class=n>Transform3D</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos="  2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Base class for 3D transforms that need both cropping and padding.&quot;&quot;&quot;</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos="  3 "></span></a>
-</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos="  4 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>)</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos="  4 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>KEYPOINTS</span><span class=p>)</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos="  5 "></span></a>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos="  6 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>Transform3D</span><span class=o>.</span><span class=n>InitSchema</span><span class=p>):</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos="  7 "></span></a>        <span class=n>pad_if_needed</span><span class=p>:</span> <span class=nb>bool</span>
@@ -142,10 +142,42 @@
 </span><span id=__span-0-134><a id=__codelineno-0-134 name=__codelineno-0-134></a><a href=#__codelineno-0-134><span class=linenos data-linenos="134 "></span></a>            <span class=p>)</span>
 </span><span id=__span-0-135><a id=__codelineno-0-135 name=__codelineno-0-135></a><a href=#__codelineno-0-135><span class=linenos data-linenos="135 "></span></a>
 </span><span id=__span-0-136><a id=__codelineno-0-136 name=__codelineno-0-136></a><a href=#__codelineno-0-136><span class=linenos data-linenos="136 "></span></a>        <span class=k>return</span> <span class=n>cropped</span>
+</span><span id=__span-0-137><a id=__codelineno-0-137 name=__codelineno-0-137></a><a href=#__codelineno-0-137><span class=linenos data-linenos="137 "></span></a>
+</span><span id=__span-0-138><a id=__codelineno-0-138 name=__codelineno-0-138></a><a href=#__codelineno-0-138><span class=linenos data-linenos="138 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span>
+</span><span id=__span-0-139><a id=__codelineno-0-139 name=__codelineno-0-139></a><a href=#__codelineno-0-139><span class=linenos data-linenos="139 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-140><a id=__codelineno-0-140 name=__codelineno-0-140></a><a href=#__codelineno-0-140><span class=linenos data-linenos="140 "></span></a>        <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-141><a id=__codelineno-0-141 name=__codelineno-0-141></a><a href=#__codelineno-0-141><span class=linenos data-linenos="141 "></span></a>        <span class=n>crop_coords</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-142><a id=__codelineno-0-142 name=__codelineno-0-142></a><a href=#__codelineno-0-142><span class=linenos data-linenos="142 "></span></a>        <span class=n>pad_params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-143><a id=__codelineno-0-143 name=__codelineno-0-143></a><a href=#__codelineno-0-143><span class=linenos data-linenos="143 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
+</span><span id=__span-0-144><a id=__codelineno-0-144 name=__codelineno-0-144></a><a href=#__codelineno-0-144><span class=linenos data-linenos="144 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-145><a id=__codelineno-0-145 name=__codelineno-0-145></a><a href=#__codelineno-0-145><span class=linenos data-linenos="145 "></span></a>        <span class=c1># Extract crop start coordinates (z1,y1,x1)</span>
+</span><span id=__span-0-146><a id=__codelineno-0-146 name=__codelineno-0-146></a><a href=#__codelineno-0-146><span class=linenos data-linenos="146 "></span></a>        <span class=n>crop_z1</span><span class=p>,</span> <span class=n>_</span><span class=p>,</span> <span class=n>crop_y1</span><span class=p>,</span> <span class=n>_</span><span class=p>,</span> <span class=n>crop_x1</span><span class=p>,</span> <span class=n>_</span> <span class=o>=</span> <span class=n>crop_coords</span>
+</span><span id=__span-0-147><a id=__codelineno-0-147 name=__codelineno-0-147></a><a href=#__codelineno-0-147><span class=linenos data-linenos="147 "></span></a>
+</span><span id=__span-0-148><a id=__codelineno-0-148 name=__codelineno-0-148></a><a href=#__codelineno-0-148><span class=linenos data-linenos="148 "></span></a>        <span class=c1># Initialize shift vector with negative crop coordinates</span>
+</span><span id=__span-0-149><a id=__codelineno-0-149 name=__codelineno-0-149></a><a href=#__codelineno-0-149><span class=linenos data-linenos="149 "></span></a>        <span class=n>shift</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span>
+</span><span id=__span-0-150><a id=__codelineno-0-150 name=__codelineno-0-150></a><a href=#__codelineno-0-150><span class=linenos data-linenos="150 "></span></a>            <span class=p>[</span>
+</span><span id=__span-0-151><a id=__codelineno-0-151 name=__codelineno-0-151></a><a href=#__codelineno-0-151><span class=linenos data-linenos="151 "></span></a>                <span class=o>-</span><span class=n>crop_x1</span><span class=p>,</span>  <span class=c1># X shift</span>
+</span><span id=__span-0-152><a id=__codelineno-0-152 name=__codelineno-0-152></a><a href=#__codelineno-0-152><span class=linenos data-linenos="152 "></span></a>                <span class=o>-</span><span class=n>crop_y1</span><span class=p>,</span>  <span class=c1># Y shift</span>
+</span><span id=__span-0-153><a id=__codelineno-0-153 name=__codelineno-0-153></a><a href=#__codelineno-0-153><span class=linenos data-linenos="153 "></span></a>                <span class=o>-</span><span class=n>crop_z1</span><span class=p>,</span>  <span class=c1># Z shift</span>
+</span><span id=__span-0-154><a id=__codelineno-0-154 name=__codelineno-0-154></a><a href=#__codelineno-0-154><span class=linenos data-linenos="154 "></span></a>            <span class=p>],</span>
+</span><span id=__span-0-155><a id=__codelineno-0-155 name=__codelineno-0-155></a><a href=#__codelineno-0-155><span class=linenos data-linenos="155 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-156><a id=__codelineno-0-156 name=__codelineno-0-156></a><a href=#__codelineno-0-156><span class=linenos data-linenos="156 "></span></a>
+</span><span id=__span-0-157><a id=__codelineno-0-157 name=__codelineno-0-157></a><a href=#__codelineno-0-157><span class=linenos data-linenos="157 "></span></a>        <span class=c1># Add padding shift if needed</span>
+</span><span id=__span-0-158><a id=__codelineno-0-158 name=__codelineno-0-158></a><a href=#__codelineno-0-158><span class=linenos data-linenos="158 "></span></a>        <span class=k>if</span> <span class=n>pad_params</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-159><a id=__codelineno-0-159 name=__codelineno-0-159></a><a href=#__codelineno-0-159><span class=linenos data-linenos="159 "></span></a>            <span class=n>shift</span> <span class=o>+=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>(</span>
+</span><span id=__span-0-160><a id=__codelineno-0-160 name=__codelineno-0-160></a><a href=#__codelineno-0-160><span class=linenos data-linenos="160 "></span></a>                <span class=p>[</span>
+</span><span id=__span-0-161><a id=__codelineno-0-161 name=__codelineno-0-161></a><a href=#__codelineno-0-161><span class=linenos data-linenos="161 "></span></a>                    <span class=n>pad_params</span><span class=p>[</span><span class=s2>&quot;pad_left&quot;</span><span class=p>],</span>  <span class=c1># X shift</span>
+</span><span id=__span-0-162><a id=__codelineno-0-162 name=__codelineno-0-162></a><a href=#__codelineno-0-162><span class=linenos data-linenos="162 "></span></a>                    <span class=n>pad_params</span><span class=p>[</span><span class=s2>&quot;pad_top&quot;</span><span class=p>],</span>  <span class=c1># Y shift</span>
+</span><span id=__span-0-163><a id=__codelineno-0-163 name=__codelineno-0-163></a><a href=#__codelineno-0-163><span class=linenos data-linenos="163 "></span></a>                    <span class=n>pad_params</span><span class=p>[</span><span class=s2>&quot;pad_front&quot;</span><span class=p>],</span>  <span class=c1># Z shift</span>
+</span><span id=__span-0-164><a id=__codelineno-0-164 name=__codelineno-0-164></a><a href=#__codelineno-0-164><span class=linenos data-linenos="164 "></span></a>                <span class=p>],</span>
+</span><span id=__span-0-165><a id=__codelineno-0-165 name=__codelineno-0-165></a><a href=#__codelineno-0-165><span class=linenos data-linenos="165 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-166><a id=__codelineno-0-166 name=__codelineno-0-166></a><a href=#__codelineno-0-166><span class=linenos data-linenos="166 "></span></a>
+</span><span id=__span-0-167><a id=__codelineno-0-167 name=__codelineno-0-167></a><a href=#__codelineno-0-167><span class=linenos data-linenos="167 "></span></a>        <span class=c1># Apply combined shift</span>
+</span><span id=__span-0-168><a id=__codelineno-0-168 name=__codelineno-0-168></a><a href=#__codelineno-0-168><span class=linenos data-linenos="168 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>shift_keypoints</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>shift</span><span class=p>)</span>
 </span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.BasePad3D class="doc doc-heading" data-toc-label=BasePad3D> <code>class <strong> BasePad3D</strong></code> <code> (fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.BasePad3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Base class for 3D padding transforms.</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/BasePad3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>BasePad3D</span><span class=p>(</span><span class=n>Transform3D</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Base class for 3D padding transforms.&quot;&quot;&quot;</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
-</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>)</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>KEYPOINTS</span><span class=p>)</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>Transform3D</span><span class=o>.</span><span class=n>InitSchema</span><span class=p>):</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span>
@@ -189,7 +221,12 @@
 </span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>            <span class=n>padding</span><span class=o>=</span><span class=n>padding</span><span class=p>,</span>
 </span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>            <span class=n>value</span><span class=o>=</span><span class=n>cast</span><span class=p>(</span><span class=n>ColorType</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span><span class=p>),</span>
 </span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.CenterCrop3D class="doc doc-heading" data-toc-label=CenterCrop3D> <code>class <strong> CenterCrop3D</strong></code> <code> (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.CenterCrop3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Crop the center of 3D volume.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>size</code></td> <td><code>tuple[int, int, int]</code></td> <td><p>Desired output size of the crop in format (depth, height, width)</p></td> </tr> <tr> <td><code>pad_if_needed</code></td> <td><code>bool</code></td> <td><p>Whether to pad if the volume is smaller than desired crop size. Default: False</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Padding value for image if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Padding value for mask if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>If you want to perform cropping only in the XY plane while preserving all slices along the Z axis, consider using CenterCrop instead. CenterCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/CenterCrop3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>CenterCrop3D</span><span class=p>(</span><span class=n>BaseCropAndPad3D</span><span class=p>):</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=n>padding</span> <span class=o>=</span> <span class=n>params</span><span class=p>[</span><span class=s2>&quot;padding&quot;</span><span class=p>]</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=n>shift_vector</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([</span><span class=n>padding</span><span class=p>[</span><span class=mi>4</span><span class=p>],</span> <span class=n>padding</span><span class=p>[</span><span class=mi>2</span><span class=p>],</span> <span class=n>padding</span><span class=p>[</span><span class=mi>0</span><span class=p>]])</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=k>return</span> <span class=n>fgeometric</span><span class=o>.</span><span class=n>shift_keypoints</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>shift_vector</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.CenterCrop3D class="doc doc-heading" data-toc-label=CenterCrop3D> <code>class <strong> CenterCrop3D</strong></code> <code> (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.CenterCrop3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Crop the center of 3D volume.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>size</code></td> <td><code>tuple[int, int, int]</code></td> <td><p>Desired output size of the crop in format (depth, height, width)</p></td> </tr> <tr> <td><code>pad_if_needed</code></td> <td><code>bool</code></td> <td><p>Whether to pad if the volume is smaller than desired crop size. Default: False</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Padding value for image if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Padding value for mask if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d, keypoints</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>If you want to perform cropping only in the XY plane while preserving all slices along the Z axis, consider using CenterCrop instead. CenterCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/CenterCrop3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>CenterCrop3D</span><span class=p>(</span><span class=n>BaseCropAndPad3D</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Crop the center of 3D volume.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -200,7 +237,7 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        p (float): probability of applying the transform. Default: 1.0</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    Targets:</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        volume, mask3d</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        volume, mask3d, keypoints</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>    Image types:</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        uint8, float32</span>
@@ -285,7 +322,7 @@
 </span><span id=__span-0-94><a id=__codelineno-0-94 name=__codelineno-0-94></a><a href=#__codelineno-0-94><span class=linenos data-linenos="94 "></span></a>
 </span><span id=__span-0-95><a id=__codelineno-0-95 name=__codelineno-0-95></a><a href=#__codelineno-0-95><span class=linenos data-linenos="95 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
 </span><span id=__span-0-96><a id=__codelineno-0-96 name=__codelineno-0-96></a><a href=#__codelineno-0-96><span class=linenos data-linenos="96 "></span></a>        <span class=k>return</span> <span class=s2>&quot;size&quot;</span><span class=p>,</span> <span class=s2>&quot;pad_if_needed&quot;</span><span class=p>,</span> <span class=s2>&quot;fill&quot;</span><span class=p>,</span> <span class=s2>&quot;fill_mask&quot;</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.CoarseDropout3D class="doc doc-heading" data-toc-label=CoarseDropout3D> <code>class <strong> CoarseDropout3D</strong></code> <code> (num_holes_range=(1, 1), hole_depth_range=(0.1, 0.2), hole_height_range=(0.1, 0.2), hole_width_range=(0.1, 0.2), fill=0, fill_mask=None, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.CoarseDropout3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>CoarseDropout3D randomly drops out cuboid regions from a 3D volume and optionally, the corresponding regions in an associated 3D mask, to simulate occlusion and varied object sizes found in real-world volumetric data.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>num_holes_range</code></td> <td><code>tuple[int, int]</code></td> <td><p>Range (min, max) for the number of cuboid regions to drop out. Default: (1, 1)</p></td> </tr> <tr> <td><code>hole_depth_range</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range (min, max) for the depth of dropout regions as a fraction of the volume depth (between 0 and 1). Default: (0.1, 0.2)</p></td> </tr> <tr> <td><code>hole_height_range</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range (min, max) for the height of dropout regions as a fraction of the volume height (between 0 and 1). Default: (0.1, 0.2)</p></td> </tr> <tr> <td><code>hole_width_range</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range (min, max) for the width of dropout regions as a fraction of the volume width (between 0 and 1). Default: (0.1, 0.2)</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Value for the dropped voxels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType | None</code></td> <td><p>Fill value for dropout regions in the 3D mask. If None, mask regions corresponding to volume dropouts are unchanged. Default: None</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The actual number and size of dropout regions are randomly chosen within the specified ranges.</li> <li>All values in hole_depth_range, hole_height_range and hole_width_range must be between 0 and 1.</li> <li>If you want to apply dropout only in the XY plane while preserving the full depth dimension, consider using CoarseDropout instead. CoarseDropout will apply the same rectangular dropout to each slice independently, effectively creating cylindrical dropout regions that extend through the entire depth of the volume.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.CoarseDropout3D class="doc doc-heading" data-toc-label=CoarseDropout3D> <code>class <strong> CoarseDropout3D</strong></code> <code> (num_holes_range=(1, 1), hole_depth_range=(0.1, 0.2), hole_height_range=(0.1, 0.2), hole_width_range=(0.1, 0.2), fill=0, fill_mask=None, p=0.5, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.CoarseDropout3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>CoarseDropout3D randomly drops out cuboid regions from a 3D volume and optionally, the corresponding regions in an associated 3D mask, to simulate occlusion and varied object sizes found in real-world volumetric data.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>num_holes_range</code></td> <td><code>tuple[int, int]</code></td> <td><p>Range (min, max) for the number of cuboid regions to drop out. Default: (1, 1)</p></td> </tr> <tr> <td><code>hole_depth_range</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range (min, max) for the depth of dropout regions as a fraction of the volume depth (between 0 and 1). Default: (0.1, 0.2)</p></td> </tr> <tr> <td><code>hole_height_range</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range (min, max) for the height of dropout regions as a fraction of the volume height (between 0 and 1). Default: (0.1, 0.2)</p></td> </tr> <tr> <td><code>hole_width_range</code></td> <td><code>tuple[float, float]</code></td> <td><p>Range (min, max) for the width of dropout regions as a fraction of the volume width (between 0 and 1). Default: (0.1, 0.2)</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Value for the dropped voxels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType | None</code></td> <td><p>Fill value for dropout regions in the 3D mask. If None, mask regions corresponding to volume dropouts are unchanged. Default: None</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 0.5</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d, keypoints</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The actual number and size of dropout regions are randomly chosen within the specified ranges.</li> <li>All values in hole_depth_range, hole_height_range and hole_width_range must be between 0 and 1.</li> <li>If you want to apply dropout only in the XY plane while preserving the full depth dimension, consider using CoarseDropout instead. CoarseDropout will apply the same rectangular dropout to each slice independently, effectively creating cylindrical dropout regions that extend through the entire depth of the volume.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>volume</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>(</span><span class=mi>10</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>  <span class=c1># (D, H, W)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>mask3d</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=p>(</span><span class=mi>10</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>    <span class=c1># (D, H, W)</span>
@@ -322,7 +359,7 @@
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos=" 21 "></span></a><span class=sd>        p (float): Probability of applying the transform. Default: 0.5</span>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos=" 22 "></span></a>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos=" 23 "></span></a><span class=sd>    Targets:</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos=" 24 "></span></a><span class=sd>        volume, mask3d</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos=" 24 "></span></a><span class=sd>        volume, mask3d, keypoints</span>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos=" 25 "></span></a>
 </span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos=" 26 "></span></a><span class=sd>    Image types:</span>
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos=" 27 "></span></a><span class=sd>        uint8, float32</span>
@@ -352,7 +389,7 @@
 </span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos=" 51 "></span></a><span class=sd>        &gt;&gt;&gt; transformed_volume, transformed_mask3d = transformed[&quot;volume&quot;], transformed[&quot;mask3d&quot;]</span>
 </span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos=" 52 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos=" 53 "></span></a>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos=" 54 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>)</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos=" 54 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>KEYPOINTS</span><span class=p>)</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos=" 55 "></span></a>
 </span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos=" 56 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>Transform3D</span><span class=o>.</span><span class=n>InitSchema</span><span class=p>):</span>
 </span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos=" 57 "></span></a>        <span class=n>num_holes_range</span><span class=p>:</span> <span class=n>Annotated</span><span class=p>[</span>
@@ -469,16 +506,31 @@
 </span><span id=__span-0-168><a id=__codelineno-0-168 name=__codelineno-0-168></a><a href=#__codelineno-0-168><span class=linenos data-linenos="168 "></span></a>
 </span><span id=__span-0-169><a id=__codelineno-0-169 name=__codelineno-0-169></a><a href=#__codelineno-0-169><span class=linenos data-linenos="169 "></span></a>        <span class=k>return</span> <span class=n>f3d</span><span class=o>.</span><span class=n>cutout3d</span><span class=p>(</span><span class=n>mask</span><span class=p>,</span> <span class=n>holes</span><span class=p>,</span> <span class=n>cast</span><span class=p>(</span><span class=n>ColorType</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span><span class=p>))</span>
 </span><span id=__span-0-170><a id=__codelineno-0-170 name=__codelineno-0-170></a><a href=#__codelineno-0-170><span class=linenos data-linenos="170 "></span></a>
-</span><span id=__span-0-171><a id=__codelineno-0-171 name=__codelineno-0-171></a><a href=#__codelineno-0-171><span class=linenos data-linenos="171 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
-</span><span id=__span-0-172><a id=__codelineno-0-172 name=__codelineno-0-172></a><a href=#__codelineno-0-172><span class=linenos data-linenos="172 "></span></a>        <span class=k>return</span> <span class=p>(</span>
-</span><span id=__span-0-173><a id=__codelineno-0-173 name=__codelineno-0-173></a><a href=#__codelineno-0-173><span class=linenos data-linenos="173 "></span></a>            <span class=s2>&quot;num_holes_range&quot;</span><span class=p>,</span>
-</span><span id=__span-0-174><a id=__codelineno-0-174 name=__codelineno-0-174></a><a href=#__codelineno-0-174><span class=linenos data-linenos="174 "></span></a>            <span class=s2>&quot;hole_depth_range&quot;</span><span class=p>,</span>
-</span><span id=__span-0-175><a id=__codelineno-0-175 name=__codelineno-0-175></a><a href=#__codelineno-0-175><span class=linenos data-linenos="175 "></span></a>            <span class=s2>&quot;hole_height_range&quot;</span><span class=p>,</span>
-</span><span id=__span-0-176><a id=__codelineno-0-176 name=__codelineno-0-176></a><a href=#__codelineno-0-176><span class=linenos data-linenos="176 "></span></a>            <span class=s2>&quot;hole_width_range&quot;</span><span class=p>,</span>
-</span><span id=__span-0-177><a id=__codelineno-0-177 name=__codelineno-0-177></a><a href=#__codelineno-0-177><span class=linenos data-linenos="177 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
-</span><span id=__span-0-178><a id=__codelineno-0-178 name=__codelineno-0-178></a><a href=#__codelineno-0-178><span class=linenos data-linenos="178 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
-</span><span id=__span-0-179><a id=__codelineno-0-179 name=__codelineno-0-179></a><a href=#__codelineno-0-179><span class=linenos data-linenos="179 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.CubicSymmetry class="doc doc-heading" data-toc-label=CubicSymmetry> <code>class <strong> CubicSymmetry</strong></code> <code> (p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L796 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.CubicSymmetry title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Applies a random cubic symmetry transformation to a 3D volume.</p> <p>This transform is a 3D extension of D4. While D4 handles the 8 symmetries of a square (4 rotations x 2 reflections), CubicSymmetry handles all 48 symmetries of a cube. Like D4, this transform does not create any interpolation artifacts as it only remaps voxels from one position to another without any interpolation.</p> <p>The 48 transformations consist of: - 24 rotations (orientation-preserving): * 4 rotations around each face diagonal (6 face diagonals x 4 rotations = 24) - 24 rotoreflections (orientation-reversing): * Reflection through a plane followed by any of the 24 rotations</p> <p>For a cube, these transformations preserve: - All face centers (6) - All vertex positions (8) - All edge centers (12)</p> <p>works with 3D volumes and masks of the shape (D, H, W) or (D, H, W, C)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This transform is particularly useful for data augmentation in 3D medical imaging, crystallography, and voxel-based 3D modeling where the object's orientation is arbitrary.</li> <li>All transformations preserve the object's chirality (handedness) when using pure rotations (indices 0-23) and invert it when using rotoreflections (indices 24-47).</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
+</span><span id=__span-0-171><a id=__codelineno-0-171 name=__codelineno-0-171></a><a href=#__codelineno-0-171><span class=linenos data-linenos="171 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span>
+</span><span id=__span-0-172><a id=__codelineno-0-172 name=__codelineno-0-172></a><a href=#__codelineno-0-172><span class=linenos data-linenos="172 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-173><a id=__codelineno-0-173 name=__codelineno-0-173></a><a href=#__codelineno-0-173><span class=linenos data-linenos="173 "></span></a>        <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-174><a id=__codelineno-0-174 name=__codelineno-0-174></a><a href=#__codelineno-0-174><span class=linenos data-linenos="174 "></span></a>        <span class=n>holes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-175><a id=__codelineno-0-175 name=__codelineno-0-175></a><a href=#__codelineno-0-175><span class=linenos data-linenos="175 "></span></a>        <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span>
+</span><span id=__span-0-176><a id=__codelineno-0-176 name=__codelineno-0-176></a><a href=#__codelineno-0-176><span class=linenos data-linenos="176 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-177><a id=__codelineno-0-177 name=__codelineno-0-177></a><a href=#__codelineno-0-177><span class=linenos data-linenos="177 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Remove keypoints that fall within dropout regions.&quot;&quot;&quot;</span>
+</span><span id=__span-0-178><a id=__codelineno-0-178 name=__codelineno-0-178></a><a href=#__codelineno-0-178><span class=linenos data-linenos="178 "></span></a>        <span class=k>if</span> <span class=n>holes</span><span class=o>.</span><span class=n>size</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-179><a id=__codelineno-0-179 name=__codelineno-0-179></a><a href=#__codelineno-0-179><span class=linenos data-linenos="179 "></span></a>            <span class=k>return</span> <span class=n>keypoints</span>
+</span><span id=__span-0-180><a id=__codelineno-0-180 name=__codelineno-0-180></a><a href=#__codelineno-0-180><span class=linenos data-linenos="180 "></span></a>        <span class=n>processor</span> <span class=o>=</span> <span class=n>cast</span><span class=p>(</span><span class=n>KeypointsProcessor</span><span class=p>,</span> <span class=bp>self</span><span class=o>.</span><span class=n>get_processor</span><span class=p>(</span><span class=s2>&quot;keypoints&quot;</span><span class=p>))</span>
+</span><span id=__span-0-181><a id=__codelineno-0-181 name=__codelineno-0-181></a><a href=#__codelineno-0-181><span class=linenos data-linenos="181 "></span></a>
+</span><span id=__span-0-182><a id=__codelineno-0-182 name=__codelineno-0-182></a><a href=#__codelineno-0-182><span class=linenos data-linenos="182 "></span></a>        <span class=k>if</span> <span class=n>processor</span> <span class=ow>is</span> <span class=kc>None</span> <span class=ow>or</span> <span class=ow>not</span> <span class=n>processor</span><span class=o>.</span><span class=n>params</span><span class=o>.</span><span class=n>remove_invisible</span><span class=p>:</span>
+</span><span id=__span-0-183><a id=__codelineno-0-183 name=__codelineno-0-183></a><a href=#__codelineno-0-183><span class=linenos data-linenos="183 "></span></a>            <span class=k>return</span> <span class=n>keypoints</span>
+</span><span id=__span-0-184><a id=__codelineno-0-184 name=__codelineno-0-184></a><a href=#__codelineno-0-184><span class=linenos data-linenos="184 "></span></a>        <span class=k>return</span> <span class=n>f3d</span><span class=o>.</span><span class=n>filter_keypoints_in_holes3d</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>holes</span><span class=p>)</span>
+</span><span id=__span-0-185><a id=__codelineno-0-185 name=__codelineno-0-185></a><a href=#__codelineno-0-185><span class=linenos data-linenos="185 "></span></a>
+</span><span id=__span-0-186><a id=__codelineno-0-186 name=__codelineno-0-186></a><a href=#__codelineno-0-186><span class=linenos data-linenos="186 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
+</span><span id=__span-0-187><a id=__codelineno-0-187 name=__codelineno-0-187></a><a href=#__codelineno-0-187><span class=linenos data-linenos="187 "></span></a>        <span class=k>return</span> <span class=p>(</span>
+</span><span id=__span-0-188><a id=__codelineno-0-188 name=__codelineno-0-188></a><a href=#__codelineno-0-188><span class=linenos data-linenos="188 "></span></a>            <span class=s2>&quot;num_holes_range&quot;</span><span class=p>,</span>
+</span><span id=__span-0-189><a id=__codelineno-0-189 name=__codelineno-0-189></a><a href=#__codelineno-0-189><span class=linenos data-linenos="189 "></span></a>            <span class=s2>&quot;hole_depth_range&quot;</span><span class=p>,</span>
+</span><span id=__span-0-190><a id=__codelineno-0-190 name=__codelineno-0-190></a><a href=#__codelineno-0-190><span class=linenos data-linenos="190 "></span></a>            <span class=s2>&quot;hole_height_range&quot;</span><span class=p>,</span>
+</span><span id=__span-0-191><a id=__codelineno-0-191 name=__codelineno-0-191></a><a href=#__codelineno-0-191><span class=linenos data-linenos="191 "></span></a>            <span class=s2>&quot;hole_width_range&quot;</span><span class=p>,</span>
+</span><span id=__span-0-192><a id=__codelineno-0-192 name=__codelineno-0-192></a><a href=#__codelineno-0-192><span class=linenos data-linenos="192 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
+</span><span id=__span-0-193><a id=__codelineno-0-193 name=__codelineno-0-193></a><a href=#__codelineno-0-193><span class=linenos data-linenos="193 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
+</span><span id=__span-0-194><a id=__codelineno-0-194 name=__codelineno-0-194></a><a href=#__codelineno-0-194><span class=linenos data-linenos="194 "></span></a>        <span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.CubicSymmetry class="doc doc-heading" data-toc-label=CubicSymmetry> <code>class <strong> CubicSymmetry</strong></code> <code> (p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L845 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.CubicSymmetry title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Applies a random cubic symmetry transformation to a 3D volume.</p> <p>This transform is a 3D extension of D4. While D4 handles the 8 symmetries of a square (4 rotations x 2 reflections), CubicSymmetry handles all 48 symmetries of a cube. Like D4, this transform does not create any interpolation artifacts as it only remaps voxels from one position to another without any interpolation.</p> <p>The 48 transformations consist of: - 24 rotations (orientation-preserving): * 4 rotations around each face diagonal (6 face diagonals x 4 rotations = 24) - 24 rotoreflections (orientation-reversing): * Reflection through a plane followed by any of the 24 rotations</p> <p>For a cube, these transformations preserve: - All face centers (6) - All vertex positions (8) - All edge centers (12)</p> <p>works with 3D volumes and masks of the shape (D, H, W) or (D, H, W, C)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d, keypoints</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>This transform is particularly useful for data augmentation in 3D medical imaging, crystallography, and voxel-based 3D modeling where the object's orientation is arbitrary.</li> <li>All transformations preserve the object's chirality (handedness) when using pure rotations (indices 0-23) and invert it when using rotoreflections (indices 24-47).</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>numpy</span> <span class=k>as</span> <span class=nn>np</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>volume</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>256</span><span class=p>,</span> <span class=p>(</span><span class=mi>10</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>  <span class=c1># (D, H, W)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=n>mask3d</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=p>(</span><span class=mi>10</span><span class=p>,</span> <span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>    <span class=c1># (D, H, W)</span>
@@ -511,7 +563,7 @@
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>        p (float): Probability of applying the transform. Default: 1.0</span>
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>    Targets:</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>        volume, mask3d</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>        volume, mask3d, keypoints</span>
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>    Image types:</span>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a><span class=sd>        uint8, float32</span>
@@ -538,7 +590,7 @@
 </span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a><span class=sd>        - D4: The 2D version that handles the 8 symmetries of a square</span>
 </span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
 </span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>)</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=n>_targets</span> <span class=o>=</span> <span class=p>(</span><span class=n>Targets</span><span class=o>.</span><span class=n>VOLUME</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>MASK3D</span><span class=p>,</span> <span class=n>Targets</span><span class=o>.</span><span class=n>KEYPOINTS</span><span class=p>)</span>
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
 </span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=bp>self</span><span class=p>,</span>
@@ -553,87 +605,88 @@
 </span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span>
 </span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
 </span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>        <span class=c1># Randomly select one of 48 possible transformations</span>
-</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>        <span class=k>return</span> <span class=p>{</span><span class=s2>&quot;index&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>47</span><span class=p>)}</span>
-</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>
-</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volume</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>index</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>        <span class=k>return</span> <span class=n>f3d</span><span class=o>.</span><span class=n>transform_cube</span><span class=p>(</span><span class=n>volume</span><span class=p>,</span> <span class=n>index</span><span class=p>)</span>
-</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>
-</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
-</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>        <span class=k>return</span> <span class=p>()</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.Pad3D class="doc doc-heading" data-toc-label=Pad3D> <code>class <strong> Pad3D</strong></code> <code> (padding, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.Pad3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Pad the sides of a 3D volume by specified number of voxels.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>padding</code></td> <td><code>int, tuple[int, int, int] or tuple[int, int, int, int, int, int]</code></td> <td><p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int, int] - symmetric padding (pad_z, pad_y, pad_x) where: - pad_z: padding for depth/z-axis (front/back) - pad_y: padding for height/y-axis (top/bottom) - pad_x: padding for width/x-axis (left/right) * tuple[int, int, int, int, int, int] - explicit padding per side in order: (front, top, left, back, bottom, right) where: - front/back: padding along z-axis (depth) - top/bottom: padding along y-axis (height) - left/right: padding along x-axis (width)</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Padding value for image</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Padding value for mask</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 1.0.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/Pad3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>Pad3D</span><span class=p>(</span><span class=n>BasePad3D</span><span class=p>):</span>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>        <span class=n>volume_shape</span> <span class=o>=</span> <span class=n>data</span><span class=p>[</span><span class=s2>&quot;volume&quot;</span><span class=p>]</span><span class=o>.</span><span class=n>shape</span>
+</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>        <span class=k>return</span> <span class=p>{</span><span class=s2>&quot;index&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>py_random</span><span class=o>.</span><span class=n>randint</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>47</span><span class=p>),</span> <span class=s2>&quot;volume_shape&quot;</span><span class=p>:</span> <span class=n>volume_shape</span><span class=p>}</span>
+</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>
+</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volume</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>index</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>        <span class=k>return</span> <span class=n>f3d</span><span class=o>.</span><span class=n>transform_cube</span><span class=p>(</span><span class=n>volume</span><span class=p>,</span> <span class=n>index</span><span class=p>)</span>
+</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>
+</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos="75 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_keypoints</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>index</span><span class=p>:</span> <span class=nb>int</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos="76 "></span></a>        <span class=k>return</span> <span class=n>f3d</span><span class=o>.</span><span class=n>transform_cube_keypoints</span><span class=p>(</span><span class=n>keypoints</span><span class=p>,</span> <span class=n>index</span><span class=p>,</span> <span class=n>volume_shape</span><span class=o>=</span><span class=n>params</span><span class=p>[</span><span class=s2>&quot;volume_shape&quot;</span><span class=p>])</span>
+</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos="77 "></span></a>
+</span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos="78 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
+</span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos="79 "></span></a>        <span class=k>return</span> <span class=p>()</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.Pad3D class="doc doc-heading" data-toc-label=Pad3D> <code>class <strong> Pad3D</strong></code> <code> (padding, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.Pad3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Pad the sides of a 3D volume by specified number of voxels.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>padding</code></td> <td><code>int, tuple[int, int, int] or tuple[int, int, int, int, int, int]</code></td> <td><p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int, int] - symmetric padding (depth, height, width) where each value is applied to both sides of the corresponding dimension * tuple[int, int, int, int, int, int] - explicit padding per side in order: (depth_front, depth_back, height_top, height_bottom, width_left, width_right)</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Padding value for image</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Padding value for mask</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 1.0.</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d, keypoints</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/Pad3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>Pad3D</span><span class=p>(</span><span class=n>BasePad3D</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Pad the sides of a 3D volume by specified number of voxels.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>        padding (int, tuple[int, int, int] or tuple[int, int, int, int, int, int]): Padding values. Can be:</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=sd>            * int - pad all sides by this value</span>
-</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>            * tuple[int, int, int] - symmetric padding (pad_z, pad_y, pad_x) where:</span>
-</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>                - pad_z: padding for depth/z-axis (front/back)</span>
-</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>                - pad_y: padding for height/y-axis (top/bottom)</span>
-</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>                - pad_x: padding for width/x-axis (left/right)</span>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>            * tuple[int, int, int, int, int, int] - explicit padding per side in order:</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>                (front, top, left, back, bottom, right) where:</span>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>                - front/back: padding along z-axis (depth)</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>                - top/bottom: padding along y-axis (height)</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>                - left/right: padding along x-axis (width)</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        fill (ColorType): Padding value for image</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        fill_mask (ColorType): Padding value for mask</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>        p (float): probability of applying the transform. Default: 1.0.</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>    Targets:</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>        volume, mask3d</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>    Image types:</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>        uint8, float32</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>    Note:</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>        Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width),</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>        with optional channel dimension as the last axis.</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>BasePad3D</span><span class=o>.</span><span class=n>InitSchema</span><span class=p>):</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>padding</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=nd>@field_validator</span><span class=p>(</span><span class=s2>&quot;padding&quot;</span><span class=p>)</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=nd>@classmethod</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=k>def</span> <span class=nf>validate_padding</span><span class=p>(</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>            <span class=bp>cls</span><span class=p>,</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=n>v</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>        <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]:</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>            <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>v</span><span class=p>,</span> <span class=nb>int</span><span class=p>)</span> <span class=ow>and</span> <span class=n>v</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>:</span>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;Padding value must be non-negative&quot;</span><span class=p>)</span>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>            <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>v</span><span class=p>,</span> <span class=nb>tuple</span><span class=p>)</span> <span class=ow>and</span> <span class=ow>not</span> <span class=nb>all</span><span class=p>(</span><span class=nb>isinstance</span><span class=p>(</span><span class=n>i</span><span class=p>,</span> <span class=nb>int</span><span class=p>)</span> <span class=ow>and</span> <span class=n>i</span> <span class=o>&gt;=</span> <span class=mi>0</span> <span class=k>for</span> <span class=n>i</span> <span class=ow>in</span> <span class=n>v</span><span class=p>):</span>
-</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;Padding tuple must contain non-negative integers&quot;</span><span class=p>)</span>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>            <span class=k>return</span> <span class=n>v</span>
-</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>
-</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=n>padding</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
-</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=n>p</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>1.0</span><span class=p>,</span>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=n>always_apply</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=p>):</span>
-</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>fill</span><span class=o>=</span><span class=n>fill</span><span class=p>,</span> <span class=n>fill_mask</span><span class=o>=</span><span class=n>fill_mask</span><span class=p>,</span> <span class=n>p</span><span class=o>=</span><span class=n>p</span><span class=p>)</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>padding</span> <span class=o>=</span> <span class=n>padding</span>
-</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=n>fill</span>
-</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=n>fill_mask</span>
-</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>
-</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>    <span class=k>def</span> <span class=nf>get_params_dependent_on_data</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span> <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
-</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>        <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>padding</span><span class=p>,</span> <span class=nb>int</span><span class=p>):</span>
-</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>            <span class=n>pad_d</span> <span class=o>=</span> <span class=n>pad_h</span> <span class=o>=</span> <span class=n>pad_w</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>padding</span>
-</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>            <span class=n>padding</span> <span class=o>=</span> <span class=p>(</span><span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>)</span>
-</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>        <span class=k>elif</span> <span class=nb>len</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>padding</span><span class=p>)</span> <span class=o>==</span> <span class=n>NUM_DIMENSIONS</span><span class=p>:</span>
-</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>            <span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_w</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>padding</span>  <span class=c1># type: ignore[misc]</span>
-</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>            <span class=n>padding</span> <span class=o>=</span> <span class=p>(</span><span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>)</span>
-</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>        <span class=k>else</span><span class=p>:</span>
-</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>            <span class=n>padding</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>padding</span>  <span class=c1># type: ignore[assignment]</span>
-</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>
-</span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>        <span class=k>return</span> <span class=p>{</span><span class=s2>&quot;padding&quot;</span><span class=p>:</span> <span class=n>padding</span><span class=p>}</span>
-</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>
-</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
-</span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>        <span class=k>return</span> <span class=s2>&quot;padding&quot;</span><span class=p>,</span> <span class=s2>&quot;fill&quot;</span><span class=p>,</span> <span class=s2>&quot;fill_mask&quot;</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.PadIfNeeded3D class="doc doc-heading" data-toc-label=PadIfNeeded3D> <code>class <strong> PadIfNeeded3D</strong></code> <code> (min_zyx=None, pad_divisor_zyx=None, position=&#39;center&#39;, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.PadIfNeeded3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions. If the pad_divisor_zyx is specified, the function additionally ensures that the volume dimensions are divisible by these values.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>min_zyx</code></td> <td><code>tuple[int, int, int] | None</code></td> <td><p>Minimum desired size as (depth, height, width). Ensures volume dimensions are at least these values. If not specified, pad_divisor_zyx must be provided.</p></td> </tr> <tr> <td><code>pad_divisor_zyx</code></td> <td><code>tuple[int, int, int] | None</code></td> <td><p>If set, pads each dimension to make it divisible by corresponding value in format (depth_div, height_div, width_div). If not specified, min_zyx must be provided.</p></td> </tr> <tr> <td><code>position</code></td> <td><code>Literal[&#34;center&#34;, &#34;random&#34;]</code></td> <td><p>Position where the volume is to be placed after padding. Default is 'center'.</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Value to fill the border voxels for volume. Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Value to fill the border voxels for masks. Default: 0</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/PadIfNeeded3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>PadIfNeeded3D</span><span class=p>(</span><span class=n>BasePad3D</span><span class=p>):</span>
+</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>            * tuple[int, int, int] - symmetric padding (depth, height, width) where each value</span>
+</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>              is applied to both sides of the corresponding dimension</span>
+</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>            * tuple[int, int, int, int, int, int] - explicit padding per side in order:</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>              (depth_front, depth_back, height_top, height_bottom, width_left, width_right)</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        fill (ColorType): Padding value for image</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>        fill_mask (ColorType): Padding value for mask</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        p (float): probability of applying the transform. Default: 1.0.</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>    Targets:</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        volume, mask3d, keypoints</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>    Image types:</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>        uint8, float32</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>    Note:</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>        Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width),</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>        with optional channel dimension as the last axis.</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=k>class</span> <span class=nc>InitSchema</span><span class=p>(</span><span class=n>BasePad3D</span><span class=o>.</span><span class=n>InitSchema</span><span class=p>):</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=n>padding</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=nd>@field_validator</span><span class=p>(</span><span class=s2>&quot;padding&quot;</span><span class=p>)</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=nd>@classmethod</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=k>def</span> <span class=nf>validate_padding</span><span class=p>(</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>            <span class=bp>cls</span><span class=p>,</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>            <span class=n>v</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]:</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>            <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>v</span><span class=p>,</span> <span class=nb>int</span><span class=p>)</span> <span class=ow>and</span> <span class=n>v</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;Padding value must be non-negative&quot;</span><span class=p>)</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>v</span><span class=p>,</span> <span class=nb>tuple</span><span class=p>)</span> <span class=ow>and</span> <span class=ow>not</span> <span class=nb>all</span><span class=p>(</span><span class=nb>isinstance</span><span class=p>(</span><span class=n>i</span><span class=p>,</span> <span class=nb>int</span><span class=p>)</span> <span class=ow>and</span> <span class=n>i</span> <span class=o>&gt;=</span> <span class=mi>0</span> <span class=k>for</span> <span class=n>i</span> <span class=ow>in</span> <span class=n>v</span><span class=p>):</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>                <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;Padding tuple must contain non-negative integers&quot;</span><span class=p>)</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>            <span class=k>return</span> <span class=n>v</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>        <span class=n>padding</span><span class=p>:</span> <span class=nb>int</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>]</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>        <span class=n>fill</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=n>fill_mask</span><span class=p>:</span> <span class=n>ColorType</span> <span class=o>=</span> <span class=mi>0</span><span class=p>,</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>        <span class=n>p</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>1.0</span><span class=p>,</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=n>always_apply</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>    <span class=p>):</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>fill</span><span class=o>=</span><span class=n>fill</span><span class=p>,</span> <span class=n>fill_mask</span><span class=o>=</span><span class=n>fill_mask</span><span class=p>,</span> <span class=n>p</span><span class=o>=</span><span class=n>p</span><span class=p>)</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>padding</span> <span class=o>=</span> <span class=n>padding</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill</span> <span class=o>=</span> <span class=n>fill</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>fill_mask</span> <span class=o>=</span> <span class=n>fill_mask</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>    <span class=k>def</span> <span class=nf>get_params_dependent_on_data</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>params</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>],</span> <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>        <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>padding</span><span class=p>,</span> <span class=nb>int</span><span class=p>):</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>            <span class=n>pad_d</span> <span class=o>=</span> <span class=n>pad_h</span> <span class=o>=</span> <span class=n>pad_w</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>padding</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>            <span class=n>padding</span> <span class=o>=</span> <span class=p>(</span><span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>)</span>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>        <span class=k>elif</span> <span class=nb>len</span><span class=p>(</span><span class=bp>self</span><span class=o>.</span><span class=n>padding</span><span class=p>)</span> <span class=o>==</span> <span class=n>NUM_DIMENSIONS</span><span class=p>:</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>            <span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_w</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>padding</span>  <span class=c1># type: ignore[misc]</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>            <span class=n>padding</span> <span class=o>=</span> <span class=p>(</span><span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_d</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_h</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>,</span> <span class=n>pad_w</span><span class=p>)</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>        <span class=k>else</span><span class=p>:</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>            <span class=n>padding</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>padding</span>  <span class=c1># type: ignore[assignment]</span>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>        <span class=k>return</span> <span class=p>{</span><span class=s2>&quot;padding&quot;</span><span class=p>:</span> <span class=n>padding</span><span class=p>}</span>
+</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>        <span class=k>return</span> <span class=s2>&quot;padding&quot;</span><span class=p>,</span> <span class=s2>&quot;fill&quot;</span><span class=p>,</span> <span class=s2>&quot;fill_mask&quot;</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.PadIfNeeded3D class="doc doc-heading" data-toc-label=PadIfNeeded3D> <code>class <strong> PadIfNeeded3D</strong></code> <code> (min_zyx=None, pad_divisor_zyx=None, position=&#39;center&#39;, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.PadIfNeeded3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions. If the pad_divisor_zyx is specified, the function additionally ensures that the volume dimensions are divisible by these values.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>min_zyx</code></td> <td><code>tuple[int, int, int] | None</code></td> <td><p>Minimum desired size as (depth, height, width). Ensures volume dimensions are at least these values. If not specified, pad_divisor_zyx must be provided.</p></td> </tr> <tr> <td><code>pad_divisor_zyx</code></td> <td><code>tuple[int, int, int] | None</code></td> <td><p>If set, pads each dimension to make it divisible by corresponding value in format (depth_div, height_div, width_div). If not specified, min_zyx must be provided.</p></td> </tr> <tr> <td><code>position</code></td> <td><code>Literal[&#34;center&#34;, &#34;random&#34;]</code></td> <td><p>Position where the volume is to be placed after padding. Default is 'center'.</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Value to fill the border voxels for volume. Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Value to fill the border voxels for masks. Default: 0</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d, keypoints</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/PadIfNeeded3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>PadIfNeeded3D</span><span class=p>(</span><span class=n>BasePad3D</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=sd>    If the pad_divisor_zyx is specified, the function additionally ensures that the volume</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    dimensions are divisible by these values.</span>
@@ -652,7 +705,7 @@
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        p (float): Probability of applying the transform. Default: 1.0</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>    Targets:</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>        volume, mask3d</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>        volume, mask3d, keypoints</span>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>    Image types:</span>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>        uint8, float32</span>
@@ -722,7 +775,7 @@
 </span><span id=__span-0-87><a id=__codelineno-0-87 name=__codelineno-0-87></a><a href=#__codelineno-0-87><span class=linenos data-linenos="87 "></span></a>            <span class=s2>&quot;fill&quot;</span><span class=p>,</span>
 </span><span id=__span-0-88><a id=__codelineno-0-88 name=__codelineno-0-88></a><a href=#__codelineno-0-88><span class=linenos data-linenos="88 "></span></a>            <span class=s2>&quot;fill_mask&quot;</span><span class=p>,</span>
 </span><span id=__span-0-89><a id=__codelineno-0-89 name=__codelineno-0-89></a><a href=#__codelineno-0-89><span class=linenos data-linenos="89 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.RandomCrop3D class="doc doc-heading" data-toc-label=RandomCrop3D> <code>class <strong> RandomCrop3D</strong></code> <code> (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.RandomCrop3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Crop random part of 3D volume.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>size</code></td> <td><code>tuple[int, int, int]</code></td> <td><p>Desired output size of the crop in format (depth, height, width)</p></td> </tr> <tr> <td><code>pad_if_needed</code></td> <td><code>bool</code></td> <td><p>Whether to pad if the volume is smaller than desired crop size. Default: False</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Padding value for image if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Padding value for mask if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>If you want to perform random cropping only in the XY plane while preserving all slices along the Z axis, consider using RandomCrop instead. RandomCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/RandomCrop3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>RandomCrop3D</span><span class=p>(</span><span class=n>BaseCropAndPad3D</span><span class=p>):</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.augmentations.transforms3d.transforms.RandomCrop3D class="doc doc-heading" data-toc-label=RandomCrop3D> <code>class <strong> RandomCrop3D</strong></code> <code> (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/augmentations/transforms3d/transforms.py#L20 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.augmentations.transforms3d.transforms.RandomCrop3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Crop random part of 3D volume.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>size</code></td> <td><code>tuple[int, int, int]</code></td> <td><p>Desired output size of the crop in format (depth, height, width)</p></td> </tr> <tr> <td><code>pad_if_needed</code></td> <td><code>bool</code></td> <td><p>Whether to pad if the volume is smaller than desired crop size. Default: False</p></td> </tr> <tr> <td><code>fill</code></td> <td><code>ColorType</code></td> <td><p>Padding value for image if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>fill_mask</code></td> <td><code>ColorType</code></td> <td><p>Padding value for mask if pad_if_needed is True. Default: 0</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying the transform. Default: 1.0</p></td> </tr> </tbody> </table> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume, mask3d, keypoints</p> </div> <p>Image types: uint8, float32</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>If you want to perform random cropping only in the XY plane while preserving all slices along the Z axis, consider using RandomCrop instead. RandomCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/RandomCrop3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>RandomCrop3D</span><span class=p>(</span><span class=n>BaseCropAndPad3D</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Crop random part of 3D volume.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -733,7 +786,7 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        p (float): probability of applying the transform. Default: 1.0</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    Targets:</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        volume, mask3d</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        volume, mask3d, keypoints</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>    Image types:</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        uint8, float32</span>
diff --git a/docs/api_reference/core/bbox_utils/index.html b/docs/api_reference/core/bbox_utils/index.html
index 020ab807..1e927848 100644
--- a/docs/api_reference/core/bbox_utils/index.html
+++ b/docs/api_reference/core/bbox_utils/index.html
@@ -85,7 +85,7 @@
 </span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos="77 "></span></a>            <span class=sa>f</span><span class=s2>&quot; min_visibility=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>min_visibility</span><span class=si>}</span><span class=s2>, min_width=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>min_width</span><span class=si>}</span><span class=s2>, min_height=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>min_height</span><span class=si>}</span><span class=s2>,&quot;</span>
 </span><span id=__span-0-78><a id=__codelineno-0-78 name=__codelineno-0-78></a><a href=#__codelineno-0-78><span class=linenos data-linenos="78 "></span></a>            <span class=sa>f</span><span class=s2>&quot; check_each_transform=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=si>}</span><span class=s2>, clip=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>clip</span><span class=si>}</span><span class=s2>)&quot;</span>
 </span><span id=__span-0-79><a id=__codelineno-0-79 name=__codelineno-0-79></a><a href=#__codelineno-0-79><span class=linenos data-linenos="79 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.bboxes_from_masks class="doc doc-heading" data-toc-label=bboxes_from_masks()> <code class="highlight language-python">def bboxes_from_masks (masks) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L532 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.bboxes_from_masks title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create bounding boxes from binary masks (fast version)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>masks</code></td> <td><code>np.ndarray</code></td> <td><p>Binary masks of shape (H, W) or (N, H, W) where N is the number of masks, and H, W are the height and width of each mask.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (N, 4), where each row is (x_min, y_min, x_max, y_max).</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>bboxes_from_masks</span><span class=p>(</span><span class=n>masks</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.bboxes_from_masks class="doc doc-heading" data-toc-label=bboxes_from_masks()> <code class="highlight language-python">def bboxes_from_masks (masks) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L542 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.bboxes_from_masks title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create bounding boxes from binary masks (fast version)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>masks</code></td> <td><code>np.ndarray</code></td> <td><p>Binary masks of shape (H, W) or (N, H, W) where N is the number of masks, and H, W are the height and width of each mask.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes with shape (N, 4), where each row is (x_min, y_min, x_max, y_max).</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>bboxes_from_masks</span><span class=p>(</span><span class=n>masks</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Create bounding boxes from binary masks (fast version)</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
@@ -114,12 +114,12 @@
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>            <span class=n>bboxes</span><span class=p>[</span><span class=n>i</span><span class=p>]</span> <span class=o>=</span> <span class=p>[</span><span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span> <span class=o>+</span> <span class=mi>1</span><span class=p>,</span> <span class=n>y_max</span> <span class=o>+</span> <span class=mi>1</span><span class=p>]</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>return</span> <span class=n>bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.calculate_bbox_areas_in_pixels class="doc doc-heading" data-toc-label=calculate_bbox_areas_in_pixels()> <code class="highlight language-python">def calculate_bbox_areas_in_pixels (bboxes, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L202 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.calculate_bbox_areas_in_pixels title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate areas for multiple bounding boxes.</p> <p>This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where N is the number of bounding boxes. Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates. Additional columns beyond the first 4 are ignored.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>A tuple containing the height and width of the image (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels. Returns an empty array if the input <code>bboxes</code> is empty.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function assumes that the input bounding boxes are valid (i.e., x_max &gt; x_min and y_max &gt; y_min). Invalid bounding boxes may result in negative areas.</li> <li>The function preserves the input array and creates a copy for internal calculations.</li> <li>The returned areas are in pixel units, not normalized.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.5</span><span class=p>,</span> <span class=mf>0.5</span><span class=p>],</span> <span class=p>[</span><span class=mf>0.2</span><span class=p>,</span> <span class=mf>0.2</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>]])</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.calculate_bbox_areas_in_pixels class="doc doc-heading" data-toc-label=calculate_bbox_areas_in_pixels()> <code class="highlight language-python">def calculate_bbox_areas_in_pixels (bboxes, shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L209 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.calculate_bbox_areas_in_pixels title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate areas for multiple bounding boxes.</p> <p>This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 4+) where N is the number of bounding boxes. Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates. Additional columns beyond the first 4 are ignored.</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>A tuple containing the height and width of the image (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels. Returns an empty array if the input <code>bboxes</code> is empty.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function assumes that the input bounding boxes are valid (i.e., x_max &gt; x_min and y_max &gt; y_min). Invalid bounding boxes may result in negative areas.</li> <li>The function preserves the input array and creates a copy for internal calculations.</li> <li>The returned areas are in pixel units, not normalized.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=n>bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([[</span><span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.1</span><span class=p>,</span> <span class=mf>0.5</span><span class=p>,</span> <span class=mf>0.5</span><span class=p>],</span> <span class=p>[</span><span class=mf>0.2</span><span class=p>,</span> <span class=mf>0.2</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>,</span> <span class=mf>0.8</span><span class=p>]])</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>image_shape</span> <span class=o>=</span> <span class=p>(</span><span class=mi>100</span><span class=p>,</span> <span class=mi>100</span><span class=p>)</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span> <span class=n>areas</span> <span class=o>=</span> <span class=n>calculate_bbox_areas</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span> <span class=nb>print</span><span class=p>(</span><span class=n>areas</span><span class=p>)</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5 href=#__codelineno-0-5></a><span class=p>[</span><span class=mf>1600.</span> <span class=mf>3600.</span><span class=p>]</span>
-</span></code></pre></div> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>calculate_bbox_areas_in_pixels</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>calculate_bbox_areas_in_pixels</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Calculate areas for multiple bounding boxes.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    This function computes the areas of bounding boxes given their normalized coordinates</span>
@@ -130,7 +130,7 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        bboxes (np.ndarray): A numpy array of shape (N, 4+) where N is the number of bounding boxes.</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>                             Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates.</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>                             Additional columns beyond the first 4 are ignored.</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        image_shape (tuple[int, int]): A tuple containing the height and width of the image (height, width).</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        shape (ShapeType): A tuple containing the height and width of the image (height, width).</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        np.ndarray: A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.</span>
@@ -152,12 +152,12 @@
 </span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>bboxes</span><span class=p>)</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
 </span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
 </span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>]</span>
 </span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=n>bboxes_denorm</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span>
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=n>bboxes_denorm</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>]]</span> <span class=o>*=</span> <span class=n>width</span>
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=n>bboxes_denorm</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>*=</span> <span class=n>height</span>
 </span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=k>return</span> <span class=p>(</span><span class=n>bboxes_denorm</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>-</span> <span class=n>bboxes_denorm</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>])</span> <span class=o>*</span> <span class=p>(</span><span class=n>bboxes_denorm</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>-</span> <span class=n>bboxes_denorm</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>])</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.check_bboxes class="doc doc-heading" data-toc-label=check_bboxes()> <code class="highlight language-python">def check_bboxes (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L352 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.check_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>numpy array of shape (num_bboxes, 4+) where first 4 coordinates are x_min, y_min, x_max, y_max.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If any bbox is invalid.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.check_bboxes class="doc doc-heading" data-toc-label=check_bboxes()> <code class="highlight language-python">def check_bboxes (bboxes) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L359 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.check_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>numpy array of shape (num_bboxes, 4+) where first 4 coordinates are x_min, y_min, x_max, y_max.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If any bbox is invalid.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>check_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -192,8 +192,8 @@
 </span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>            <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;x_max is less than or equal to x_min for bbox </span><span class=si>{</span><span class=n>invalid_bbox</span><span class=si>}</span><span class=s2>.&quot;</span><span class=p>)</span>
 </span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>
 </span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;y_max is less than or equal to y_min for bbox </span><span class=si>{</span><span class=n>invalid_bbox</span><span class=si>}</span><span class=s2>.&quot;</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.clip_bboxes class="doc doc-heading" data-toc-label=clip_bboxes()> <code class="highlight language-python">def clip_bboxes (bboxes, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L389 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.clip_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Clips the bounding box coordinates to ensure they fit within the boundaries of an image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (num_boxes, 4+) in normalized format. The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Tuple[int, int]</code></td> <td><p>Image shape (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The clipped bounding boxes, normalized to the image dimensions.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
-</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>clip_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.clip_bboxes class="doc doc-heading" data-toc-label=clip_bboxes()> <code class="highlight language-python">def clip_bboxes (bboxes, shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L396 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.clip_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Clips the bounding box coordinates to ensure they fit within the boundaries of an image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (num_boxes, 4+) in normalized format. The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Tuple[int, int]</code></td> <td><p>Image shape (height, width).</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>The clipped bounding boxes, normalized to the image dimensions.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>clip_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Clips the bounding box coordinates to ensure they fit within the boundaries of an image.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>    Parameters:</span>
@@ -205,10 +205,10 @@
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        np.ndarray: The clipped bounding boxes, normalized to the image dimensions.</span>
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>]</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=c1># Denormalize bboxes</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=n>denorm_bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=n>denorm_bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=c1>## Note:</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=c1># It could be tempting to use cols - 1 and rows - 1 as the upper bounds for the clipping</span>
@@ -231,12 +231,12 @@
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=n>denorm_bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>clip</span><span class=p>(</span><span class=n>denorm_bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]],</span> <span class=mi>0</span><span class=p>,</span> <span class=n>height</span><span class=p>,</span> <span class=n>out</span><span class=o>=</span><span class=n>denorm_bboxes</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]])</span>
 </span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>
 </span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=c1># Normalize clipped bboxes</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=k>return</span> <span class=n>normalize_bboxes</span><span class=p>(</span><span class=n>denorm_bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.convert_bboxes_from_albumentations class="doc doc-heading" data-toc-label=convert_bboxes_from_albumentations()> <code class="highlight language-python">def convert_bboxes_from_albumentations (bboxes, target_format, image_shape, check_validity=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L301 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.convert_bboxes_from_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert bounding boxes from the format used by albumentations to a specified format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>target_format</code></td> <td><code>Literal[&#39;coco&#39;, &#39;pascal_voc&#39;, &#39;yolo&#39;]</code></td> <td><p>Required format of the output bounding boxes. Should be 'coco', 'pascal_voc' or 'yolo'.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape (height, width).</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>Check if all boxes are valid boxes.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes in the target format with shape (num_bboxes, 4+).</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If <code>target_format</code> is not 'coco', 'pascal_voc' or 'yolo'.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=k>return</span> <span class=n>normalize_bboxes</span><span class=p>(</span><span class=n>denorm_bboxes</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.convert_bboxes_from_albumentations class="doc doc-heading" data-toc-label=convert_bboxes_from_albumentations()> <code class="highlight language-python">def convert_bboxes_from_albumentations (bboxes, target_format, shape, check_validity=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L308 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.convert_bboxes_from_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert bounding boxes from the format used by albumentations to a specified format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>target_format</code></td> <td><code>Literal[&#39;coco&#39;, &#39;pascal_voc&#39;, &#39;yolo&#39;]</code></td> <td><p>Required format of the output bounding boxes. Should be 'coco', 'pascal_voc' or 'yolo'.</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>Image shape (height, width).</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>Check if all boxes are valid boxes.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes in the target format with shape (num_bboxes, 4+).</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If <code>target_format</code> is not 'coco', 'pascal_voc' or 'yolo'.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>convert_bboxes_from_albumentations</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>target_format</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;coco&quot;</span><span class=p>,</span> <span class=s2>&quot;pascal_voc&quot;</span><span class=p>,</span> <span class=s2>&quot;yolo&quot;</span><span class=p>],</span>
-</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>check_validity</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Convert bounding boxes from the format used by albumentations to a specified format.</span>
@@ -245,7 +245,7 @@
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        bboxes: A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+).</span>
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>                The first 4 columns are [x_min, y_min, x_max, y_max].</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>        target_format: Required format of the output bounding boxes. Should be &#39;coco&#39;, &#39;pascal_voc&#39; or &#39;yolo&#39;.</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        image_shape: Image shape (height, width).</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        shape: Image shape (height, width).</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        check_validity: Check if all boxes are valid boxes.</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>    Returns:</span>
@@ -265,7 +265,7 @@
 </span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=n>converted_bboxes</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>zeros_like</span><span class=p>(</span><span class=n>bboxes</span><span class=p>)</span>
 </span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>:]</span> <span class=o>=</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>:]</span>  <span class=c1># Preserve additional columns</span>
 </span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=n>denormalized_bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>],</span> <span class=n>image_shape</span><span class=p>)</span> <span class=k>if</span> <span class=n>target_format</span> <span class=o>!=</span> <span class=s2>&quot;yolo&quot;</span> <span class=k>else</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=n>denormalized_bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>],</span> <span class=n>shape</span><span class=p>)</span> <span class=k>if</span> <span class=n>target_format</span> <span class=o>!=</span> <span class=s2>&quot;yolo&quot;</span> <span class=k>else</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span>
 </span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=k>if</span> <span class=n>target_format</span> <span class=o>==</span> <span class=s2>&quot;coco&quot;</span><span class=p>:</span>
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span> <span class=o>=</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span>  <span class=c1># x_min</span>
@@ -281,11 +281,11 @@
 </span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span> <span class=o>=</span> <span class=n>denormalized_bboxes</span>
 </span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>
 </span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=k>return</span> <span class=n>converted_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.convert_bboxes_to_albumentations class="doc doc-heading" data-toc-label=convert_bboxes_to_albumentations()> <code class="highlight language-python">def convert_bboxes_to_albumentations (bboxes, source_format, image_shape, check_validity=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L242 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.convert_bboxes_to_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of <code>(x_min, y_min, x_max, y_max)</code> e.g. <code>(0.15, 0.27, 0.67, 0.5)</code>.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+).</p></td> </tr> <tr> <td><code>source_format</code></td> <td><code>Literal[&#39;coco&#39;, &#39;pascal_voc&#39;, &#39;yolo&#39;]</code></td> <td><p>Format of the input bounding boxes. Should be 'coco', 'pascal_voc', or 'yolo'.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape (height, width).</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>Check if all boxes are valid boxes.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If <code>source_format</code> is not 'coco', 'pascal_voc', or 'yolo'.</p></td> </tr> <tr> <td><code>ValueError</code></td> <td><p>If in YOLO format, any coordinates are not in the range (0, 1].</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.convert_bboxes_to_albumentations class="doc doc-heading" data-toc-label=convert_bboxes_to_albumentations()> <code class="highlight language-python">def convert_bboxes_to_albumentations (bboxes, source_format, shape, check_validity=False) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L249 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.convert_bboxes_to_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of <code>(x_min, y_min, x_max, y_max)</code> e.g. <code>(0.15, 0.27, 0.67, 0.5)</code>.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of bounding boxes with shape (num_bboxes, 4+).</p></td> </tr> <tr> <td><code>source_format</code></td> <td><code>Literal[&#39;coco&#39;, &#39;pascal_voc&#39;, &#39;yolo&#39;]</code></td> <td><p>Format of the input bounding boxes. Should be 'coco', 'pascal_voc', or 'yolo'.</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>Image shape (height, width).</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>Check if all boxes are valid boxes.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If <code>source_format</code> is not 'coco', 'pascal_voc', or 'yolo'.</p></td> </tr> <tr> <td><code>ValueError</code></td> <td><p>If in YOLO format, any coordinates are not in the range (0, 1].</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>convert_bboxes_to_albumentations</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>source_format</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;coco&quot;</span><span class=p>,</span> <span class=s2>&quot;pascal_voc&quot;</span><span class=p>,</span> <span class=s2>&quot;yolo&quot;</span><span class=p>],</span>
-</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>check_validity</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Convert bounding boxes from a specified format to the format used by albumentations:</span>
@@ -295,7 +295,7 @@
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).</span>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        source_format: Format of the input bounding boxes. Should be &#39;coco&#39;, &#39;pascal_voc&#39;, or &#39;yolo&#39;.</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        image_shape: Image shape (height, width).</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        shape: Image shape (height, width).</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        check_validity: Check if all boxes are valid boxes.</span>
 </span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>    Returns:</span>
@@ -332,36 +332,39 @@
 </span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span> <span class=o>=</span> <span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span>
 </span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>
 </span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>    <span class=k>if</span> <span class=n>source_format</span> <span class=o>!=</span> <span class=s2>&quot;yolo&quot;</span><span class=p>:</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span> <span class=o>=</span> <span class=n>normalize_bboxes</span><span class=p>(</span><span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>],</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span> <span class=o>=</span> <span class=n>normalize_bboxes</span><span class=p>(</span><span class=n>converted_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>],</span> <span class=n>shape</span><span class=p>)</span>
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>
 </span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=k>if</span> <span class=n>check_validity</span><span class=p>:</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>        <span class=n>check_bboxes</span><span class=p>(</span><span class=n>converted_bboxes</span><span class=p>)</span>
 </span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>
 </span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=k>return</span> <span class=n>converted_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.denormalize_bboxes class="doc doc-heading" data-toc-label=denormalize_bboxes()> <code class="highlight language-python">def denormalize_bboxes (bboxes, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L179 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.denormalize_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Denormalize array of bounding boxes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape <code>(height, width)</code>.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.denormalize_bboxes class="doc doc-heading" data-toc-label=denormalize_bboxes()> <code class="highlight language-python">def denormalize_bboxes (bboxes, shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L183 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.denormalize_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Denormalize array of bounding boxes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType | tuple[int, int]</code></td> <td><p>Image shape <code>(height, width)</code>.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>denormalize_bboxes</span><span class=p>(</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Denormalize  array of bounding boxes.</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        bboxes: Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.</span>
-</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        image_shape: Image shape `(height, width)`.</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        shape: Image shape `(height, width)`.</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a>
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>        Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.</span>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=n>rows</span><span class=p>,</span> <span class=n>cols</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>denormalized</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>float</span><span class=p>)</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>denormalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>]]</span> <span class=o>*=</span> <span class=n>cols</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=n>denormalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>*=</span> <span class=n>rows</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=k>return</span> <span class=n>denormalized</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.filter_bboxes class="doc doc-heading" data-toc-label=filter_bboxes()> <code class="highlight language-python">def filter_bboxes (bboxes, image_shape, min_area=0.0, min_visibility=0.0, min_width=1.0, min_height=1.0) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L431 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.filter_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by <code>min_area</code>. Also crops boxes to final image size.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>numpy array of bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape (height, width).</p></td> </tr> <tr> <td><code>min_area</code></td> <td><code>float</code></td> <td><p>Minimum area of a bounding box in pixels. Default: 0.0.</p></td> </tr> <tr> <td><code>min_visibility</code></td> <td><code>float</code></td> <td><p>Minimum fraction of area for a bounding box to remain. Default: 0.0.</p></td> </tr> <tr> <td><code>min_width</code></td> <td><code>float</code></td> <td><p>Minimum width of a bounding box in pixels. Default: 0.0.</p></td> </tr> <tr> <td><code>min_height</code></td> <td><code>float</code></td> <td><p>Minimum height of a bounding box in pixels. Default: 0.0.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>numpy array of filtered bounding boxes.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>filter_bboxes</span><span class=p>(</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>shape</span><span class=p>,</span> <span class=nb>tuple</span><span class=p>):</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>        <span class=n>rows</span><span class=p>,</span> <span class=n>cols</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=k>else</span><span class=p>:</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>        <span class=n>rows</span><span class=p>,</span> <span class=n>cols</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>]</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=n>denormalized</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>float</span><span class=p>)</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>    <span class=n>denormalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>]]</span> <span class=o>*=</span> <span class=n>cols</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=n>denormalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>*=</span> <span class=n>rows</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=k>return</span> <span class=n>denormalized</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.filter_bboxes class="doc doc-heading" data-toc-label=filter_bboxes()> <code class="highlight language-python">def filter_bboxes (bboxes, shape, min_area=0.0, min_visibility=0.0, min_width=1.0, min_height=1.0) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L438 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.filter_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by <code>min_area</code>. Also crops boxes to final image size.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>numpy array of bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max].</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>dict[str, int]</code></td> <td><p>The shape of the image/volume: - For 2D: {'height': int, 'width': int} - For 3D: {'height': int, 'width': int, 'depth': int}</p></td> </tr> <tr> <td><code>min_area</code></td> <td><code>float</code></td> <td><p>Minimum area of a bounding box in pixels. Default: 0.0.</p></td> </tr> <tr> <td><code>min_visibility</code></td> <td><code>float</code></td> <td><p>Minimum fraction of area for a bounding box to remain. Default: 0.0.</p></td> </tr> <tr> <td><code>min_width</code></td> <td><code>float</code></td> <td><p>Minimum width of a bounding box in pixels. Default: 0.0.</p></td> </tr> <tr> <td><code>min_height</code></td> <td><code>float</code></td> <td><p>Minimum height of a bounding box in pixels. Default: 0.0.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>numpy array of filtered bounding boxes.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>filter_bboxes</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>min_area</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>0.0</span><span class=p>,</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>min_visibility</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>0.0</span><span class=p>,</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>min_width</span><span class=p>:</span> <span class=nb>float</span> <span class=o>=</span> <span class=mf>1.0</span><span class=p>,</span>
@@ -373,85 +376,96 @@
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>        bboxes: numpy array of bounding boxes with shape (num_bboxes, 4+).</span>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>                The first 4 columns are [x_min, y_min, x_max, y_max].</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        image_shape: Image shape (height, width).</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        min_area: Minimum area of a bounding box in pixels. Default: 0.0.</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        min_visibility: Minimum fraction of area for a bounding box to remain. Default: 0.0.</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>        min_width: Minimum width of a bounding box in pixels. Default: 0.0.</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        min_height: Minimum height of a bounding box in pixels. Default: 0.0.</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>    Returns:</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>        numpy array of filtered bounding boxes.</span>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=n>epsilon</span> <span class=o>=</span> <span class=mf>1e-7</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>bboxes</span><span class=p>)</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span><span class=o>.</span><span class=n>reshape</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>4</span><span class=p>)</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        shape (dict[str, int]): The shape of the image/volume:</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>                               - For 2D: {&#39;height&#39;: int, &#39;width&#39;: int}</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>                               - For 3D: {&#39;height&#39;: int, &#39;width&#39;: int, &#39;depth&#39;: int}</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        min_area: Minimum area of a bounding box in pixels. Default: 0.0.</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>        min_visibility: Minimum fraction of area for a bounding box to remain. Default: 0.0.</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>        min_width: Minimum width of a bounding box in pixels. Default: 0.0.</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>        min_height: Minimum height of a bounding box in pixels. Default: 0.0.</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        numpy array of filtered bounding boxes.</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>epsilon</span> <span class=o>=</span> <span class=mf>1e-7</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=c1># Calculate areas of bounding boxes before clipping in pixels</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=n>denormalized_box_areas</span> <span class=o>=</span> <span class=n>calculate_bbox_areas_in_pixels</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>bboxes</span><span class=p>)</span> <span class=o>==</span> <span class=mi>0</span><span class=p>:</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span><span class=o>.</span><span class=n>reshape</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>4</span><span class=p>)</span>
 </span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=c1># Clip bounding boxes in ratio</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=n>clipped_bboxes</span> <span class=o>=</span> <span class=n>clip_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=c1># Calculate areas of bounding boxes before clipping in pixels</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=n>denormalized_box_areas</span> <span class=o>=</span> <span class=n>calculate_bbox_areas_in_pixels</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
 </span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=c1># Calculate areas of clipped bounding boxes in pixels</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=n>clipped_box_areas</span> <span class=o>=</span> <span class=n>calculate_bbox_areas_in_pixels</span><span class=p>(</span><span class=n>clipped_bboxes</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=c1># Clip bounding boxes in ratio</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=n>clipped_bboxes</span> <span class=o>=</span> <span class=n>clip_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
 </span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=c1># Calculate width and height of the clipped bounding boxes</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=n>denormalized_bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>clipped_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>],</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=c1># Calculate areas of clipped bounding boxes in pixels</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=n>clipped_box_areas</span> <span class=o>=</span> <span class=n>calculate_bbox_areas_in_pixels</span><span class=p>(</span><span class=n>clipped_bboxes</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
 </span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=n>clipped_widths</span> <span class=o>=</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>-</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>    <span class=n>clipped_heights</span> <span class=o>=</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>-</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>    <span class=c1># Calculate width and height of the clipped bounding boxes</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>    <span class=n>denormalized_bboxes</span> <span class=o>=</span> <span class=n>denormalize_bboxes</span><span class=p>(</span><span class=n>clipped_bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>],</span> <span class=n>shape</span><span class=p>)</span>
 </span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>    <span class=c1># Create a mask for bboxes that meet all criteria</span>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=n>mask</span> <span class=o>=</span> <span class=p>(</span>
-</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>        <span class=p>(</span><span class=n>denormalized_box_areas</span> <span class=o>&gt;=</span> <span class=n>epsilon</span><span class=p>)</span>
-</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_box_areas</span> <span class=o>&gt;=</span> <span class=n>min_area</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_box_areas</span> <span class=o>/</span> <span class=n>denormalized_box_areas</span> <span class=o>&gt;=</span> <span class=n>min_visibility</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_widths</span> <span class=o>&gt;=</span> <span class=n>min_width</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
-</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_heights</span> <span class=o>&gt;=</span> <span class=n>min_height</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>    <span class=p>)</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=c1># Apply the mask to get the filtered bboxes</span>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=n>filtered_bboxes</span> <span class=o>=</span> <span class=n>clipped_bboxes</span><span class=p>[</span><span class=n>mask</span><span class=p>]</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>    <span class=n>clipped_widths</span> <span class=o>=</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>-</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>]</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=n>clipped_heights</span> <span class=o>=</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>-</span> <span class=n>denormalized_bboxes</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>    <span class=c1># Create a mask for bboxes that meet all criteria</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>    <span class=n>mask</span> <span class=o>=</span> <span class=p>(</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=p>(</span><span class=n>denormalized_box_areas</span> <span class=o>&gt;=</span> <span class=n>epsilon</span><span class=p>)</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_box_areas</span> <span class=o>&gt;=</span> <span class=n>min_area</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_box_areas</span> <span class=o>/</span> <span class=n>denormalized_box_areas</span> <span class=o>&gt;=</span> <span class=n>min_visibility</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_widths</span> <span class=o>&gt;=</span> <span class=n>min_width</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=o>&amp;</span> <span class=p>(</span><span class=n>clipped_heights</span> <span class=o>&gt;=</span> <span class=n>min_height</span> <span class=o>-</span> <span class=n>epsilon</span><span class=p>)</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=p>)</span>
 </span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span><span class=o>.</span><span class=n>reshape</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>4</span><span class=p>)</span> <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>filtered_bboxes</span><span class=p>)</span> <span class=o>==</span> <span class=mi>0</span> <span class=k>else</span> <span class=n>filtered_bboxes</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.masks_from_bboxes class="doc doc-heading" data-toc-label=masks_from_bboxes()> <code class="highlight language-python">def masks_from_bboxes (bboxes, img_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L563 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.masks_from_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create binary masks from multiple bounding boxes</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (N, 4), where N is the number of boxes</p></td> </tr> <tr> <td><code>img_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape (height, width)</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>masks</code></td> <td><p>Array of binary masks with shape (N, height, width)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>masks_from_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>img_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>    <span class=c1># Apply the mask to get the filtered bboxes</span>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=n>filtered_bboxes</span> <span class=o>=</span> <span class=n>clipped_bboxes</span><span class=p>[</span><span class=n>mask</span><span class=p>]</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>    <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>array</span><span class=p>([],</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span><span class=o>.</span><span class=n>reshape</span><span class=p>(</span><span class=mi>0</span><span class=p>,</span> <span class=mi>4</span><span class=p>)</span> <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>filtered_bboxes</span><span class=p>)</span> <span class=o>==</span> <span class=mi>0</span> <span class=k>else</span> <span class=n>filtered_bboxes</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.masks_from_bboxes class="doc doc-heading" data-toc-label=masks_from_bboxes()> <code class="highlight language-python">def masks_from_bboxes (bboxes, shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L573 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.masks_from_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Create binary masks from multiple bounding boxes</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Array of bounding boxes with shape (N, 4), where N is the number of boxes</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType | tuple[int, int]</code></td> <td><p>{"height": int, "width": int} or tuple[int, int]</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>masks</code></td> <td><p>Array of binary masks with shape (N, height, width)</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>masks_from_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Create binary masks from multiple bounding boxes</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>        bboxes: Array of bounding boxes with shape (N, 4), where N is the number of boxes</span>
-</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=sd>        img_shape: Image shape (height, width)</span>
+</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=sd>        shape: {&quot;height&quot;: int, &quot;width&quot;: int} or tuple[int, int]</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        masks: Array of binary masks with shape (N, height, width)</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>img_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>    <span class=n>masks</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>zeros</span><span class=p>((</span><span class=nb>len</span><span class=p>(</span><span class=n>bboxes</span><span class=p>),</span> <span class=n>height</span><span class=p>,</span> <span class=n>width</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>y</span><span class=p>,</span> <span class=n>x</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>ogrid</span><span class=p>[:</span><span class=n>height</span><span class=p>,</span> <span class=p>:</span><span class=n>width</span><span class=p>]</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=k>for</span> <span class=n>i</span><span class=p>,</span> <span class=p>(</span><span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span><span class=p>)</span> <span class=ow>in</span> <span class=nb>enumerate</span><span class=p>(</span><span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>int</span><span class=p>)):</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>        <span class=n>masks</span><span class=p>[</span><span class=n>i</span><span class=p>]</span> <span class=o>=</span> <span class=p>(</span><span class=n>x_min</span> <span class=o>&lt;=</span> <span class=n>x</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>x</span> <span class=o>&lt;</span> <span class=n>x_max</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y_min</span> <span class=o>&lt;=</span> <span class=n>y</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y</span> <span class=o>&lt;</span> <span class=n>y_max</span><span class=p>)</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>return</span> <span class=n>masks</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.normalize_bboxes class="doc doc-heading" data-toc-label=normalize_bboxes()> <code class="highlight language-python">def normalize_bboxes (bboxes, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L160 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.normalize_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Normalize array of bounding boxes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>Image shape <code>(height, width)</code>.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
-</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>normalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>    <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>shape</span><span class=p>,</span> <span class=nb>dict</span><span class=p>):</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>        <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>]</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=k>else</span><span class=p>:</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>        <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=n>masks</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>zeros</span><span class=p>((</span><span class=nb>len</span><span class=p>(</span><span class=n>bboxes</span><span class=p>),</span> <span class=n>height</span><span class=p>,</span> <span class=n>width</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>uint8</span><span class=p>)</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>y</span><span class=p>,</span> <span class=n>x</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>ogrid</span><span class=p>[:</span><span class=n>height</span><span class=p>,</span> <span class=p>:</span><span class=n>width</span><span class=p>]</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=k>for</span> <span class=n>i</span><span class=p>,</span> <span class=p>(</span><span class=n>x_min</span><span class=p>,</span> <span class=n>y_min</span><span class=p>,</span> <span class=n>x_max</span><span class=p>,</span> <span class=n>y_max</span><span class=p>)</span> <span class=ow>in</span> <span class=nb>enumerate</span><span class=p>(</span><span class=n>bboxes</span><span class=p>[:,</span> <span class=p>:</span><span class=mi>4</span><span class=p>]</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>int</span><span class=p>)):</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>        <span class=n>masks</span><span class=p>[</span><span class=n>i</span><span class=p>]</span> <span class=o>=</span> <span class=p>(</span><span class=n>x_min</span> <span class=o>&lt;=</span> <span class=n>x</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>x</span> <span class=o>&lt;</span> <span class=n>x_max</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y_min</span> <span class=o>&lt;=</span> <span class=n>y</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y</span> <span class=o>&lt;</span> <span class=n>y_max</span><span class=p>)</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=k>return</span> <span class=n>masks</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.normalize_bboxes class="doc doc-heading" data-toc-label=normalize_bboxes()> <code class="highlight language-python">def normalize_bboxes (bboxes, shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L160 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.normalize_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Normalize array of bounding boxes.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType | tuple[int, int]</code></td> <td><p>Image shape <code>(height, width)</code>.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=nd>@handle_empty_array</span><span class=p>(</span><span class=s2>&quot;bboxes&quot;</span><span class=p>)</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=k>def</span> <span class=nf>normalize_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span> <span class=o>|</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Normalize array of bounding boxes.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=sd>        bboxes: Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.</span>
-</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>        image_shape: Image shape `(height, width)`.</span>
+</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>        shape: Image shape `(height, width)`.</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a>
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>    <span class=n>rows</span><span class=p>,</span> <span class=n>cols</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=n>normalized</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>float</span><span class=p>)</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=n>normalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>]]</span> <span class=o>/=</span> <span class=n>cols</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>    <span class=n>normalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>/=</span> <span class=n>rows</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>    <span class=k>return</span> <span class=n>normalized</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.union_of_bboxes class="doc doc-heading" data-toc-label=union_of_bboxes()> <code class="highlight language-python">def union_of_bboxes (bboxes, erosion_rate) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L489 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.union_of_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>List of bounding boxes</p></td> </tr> <tr> <td><code>erosion_rate</code></td> <td><code>float</code></td> <td><p>How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray | None</code></td> <td><p>A bounding box <code>(x_min, y_min, x_max, y_max)</code> or None if no bboxes are given or if the bounding boxes become invalid after erosion.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>union_of_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>erosion_rate</span><span class=p>:</span> <span class=nb>float</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span> <span class=o>|</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>    <span class=k>if</span> <span class=nb>isinstance</span><span class=p>(</span><span class=n>shape</span><span class=p>,</span> <span class=nb>tuple</span><span class=p>):</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>        <span class=n>rows</span><span class=p>,</span> <span class=n>cols</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=k>else</span><span class=p>:</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>        <span class=n>rows</span><span class=p>,</span> <span class=n>cols</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>]</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=n>normalized</span> <span class=o>=</span> <span class=n>bboxes</span><span class=o>.</span><span class=n>copy</span><span class=p>()</span><span class=o>.</span><span class=n>astype</span><span class=p>(</span><span class=nb>float</span><span class=p>)</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=n>normalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>2</span><span class=p>]]</span> <span class=o>/=</span> <span class=n>cols</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=n>normalized</span><span class=p>[:,</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>]]</span> <span class=o>/=</span> <span class=n>rows</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>    <span class=k>return</span> <span class=n>normalized</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.bbox_utils.union_of_bboxes class="doc doc-heading" data-toc-label=union_of_bboxes()> <code class="highlight language-python">def union_of_bboxes (bboxes, erosion_rate) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/bbox_utils.py#L499 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.bbox_utils.union_of_bboxes title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>bboxes</code></td> <td><code>np.ndarray</code></td> <td><p>List of bounding boxes</p></td> </tr> <tr> <td><code>erosion_rate</code></td> <td><code>float</code></td> <td><p>How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray | None</code></td> <td><p>A bounding box <code>(x_min, y_min, x_max, y_max)</code> or None if no bboxes are given or if the bounding boxes become invalid after erosion.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/bbox_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>union_of_bboxes</span><span class=p>(</span><span class=n>bboxes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>erosion_rate</span><span class=p>:</span> <span class=nb>float</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span> <span class=o>|</span> <span class=kc>None</span><span class=p>:</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
diff --git a/docs/api_reference/core/composition/index.html b/docs/api_reference/core/composition/index.html
index 429e67a4..201bfdbf 100644
--- a/docs/api_reference/core/composition/index.html
+++ b/docs/api_reference/core/composition/index.html
@@ -215,19 +215,27 @@
 </span><span id=__span-0-207><a id=__codelineno-0-207 name=__codelineno-0-207></a><a href=#__codelineno-0-207><span class=linenos data-linenos="207 "></span></a>        <span class=k>for</span> <span class=n>t</span> <span class=ow>in</span> <span class=bp>self</span><span class=o>.</span><span class=n>transforms</span><span class=p>:</span>
 </span><span id=__span-0-208><a id=__codelineno-0-208 name=__codelineno-0-208></a><a href=#__codelineno-0-208><span class=linenos data-linenos="208 "></span></a>            <span class=n>t</span><span class=o>.</span><span class=n>set_deterministic</span><span class=p>(</span><span class=n>flag</span><span class=p>,</span> <span class=n>save_key</span><span class=p>)</span>
 </span><span id=__span-0-209><a id=__codelineno-0-209 name=__codelineno-0-209></a><a href=#__codelineno-0-209><span class=linenos data-linenos="209 "></span></a>
-</span><span id=__span-0-210><a id=__codelineno-0-210 name=__codelineno-0-210></a><a href=#__codelineno-0-210><span class=linenos data-linenos="210 "></span></a>    <span class=k>def</span> <span class=nf>check_data_post_transform</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>data</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
-</span><span id=__span-0-211><a id=__codelineno-0-211 name=__codelineno-0-211></a><a href=#__codelineno-0-211><span class=linenos data-linenos="211 "></span></a>        <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=p>:</span>
-</span><span id=__span-0-212><a id=__codelineno-0-212 name=__codelineno-0-212></a><a href=#__codelineno-0-212><span class=linenos data-linenos="212 "></span></a>            <span class=n>image_shape</span> <span class=o>=</span> <span class=n>get_shape</span><span class=p>(</span><span class=n>data</span><span class=p>[</span><span class=s2>&quot;image&quot;</span><span class=p>])</span>
-</span><span id=__span-0-213><a id=__codelineno-0-213 name=__codelineno-0-213></a><a href=#__codelineno-0-213><span class=linenos data-linenos="213 "></span></a>
-</span><span id=__span-0-214><a id=__codelineno-0-214 name=__codelineno-0-214></a><a href=#__codelineno-0-214><span class=linenos data-linenos="214 "></span></a>            <span class=k>for</span> <span class=n>proc</span> <span class=ow>in</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=p>:</span>
-</span><span id=__span-0-215><a id=__codelineno-0-215 name=__codelineno-0-215></a><a href=#__codelineno-0-215><span class=linenos data-linenos="215 "></span></a>                <span class=k>for</span> <span class=n>data_name</span> <span class=ow>in</span> <span class=n>data</span><span class=p>:</span>
-</span><span id=__span-0-216><a id=__codelineno-0-216 name=__codelineno-0-216></a><a href=#__codelineno-0-216><span class=linenos data-linenos="216 "></span></a>                    <span class=k>if</span> <span class=n>data_name</span> <span class=ow>in</span> <span class=n>proc</span><span class=o>.</span><span class=n>data_fields</span> <span class=ow>or</span> <span class=p>(</span>
-</span><span id=__span-0-217><a id=__codelineno-0-217 name=__codelineno-0-217></a><a href=#__codelineno-0-217><span class=linenos data-linenos="217 "></span></a>                        <span class=n>data_name</span> <span class=ow>in</span> <span class=bp>self</span><span class=o>.</span><span class=n>_additional_targets</span>
-</span><span id=__span-0-218><a id=__codelineno-0-218 name=__codelineno-0-218></a><a href=#__codelineno-0-218><span class=linenos data-linenos="218 "></span></a>                        <span class=ow>and</span> <span class=bp>self</span><span class=o>.</span><span class=n>_additional_targets</span><span class=p>[</span><span class=n>data_name</span><span class=p>]</span> <span class=ow>in</span> <span class=n>proc</span><span class=o>.</span><span class=n>data_fields</span>
-</span><span id=__span-0-219><a id=__codelineno-0-219 name=__codelineno-0-219></a><a href=#__codelineno-0-219><span class=linenos data-linenos="219 "></span></a>                    <span class=p>):</span>
-</span><span id=__span-0-220><a id=__codelineno-0-220 name=__codelineno-0-220></a><a href=#__codelineno-0-220><span class=linenos data-linenos="220 "></span></a>                        <span class=n>data</span><span class=p>[</span><span class=n>data_name</span><span class=p>]</span> <span class=o>=</span> <span class=n>proc</span><span class=o>.</span><span class=n>filter</span><span class=p>(</span><span class=n>data</span><span class=p>[</span><span class=n>data_name</span><span class=p>],</span> <span class=n>image_shape</span><span class=p>)</span>
-</span><span id=__span-0-221><a id=__codelineno-0-221 name=__codelineno-0-221></a><a href=#__codelineno-0-221><span class=linenos data-linenos="221 "></span></a>        <span class=k>return</span> <span class=n>data</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.Compose class="doc doc-heading" data-toc-label=Compose> <code>class <strong> Compose</strong></code> <code> (transforms, bbox_params=None, keypoint_params=None, additional_targets=None, p=1.0, is_check_shapes=True, strict=True, mask_interpolation=None, seed=None, save_applied_params=False) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L413 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.Compose title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Compose multiple transforms together and apply them sequentially to input data.</p> <p>This class allows you to chain multiple image augmentation transforms and apply them in a specified order. It also handles bounding box and keypoint transformations if the appropriate parameters are provided.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>List[Union[BasicTransform, BaseCompose]]</code></td> <td><p>A list of transforms to apply.</p></td> </tr> <tr> <td><code>bbox_params</code></td> <td><code>Union[dict, BboxParams, None]</code></td> <td><p>Parameters for bounding box transforms. Can be a dict of params or a BboxParams object. Default is None.</p></td> </tr> <tr> <td><code>keypoint_params</code></td> <td><code>Union[dict, KeypointParams, None]</code></td> <td><p>Parameters for keypoint transforms. Can be a dict of params or a KeypointParams object. Default is None.</p></td> </tr> <tr> <td><code>additional_targets</code></td> <td><code>Dict[str, str]</code></td> <td><p>A dictionary mapping additional target names to their types. For example, {'image2': 'image'}. Default is None.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying all transforms. Should be in range [0, 1]. Default is 1.0.</p></td> </tr> <tr> <td><code>is_check_shapes</code></td> <td><code>bool</code></td> <td><p>If True, checks consistency of shapes for image/mask/masks on each call. Disable only if you are sure about your data consistency. Default is True.</p></td> </tr> <tr> <td><code>strict</code></td> <td><code>bool</code></td> <td><p>If True, raises an error on unknown input keys. If False, ignores them. Default is True.</p></td> </tr> <tr> <td><code>mask_interpolation</code></td> <td><code>int</code></td> <td><p>Interpolation method for mask transforms. When defined, it overrides the interpolation method specified in individual transforms. Default is None.</p></td> </tr> <tr> <td><code>seed</code></td> <td><code>int</code></td> <td><p>Random seed. Default is None.</p></td> </tr> <tr> <td><code>save_applied_params</code></td> <td><code>bool</code></td> <td><p>If True, saves the applied parameters of each transform. Default is False.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
+</span><span id=__span-0-210><a id=__codelineno-0-210 name=__codelineno-0-210></a><a href=#__codelineno-0-210><span class=linenos data-linenos="210 "></span></a>    <span class=k>def</span> <span class=nf>check_data_post_transform</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
+</span><span id=__span-0-211><a id=__codelineno-0-211 name=__codelineno-0-211></a><a href=#__codelineno-0-211><span class=linenos data-linenos="211 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Check and filter data after transformation.</span>
+</span><span id=__span-0-212><a id=__codelineno-0-212 name=__codelineno-0-212></a><a href=#__codelineno-0-212><span class=linenos data-linenos="212 "></span></a>
+</span><span id=__span-0-213><a id=__codelineno-0-213 name=__codelineno-0-213></a><a href=#__codelineno-0-213><span class=linenos data-linenos="213 "></span></a><span class=sd>        Args:</span>
+</span><span id=__span-0-214><a id=__codelineno-0-214 name=__codelineno-0-214></a><a href=#__codelineno-0-214><span class=linenos data-linenos="214 "></span></a><span class=sd>            data: Dictionary containing transformed data</span>
+</span><span id=__span-0-215><a id=__codelineno-0-215 name=__codelineno-0-215></a><a href=#__codelineno-0-215><span class=linenos data-linenos="215 "></span></a>
+</span><span id=__span-0-216><a id=__codelineno-0-216 name=__codelineno-0-216></a><a href=#__codelineno-0-216><span class=linenos data-linenos="216 "></span></a><span class=sd>        Returns:</span>
+</span><span id=__span-0-217><a id=__codelineno-0-217 name=__codelineno-0-217></a><a href=#__codelineno-0-217><span class=linenos data-linenos="217 "></span></a><span class=sd>            Filtered data dictionary</span>
+</span><span id=__span-0-218><a id=__codelineno-0-218 name=__codelineno-0-218></a><a href=#__codelineno-0-218><span class=linenos data-linenos="218 "></span></a><span class=sd>        &quot;&quot;&quot;</span>
+</span><span id=__span-0-219><a id=__codelineno-0-219 name=__codelineno-0-219></a><a href=#__codelineno-0-219><span class=linenos data-linenos="219 "></span></a>        <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=p>:</span>
+</span><span id=__span-0-220><a id=__codelineno-0-220 name=__codelineno-0-220></a><a href=#__codelineno-0-220><span class=linenos data-linenos="220 "></span></a>            <span class=n>shape</span> <span class=o>=</span> <span class=n>get_shape</span><span class=p>(</span><span class=n>data</span><span class=p>)</span>
+</span><span id=__span-0-221><a id=__codelineno-0-221 name=__codelineno-0-221></a><a href=#__codelineno-0-221><span class=linenos data-linenos="221 "></span></a>
+</span><span id=__span-0-222><a id=__codelineno-0-222 name=__codelineno-0-222></a><a href=#__codelineno-0-222><span class=linenos data-linenos="222 "></span></a>            <span class=k>for</span> <span class=n>proc</span> <span class=ow>in</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=p>:</span>
+</span><span id=__span-0-223><a id=__codelineno-0-223 name=__codelineno-0-223></a><a href=#__codelineno-0-223><span class=linenos data-linenos="223 "></span></a>                <span class=k>for</span> <span class=n>data_name</span><span class=p>,</span> <span class=n>data_value</span> <span class=ow>in</span> <span class=n>data</span><span class=o>.</span><span class=n>items</span><span class=p>():</span>
+</span><span id=__span-0-224><a id=__codelineno-0-224 name=__codelineno-0-224></a><a href=#__codelineno-0-224><span class=linenos data-linenos="224 "></span></a>                    <span class=k>if</span> <span class=n>data_name</span> <span class=ow>in</span> <span class=n>proc</span><span class=o>.</span><span class=n>data_fields</span> <span class=ow>or</span> <span class=p>(</span>
+</span><span id=__span-0-225><a id=__codelineno-0-225 name=__codelineno-0-225></a><a href=#__codelineno-0-225><span class=linenos data-linenos="225 "></span></a>                        <span class=n>data_name</span> <span class=ow>in</span> <span class=bp>self</span><span class=o>.</span><span class=n>_additional_targets</span>
+</span><span id=__span-0-226><a id=__codelineno-0-226 name=__codelineno-0-226></a><a href=#__codelineno-0-226><span class=linenos data-linenos="226 "></span></a>                        <span class=ow>and</span> <span class=bp>self</span><span class=o>.</span><span class=n>_additional_targets</span><span class=p>[</span><span class=n>data_name</span><span class=p>]</span> <span class=ow>in</span> <span class=n>proc</span><span class=o>.</span><span class=n>data_fields</span>
+</span><span id=__span-0-227><a id=__codelineno-0-227 name=__codelineno-0-227></a><a href=#__codelineno-0-227><span class=linenos data-linenos="227 "></span></a>                    <span class=p>):</span>
+</span><span id=__span-0-228><a id=__codelineno-0-228 name=__codelineno-0-228></a><a href=#__codelineno-0-228><span class=linenos data-linenos="228 "></span></a>                        <span class=n>data</span><span class=p>[</span><span class=n>data_name</span><span class=p>]</span> <span class=o>=</span> <span class=n>proc</span><span class=o>.</span><span class=n>filter</span><span class=p>(</span><span class=n>data_value</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
+</span><span id=__span-0-229><a id=__codelineno-0-229 name=__codelineno-0-229></a><a href=#__codelineno-0-229><span class=linenos data-linenos="229 "></span></a>        <span class=k>return</span> <span class=n>data</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.Compose class="doc doc-heading" data-toc-label=Compose> <code>class <strong> Compose</strong></code> <code> (transforms, bbox_params=None, keypoint_params=None, additional_targets=None, p=1.0, is_check_shapes=True, strict=True, mask_interpolation=None, seed=None, save_applied_params=False) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L421 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.Compose title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Compose multiple transforms together and apply them sequentially to input data.</p> <p>This class allows you to chain multiple image augmentation transforms and apply them in a specified order. It also handles bounding box and keypoint transformations if the appropriate parameters are provided.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>List[Union[BasicTransform, BaseCompose]]</code></td> <td><p>A list of transforms to apply.</p></td> </tr> <tr> <td><code>bbox_params</code></td> <td><code>Union[dict, BboxParams, None]</code></td> <td><p>Parameters for bounding box transforms. Can be a dict of params or a BboxParams object. Default is None.</p></td> </tr> <tr> <td><code>keypoint_params</code></td> <td><code>Union[dict, KeypointParams, None]</code></td> <td><p>Parameters for keypoint transforms. Can be a dict of params or a KeypointParams object. Default is None.</p></td> </tr> <tr> <td><code>additional_targets</code></td> <td><code>Dict[str, str]</code></td> <td><p>A dictionary mapping additional target names to their types. For example, {'image2': 'image'}. Default is None.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying all transforms. Should be in range [0, 1]. Default is 1.0.</p></td> </tr> <tr> <td><code>is_check_shapes</code></td> <td><code>bool</code></td> <td><p>If True, checks consistency of shapes for image/mask/masks on each call. Disable only if you are sure about your data consistency. Default is True.</p></td> </tr> <tr> <td><code>strict</code></td> <td><code>bool</code></td> <td><p>If True, raises an error on unknown input keys. If False, ignores them. Default is True.</p></td> </tr> <tr> <td><code>mask_interpolation</code></td> <td><code>int</code></td> <td><p>Interpolation method for mask transforms. When defined, it overrides the interpolation method specified in individual transforms. Default is None.</p></td> </tr> <tr> <td><code>seed</code></td> <td><code>int</code></td> <td><p>Random seed. Default is None.</p></td> </tr> <tr> <td><code>save_applied_params</code></td> <td><code>bool</code></td> <td><p>If True, saves the applied parameters of each transform. Default is False.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transform</span> <span class=o>=</span> <span class=n>A</span><span class=o>.</span><span class=n>Compose</span><span class=p>([</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>...</span>     <span class=n>A</span><span class=o>.</span><span class=n>RandomCrop</span><span class=p>(</span><span class=n>width</span><span class=o>=</span><span class=mi>256</span><span class=p>,</span> <span class=n>height</span><span class=o>=</span><span class=mi>256</span><span class=p>),</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>...</span>     <span class=n>A</span><span class=o>.</span><span class=n>HorizontalFlip</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=mf>0.5</span><span class=p>),</span>
@@ -666,7 +674,7 @@
 </span><span id=__span-0-430><a id=__codelineno-0-430 name=__codelineno-0-430></a><a href=#__codelineno-0-430><span class=linenos data-linenos="430 "></span></a>        <span class=k>if</span> <span class=n>data</span><span class=o>.</span><span class=n>ndim</span> <span class=ow>not</span> <span class=ow>in</span> <span class=p>[</span><span class=mi>4</span><span class=p>,</span> <span class=mi>5</span><span class=p>]:</span>  <span class=c1># (N,D,H,W) or (N,D,H,W,C)</span>
 </span><span id=__span-0-431><a id=__codelineno-0-431 name=__codelineno-0-431></a><a href=#__codelineno-0-431><span class=linenos data-linenos="431 "></span></a>            <span class=k>raise</span> <span class=ne>TypeError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;</span><span class=si>{</span><span class=n>data_name</span><span class=si>}</span><span class=s2> must be 4D or 5D array&quot;</span><span class=p>)</span>
 </span><span id=__span-0-432><a id=__codelineno-0-432 name=__codelineno-0-432></a><a href=#__codelineno-0-432><span class=linenos data-linenos="432 "></span></a>        <span class=k>return</span> <span class=n>data</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>:</span><span class=mi>4</span><span class=p>]</span>  <span class=c1># Return (D,H,W)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.OneOf class="doc doc-heading" data-toc-label=OneOf> <code>class <strong> OneOf</strong></code> <code> (transforms, p=0.5) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L739 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.OneOf title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Select one of transforms to apply. Selected transform will be called with <code>force_apply=True</code>. Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>list</code></td> <td><p>list of transformations to compose.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying selected transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/OneOf target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/composition.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>OneOf</span><span class=p>(</span><span class=n>BaseCompose</span><span class=p>):</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.OneOf class="doc doc-heading" data-toc-label=OneOf> <code>class <strong> OneOf</strong></code> <code> (transforms, p=0.5) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L747 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.OneOf title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Select one of transforms to apply. Selected transform will be called with <code>force_apply=True</code>. Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>list</code></td> <td><p>list of transformations to compose.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>probability of applying selected transform. Default: 0.5.</p></td> </tr> </tbody> </table> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/OneOf target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/composition.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>OneOf</span><span class=p>(</span><span class=n>BaseCompose</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Select one of transforms to apply. Selected transform will be called with `force_apply=True`.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a><span class=sd>    Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
@@ -694,7 +702,7 @@
 </span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>            <span class=n>data</span> <span class=o>=</span> <span class=n>t</span><span class=p>(</span><span class=n>force_apply</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=o>**</span><span class=n>data</span><span class=p>)</span>
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>            <span class=bp>self</span><span class=o>.</span><span class=n>_track_transform_params</span><span class=p>(</span><span class=n>t</span><span class=p>,</span> <span class=n>data</span><span class=p>)</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=k>return</span> <span class=n>data</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.OneOrOther class="doc doc-heading" data-toc-label=OneOrOther> <code>class <strong> OneOrOther</strong></code> <code> (first=None, second=None, transforms=None, p=0.5) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L894 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.OneOrOther title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Select one or another transform to apply. Selected transform will be called with <code>force_apply=True</code>.</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/OneOrOther target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/composition.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>OneOrOther</span><span class=p>(</span><span class=n>BaseCompose</span><span class=p>):</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.OneOrOther class="doc doc-heading" data-toc-label=OneOrOther> <code>class <strong> OneOrOther</strong></code> <code> (first=None, second=None, transforms=None, p=0.5) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L902 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.OneOrOther title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Select one or another transform to apply. Selected transform will be called with <code>force_apply=True</code>.</p> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/OneOrOther target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/composition.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>OneOrOther</span><span class=p>(</span><span class=n>BaseCompose</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Select one or another transform to apply. Selected transform will be called with `force_apply=True`.&quot;&quot;&quot;</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
@@ -724,7 +732,7 @@
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>            <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>transforms</span><span class=p>[</span><span class=mi>0</span><span class=p>](</span><span class=n>force_apply</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=o>**</span><span class=n>data</span><span class=p>)</span>
 </span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>
 </span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>transforms</span><span class=p>[</span><span class=o>-</span><span class=mi>1</span><span class=p>](</span><span class=n>force_apply</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=o>**</span><span class=n>data</span><span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.RandomOrder class="doc doc-heading" data-toc-label=RandomOrder> <code>class <strong> RandomOrder</strong></code> <code> (transforms, n=1, replace=False, p=1) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L863 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.RandomOrder title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Apply a random subset of transforms from the given list in a random order.</p> <p>The <code>RandomOrder</code> class allows you to select a specified number of transforms from a list and apply them to the input data in a random order. This is useful for creating more diverse augmentation pipelines where the order of transformations can vary, potentially leading to different results.</p> <p><strong>Attributes:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>TransformsSeqType</code></td> <td><p>A list of transformations to choose from.</p></td> </tr> <tr> <td><code>n</code></td> <td><code>int</code></td> <td><p>The number of transforms to apply. If <code>n</code> is greater than the number of available transforms and <code>replace</code> is False, <code>n</code> will be set to the number of available transforms.</p></td> </tr> <tr> <td><code>replace</code></td> <td><code>bool</code></td> <td><p>Whether to sample transforms with replacement. If True, the same transform can be selected multiple times. Default is False.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.RandomOrder class="doc doc-heading" data-toc-label=RandomOrder> <code>class <strong> RandomOrder</strong></code> <code> (transforms, n=1, replace=False, p=1) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L871 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.RandomOrder title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Apply a random subset of transforms from the given list in a random order.</p> <p>The <code>RandomOrder</code> class allows you to select a specified number of transforms from a list and apply them to the input data in a random order. This is useful for creating more diverse augmentation pipelines where the order of transformations can vary, potentially leading to different results.</p> <p><strong>Attributes:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>TransformsSeqType</code></td> <td><p>A list of transformations to choose from.</p></td> </tr> <tr> <td><code>n</code></td> <td><code>int</code></td> <td><p>The number of transforms to apply. If <code>n</code> is greater than the number of available transforms and <code>replace</code> is False, <code>n</code> will be set to the number of available transforms.</p></td> </tr> <tr> <td><code>replace</code></td> <td><code>bool</code></td> <td><p>Whether to sample transforms with replacement. If True, the same transform can be selected multiple times. Default is False.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.</p></td> </tr> </tbody> </table> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transform</span> <span class=o>=</span> <span class=n>A</span><span class=o>.</span><span class=n>RandomOrder</span><span class=p>([</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>...</span>     <span class=n>A</span><span class=o>.</span><span class=n>HorizontalFlip</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=mi>1</span><span class=p>),</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>...</span>     <span class=n>A</span><span class=o>.</span><span class=n>VerticalFlip</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=mi>1</span><span class=p>),</span>
@@ -771,7 +779,7 @@
 </span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=n>replace</span><span class=o>=</span><span class=bp>self</span><span class=o>.</span><span class=n>replace</span><span class=p>,</span>
 </span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>            <span class=n>p</span><span class=o>=</span><span class=bp>self</span><span class=o>.</span><span class=n>transforms_ps</span><span class=p>,</span>
 </span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.SelectiveChannelTransform class="doc doc-heading" data-toc-label=SelectiveChannelTransform> <code>class <strong> SelectiveChannelTransform</strong></code> <code> (transforms, channels=(0, 1, 2), p=1.0) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L940 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.SelectiveChannelTransform title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>A transformation class to apply specified transforms to selected channels of an image.</p> <p>This class extends BaseCompose to allow selective application of transformations to specified image channels. It extracts the selected channels, applies the transformations, and then reinserts the transformed channels back into their original positions in the image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>TransformsSeqType</code></td> <td><p>A sequence of transformations (from Albumentations) to be applied to the specified channels.</p></td> </tr> <tr> <td><code>channels</code></td> <td><code>Sequence[int]</code></td> <td><p>A sequence of integers specifying the indices of the channels to which the transforms should be applied.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability that the transform will be applied; the default is 1.0 (always apply).</p></td> </tr> </tbody> </table> <div class="admonition methods"> <p class=admonition-title>Methods</p> <p><strong>call</strong>(<em>args, *</em>kwargs): Applies the transforms to the image according to the specified channels. The input data should include 'image' key with the image array.</p> </div> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>dict[str, Any]</code></td> <td><p>The transformed data dictionary, which includes the transformed 'image' key.</p></td> </tr> </tbody> </table> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/SelectiveChannelTransform target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/composition.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>SelectiveChannelTransform</span><span class=p>(</span><span class=n>BaseCompose</span><span class=p>):</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.SelectiveChannelTransform class="doc doc-heading" data-toc-label=SelectiveChannelTransform> <code>class <strong> SelectiveChannelTransform</strong></code> <code> (transforms, channels=(0, 1, 2), p=1.0) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L948 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.SelectiveChannelTransform title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>A transformation class to apply specified transforms to selected channels of an image.</p> <p>This class extends BaseCompose to allow selective application of transformations to specified image channels. It extracts the selected channels, applies the transformations, and then reinserts the transformed channels back into their original positions in the image.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>TransformsSeqType</code></td> <td><p>A sequence of transformations (from Albumentations) to be applied to the specified channels.</p></td> </tr> <tr> <td><code>channels</code></td> <td><code>Sequence[int]</code></td> <td><p>A sequence of integers specifying the indices of the channels to which the transforms should be applied.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability that the transform will be applied; the default is 1.0 (always apply).</p></td> </tr> </tbody> </table> <div class="admonition methods"> <p class=admonition-title>Methods</p> <p><strong>call</strong>(<em>args, *</em>kwargs): Applies the transforms to the image according to the specified channels. The input data should include 'image' key with the image array.</p> </div> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>dict[str, Any]</code></td> <td><p>The transformed data dictionary, which includes the transformed 'image' key.</p></td> </tr> </tbody> </table> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/SelectiveChannelTransform target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/composition.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>SelectiveChannelTransform</span><span class=p>(</span><span class=n>BaseCompose</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;A transformation class to apply specified transforms to selected channels of an image.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    This class extends BaseCompose to allow selective application of transformations to</span>
@@ -824,7 +832,7 @@
 </span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>            <span class=n>data</span><span class=p>[</span><span class=s2>&quot;image&quot;</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>ascontiguousarray</span><span class=p>(</span><span class=n>output_img</span><span class=p>)</span>
 </span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>
 </span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=k>return</span> <span class=n>data</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.Sequential class="doc doc-heading" data-toc-label=Sequential> <code>class <strong> Sequential</strong></code> <code> (transforms, p=0.5) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L1078 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.Sequential title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Sequentially applies all transforms to targets.</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>This transform is not intended to be a replacement for <code>Compose</code>. Instead, it should be used inside <code>Compose</code> the same way <code>OneOf</code> or <code>OneOrOther</code> are used. For instance, you can combine <code>OneOf</code> with <code>Sequential</code> to create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly chose sequence to input data (see the <code>Example</code> section for an example definition of such pipeline).</p> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.Sequential class="doc doc-heading" data-toc-label=Sequential> <code>class <strong> Sequential</strong></code> <code> (transforms, p=0.5) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L1086 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.Sequential title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Sequentially applies all transforms to targets.</p> <div class="admonition note"> <p class=admonition-title>Note</p> <p>This transform is not intended to be a replacement for <code>Compose</code>. Instead, it should be used inside <code>Compose</code> the same way <code>OneOf</code> or <code>OneOrOther</code> are used. For instance, you can combine <code>OneOf</code> with <code>Sequential</code> to create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly chose sequence to input data (see the <code>Example</code> section for an example definition of such pipeline).</p> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transform</span> <span class=o>=</span> <span class=n>A</span><span class=o>.</span><span class=n>Compose</span><span class=p>([</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>&gt;&gt;&gt;</span>    <span class=n>A</span><span class=o>.</span><span class=n>OneOf</span><span class=p>([</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>&gt;&gt;&gt;</span>        <span class=n>A</span><span class=o>.</span><span class=n>Sequential</span><span class=p>([</span>
@@ -873,7 +881,7 @@
 </span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>                <span class=bp>self</span><span class=o>.</span><span class=n>_track_transform_params</span><span class=p>(</span><span class=n>t</span><span class=p>,</span> <span class=n>data</span><span class=p>)</span>
 </span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>                <span class=n>data</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_data_post_transform</span><span class=p>(</span><span class=n>data</span><span class=p>)</span>
 </span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=k>return</span> <span class=n>data</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.SomeOf class="doc doc-heading" data-toc-label=SomeOf> <code>class <strong> SomeOf</strong></code> <code> (transforms, n=1, replace=False, p=1) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L802 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.SomeOf title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Apply a random subset of transforms from the given list.</p> <p>This class selects a specified number of transforms from the provided list and applies them to the input data. The selection can be done with or without replacement, allowing for the same transform to be potentially applied multiple times.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>List[Union[BasicTransform, BaseCompose]]</code></td> <td><p>A list of transforms to choose from.</p></td> </tr> <tr> <td><code>n</code></td> <td><code>int</code></td> <td><p>The number of transforms to apply. If greater than the number of transforms and replace=False, it will be set to the number of transforms.</p></td> </tr> <tr> <td><code>replace</code></td> <td><code>bool</code></td> <td><p>Whether to sample transforms with replacement. Default is True.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.</p></td> </tr> <tr> <td><code>mask_interpolation</code></td> <td><code>int</code></td> <td><p>Interpolation method for mask transforms. When defined, it overrides the interpolation method specified in individual transforms. Default is None.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>If <code>n</code> is greater than the number of transforms and <code>replace</code> is False, <code>n</code> will be set to the number of transforms with a warning.</li> <li>The probabilities of individual transforms are used as weights for sampling.</li> <li>When <code>replace</code> is True, the same transform can be selected multiple times.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.composition.SomeOf class="doc doc-heading" data-toc-label=SomeOf> <code>class <strong> SomeOf</strong></code> <code> (transforms, n=1, replace=False, p=1) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/composition.py#L810 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.composition.SomeOf title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Apply a random subset of transforms from the given list.</p> <p>This class selects a specified number of transforms from the provided list and applies them to the input data. The selection can be done with or without replacement, allowing for the same transform to be potentially applied multiple times.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>transforms</code></td> <td><code>List[Union[BasicTransform, BaseCompose]]</code></td> <td><p>A list of transforms to choose from.</p></td> </tr> <tr> <td><code>n</code></td> <td><code>int</code></td> <td><p>The number of transforms to apply. If greater than the number of transforms and replace=False, it will be set to the number of transforms.</p></td> </tr> <tr> <td><code>replace</code></td> <td><code>bool</code></td> <td><p>Whether to sample transforms with replacement. Default is True.</p></td> </tr> <tr> <td><code>p</code></td> <td><code>float</code></td> <td><p>Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.</p></td> </tr> <tr> <td><code>mask_interpolation</code></td> <td><code>int</code></td> <td><p>Interpolation method for mask transforms. When defined, it overrides the interpolation method specified in individual transforms. Default is None.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>If <code>n</code> is greater than the number of transforms and <code>replace</code> is False, <code>n</code> will be set to the number of transforms with a warning.</li> <li>The probabilities of individual transforms are used as weights for sampling.</li> <li>When <code>replace</code> is True, the same transform can be selected multiple times.</li> </ul> </div> <p><strong>Examples:</strong></p> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1 href=#__codelineno-0-1></a><span class=o>&gt;&gt;&gt;</span> <span class=kn>import</span> <span class=nn>albumentations</span> <span class=k>as</span> <span class=nn>A</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2 href=#__codelineno-0-2></a><span class=o>&gt;&gt;&gt;</span> <span class=n>transform</span> <span class=o>=</span> <span class=n>A</span><span class=o>.</span><span class=n>SomeOf</span><span class=p>([</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3 href=#__codelineno-0-3></a><span class=o>...</span>     <span class=n>A</span><span class=o>.</span><span class=n>HorizontalFlip</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=mi>1</span><span class=p>),</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4 href=#__codelineno-0-4></a><span class=o>...</span>     <span class=n>A</span><span class=o>.</span><span class=n>VerticalFlip</span><span class=p>(</span><span class=n>p</span><span class=o>=</span><span class=mi>1</span><span class=p>),</span>
diff --git a/docs/api_reference/core/keypoints_utils/index.html b/docs/api_reference/core/keypoints_utils/index.html
index b2e3107d..6dcf281c 100644
--- a/docs/api_reference/core/keypoints_utils/index.html
+++ b/docs/api_reference/core/keypoints_utils/index.html
@@ -6,183 +6,270 @@
   .jupyter-wrapper .jp-MarkdownOutput.jp-RenderedHTMLCommon {
     font-size: 0.8rem;
   }
-</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/core/keypoints_utils.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=helper-functions-for-working-with-keypoints-augmentationscorekeypoints_utils>Helper functions for working with keypoints (augmentations.core.keypoints_utils)<a class=headerlink href=#helper-functions-for-working-with-keypoints-augmentationscorekeypoints_utils title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.core.keypoints_utils></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.keypoints_utils.KeypointParams class="doc doc-heading" data-toc-label=KeypointParams> <code>class <strong> KeypointParams</strong></code> <code> (format, label_fields=None, remove_invisible=True, angle_in_degrees=True, check_each_transform=True) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/keypoints_utils.py#L52 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.keypoints_utils.KeypointParams title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Parameters of keypoints</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>format</code></td> <td><code>str</code></td> <td><p>format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa'.</p> <p>x - X coordinate,</p> <p>y - Y coordinate</p> <p>s - Keypoint scale</p> <p>a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)</p></td> </tr> <tr> <td><code>label_fields</code></td> <td><code>list</code></td> <td><p>list of fields that are joined with keypoints, e.g labels. Should be same type as keypoints.</p></td> </tr> <tr> <td><code>remove_invisible</code></td> <td><code>bool</code></td> <td><p>to remove invisible points after transform or not</p></td> </tr> <tr> <td><code>angle_in_degrees</code></td> <td><code>bool</code></td> <td><p>angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints</p></td> </tr> <tr> <td><code>check_each_transform</code></td> <td><code>bool</code></td> <td><p>if <code>True</code>, then keypoints will be checked after each dual transform. Default: <code>True</code></p></td> </tr> </tbody> </table> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/KeypointParams target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>KeypointParams</span><span class=p>(</span><span class=n>Params</span><span class=p>):</span>
+</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/core/keypoints_utils.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=helper-functions-for-working-with-keypoints-augmentationscorekeypoints_utils>Helper functions for working with keypoints (augmentations.core.keypoints_utils)<a class=headerlink href=#helper-functions-for-working-with-keypoints-augmentationscorekeypoints_utils title="Permanent link">&para;</a></h1> <div class="doc doc-object doc-module"> <a id=albumentations.core.keypoints_utils></a> <div class="doc doc-contents first"> <div class="doc doc-children"> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.keypoints_utils.KeypointParams class="doc doc-heading" data-toc-label=KeypointParams> <code>class <strong> KeypointParams</strong></code> <code> (format, label_fields=None, remove_invisible=True, angle_in_degrees=True, check_each_transform=True) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/keypoints_utils.py#L58 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.keypoints_utils.KeypointParams title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Parameters of keypoints</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>format</code></td> <td><code>str</code></td> <td><p>format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa', 'xyz'.</p> <p>x - X coordinate,</p> <p>y - Y coordinate</p> <p>z - Z coordinate (for 3D keypoints)</p> <p>s - Keypoint scale</p> <p>a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)</p></td> </tr> <tr> <td><code>label_fields</code></td> <td><code>list</code></td> <td><p>list of fields that are joined with keypoints, e.g labels. Should be same type as keypoints.</p></td> </tr> <tr> <td><code>remove_invisible</code></td> <td><code>bool</code></td> <td><p>to remove invisible points after transform or not</p></td> </tr> <tr> <td><code>angle_in_degrees</code></td> <td><code>bool</code></td> <td><p>angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints</p></td> </tr> <tr> <td><code>check_each_transform</code></td> <td><code>bool</code></td> <td><p>if <code>True</code>, then keypoints will be checked after each dual transform. Default: <code>True</code></p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <p>The internal Albumentations format is [x, y, z, angle, scale]. For 2D formats (xy, yx, xya, xys, xyas, xysa), z coordinate is set to 0. For formats without angle or scale, these values are set to 0.</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/KeypointParams target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>KeypointParams</span><span class=p>(</span><span class=n>Params</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Parameters of keypoints</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Args:</span>
-</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>        format (str): format of keypoints. Should be &#39;xy&#39;, &#39;yx&#39;, &#39;xya&#39;, &#39;xys&#39;, &#39;xyas&#39;, &#39;xysa&#39;.</span>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>        format (str): format of keypoints. Should be &#39;xy&#39;, &#39;yx&#39;, &#39;xya&#39;, &#39;xys&#39;, &#39;xyas&#39;, &#39;xysa&#39;, &#39;xyz&#39;.</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>            x - X coordinate,</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>            y - Y coordinate</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>            s - Keypoint scale</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>            z - Z coordinate (for 3D keypoints)</span>
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>            a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        label_fields (list): list of fields that are joined with keypoints, e.g labels.</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>            Should be same type as keypoints.</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        remove_invisible (bool): to remove invisible points after transform or not</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        angle_in_degrees (bool): angle in degrees or radians in &#39;xya&#39;, &#39;xyas&#39;, &#39;xysa&#39; keypoints</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>        check_each_transform (bool): if `True`, then keypoints will be checked after each dual transform.</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>            Default: `True`</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>            s - Keypoint scale</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>            a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        label_fields (list): list of fields that are joined with keypoints, e.g labels.</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>            Should be same type as keypoints.</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>        remove_invisible (bool): to remove invisible points after transform or not</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>        angle_in_degrees (bool): angle in degrees or radians in &#39;xya&#39;, &#39;xyas&#39;, &#39;xysa&#39; keypoints</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>        check_each_transform (bool): if `True`, then keypoints will be checked after each dual transform.</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>            Default: `True`</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>    Note:</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        The internal Albumentations format is [x, y, z, angle, scale]. For 2D formats (xy, yx, xya, xys, xyas, xysa),</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>        z coordinate is set to 0. For formats without angle or scale, these values are set to 0.</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=nb>format</span><span class=p>:</span> <span class=nb>str</span><span class=p>,</span>  <span class=c1># noqa: A002</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>label_fields</span><span class=p>:</span> <span class=n>Sequence</span><span class=p>[</span><span class=nb>str</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>remove_invisible</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=n>angle_in_degrees</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>check_each_transform</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=p>):</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=nb>format</span><span class=p>,</span> <span class=n>label_fields</span><span class=p>)</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>remove_invisible</span> <span class=o>=</span> <span class=n>remove_invisible</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>angle_in_degrees</span> <span class=o>=</span> <span class=n>angle_in_degrees</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span> <span class=o>=</span> <span class=n>check_each_transform</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>    <span class=k>def</span> <span class=nf>to_dict_private</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>        <span class=n>data</span> <span class=o>=</span> <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=n>to_dict_private</span><span class=p>()</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=n>data</span><span class=o>.</span><span class=n>update</span><span class=p>(</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>            <span class=p>{</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>                <span class=s2>&quot;remove_invisible&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>remove_invisible</span><span class=p>,</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>                <span class=s2>&quot;angle_in_degrees&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>angle_in_degrees</span><span class=p>,</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>                <span class=s2>&quot;check_each_transform&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=p>,</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>            <span class=p>},</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=k>return</span> <span class=n>data</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=nd>@classmethod</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=k>def</span> <span class=nf>is_serializable</span><span class=p>(</span><span class=bp>cls</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>bool</span><span class=p>:</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>        <span class=k>return</span> <span class=kc>True</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=nd>@classmethod</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>    <span class=k>def</span> <span class=nf>get_class_fullname</span><span class=p>(</span><span class=bp>cls</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>str</span><span class=p>:</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>        <span class=k>return</span> <span class=s2>&quot;KeypointParams&quot;</span>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>    <span class=k>def</span> <span class=fm>__repr__</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>str</span><span class=p>:</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>        <span class=k>return</span> <span class=p>(</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>            <span class=sa>f</span><span class=s2>&quot;KeypointParams(format=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>format</span><span class=si>}</span><span class=s2>, label_fields=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>label_fields</span><span class=si>}</span><span class=s2>,&quot;</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>            <span class=sa>f</span><span class=s2>&quot; remove_invisible=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>remove_invisible</span><span class=si>}</span><span class=s2>, angle_in_degrees=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>angle_in_degrees</span><span class=si>}</span><span class=s2>,&quot;</span>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>            <span class=sa>f</span><span class=s2>&quot; check_each_transform=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=si>}</span><span class=s2>)&quot;</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>        <span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.keypoints_utils.KeypointsProcessor class="doc doc-heading" data-toc-label=KeypointsProcessor> <code>class <strong> KeypointsProcessor</strong></code> <code> (params, additional_targets=None) </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/keypoints_utils.py#L99 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.keypoints_utils.KeypointsProcessor title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/KeypointsProcessor target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>KeypointsProcessor</span><span class=p>(</span><span class=n>DataProcessor</span><span class=p>):</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>params</span><span class=p>:</span> <span class=n>KeypointParams</span><span class=p>,</span> <span class=n>additional_targets</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>str</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>):</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=n>params</span><span class=p>,</span> <span class=n>additional_targets</span><span class=p>)</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>
+</span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=nd>@property</span>
+</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=k>def</span> <span class=nf>default_data_name</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>str</span><span class=p>:</span>
+</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>        <span class=k>return</span> <span class=s2>&quot;keypoints&quot;</span>
+</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a>
+</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a>    <span class=k>def</span> <span class=nf>ensure_data_valid</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>data</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a>        <span class=k>if</span> <span class=bp>self</span><span class=o>.</span><span class=n>params</span><span class=o>.</span><span class=n>label_fields</span> <span class=ow>and</span> <span class=ow>not</span> <span class=nb>all</span><span class=p>(</span><span class=n>i</span> <span class=ow>in</span> <span class=n>data</span> <span class=k>for</span> <span class=n>i</span> <span class=ow>in</span> <span class=bp>self</span><span class=o>.</span><span class=n>params</span><span class=o>.</span><span class=n>label_fields</span><span class=p>):</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a>            <span class=n>msg</span> <span class=o>=</span> <span class=s2>&quot;Your &#39;label_fields&#39; are not valid - them must have same names as params in &#39;keypoint_params&#39; dict&quot;</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>            <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=n>msg</span><span class=p>)</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=k>def</span> <span class=nf>filter</span><span class=p>(</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>        <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Filter keypoints based on visibility within given shape.</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=k>def</span> <span class=fm>__init__</span><span class=p>(</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>        <span class=bp>self</span><span class=p>,</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=nb>format</span><span class=p>:</span> <span class=nb>str</span><span class=p>,</span>  <span class=c1># noqa: A002</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=n>label_fields</span><span class=p>:</span> <span class=n>Sequence</span><span class=p>[</span><span class=nb>str</span><span class=p>]</span> <span class=o>|</span> <span class=kc>None</span> <span class=o>=</span> <span class=kc>None</span><span class=p>,</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>        <span class=n>remove_invisible</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=n>angle_in_degrees</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=n>check_each_transform</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=p>):</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=fm>__init__</span><span class=p>(</span><span class=nb>format</span><span class=p>,</span> <span class=n>label_fields</span><span class=p>)</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>remove_invisible</span> <span class=o>=</span> <span class=n>remove_invisible</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>angle_in_degrees</span> <span class=o>=</span> <span class=n>angle_in_degrees</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span> <span class=o>=</span> <span class=n>check_each_transform</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>    <span class=k>def</span> <span class=nf>to_dict_private</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Any</span><span class=p>]:</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>data</span> <span class=o>=</span> <span class=nb>super</span><span class=p>()</span><span class=o>.</span><span class=n>to_dict_private</span><span class=p>()</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>        <span class=n>data</span><span class=o>.</span><span class=n>update</span><span class=p>(</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>            <span class=p>{</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>                <span class=s2>&quot;remove_invisible&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>remove_invisible</span><span class=p>,</span>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>                <span class=s2>&quot;angle_in_degrees&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>angle_in_degrees</span><span class=p>,</span>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>                <span class=s2>&quot;check_each_transform&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=p>,</span>
-</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>            <span class=p>},</span>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=p>)</span>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>        <span class=k>return</span> <span class=n>data</span>
-</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>
-</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>    <span class=nd>@classmethod</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>    <span class=k>def</span> <span class=nf>is_serializable</span><span class=p>(</span><span class=bp>cls</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>bool</span><span class=p>:</span>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=k>return</span> <span class=kc>True</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>        Args:</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>            data: Keypoints in [x, y, z, angle, scale] format</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>            shape: Shape to check against as {&#39;height&#39;: height, &#39;width&#39;: width, &#39;depth&#39;: depth}</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        Returns:</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>            Filtered keypoints</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>        &quot;&quot;&quot;</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>        <span class=bp>self</span><span class=o>.</span><span class=n>params</span><span class=p>:</span> <span class=n>KeypointParams</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=k>return</span> <span class=n>filter_keypoints</span><span class=p>(</span><span class=n>data</span><span class=p>,</span> <span class=n>shape</span><span class=p>,</span> <span class=n>remove_invisible</span><span class=o>=</span><span class=bp>self</span><span class=o>.</span><span class=n>params</span><span class=o>.</span><span class=n>remove_invisible</span><span class=p>)</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=k>def</span> <span class=nf>check</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>data</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=n>check_keypoints</span><span class=p>(</span><span class=n>data</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=k>def</span> <span class=nf>convert_from_albumentations</span><span class=p>(</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>        <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>        <span class=k>if</span> <span class=ow>not</span> <span class=n>data</span><span class=o>.</span><span class=n>size</span><span class=p>:</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>            <span class=k>return</span> <span class=n>data</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>        <span class=n>params</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>params</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>        <span class=k>return</span> <span class=n>convert_keypoints_from_albumentations</span><span class=p>(</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>            <span class=n>data</span><span class=p>,</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>            <span class=n>params</span><span class=o>.</span><span class=n>format</span><span class=p>,</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>            <span class=n>shape</span><span class=p>,</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>            <span class=n>check_validity</span><span class=o>=</span><span class=n>params</span><span class=o>.</span><span class=n>remove_invisible</span><span class=p>,</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>            <span class=n>angle_in_degrees</span><span class=o>=</span><span class=n>params</span><span class=o>.</span><span class=n>angle_in_degrees</span><span class=p>,</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=p>)</span>
 </span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>    <span class=nd>@classmethod</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>    <span class=k>def</span> <span class=nf>get_class_fullname</span><span class=p>(</span><span class=bp>cls</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>str</span><span class=p>:</span>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=k>return</span> <span class=s2>&quot;KeypointParams&quot;</span>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>
-</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=k>def</span> <span class=fm>__repr__</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>str</span><span class=p>:</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=k>return</span> <span class=p>(</span>
-</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>            <span class=sa>f</span><span class=s2>&quot;KeypointParams(format=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>format</span><span class=si>}</span><span class=s2>, label_fields=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>label_fields</span><span class=si>}</span><span class=s2>,&quot;</span>
-</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>            <span class=sa>f</span><span class=s2>&quot; remove_invisible=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>remove_invisible</span><span class=si>}</span><span class=s2>, angle_in_degrees=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>angle_in_degrees</span><span class=si>}</span><span class=s2>,&quot;</span>
-</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>            <span class=sa>f</span><span class=s2>&quot; check_each_transform=</span><span class=si>{</span><span class=bp>self</span><span class=o>.</span><span class=n>check_each_transform</span><span class=si>}</span><span class=s2>)&quot;</span>
-</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>        <span class=p>)</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.check_keypoints class="doc doc-heading" data-toc-label=check_keypoints()> <code class="highlight language-python">def check_keypoints (keypoints, image_shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L146 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.check_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Check if keypoint coordinates are within valid ranges for the given image shape.</p> <p>This function validates that: 1. All x-coordinates are within [0, width) 2. All y-coordinates are within [0, height) 3. If angles are present (i.e., keypoints have more than 2 columns), they are within the range [0, 2π)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, 2+), where N is the number of keypoints. Each row represents a keypoint with at least (x, y) coordinates. If present, the third column is assumed to be the angle.</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>Tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If any keypoint coordinate is outside the valid range, or if any angle is invalid. The error message will detail which keypoints are invalid and why.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function assumes that keypoint coordinates are in absolute pixel values, not normalized.</li> <li>Angles, if present, are assumed to be in radians.</li> <li>The constant PAIR should be defined elsewhere in the module, typically as 2.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>check_keypoints</span><span class=p>(</span><span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>])</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Check if keypoint coordinates are within valid ranges for the given image shape.</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>    <span class=k>def</span> <span class=nf>convert_to_albumentations</span><span class=p>(</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=bp>self</span><span class=p>,</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=n>data</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>        <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=k>if</span> <span class=ow>not</span> <span class=n>data</span><span class=o>.</span><span class=n>size</span><span class=p>:</span>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>            <span class=k>return</span> <span class=n>data</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>        <span class=n>params</span> <span class=o>=</span> <span class=bp>self</span><span class=o>.</span><span class=n>params</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>        <span class=k>return</span> <span class=n>convert_keypoints_to_albumentations</span><span class=p>(</span>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>            <span class=n>data</span><span class=p>,</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>            <span class=n>params</span><span class=o>.</span><span class=n>format</span><span class=p>,</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>            <span class=n>shape</span><span class=p>,</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>            <span class=n>check_validity</span><span class=o>=</span><span class=n>params</span><span class=o>.</span><span class=n>remove_invisible</span><span class=p>,</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>            <span class=n>angle_in_degrees</span><span class=o>=</span><span class=n>params</span><span class=o>.</span><span class=n>angle_in_degrees</span><span class=p>,</span>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>        <span class=p>)</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.check_keypoints class="doc doc-heading" data-toc-label=check_keypoints()> <code class="highlight language-python">def check_keypoints (keypoints, shape) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L165 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.check_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Check if keypoint coordinates are within valid ranges for the given shape.</p> <p>This function validates that: 1. All x-coordinates are within [0, width) 2. All y-coordinates are within [0, height) 3. If depth is provided in shape, z-coordinates are within [0, depth) 4. Angles are within the range [0, 2π)</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, 5+), where N is the number of keypoints. - First 2 columns are always x, y - Column 3 (if present) is z - Column 4 (if present) is angle - Column 5+ (if present) are additional attributes</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>The shape of the image/volume: - For 2D: {'height': int, 'width': int} - For 3D: {'height': int, 'width': int, 'depth': int}</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If any keypoint coordinate is outside the valid range, or if angles are invalid. The error message will detail which keypoints are invalid and why.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>The function assumes that keypoint coordinates are in absolute pixel values, not normalized</li> <li>Angles are in radians</li> <li>Z-coordinates are only checked if 'depth' is present in shape</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>check_keypoints</span><span class=p>(</span><span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Check if keypoint coordinates are within valid ranges for the given shape.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    This function validates that:</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=sd>    1. All x-coordinates are within [0, width)</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=sd>    2. All y-coordinates are within [0, height)</span>
-</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>    3. If angles are present (i.e., keypoints have more than 2 columns),</span>
-</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>       they are within the range [0, 2π)</span>
+</span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=sd>    3. If depth is provided in shape, z-coordinates are within [0, depth)</span>
+</span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>    4. Angles are within the range [0, 2π)</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>    Args:</span>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        keypoints (np.ndarray): Array of keypoints with shape (N, 2+), where N is the number of keypoints.</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>                                Each row represents a keypoint with at least (x, y) coordinates.</span>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>                                If present, the third column is assumed to be the angle.</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        image_shape (Tuple[int, int]): The shape of the image (height, width).</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>    Raises:</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        ValueError: If any keypoint coordinate is outside the valid range, or if any angle is invalid.</span>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>                    The error message will detail which keypoints are invalid and why.</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        keypoints (np.ndarray): Array of keypoints with shape (N, 5+), where N is the number of keypoints.</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>            - First 2 columns are always x, y</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>            - Column 3 (if present) is z</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>            - Column 4 (if present) is angle</span>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>            - Column 5+ (if present) are additional attributes</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>        shape (ShapeType): The shape of the image/volume:</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>                           - For 2D: {&#39;height&#39;: int, &#39;width&#39;: int}</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>                           - For 3D: {&#39;height&#39;: int, &#39;width&#39;: int, &#39;depth&#39;: int}</span>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>    Note:</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>        - The function assumes that keypoint coordinates are in absolute pixel values, not normalized.</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>        - Angles, if present, are assumed to be in radians.</span>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>        - The constant PAIR should be defined elsewhere in the module, typically as 2.</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=c1># Check x and y coordinates</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>if</span> <span class=n>np</span><span class=o>.</span><span class=n>any</span><span class=p>((</span><span class=n>x</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>x</span> <span class=o>&gt;=</span> <span class=n>width</span><span class=p>))</span> <span class=ow>or</span> <span class=n>np</span><span class=o>.</span><span class=n>any</span><span class=p>((</span><span class=n>y</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>y</span> <span class=o>&gt;=</span> <span class=n>height</span><span class=p>)):</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=n>invalid_x</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>x</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>x</span> <span class=o>&gt;=</span> <span class=n>width</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=n>invalid_y</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>y</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>y</span> <span class=o>&gt;=</span> <span class=n>height</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>        <span class=n>error_messages</span> <span class=o>=</span> <span class=p>[]</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=n>error_messages</span> <span class=o>=</span> <span class=p>[</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>            <span class=sa>f</span><span class=s2>&quot;Expected </span><span class=si>{</span><span class=s1>&#39;x&#39;</span><span class=w> </span><span class=k>if</span><span class=w> </span><span class=n>idx</span><span class=w> </span><span class=ow>in</span><span class=w> </span><span class=n>invalid_x</span><span class=w> </span><span class=k>else</span><span class=w> </span><span class=s1>&#39;y&#39;</span><span class=si>}</span><span class=s2> for keypoint </span><span class=si>{</span><span class=n>keypoints</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2> to be &quot;</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>            <span class=sa>f</span><span class=s2>&quot;in the range [0.0, </span><span class=si>{</span><span class=n>width</span><span class=w> </span><span class=k>if</span><span class=w> </span><span class=n>idx</span><span class=w> </span><span class=ow>in</span><span class=w> </span><span class=n>invalid_x</span><span class=w> </span><span class=k>else</span><span class=w> </span><span class=n>height</span><span class=si>}</span><span class=s2>], &quot;</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=sa>f</span><span class=s2>&quot;got </span><span class=si>{</span><span class=n>x</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=w> </span><span class=k>if</span><span class=w> </span><span class=n>idx</span><span class=w> </span><span class=ow>in</span><span class=w> </span><span class=n>invalid_x</span><span class=w> </span><span class=k>else</span><span class=w> </span><span class=n>y</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2>.&quot;</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>            <span class=k>for</span> <span class=n>idx</span> <span class=ow>in</span> <span class=nb>sorted</span><span class=p>(</span><span class=nb>set</span><span class=p>(</span><span class=n>invalid_x</span><span class=p>)</span> <span class=o>|</span> <span class=nb>set</span><span class=p>(</span><span class=n>invalid_y</span><span class=p>))</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>        <span class=p>]</span>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;</span><span class=se>\n</span><span class=s2>&quot;</span><span class=o>.</span><span class=n>join</span><span class=p>(</span><span class=n>error_messages</span><span class=p>))</span>
-</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>    <span class=c1># Check angles</span>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=n>PAIR</span><span class=p>:</span>
-</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>        <span class=n>angles</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
-</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=n>invalid_angles</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>angles</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>angles</span> <span class=o>&gt;=</span> <span class=mi>2</span> <span class=o>*</span> <span class=n>math</span><span class=o>.</span><span class=n>pi</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>        <span class=k>if</span> <span class=nb>len</span><span class=p>(</span><span class=n>invalid_angles</span><span class=p>)</span> <span class=o>&gt;</span> <span class=mi>0</span><span class=p>:</span>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>            <span class=n>error_messages</span> <span class=o>=</span> <span class=p>[</span>
-</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>                <span class=sa>f</span><span class=s2>&quot;Keypoint angle must be in range [0, 2 * PI). Got: </span><span class=si>{</span><span class=n>angles</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2> for keypoint </span><span class=si>{</span><span class=n>keypoints</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2>&quot;</span>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>                <span class=k>for</span> <span class=n>idx</span> <span class=ow>in</span> <span class=n>invalid_angles</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>            <span class=p>]</span>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>            <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;</span><span class=se>\n</span><span class=s2>&quot;</span><span class=o>.</span><span class=n>join</span><span class=p>(</span><span class=n>error_messages</span><span class=p>))</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.convert_keypoints_from_albumentations class="doc doc-heading" data-toc-label=convert_keypoints_from_albumentations()> <code class="highlight language-python">def convert_keypoints_from_albumentations (keypoints, target_format, image_shape, check_validity=False, angle_in_degrees=True) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L309 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.convert_keypoints_from_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert keypoints from Albumentations format to various other formats.</p> <p>This function takes keypoints in the standard Albumentations format [x, y, angle, scale] and converts them to the specified target format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints in Albumentations format with shape (N, 4+), where N is the number of keypoints. Each row represents a keypoint [x, y, angle, scale, ...].</p></td> </tr> <tr> <td><code>target_format</code></td> <td><code>Literal[&#34;xy&#34;, &#34;yx&#34;, &#34;xya&#34;, &#34;xys&#34;, &#34;xyas&#34;, &#34;xysa&#34;]</code></td> <td><p>The desired output format. - "xy": [x, y] - "yx": [y, x] - "xya": [x, y, angle] - "xys": [x, y, scale] - "xyas": [x, y, angle, scale] - "xysa": [x, y, scale, angle]</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>If True, check if the keypoints are within the image boundaries. Defaults to False.</p></td> </tr> <tr> <td><code>angle_in_degrees</code></td> <td><code>bool</code></td> <td><p>If True, convert output angles to degrees. If False, angles remain in radians. Defaults to True.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of keypoints in the specified target format with shape (N, 2+). Any additional columns from the input keypoints beyond the first 4 are preserved and appended after the converted columns.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the target_format is not one of the supported formats.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>Input angles are assumed to be in the range [0, 2π) radians.</li> <li>If the input keypoints have additional columns beyond the first 4, these columns are preserved in the output.</li> <li>The constant NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS should be defined elsewhere in the module, typically as 4.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>convert_keypoints_from_albumentations</span><span class=p>(</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>    Raises:</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>        ValueError: If any keypoint coordinate is outside the valid range, or if angles are invalid.</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>                   The error message will detail which keypoints are invalid and why.</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>    Note:</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        - The function assumes that keypoint coordinates are in absolute pixel values, not normalized</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>        - Angles are in radians</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>        - Z-coordinates are only checked if &#39;depth&#39; is present in shape</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>]</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=n>has_depth</span> <span class=o>=</span> <span class=s2>&quot;depth&quot;</span> <span class=ow>in</span> <span class=n>shape</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=c1># Check x and y coordinates (always present)</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=n>invalid_x</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>x</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>x</span> <span class=o>&gt;=</span> <span class=n>width</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>    <span class=n>invalid_y</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>y</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>y</span> <span class=o>&gt;=</span> <span class=n>height</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>    <span class=n>error_messages</span> <span class=o>=</span> <span class=p>[]</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>    <span class=c1># Handle x, y errors</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>    <span class=k>for</span> <span class=n>idx</span> <span class=ow>in</span> <span class=nb>sorted</span><span class=p>(</span><span class=nb>set</span><span class=p>(</span><span class=n>invalid_x</span><span class=p>)</span> <span class=o>|</span> <span class=nb>set</span><span class=p>(</span><span class=n>invalid_y</span><span class=p>)):</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>        <span class=k>if</span> <span class=n>idx</span> <span class=ow>in</span> <span class=n>invalid_x</span><span class=p>:</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>            <span class=n>error_messages</span><span class=o>.</span><span class=n>append</span><span class=p>(</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>                <span class=sa>f</span><span class=s2>&quot;Expected x for keypoint </span><span class=si>{</span><span class=n>keypoints</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2> to be in range [0, </span><span class=si>{</span><span class=n>width</span><span class=si>}</span><span class=s2>), got </span><span class=si>{</span><span class=n>x</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>,</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>        <span class=k>if</span> <span class=n>idx</span> <span class=ow>in</span> <span class=n>invalid_y</span><span class=p>:</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>            <span class=n>error_messages</span><span class=o>.</span><span class=n>append</span><span class=p>(</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>                <span class=sa>f</span><span class=s2>&quot;Expected y for keypoint </span><span class=si>{</span><span class=n>keypoints</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2> to be in range [0, </span><span class=si>{</span><span class=n>height</span><span class=si>}</span><span class=s2>), got </span><span class=si>{</span><span class=n>y</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>,</span>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>            <span class=p>)</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>    <span class=c1># Check z coordinates if depth is provided and keypoints have z</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>    <span class=k>if</span> <span class=n>has_depth</span> <span class=ow>and</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=mi>2</span><span class=p>:</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=n>z</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=n>depth</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;depth&quot;</span><span class=p>]</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>        <span class=n>invalid_z</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>z</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>z</span> <span class=o>&gt;=</span> <span class=n>depth</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>        <span class=n>error_messages</span><span class=o>.</span><span class=n>extend</span><span class=p>(</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>            <span class=sa>f</span><span class=s2>&quot;Expected z for keypoint </span><span class=si>{</span><span class=n>keypoints</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2> to be in range [0, </span><span class=si>{</span><span class=n>depth</span><span class=si>}</span><span class=s2>), got </span><span class=si>{</span><span class=n>z</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2>&quot;</span> <span class=k>for</span> <span class=n>idx</span> <span class=ow>in</span> <span class=n>invalid_z</span>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>    <span class=c1># Check angles only if keypoints have angle column</span>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=mi>3</span><span class=p>:</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>        <span class=n>angles</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>        <span class=n>invalid_angles</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>where</span><span class=p>((</span><span class=n>angles</span> <span class=o>&lt;</span> <span class=mi>0</span><span class=p>)</span> <span class=o>|</span> <span class=p>(</span><span class=n>angles</span> <span class=o>&gt;=</span> <span class=mi>2</span> <span class=o>*</span> <span class=n>math</span><span class=o>.</span><span class=n>pi</span><span class=p>))[</span><span class=mi>0</span><span class=p>]</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>        <span class=n>error_messages</span><span class=o>.</span><span class=n>extend</span><span class=p>(</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>            <span class=sa>f</span><span class=s2>&quot;Expected angle for keypoint </span><span class=si>{</span><span class=n>keypoints</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2> to be in range [0, 2π), got </span><span class=si>{</span><span class=n>angles</span><span class=p>[</span><span class=n>idx</span><span class=p>]</span><span class=si>}</span><span class=s2>&quot;</span>
+</span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>            <span class=k>for</span> <span class=n>idx</span> <span class=ow>in</span> <span class=n>invalid_angles</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>        <span class=p>)</span>
+</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>    <span class=k>if</span> <span class=n>error_messages</span><span class=p>:</span>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=s2>&quot;</span><span class=se>\n</span><span class=s2>&quot;</span><span class=o>.</span><span class=n>join</span><span class=p>(</span><span class=n>error_messages</span><span class=p>))</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.convert_keypoints_from_albumentations class="doc doc-heading" data-toc-label=convert_keypoints_from_albumentations()> <code class="highlight language-python">def convert_keypoints_from_albumentations (keypoints, target_format, shape, check_validity=False, angle_in_degrees=True) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L350 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.convert_keypoints_from_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert keypoints from Albumentations format to various other formats.</p> <p>This function takes keypoints in the standard Albumentations format [x, y, z, angle, scale] and converts them to the specified target format.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints in Albumentations format with shape (N, 5+), where N is the number of keypoints. Each row represents a keypoint [x, y, z, angle, scale, ...].</p></td> </tr> <tr> <td><code>target_format</code></td> <td><code>Literal[&#34;xy&#34;, &#34;yx&#34;, &#34;xya&#34;, &#34;xys&#34;, &#34;xyas&#34;, &#34;xysa&#34;, &#34;xyz&#34;]</code></td> <td><p>The desired output format. - "xy": [x, y] - "yx": [y, x] - "xya": [x, y, angle] - "xys": [x, y, scale] - "xyas": [x, y, angle, scale] - "xysa": [x, y, scale, angle] - "xyz": [x, y, z]</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>The shape of the image {'height': height, 'width': width, 'depth': depth}.</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>If True, check if the keypoints are within the image boundaries. Defaults to False.</p></td> </tr> <tr> <td><code>angle_in_degrees</code></td> <td><code>bool</code></td> <td><p>If True, convert output angles to degrees. If False, angles remain in radians. Defaults to True.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of keypoints in the specified target format with shape (N, 2+). Any additional columns from the input keypoints beyond the first 5 are preserved and appended after the converted columns.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the target_format is not one of the supported formats.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>Input angles are assumed to be in the range [0, 2π) radians</li> <li>If the input keypoints have additional columns beyond the first 5, these columns are preserved in the output</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>convert_keypoints_from_albumentations</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>target_format</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;xy&quot;</span><span class=p>,</span> <span class=s2>&quot;yx&quot;</span><span class=p>,</span> <span class=s2>&quot;xya&quot;</span><span class=p>,</span> <span class=s2>&quot;xys&quot;</span><span class=p>,</span> <span class=s2>&quot;xyas&quot;</span><span class=p>,</span> <span class=s2>&quot;xysa&quot;</span><span class=p>],</span>
-</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>target_format</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;xy&quot;</span><span class=p>,</span> <span class=s2>&quot;yx&quot;</span><span class=p>,</span> <span class=s2>&quot;xya&quot;</span><span class=p>,</span> <span class=s2>&quot;xys&quot;</span><span class=p>,</span> <span class=s2>&quot;xyas&quot;</span><span class=p>,</span> <span class=s2>&quot;xysa&quot;</span><span class=p>,</span> <span class=s2>&quot;xyz&quot;</span><span class=p>],</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>check_validity</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>angle_in_degrees</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Convert keypoints from Albumentations format to various other formats.</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a>
-</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>    This function takes keypoints in the standard Albumentations format [x, y, angle, scale]</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>    This function takes keypoints in the standard Albumentations format [x, y, z, angle, scale]</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    and converts them to the specified target format.</span>
 </span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    Args:</span>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        keypoints (np.ndarray): Array of keypoints in Albumentations format with shape (N, 4+),</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>        keypoints (np.ndarray): Array of keypoints in Albumentations format with shape (N, 5+),</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>                                where N is the number of keypoints. Each row represents a keypoint</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>                                [x, y, angle, scale, ...].</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        target_format (Literal[&quot;xy&quot;, &quot;yx&quot;, &quot;xya&quot;, &quot;xys&quot;, &quot;xyas&quot;, &quot;xysa&quot;]): The desired output format.</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>                                [x, y, z, angle, scale, ...].</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        target_format (Literal[&quot;xy&quot;, &quot;yx&quot;, &quot;xya&quot;, &quot;xys&quot;, &quot;xyas&quot;, &quot;xysa&quot;, &quot;xyz&quot;]): The desired output format.</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>            - &quot;xy&quot;: [x, y]</span>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>            - &quot;yx&quot;: [y, x]</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>            - &quot;xya&quot;: [x, y, angle]</span>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>            - &quot;xys&quot;: [x, y, scale]</span>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>            - &quot;xyas&quot;: [x, y, angle, scale]</span>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>            - &quot;xysa&quot;: [x, y, scale, angle]</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>        image_shape (tuple[int, int]): The shape of the image (height, width).</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        check_validity (bool, optional): If True, check if the keypoints are within the image boundaries.</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>                                         Defaults to False.</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>        angle_in_degrees (bool, optional): If True, convert output angles to degrees.</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>                                           If False, angles remain in radians.</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a><span class=sd>                                           Defaults to True.</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a><span class=sd>    Returns:</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a><span class=sd>        np.ndarray: Array of keypoints in the specified target format with shape (N, 2+).</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a><span class=sd>                    Any additional columns from the input keypoints beyond the first 4</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a><span class=sd>                    are preserved and appended after the converted columns.</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a><span class=sd>    Raises:</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a><span class=sd>        ValueError: If the target_format is not one of the supported formats.</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a><span class=sd>    Note:</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a><span class=sd>        - Input angles are assumed to be in the range [0, 2π) radians.</span>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a><span class=sd>        - If the input keypoints have additional columns beyond the first 4,</span>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a><span class=sd>          these columns are preserved in the output.</span>
-</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a><span class=sd>        - The constant NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS should be defined</span>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a><span class=sd>          elsewhere in the module, typically as 4.</span>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=k>if</span> <span class=n>target_format</span> <span class=ow>not</span> <span class=ow>in</span> <span class=n>keypoint_formats</span><span class=p>:</span>
-</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Unknown target_format </span><span class=si>{</span><span class=n>target_format</span><span class=si>}</span><span class=s2>. Supported formats are: </span><span class=si>{</span><span class=n>keypoint_formats</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span>
-</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>    <span class=n>angle</span> <span class=o>=</span> <span class=n>angle_to_2pi_range</span><span class=p>(</span><span class=n>angle</span><span class=p>)</span>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>    <span class=k>if</span> <span class=n>check_validity</span><span class=p>:</span>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=n>check_keypoints</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>((</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span><span class=p>)),</span> <span class=n>image_shape</span><span class=p>)</span>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>
-</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=k>if</span> <span class=n>angle_in_degrees</span><span class=p>:</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>        <span class=n>angle</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>degrees</span><span class=p>(</span><span class=n>angle</span><span class=p>)</span>
-</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>
-</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>    <span class=n>format_to_columns</span> <span class=o>=</span> <span class=p>{</span>
-</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>        <span class=s2>&quot;xy&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>],</span>
-</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>        <span class=s2>&quot;yx&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>y</span><span class=p>,</span> <span class=n>x</span><span class=p>],</span>
-</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>        <span class=s2>&quot;xya&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>],</span>
-</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>        <span class=s2>&quot;xys&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>scale</span><span class=p>],</span>
-</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>        <span class=s2>&quot;xyas&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span><span class=p>],</span>
-</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>        <span class=s2>&quot;xysa&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>scale</span><span class=p>,</span> <span class=n>angle</span><span class=p>],</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>            - &quot;xyz&quot;: [x, y, z]</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        shape (ShapeType): The shape of the image {&#39;height&#39;: height, &#39;width&#39;: width, &#39;depth&#39;: depth}.</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>        check_validity (bool, optional): If True, check if the keypoints are within the image boundaries.</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>                                         Defaults to False.</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>        angle_in_degrees (bool, optional): If True, convert output angles to degrees.</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a><span class=sd>                                           If False, angles remain in radians.</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a><span class=sd>                                           Defaults to True.</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a><span class=sd>        np.ndarray: Array of keypoints in the specified target format with shape (N, 2+).</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a><span class=sd>                    Any additional columns from the input keypoints beyond the first 5</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a><span class=sd>                    are preserved and appended after the converted columns.</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a><span class=sd>    Raises:</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a><span class=sd>        ValueError: If the target_format is not one of the supported formats.</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a><span class=sd>    Note:</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a><span class=sd>        - Input angles are assumed to be in the range [0, 2π) radians</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a><span class=sd>        - If the input keypoints have additional columns beyond the first 5,</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a><span class=sd>          these columns are preserved in the output</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=k>if</span> <span class=n>target_format</span> <span class=ow>not</span> <span class=ow>in</span> <span class=n>keypoint_formats</span><span class=p>:</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Unknown target_format </span><span class=si>{</span><span class=n>target_format</span><span class=si>}</span><span class=s2>. Supported formats are: </span><span class=si>{</span><span class=n>keypoint_formats</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>4</span><span class=p>]</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>    <span class=n>angle</span> <span class=o>=</span> <span class=n>angle_to_2pi_range</span><span class=p>(</span><span class=n>angle</span><span class=p>)</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>    <span class=k>if</span> <span class=n>check_validity</span><span class=p>:</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=n>check_keypoints</span><span class=p>(</span><span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>((</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span><span class=p>)),</span> <span class=n>shape</span><span class=p>)</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>    <span class=k>if</span> <span class=n>angle_in_degrees</span><span class=p>:</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>        <span class=n>angle</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>degrees</span><span class=p>(</span><span class=n>angle</span><span class=p>)</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=n>format_to_columns</span> <span class=o>=</span> <span class=p>{</span>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>        <span class=s2>&quot;xy&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>],</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>        <span class=s2>&quot;yx&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>y</span><span class=p>,</span> <span class=n>x</span><span class=p>],</span>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>        <span class=s2>&quot;xya&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>],</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>        <span class=s2>&quot;xys&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>scale</span><span class=p>],</span>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>        <span class=s2>&quot;xyas&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>angle</span><span class=p>,</span> <span class=n>scale</span><span class=p>],</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>        <span class=s2>&quot;xysa&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>scale</span><span class=p>,</span> <span class=n>angle</span><span class=p>],</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>        <span class=s2>&quot;xyz&quot;</span><span class=p>:</span> <span class=p>[</span><span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span><span class=p>],</span>
 </span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>    <span class=p>}</span>
 </span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>
 </span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>    <span class=n>result</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>(</span><span class=n>format_to_columns</span><span class=p>[</span><span class=n>target_format</span><span class=p>])</span>
@@ -192,92 +279,95 @@
 </span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>        <span class=k>return</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>((</span><span class=n>result</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>:]))</span>
 </span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>
 </span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>    <span class=k>return</span> <span class=n>result</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.convert_keypoints_to_albumentations class="doc doc-heading" data-toc-label=convert_keypoints_to_albumentations()> <code class="highlight language-python">def convert_keypoints_to_albumentations (keypoints, source_format, image_shape, check_validity=False, angle_in_degrees=True) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L233 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.convert_keypoints_to_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert keypoints from various formats to the Albumentations format.</p> <p>This function takes keypoints in different formats and converts them to the standard Albumentations format: [x, y, angle, scale]. If the input format doesn't include angle or scale, these values are set to 0.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, 2+), where N is the number of keypoints. The number of columns depends on the source_format.</p></td> </tr> <tr> <td><code>source_format</code></td> <td><code>Literal[&#34;xy&#34;, &#34;yx&#34;, &#34;xya&#34;, &#34;xys&#34;, &#34;xyas&#34;, &#34;xysa&#34;]</code></td> <td><p>The format of the input keypoints. - "xy": [x, y] - "yx": [y, x] - "xya": [x, y, angle] - "xys": [x, y, scale] - "xyas": [x, y, angle, scale] - "xysa": [x, y, scale, angle]</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>The shape of the image (height, width).</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>If True, check if the converted keypoints are within the image boundaries. Defaults to False.</p></td> </tr> <tr> <td><code>angle_in_degrees</code></td> <td><code>bool</code></td> <td><p>If True, convert input angles from degrees to radians. Defaults to True.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of keypoints in Albumentations format [x, y, angle, scale] with shape (N, 4+). Any additional columns from the input keypoints are preserved and appended after the first 4 columns.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the source_format is not one of the supported formats.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>Angles are converted to the range [0, 2π) radians.</li> <li>If the input keypoints have additional columns beyond what's specified in the source_format, these columns are preserved in the output.</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>convert_keypoints_to_albumentations</span><span class=p>(</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.convert_keypoints_to_albumentations class="doc doc-heading" data-toc-label=convert_keypoints_to_albumentations()> <code class="highlight language-python">def convert_keypoints_to_albumentations (keypoints, source_format, shape, check_validity=False, angle_in_degrees=True) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L271 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.convert_keypoints_to_albumentations title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Convert keypoints from various formats to the Albumentations format.</p> <p>This function takes keypoints in different formats and converts them to the standard Albumentations format: [x, y, z, angle, scale]. For 2D formats, z is set to 0. For formats without angle or scale, these values are set to 0.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>Array of keypoints with shape (N, 2+), where N is the number of keypoints. The number of columns depends on the source_format.</p></td> </tr> <tr> <td><code>source_format</code></td> <td><code>Literal[&#34;xy&#34;, &#34;yx&#34;, &#34;xya&#34;, &#34;xys&#34;, &#34;xyas&#34;, &#34;xysa&#34;, &#34;xyz&#34;]</code></td> <td><p>The format of the input keypoints. - "xy": [x, y] - "yx": [y, x] - "xya": [x, y, angle] - "xys": [x, y, scale] - "xyas": [x, y, angle, scale] - "xysa": [x, y, scale, angle] - "xyz": [x, y, z]</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>The shape of the image {'height': height, 'width': width, 'depth': depth}.</p></td> </tr> <tr> <td><code>check_validity</code></td> <td><code>bool</code></td> <td><p>If True, check if the converted keypoints are within the image boundaries. Defaults to False.</p></td> </tr> <tr> <td><code>angle_in_degrees</code></td> <td><code>bool</code></td> <td><p>If True, convert input angles from degrees to radians. Defaults to True.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>Array of keypoints in Albumentations format [x, y, z, angle, scale] with shape (N, 5+). Any additional columns from the input keypoints are preserved and appended after the first 5 columns.</p></td> </tr> </tbody> </table> <p><strong>Exceptions:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>ValueError</code></td> <td><p>If the source_format is not one of the supported formats.</p></td> </tr> </tbody> </table> <div class="admonition note"> <p class=admonition-title>Note</p> <ul> <li>For 2D formats (xy, yx, xya, xys, xyas, xysa), z coordinate is set to 0</li> <li>Angles are converted to the range [0, 2π) radians</li> <li>If the input keypoints have additional columns beyond what's specified in the source_format, these columns are preserved in the output</li> </ul> </div> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>convert_keypoints_to_albumentations</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>source_format</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;xy&quot;</span><span class=p>,</span> <span class=s2>&quot;yx&quot;</span><span class=p>,</span> <span class=s2>&quot;xya&quot;</span><span class=p>,</span> <span class=s2>&quot;xys&quot;</span><span class=p>,</span> <span class=s2>&quot;xyas&quot;</span><span class=p>,</span> <span class=s2>&quot;xysa&quot;</span><span class=p>],</span>
-</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>source_format</span><span class=p>:</span> <span class=n>Literal</span><span class=p>[</span><span class=s2>&quot;xy&quot;</span><span class=p>,</span> <span class=s2>&quot;yx&quot;</span><span class=p>,</span> <span class=s2>&quot;xya&quot;</span><span class=p>,</span> <span class=s2>&quot;xys&quot;</span><span class=p>,</span> <span class=s2>&quot;xyas&quot;</span><span class=p>,</span> <span class=s2>&quot;xysa&quot;</span><span class=p>,</span> <span class=s2>&quot;xyz&quot;</span><span class=p>],</span>
+</span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a>    <span class=n>check_validity</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>False</span><span class=p>,</span>
 </span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a>    <span class=n>angle_in_degrees</span><span class=p>:</span> <span class=nb>bool</span> <span class=o>=</span> <span class=kc>True</span><span class=p>,</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Convert keypoints from various formats to the Albumentations format.</span>
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>    This function takes keypoints in different formats and converts them to the standard</span>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    Albumentations format: [x, y, angle, scale]. If the input format doesn&#39;t include</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    angle or scale, these values are set to 0.</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>    Albumentations format: [x, y, z, angle, scale]. For 2D formats, z is set to 0.</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    For formats without angle or scale, these values are set to 0.</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>    Args:</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        keypoints (np.ndarray): Array of keypoints with shape (N, 2+), where N is the number of keypoints.</span>
 </span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=sd>                                The number of columns depends on the source_format.</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        source_format (Literal[&quot;xy&quot;, &quot;yx&quot;, &quot;xya&quot;, &quot;xys&quot;, &quot;xyas&quot;, &quot;xysa&quot;]): The format of the input keypoints.</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a><span class=sd>        source_format (Literal[&quot;xy&quot;, &quot;yx&quot;, &quot;xya&quot;, &quot;xys&quot;, &quot;xyas&quot;, &quot;xysa&quot;, &quot;xyz&quot;]): The format of the input keypoints.</span>
 </span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a><span class=sd>            - &quot;xy&quot;: [x, y]</span>
 </span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a><span class=sd>            - &quot;yx&quot;: [y, x]</span>
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=sd>            - &quot;xya&quot;: [x, y, angle]</span>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=sd>            - &quot;xys&quot;: [x, y, scale]</span>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a><span class=sd>            - &quot;xyas&quot;: [x, y, angle, scale]</span>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a><span class=sd>            - &quot;xysa&quot;: [x, y, scale, angle]</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>        image_shape (tuple[int, int]): The shape of the image (height, width).</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        check_validity (bool, optional): If True, check if the converted keypoints are within the image boundaries.</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>                                         Defaults to False.</span>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>        angle_in_degrees (bool, optional): If True, convert input angles from degrees to radians.</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>                                           Defaults to True.</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a><span class=sd>    Returns:</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a><span class=sd>        np.ndarray: Array of keypoints in Albumentations format [x, y, angle, scale] with shape (N, 4+).</span>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a><span class=sd>                    Any additional columns from the input keypoints are preserved and appended after the</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a><span class=sd>                    first 4 columns.</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a><span class=sd>    Raises:</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a><span class=sd>        ValueError: If the source_format is not one of the supported formats.</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a><span class=sd>    Note:</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a><span class=sd>        - Angles are converted to the range [0, 2π) radians.</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a><span class=sd>        - If the input keypoints have additional columns beyond what&#39;s specified in the source_format,</span>
-</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a><span class=sd>          these columns are preserved in the output.</span>
-</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a>    <span class=k>if</span> <span class=n>source_format</span> <span class=ow>not</span> <span class=ow>in</span> <span class=n>keypoint_formats</span><span class=p>:</span>
-</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Unknown source_format </span><span class=si>{</span><span class=n>source_format</span><span class=si>}</span><span class=s2>. Supported formats are: </span><span class=si>{</span><span class=n>keypoint_formats</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
-</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>
-</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>    <span class=n>format_to_indices</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>]]</span> <span class=o>=</span> <span class=p>{</span>
-</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>        <span class=s2>&quot;xy&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
-</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>        <span class=s2>&quot;yx&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
-</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=s2>&quot;xya&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
-</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=s2>&quot;xys&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=mi>2</span><span class=p>],</span>
-</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=s2>&quot;xyas&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span>
-</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=s2>&quot;xysa&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>3</span><span class=p>,</span> <span class=mi>2</span><span class=p>],</span>
-</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>    <span class=p>}</span>
-</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>
-</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>    <span class=n>indices</span><span class=p>:</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>]</span> <span class=o>=</span> <span class=n>format_to_indices</span><span class=p>[</span><span class=n>source_format</span><span class=p>]</span>
-</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>
-</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>    <span class=n>processed_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>zeros</span><span class=p>((</span><span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>0</span><span class=p>],</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
-</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>
-</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>    <span class=k>for</span> <span class=n>i</span><span class=p>,</span> <span class=n>idx</span> <span class=ow>in</span> <span class=nb>enumerate</span><span class=p>(</span><span class=n>indices</span><span class=p>):</span>
-</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>        <span class=k>if</span> <span class=n>idx</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>            <span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=n>i</span><span class=p>]</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=n>idx</span><span class=p>]</span>
-</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>
-</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>    <span class=k>if</span> <span class=n>angle_in_degrees</span> <span class=ow>and</span> <span class=n>indices</span><span class=p>[</span><span class=mi>2</span><span class=p>]</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
-</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>        <span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>radians</span><span class=p>(</span><span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>])</span>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=sd>            - &quot;xyz&quot;: [x, y, z]</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=sd>        shape (ShapeType): The shape of the image {&#39;height&#39;: height, &#39;width&#39;: width, &#39;depth&#39;: depth}.</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a><span class=sd>        check_validity (bool, optional): If True, check if the converted keypoints are within the image boundaries.</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a><span class=sd>                                         Defaults to False.</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a><span class=sd>        angle_in_degrees (bool, optional): If True, convert input angles from degrees to radians.</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a><span class=sd>                                           Defaults to True.</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a><span class=sd>    Returns:</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a><span class=sd>        np.ndarray: Array of keypoints in Albumentations format [x, y, z, angle, scale] with shape (N, 5+).</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a><span class=sd>                    Any additional columns from the input keypoints are preserved and appended after the</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a><span class=sd>                    first 5 columns.</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a><span class=sd>    Raises:</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a><span class=sd>        ValueError: If the source_format is not one of the supported formats.</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a><span class=sd>    Note:</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a><span class=sd>        - For 2D formats (xy, yx, xya, xys, xyas, xysa), z coordinate is set to 0</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a><span class=sd>        - Angles are converted to the range [0, 2π) radians</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a><span class=sd>        - If the input keypoints have additional columns beyond what&#39;s specified in the source_format,</span>
+</span><span id=__span-0-43><a id=__codelineno-0-43 name=__codelineno-0-43></a><a href=#__codelineno-0-43><span class=linenos data-linenos="43 "></span></a><span class=sd>          these columns are preserved in the output</span>
+</span><span id=__span-0-44><a id=__codelineno-0-44 name=__codelineno-0-44></a><a href=#__codelineno-0-44><span class=linenos data-linenos="44 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-45><a id=__codelineno-0-45 name=__codelineno-0-45></a><a href=#__codelineno-0-45><span class=linenos data-linenos="45 "></span></a>    <span class=k>if</span> <span class=n>source_format</span> <span class=ow>not</span> <span class=ow>in</span> <span class=n>keypoint_formats</span><span class=p>:</span>
+</span><span id=__span-0-46><a id=__codelineno-0-46 name=__codelineno-0-46></a><a href=#__codelineno-0-46><span class=linenos data-linenos="46 "></span></a>        <span class=k>raise</span> <span class=ne>ValueError</span><span class=p>(</span><span class=sa>f</span><span class=s2>&quot;Unknown source_format </span><span class=si>{</span><span class=n>source_format</span><span class=si>}</span><span class=s2>. Supported formats are: </span><span class=si>{</span><span class=n>keypoint_formats</span><span class=si>}</span><span class=s2>&quot;</span><span class=p>)</span>
+</span><span id=__span-0-47><a id=__codelineno-0-47 name=__codelineno-0-47></a><a href=#__codelineno-0-47><span class=linenos data-linenos="47 "></span></a>
+</span><span id=__span-0-48><a id=__codelineno-0-48 name=__codelineno-0-48></a><a href=#__codelineno-0-48><span class=linenos data-linenos="48 "></span></a>    <span class=n>format_to_indices</span><span class=p>:</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>]]</span> <span class=o>=</span> <span class=p>{</span>
+</span><span id=__span-0-49><a id=__codelineno-0-49 name=__codelineno-0-49></a><a href=#__codelineno-0-49><span class=linenos data-linenos="49 "></span></a>        <span class=s2>&quot;xy&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
+</span><span id=__span-0-50><a id=__codelineno-0-50 name=__codelineno-0-50></a><a href=#__codelineno-0-50><span class=linenos data-linenos="50 "></span></a>        <span class=s2>&quot;yx&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>1</span><span class=p>,</span> <span class=mi>0</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
+</span><span id=__span-0-51><a id=__codelineno-0-51 name=__codelineno-0-51></a><a href=#__codelineno-0-51><span class=linenos data-linenos="51 "></span></a>        <span class=s2>&quot;xya&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
+</span><span id=__span-0-52><a id=__codelineno-0-52 name=__codelineno-0-52></a><a href=#__codelineno-0-52><span class=linenos data-linenos="52 "></span></a>        <span class=s2>&quot;xys&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=mi>2</span><span class=p>],</span>
+</span><span id=__span-0-53><a id=__codelineno-0-53 name=__codelineno-0-53></a><a href=#__codelineno-0-53><span class=linenos data-linenos="53 "></span></a>        <span class=s2>&quot;xyas&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=mi>3</span><span class=p>],</span>
+</span><span id=__span-0-54><a id=__codelineno-0-54 name=__codelineno-0-54></a><a href=#__codelineno-0-54><span class=linenos data-linenos="54 "></span></a>        <span class=s2>&quot;xysa&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=mi>3</span><span class=p>,</span> <span class=mi>2</span><span class=p>],</span>
+</span><span id=__span-0-55><a id=__codelineno-0-55 name=__codelineno-0-55></a><a href=#__codelineno-0-55><span class=linenos data-linenos="55 "></span></a>        <span class=s2>&quot;xyz&quot;</span><span class=p>:</span> <span class=p>[</span><span class=mi>0</span><span class=p>,</span> <span class=mi>1</span><span class=p>,</span> <span class=mi>2</span><span class=p>,</span> <span class=kc>None</span><span class=p>,</span> <span class=kc>None</span><span class=p>],</span>
+</span><span id=__span-0-56><a id=__codelineno-0-56 name=__codelineno-0-56></a><a href=#__codelineno-0-56><span class=linenos data-linenos="56 "></span></a>    <span class=p>}</span>
+</span><span id=__span-0-57><a id=__codelineno-0-57 name=__codelineno-0-57></a><a href=#__codelineno-0-57><span class=linenos data-linenos="57 "></span></a>
+</span><span id=__span-0-58><a id=__codelineno-0-58 name=__codelineno-0-58></a><a href=#__codelineno-0-58><span class=linenos data-linenos="58 "></span></a>    <span class=n>indices</span><span class=p>:</span> <span class=nb>list</span><span class=p>[</span><span class=nb>int</span> <span class=o>|</span> <span class=kc>None</span><span class=p>]</span> <span class=o>=</span> <span class=n>format_to_indices</span><span class=p>[</span><span class=n>source_format</span><span class=p>]</span>
+</span><span id=__span-0-59><a id=__codelineno-0-59 name=__codelineno-0-59></a><a href=#__codelineno-0-59><span class=linenos data-linenos="59 "></span></a>
+</span><span id=__span-0-60><a id=__codelineno-0-60 name=__codelineno-0-60></a><a href=#__codelineno-0-60><span class=linenos data-linenos="60 "></span></a>    <span class=n>processed_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>zeros</span><span class=p>((</span><span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>0</span><span class=p>],</span> <span class=n>NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS</span><span class=p>),</span> <span class=n>dtype</span><span class=o>=</span><span class=n>np</span><span class=o>.</span><span class=n>float32</span><span class=p>)</span>
+</span><span id=__span-0-61><a id=__codelineno-0-61 name=__codelineno-0-61></a><a href=#__codelineno-0-61><span class=linenos data-linenos="61 "></span></a>
+</span><span id=__span-0-62><a id=__codelineno-0-62 name=__codelineno-0-62></a><a href=#__codelineno-0-62><span class=linenos data-linenos="62 "></span></a>    <span class=k>for</span> <span class=n>i</span><span class=p>,</span> <span class=n>idx</span> <span class=ow>in</span> <span class=nb>enumerate</span><span class=p>(</span><span class=n>indices</span><span class=p>):</span>
+</span><span id=__span-0-63><a id=__codelineno-0-63 name=__codelineno-0-63></a><a href=#__codelineno-0-63><span class=linenos data-linenos="63 "></span></a>        <span class=k>if</span> <span class=n>idx</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-64><a id=__codelineno-0-64 name=__codelineno-0-64></a><a href=#__codelineno-0-64><span class=linenos data-linenos="64 "></span></a>            <span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=n>i</span><span class=p>]</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=n>idx</span><span class=p>]</span>
 </span><span id=__span-0-65><a id=__codelineno-0-65 name=__codelineno-0-65></a><a href=#__codelineno-0-65><span class=linenos data-linenos="65 "></span></a>
-</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>    <span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle_to_2pi_range</span><span class=p>(</span><span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>])</span>
-</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>
-</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=nb>len</span><span class=p>(</span><span class=n>source_format</span><span class=p>):</span>
-</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>        <span class=n>processed_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>((</span><span class=n>processed_keypoints</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=nb>len</span><span class=p>(</span><span class=n>source_format</span><span class=p>)</span> <span class=p>:]))</span>
+</span><span id=__span-0-66><a id=__codelineno-0-66 name=__codelineno-0-66></a><a href=#__codelineno-0-66><span class=linenos data-linenos="66 "></span></a>    <span class=k>if</span> <span class=n>angle_in_degrees</span> <span class=ow>and</span> <span class=n>indices</span><span class=p>[</span><span class=mi>3</span><span class=p>]</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>  <span class=c1># angle is now at index 3</span>
+</span><span id=__span-0-67><a id=__codelineno-0-67 name=__codelineno-0-67></a><a href=#__codelineno-0-67><span class=linenos data-linenos="67 "></span></a>        <span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>radians</span><span class=p>(</span><span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>])</span>
+</span><span id=__span-0-68><a id=__codelineno-0-68 name=__codelineno-0-68></a><a href=#__codelineno-0-68><span class=linenos data-linenos="68 "></span></a>
+</span><span id=__span-0-69><a id=__codelineno-0-69 name=__codelineno-0-69></a><a href=#__codelineno-0-69><span class=linenos data-linenos="69 "></span></a>    <span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>]</span> <span class=o>=</span> <span class=n>angle_to_2pi_range</span><span class=p>(</span><span class=n>processed_keypoints</span><span class=p>[:,</span> <span class=mi>3</span><span class=p>])</span>  <span class=c1># angle is now at index 3</span>
 </span><span id=__span-0-70><a id=__codelineno-0-70 name=__codelineno-0-70></a><a href=#__codelineno-0-70><span class=linenos data-linenos="70 "></span></a>
-</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>    <span class=k>if</span> <span class=n>check_validity</span><span class=p>:</span>
-</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>        <span class=n>check_keypoints</span><span class=p>(</span><span class=n>processed_keypoints</span><span class=p>,</span> <span class=n>image_shape</span><span class=p>)</span>
+</span><span id=__span-0-71><a id=__codelineno-0-71 name=__codelineno-0-71></a><a href=#__codelineno-0-71><span class=linenos data-linenos="71 "></span></a>    <span class=k>if</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>shape</span><span class=p>[</span><span class=mi>1</span><span class=p>]</span> <span class=o>&gt;</span> <span class=nb>len</span><span class=p>(</span><span class=n>source_format</span><span class=p>):</span>
+</span><span id=__span-0-72><a id=__codelineno-0-72 name=__codelineno-0-72></a><a href=#__codelineno-0-72><span class=linenos data-linenos="72 "></span></a>        <span class=n>processed_keypoints</span> <span class=o>=</span> <span class=n>np</span><span class=o>.</span><span class=n>column_stack</span><span class=p>((</span><span class=n>processed_keypoints</span><span class=p>,</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=nb>len</span><span class=p>(</span><span class=n>source_format</span><span class=p>)</span> <span class=p>:]))</span>
 </span><span id=__span-0-73><a id=__codelineno-0-73 name=__codelineno-0-73></a><a href=#__codelineno-0-73><span class=linenos data-linenos="73 "></span></a>
-</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>    <span class=k>return</span> <span class=n>processed_keypoints</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.filter_keypoints class="doc doc-heading" data-toc-label=filter_keypoints()> <code class="highlight language-python">def filter_keypoints (keypoints, image_shape, remove_invisible) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L201 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.filter_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Filter keypoints to remove those outside the image boundaries.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 2+) where N is the number of keypoints. Each row represents a keypoint (x, y, ...).</p></td> </tr> <tr> <td><code>image_shape</code></td> <td><code>tuple[int, int]</code></td> <td><p>A tuple (height, width) representing the image dimensions.</p></td> </tr> <tr> <td><code>remove_invisible</code></td> <td><code>bool</code></td> <td><p>If True, remove keypoints outside the image boundaries.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of filtered keypoints.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>filter_keypoints</span><span class=p>(</span>
+</span><span id=__span-0-74><a id=__codelineno-0-74 name=__codelineno-0-74></a><a href=#__codelineno-0-74><span class=linenos data-linenos="74 "></span></a>    <span class=k>if</span> <span class=n>check_validity</span><span class=p>:</span>
+</span><span id=__span-0-75><a id=__codelineno-0-75 name=__codelineno-0-75></a><a href=#__codelineno-0-75><span class=linenos data-linenos="75 "></span></a>        <span class=n>check_keypoints</span><span class=p>(</span><span class=n>processed_keypoints</span><span class=p>,</span> <span class=n>shape</span><span class=p>)</span>
+</span><span id=__span-0-76><a id=__codelineno-0-76 name=__codelineno-0-76></a><a href=#__codelineno-0-76><span class=linenos data-linenos="76 "></span></a>
+</span><span id=__span-0-77><a id=__codelineno-0-77 name=__codelineno-0-77></a><a href=#__codelineno-0-77><span class=linenos data-linenos="77 "></span></a>    <span class=k>return</span> <span class=n>processed_keypoints</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-function"> <h2 id=albumentations.core.keypoints_utils.filter_keypoints class="doc doc-heading" data-toc-label=filter_keypoints()> <code class="highlight language-python">def filter_keypoints (keypoints, shape, remove_invisible) </code> <span class=doc-github-link><a href=https://github.com/albumentations-team/albumentations/blob/master/albumentations/core/keypoints_utils.py#L236 target=_blank>[view source on GitHub]</a></span><a class=headerlink href=#albumentations.core.keypoints_utils.filter_keypoints title="Permanent link">¶</a> </h2> <div class="doc doc-contents "> <p>Filter keypoints to remove those outside the boundaries.</p> <p><strong>Parameters:</strong></p> <table> <thead> <tr> <th>Name</th> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>keypoints</code></td> <td><code>np.ndarray</code></td> <td><p>A numpy array of shape (N, 5+) where N is the number of keypoints. Each row represents a keypoint (x, y, z, angle, scale, ...).</p></td> </tr> <tr> <td><code>shape</code></td> <td><code>ShapeType</code></td> <td><p>Shape to check against as {'height': height, 'width': width, 'depth': depth}.</p></td> </tr> <tr> <td><code>remove_invisible</code></td> <td><code>bool</code></td> <td><p>If True, remove keypoints outside the boundaries.</p></td> </tr> </tbody> </table> <p><strong>Returns:</strong></p> <table> <thead> <tr> <th>Type</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code>np.ndarray</code></td> <td><p>A numpy array of filtered keypoints.</p></td> </tr> </tbody> </table> <details class=quote> <summary>Source code in <code>albumentations/core/keypoints_utils.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>def</span> <span class=nf>filter_keypoints</span><span class=p>(</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a>    <span class=n>keypoints</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span>
-</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>image_shape</span><span class=p>:</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>int</span><span class=p>,</span> <span class=nb>int</span><span class=p>],</span>
+</span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>    <span class=n>shape</span><span class=p>:</span> <span class=n>ShapeType</span><span class=p>,</span>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a>    <span class=n>remove_invisible</span><span class=p>:</span> <span class=nb>bool</span><span class=p>,</span>
 </span><span id=__span-0-5><a id=__codelineno-0-5 name=__codelineno-0-5></a><a href=#__codelineno-0-5><span class=linenos data-linenos=" 5 "></span></a><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Filter keypoints to remove those outside the image boundaries.</span>
+</span><span id=__span-0-6><a id=__codelineno-0-6 name=__codelineno-0-6></a><a href=#__codelineno-0-6><span class=linenos data-linenos=" 6 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Filter keypoints to remove those outside the boundaries.</span>
 </span><span id=__span-0-7><a id=__codelineno-0-7 name=__codelineno-0-7></a><a href=#__codelineno-0-7><span class=linenos data-linenos=" 7 "></span></a>
 </span><span id=__span-0-8><a id=__codelineno-0-8 name=__codelineno-0-8></a><a href=#__codelineno-0-8><span class=linenos data-linenos=" 8 "></span></a><span class=sd>    Args:</span>
-</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        keypoints: A numpy array of shape (N, 2+) where N is the number of keypoints.</span>
-</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>                   Each row represents a keypoint (x, y, ...).</span>
-</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        image_shape: A tuple (height, width) representing the image dimensions.</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        remove_invisible: If True, remove keypoints outside the image boundaries.</span>
+</span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        keypoints: A numpy array of shape (N, 5+) where N is the number of keypoints.</span>
+</span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>                   Each row represents a keypoint (x, y, z, angle, scale, ...).</span>
+</span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        shape: Shape to check against as {&#39;height&#39;: height, &#39;width&#39;: width, &#39;depth&#39;: depth}.</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        remove_invisible: If True, remove keypoints outside the boundaries.</span>
 </span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
 </span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a><span class=sd>    Returns:</span>
 </span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=sd>        A numpy array of filtered keypoints.</span>
@@ -288,12 +378,15 @@
 </span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=k>if</span> <span class=ow>not</span> <span class=n>keypoints</span><span class=o>.</span><span class=n>size</span><span class=p>:</span>
 </span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>        <span class=k>return</span> <span class=n>keypoints</span>
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span> <span class=o>=</span> <span class=n>image_shape</span><span class=p>[:</span><span class=mi>2</span><span class=p>]</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=n>height</span><span class=p>,</span> <span class=n>width</span><span class=p>,</span> <span class=n>depth</span> <span class=o>=</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;height&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=p>[</span><span class=s2>&quot;width&quot;</span><span class=p>],</span> <span class=n>shape</span><span class=o>.</span><span class=n>get</span><span class=p>(</span><span class=s2>&quot;depth&quot;</span><span class=p>,</span> <span class=kc>None</span><span class=p>)</span>
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
 </span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=c1># Create boolean mask for visible keypoints</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>]</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>    <span class=n>x</span><span class=p>,</span> <span class=n>y</span><span class=p>,</span> <span class=n>z</span> <span class=o>=</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>0</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>1</span><span class=p>],</span> <span class=n>keypoints</span><span class=p>[:,</span> <span class=mi>2</span><span class=p>]</span>
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=n>visible</span> <span class=o>=</span> <span class=p>(</span><span class=n>x</span> <span class=o>&gt;=</span> <span class=mi>0</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>x</span> <span class=o>&lt;</span> <span class=n>width</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y</span> <span class=o>&gt;=</span> <span class=mi>0</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>y</span> <span class=o>&lt;</span> <span class=n>height</span><span class=p>)</span>
 </span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=c1># Apply the mask to filter keypoints</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>    <span class=k>return</span> <span class=n>keypoints</span><span class=p>[</span><span class=n>visible</span><span class=p>]</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>if</span> <span class=n>depth</span> <span class=ow>is</span> <span class=ow>not</span> <span class=kc>None</span><span class=p>:</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=n>visible</span> <span class=o>&amp;=</span> <span class=p>(</span><span class=n>z</span> <span class=o>&gt;=</span> <span class=mi>0</span><span class=p>)</span> <span class=o>&amp;</span> <span class=p>(</span><span class=n>z</span> <span class=o>&lt;</span> <span class=n>depth</span><span class=p>)</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=c1># Apply the mask to filter keypoints</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=k>return</span> <span class=n>keypoints</span><span class=p>[</span><span class=n>visible</span><span class=p>]</span>
 </span></code></pre></div> </details> </div> </div> </div> </div> </div> </article> </div> <script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script> </div> <button type=button class="md-top md-icon" data-md-component=top hidden> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8z"/></svg> Back to top </button> </main> <footer class=md-footer> <nav class="md-footer__inner md-grid" aria-label=Footer> <a href=../bbox_utils/ class="md-footer__link md-footer__link--prev" aria-label="Previous: Helper functions for working with bounding boxes (augmentations.core.bbox_utils)"> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11z"/></svg> </div> <div class=md-footer__title> <span class=md-footer__direction> Previous </span> <div class=md-ellipsis> Helper functions for working with bounding boxes (augmentations.core.bbox_utils) </div> </div> </a> <a href=../../augmentations/ class="md-footer__link md-footer__link--next" aria-label="Next: Index"> <div class=md-footer__title> <span class=md-footer__direction> Next </span> <div class=md-ellipsis> Index </div> </div> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M4 11v2h12l-5.5 5.5 1.42 1.42L19.84 12l-7.92-7.92L10.5 5.5 16 11z"/></svg> </div> </a> </nav> <div class="md-footer-meta md-typeset"> <div class="md-footer-meta__inner md-grid"> <div class=md-copyright> </div> <div class=md-social> <a href=https://twitter.com/albumentations target=_blank rel=noopener title=twitter.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 512 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8l164.9-188.5L26.8 48h145.6l100.5 132.9zm-24.8 373.8h39.1L151.1 88h-42z"/></svg> </a> <a href=https://www.linkedin.com/company/100504475/ target=_blank rel=noopener title=www.linkedin.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 448 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3M135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5m282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9z"/></svg> </a> <a href=https://discord.com/invite/AKPrrDYNAt target=_blank rel=noopener title=discord.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 640 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M524.531 69.836a1.5 1.5 0 0 0-.764-.7A485 485 0 0 0 404.081 32.03a1.82 1.82 0 0 0-1.923.91 338 338 0 0 0-14.9 30.6 447.9 447.9 0 0 0-134.426 0 310 310 0 0 0-15.135-30.6 1.89 1.89 0 0 0-1.924-.91 483.7 483.7 0 0 0-119.688 37.107 1.7 1.7 0 0 0-.788.676C39.068 183.651 18.186 294.69 28.43 404.354a2.02 2.02 0 0 0 .765 1.375 487.7 487.7 0 0 0 146.825 74.189 1.9 1.9 0 0 0 2.063-.676A348 348 0 0 0 208.12 430.4a1.86 1.86 0 0 0-1.019-2.588 321 321 0 0 1-45.868-21.853 1.885 1.885 0 0 1-.185-3.126 251 251 0 0 0 9.109-7.137 1.82 1.82 0 0 1 1.9-.256c96.229 43.917 200.41 43.917 295.5 0a1.81 1.81 0 0 1 1.924.233 235 235 0 0 0 9.132 7.16 1.884 1.884 0 0 1-.162 3.126 301.4 301.4 0 0 1-45.89 21.83 1.875 1.875 0 0 0-1 2.611 391 391 0 0 0 30.014 48.815 1.86 1.86 0 0 0 2.063.7A486 486 0 0 0 610.7 405.729a1.88 1.88 0 0 0 .765-1.352c12.264-126.783-20.532-236.912-86.934-334.541M222.491 337.58c-28.972 0-52.844-26.587-52.844-59.239s23.409-59.241 52.844-59.241c29.665 0 53.306 26.82 52.843 59.239 0 32.654-23.41 59.241-52.843 59.241m195.38 0c-28.971 0-52.843-26.587-52.843-59.239s23.409-59.241 52.843-59.241c29.667 0 53.307 26.82 52.844 59.239 0 32.654-23.177 59.241-52.844 59.241"/></svg> </a> </div> </div> </div> </footer> </div> <div class=md-dialog data-md-component=dialog> <div class="md-dialog__inner md-typeset"></div> </div> <script id=__config type=application/json>{"base": "../../..", "features": ["navigation.sections", "navigation.indexes", "navigation.top", "navigation.footer", "navigation.path", "navigation.prune", "search.suggest", "search.highlight", "search.share", "toc.follow", "toc.integrate"], "search": "../../../assets/javascripts/workers/search.6ce7567c.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}}</script> <script src=../../../assets/javascripts/bundle.88dd0f4e.min.js></script> <script src=https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/MathJax.js?config=TeX-MML-AM_CHTML"></script> <script src=../../../js/extra.js></script> </body> </html>
\ No newline at end of file
diff --git a/docs/api_reference/core/transforms_interface/index.html b/docs/api_reference/core/transforms_interface/index.html
index 45f65179..2114b79e 100644
--- a/docs/api_reference/core/transforms_interface/index.html
+++ b/docs/api_reference/core/transforms_interface/index.html
@@ -589,12 +589,18 @@
 </span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volume</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
 </span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>        <span class=k>return</span> <span class=n>volume</span>
 </span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_mask3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>mask3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=k>return</span> <span class=n>mask3d</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volumes</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volumes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=k>return</span> <span class=n>volumes</span>
 </span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=k>return</span> <span class=p>()</span>
-</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.transforms_interface.Transform3D class="doc doc-heading" data-toc-label=Transform3D> <code>class <strong> Transform3D</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/transforms_interface.py#L621 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.transforms_interface.Transform3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Base class for all 3D transforms.</p> <p>Transform3D inherits from DualTransform because 3D transforms can be applied to both volumes and masks, similar to how 2D DualTransforms work with images and masks.</p> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume: 3D numpy array of shape (D, H, W) or (D, H, W, C) volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C) mask: 3D numpy array of shape (D, H, W) masks: Batch of 3D arrays of shape (N, D, H, W)</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/Transform3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/transforms_interface.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>Transform3D</span><span class=p>(</span><span class=n>DualTransform</span><span class=p>):</span>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_mask3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>mask3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>        <span class=k>return</span> <span class=n>mask3d</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_masks3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>masks3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>        <span class=k>return</span> <span class=n>masks3d</span>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=k>def</span> <span class=nf>get_transform_init_args_names</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>tuple</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=o>...</span><span class=p>]:</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=k>return</span> <span class=p>()</span>
+</span></code></pre></div> </details> </div> </div> <div class="doc doc-object doc-class"> <h2 id=albumentations.core.transforms_interface.Transform3D class="doc doc-heading" data-toc-label=Transform3D> <code>class <strong> Transform3D</strong></code> <code> </code> <span class=doc-github-link> <a href=https://github.com/albumentations-team/albumentations/blob/main/albumentations/core/transforms_interface.py#L628 target=_blank>[view source on GitHub]</a> </span><a class=headerlink href=#albumentations.core.transforms_interface.Transform3D title="Permanent link">¶</a> </h2> <div class=class-signature> </div> <div class="doc doc-contents "> <p>Base class for all 3D transforms.</p> <p>Transform3D inherits from DualTransform because 3D transforms can be applied to both volumes and masks, similar to how 2D DualTransforms work with images and masks.</p> <div class="admonition targets"> <p class=admonition-title>Targets</p> <p>volume: 3D numpy array of shape (D, H, W) or (D, H, W, C) volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C) mask: 3D numpy array of shape (D, H, W) masks: Batch of 3D arrays of shape (N, D, H, W) keypoints: 3D numpy array of shape (N, 3)</p> </div> <div class="admonition info"> <p class=admonition-title><strong>Interactive Tool Available!</strong></p> <p> <strong>Explore this transform visually and adjust parameters interactively using this tool:</strong> </p> <p> <a class="md-button md-button--primary" href=https://explore.albumentations.ai/transform/Transform3D target=_blank>Open Tool</a> </p> </div> <details class=quote> <summary>Source code in <code>albumentations/core/transforms_interface.py</code></summary> <div class="language-python highlight"><span class=filename>Python</span><pre><span></span><code><span id=__span-0-1><a id=__codelineno-0-1 name=__codelineno-0-1></a><a href=#__codelineno-0-1><span class=linenos data-linenos=" 1 "></span></a><span class=k>class</span> <span class=nc>Transform3D</span><span class=p>(</span><span class=n>DualTransform</span><span class=p>):</span>
 </span><span id=__span-0-2><a id=__codelineno-0-2 name=__codelineno-0-2></a><a href=#__codelineno-0-2><span class=linenos data-linenos=" 2 "></span></a><span class=w>    </span><span class=sd>&quot;&quot;&quot;Base class for all 3D transforms.</span>
 </span><span id=__span-0-3><a id=__codelineno-0-3 name=__codelineno-0-3></a><a href=#__codelineno-0-3><span class=linenos data-linenos=" 3 "></span></a>
 </span><span id=__span-0-4><a id=__codelineno-0-4 name=__codelineno-0-4></a><a href=#__codelineno-0-4><span class=linenos data-linenos=" 4 "></span></a><span class=sd>    Transform3D inherits from DualTransform because 3D transforms can be applied to both</span>
@@ -605,33 +611,35 @@
 </span><span id=__span-0-9><a id=__codelineno-0-9 name=__codelineno-0-9></a><a href=#__codelineno-0-9><span class=linenos data-linenos=" 9 "></span></a><span class=sd>        volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C)</span>
 </span><span id=__span-0-10><a id=__codelineno-0-10 name=__codelineno-0-10></a><a href=#__codelineno-0-10><span class=linenos data-linenos="10 "></span></a><span class=sd>        mask: 3D numpy array of shape (D, H, W)</span>
 </span><span id=__span-0-11><a id=__codelineno-0-11 name=__codelineno-0-11></a><a href=#__codelineno-0-11><span class=linenos data-linenos="11 "></span></a><span class=sd>        masks: Batch of 3D arrays of shape (N, D, H, W)</span>
-</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
-</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a>
-</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volume</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to single 3D volume.&quot;&quot;&quot;</span>
-</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a>        <span class=k>raise</span> <span class=ne>NotImplementedError</span>
-</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>
-</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>    <span class=nd>@batch_transform</span><span class=p>(</span><span class=s2>&quot;spatial&quot;</span><span class=p>,</span> <span class=n>keep_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_batch_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>)</span>
-</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volumes</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volumes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to batch of 3D volumes.&quot;&quot;&quot;</span>
-</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volume</span><span class=p>(</span><span class=n>volumes</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>)</span>
-</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>
-</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_mask3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>mask3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to single 3D mask.&quot;&quot;&quot;</span>
-</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volume</span><span class=p>(</span><span class=n>mask3d</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>)</span>
-</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>
-</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>    <span class=nd>@batch_transform</span><span class=p>(</span><span class=s2>&quot;spatial&quot;</span><span class=p>,</span> <span class=n>keep_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_batch_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>)</span>
-</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_masks3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>masks3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
-</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to batch of 3D masks.&quot;&quot;&quot;</span>
-</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_mask3d</span><span class=p>(</span><span class=n>masks3d</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>)</span>
-</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>
-</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>    <span class=nd>@property</span>
-</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=k>def</span> <span class=nf>targets</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Callable</span><span class=p>[</span><span class=o>...</span><span class=p>,</span> <span class=n>Any</span><span class=p>]]:</span>
-</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Define valid targets for 3D transforms.&quot;&quot;&quot;</span>
-</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a>        <span class=k>return</span> <span class=p>{</span>
-</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>            <span class=s2>&quot;volume&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volume</span><span class=p>,</span>
-</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>            <span class=s2>&quot;volumes&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volumes</span><span class=p>,</span>
-</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=s2>&quot;mask3d&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_mask3d</span><span class=p>,</span>
-</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>            <span class=s2>&quot;masks3d&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_masks3d</span><span class=p>,</span>
-</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>        <span class=p>}</span>
+</span><span id=__span-0-12><a id=__codelineno-0-12 name=__codelineno-0-12></a><a href=#__codelineno-0-12><span class=linenos data-linenos="12 "></span></a><span class=sd>        keypoints: 3D numpy array of shape (N, 3)</span>
+</span><span id=__span-0-13><a id=__codelineno-0-13 name=__codelineno-0-13></a><a href=#__codelineno-0-13><span class=linenos data-linenos="13 "></span></a><span class=sd>    &quot;&quot;&quot;</span>
+</span><span id=__span-0-14><a id=__codelineno-0-14 name=__codelineno-0-14></a><a href=#__codelineno-0-14><span class=linenos data-linenos="14 "></span></a>
+</span><span id=__span-0-15><a id=__codelineno-0-15 name=__codelineno-0-15></a><a href=#__codelineno-0-15><span class=linenos data-linenos="15 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volume</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volume</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-16><a id=__codelineno-0-16 name=__codelineno-0-16></a><a href=#__codelineno-0-16><span class=linenos data-linenos="16 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to single 3D volume.&quot;&quot;&quot;</span>
+</span><span id=__span-0-17><a id=__codelineno-0-17 name=__codelineno-0-17></a><a href=#__codelineno-0-17><span class=linenos data-linenos="17 "></span></a>        <span class=k>raise</span> <span class=ne>NotImplementedError</span>
+</span><span id=__span-0-18><a id=__codelineno-0-18 name=__codelineno-0-18></a><a href=#__codelineno-0-18><span class=linenos data-linenos="18 "></span></a>
+</span><span id=__span-0-19><a id=__codelineno-0-19 name=__codelineno-0-19></a><a href=#__codelineno-0-19><span class=linenos data-linenos="19 "></span></a>    <span class=nd>@batch_transform</span><span class=p>(</span><span class=s2>&quot;spatial&quot;</span><span class=p>,</span> <span class=n>keep_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_batch_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>)</span>
+</span><span id=__span-0-20><a id=__codelineno-0-20 name=__codelineno-0-20></a><a href=#__codelineno-0-20><span class=linenos data-linenos="20 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_volumes</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>volumes</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-21><a id=__codelineno-0-21 name=__codelineno-0-21></a><a href=#__codelineno-0-21><span class=linenos data-linenos="21 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to batch of 3D volumes.&quot;&quot;&quot;</span>
+</span><span id=__span-0-22><a id=__codelineno-0-22 name=__codelineno-0-22></a><a href=#__codelineno-0-22><span class=linenos data-linenos="22 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volume</span><span class=p>(</span><span class=n>volumes</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>)</span>
+</span><span id=__span-0-23><a id=__codelineno-0-23 name=__codelineno-0-23></a><a href=#__codelineno-0-23><span class=linenos data-linenos="23 "></span></a>
+</span><span id=__span-0-24><a id=__codelineno-0-24 name=__codelineno-0-24></a><a href=#__codelineno-0-24><span class=linenos data-linenos="24 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_mask3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>mask3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-25><a id=__codelineno-0-25 name=__codelineno-0-25></a><a href=#__codelineno-0-25><span class=linenos data-linenos="25 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to single 3D mask.&quot;&quot;&quot;</span>
+</span><span id=__span-0-26><a id=__codelineno-0-26 name=__codelineno-0-26></a><a href=#__codelineno-0-26><span class=linenos data-linenos="26 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volume</span><span class=p>(</span><span class=n>mask3d</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>)</span>
+</span><span id=__span-0-27><a id=__codelineno-0-27 name=__codelineno-0-27></a><a href=#__codelineno-0-27><span class=linenos data-linenos="27 "></span></a>
+</span><span id=__span-0-28><a id=__codelineno-0-28 name=__codelineno-0-28></a><a href=#__codelineno-0-28><span class=linenos data-linenos="28 "></span></a>    <span class=nd>@batch_transform</span><span class=p>(</span><span class=s2>&quot;spatial&quot;</span><span class=p>,</span> <span class=n>keep_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_batch_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>,</span> <span class=n>has_depth_dim</span><span class=o>=</span><span class=kc>True</span><span class=p>)</span>
+</span><span id=__span-0-29><a id=__codelineno-0-29 name=__codelineno-0-29></a><a href=#__codelineno-0-29><span class=linenos data-linenos="29 "></span></a>    <span class=k>def</span> <span class=nf>apply_to_masks3d</span><span class=p>(</span><span class=bp>self</span><span class=p>,</span> <span class=n>masks3d</span><span class=p>:</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>:</span> <span class=n>Any</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>:</span> <span class=n>Any</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=n>np</span><span class=o>.</span><span class=n>ndarray</span><span class=p>:</span>
+</span><span id=__span-0-30><a id=__codelineno-0-30 name=__codelineno-0-30></a><a href=#__codelineno-0-30><span class=linenos data-linenos="30 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Apply transform to batch of 3D masks.&quot;&quot;&quot;</span>
+</span><span id=__span-0-31><a id=__codelineno-0-31 name=__codelineno-0-31></a><a href=#__codelineno-0-31><span class=linenos data-linenos="31 "></span></a>        <span class=k>return</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_mask3d</span><span class=p>(</span><span class=n>masks3d</span><span class=p>,</span> <span class=o>*</span><span class=n>args</span><span class=p>,</span> <span class=o>**</span><span class=n>params</span><span class=p>)</span>
+</span><span id=__span-0-32><a id=__codelineno-0-32 name=__codelineno-0-32></a><a href=#__codelineno-0-32><span class=linenos data-linenos="32 "></span></a>
+</span><span id=__span-0-33><a id=__codelineno-0-33 name=__codelineno-0-33></a><a href=#__codelineno-0-33><span class=linenos data-linenos="33 "></span></a>    <span class=nd>@property</span>
+</span><span id=__span-0-34><a id=__codelineno-0-34 name=__codelineno-0-34></a><a href=#__codelineno-0-34><span class=linenos data-linenos="34 "></span></a>    <span class=k>def</span> <span class=nf>targets</span><span class=p>(</span><span class=bp>self</span><span class=p>)</span> <span class=o>-&gt;</span> <span class=nb>dict</span><span class=p>[</span><span class=nb>str</span><span class=p>,</span> <span class=n>Callable</span><span class=p>[</span><span class=o>...</span><span class=p>,</span> <span class=n>Any</span><span class=p>]]:</span>
+</span><span id=__span-0-35><a id=__codelineno-0-35 name=__codelineno-0-35></a><a href=#__codelineno-0-35><span class=linenos data-linenos="35 "></span></a><span class=w>        </span><span class=sd>&quot;&quot;&quot;Define valid targets for 3D transforms.&quot;&quot;&quot;</span>
+</span><span id=__span-0-36><a id=__codelineno-0-36 name=__codelineno-0-36></a><a href=#__codelineno-0-36><span class=linenos data-linenos="36 "></span></a>        <span class=k>return</span> <span class=p>{</span>
+</span><span id=__span-0-37><a id=__codelineno-0-37 name=__codelineno-0-37></a><a href=#__codelineno-0-37><span class=linenos data-linenos="37 "></span></a>            <span class=s2>&quot;volume&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volume</span><span class=p>,</span>
+</span><span id=__span-0-38><a id=__codelineno-0-38 name=__codelineno-0-38></a><a href=#__codelineno-0-38><span class=linenos data-linenos="38 "></span></a>            <span class=s2>&quot;volumes&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_volumes</span><span class=p>,</span>
+</span><span id=__span-0-39><a id=__codelineno-0-39 name=__codelineno-0-39></a><a href=#__codelineno-0-39><span class=linenos data-linenos="39 "></span></a>            <span class=s2>&quot;mask3d&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_mask3d</span><span class=p>,</span>
+</span><span id=__span-0-40><a id=__codelineno-0-40 name=__codelineno-0-40></a><a href=#__codelineno-0-40><span class=linenos data-linenos="40 "></span></a>            <span class=s2>&quot;masks3d&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_masks3d</span><span class=p>,</span>
+</span><span id=__span-0-41><a id=__codelineno-0-41 name=__codelineno-0-41></a><a href=#__codelineno-0-41><span class=linenos data-linenos="41 "></span></a>            <span class=s2>&quot;keypoints&quot;</span><span class=p>:</span> <span class=bp>self</span><span class=o>.</span><span class=n>apply_to_keypoints</span><span class=p>,</span>
+</span><span id=__span-0-42><a id=__codelineno-0-42 name=__codelineno-0-42></a><a href=#__codelineno-0-42><span class=linenos data-linenos="42 "></span></a>        <span class=p>}</span>
 </span></code></pre></div> </details> </div> </div> </div> </div> </div> </article> </div> <script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script> </div> <button type=button class="md-top md-icon" data-md-component=top hidden> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8z"/></svg> Back to top </button> </main> <footer class=md-footer> <nav class="md-footer__inner md-grid" aria-label=Footer> <a href=../composition/ class="md-footer__link md-footer__link--prev" aria-label="Previous: Composition API (core.composition)"> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11z"/></svg> </div> <div class=md-footer__title> <span class=md-footer__direction> Previous </span> <div class=md-ellipsis> Composition API (core.composition) </div> </div> </a> <a href=../serialization/ class="md-footer__link md-footer__link--next" aria-label="Next: Serialization API (core.serialization)"> <div class=md-footer__title> <span class=md-footer__direction> Next </span> <div class=md-ellipsis> Serialization API (core.serialization) </div> </div> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M4 11v2h12l-5.5 5.5 1.42 1.42L19.84 12l-7.92-7.92L10.5 5.5 16 11z"/></svg> </div> </a> </nav> <div class="md-footer-meta md-typeset"> <div class="md-footer-meta__inner md-grid"> <div class=md-copyright> </div> <div class=md-social> <a href=https://twitter.com/albumentations target=_blank rel=noopener title=twitter.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 512 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8l164.9-188.5L26.8 48h145.6l100.5 132.9zm-24.8 373.8h39.1L151.1 88h-42z"/></svg> </a> <a href=https://www.linkedin.com/company/100504475/ target=_blank rel=noopener title=www.linkedin.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 448 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3M135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5m282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9z"/></svg> </a> <a href=https://discord.com/invite/AKPrrDYNAt target=_blank rel=noopener title=discord.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 640 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M524.531 69.836a1.5 1.5 0 0 0-.764-.7A485 485 0 0 0 404.081 32.03a1.82 1.82 0 0 0-1.923.91 338 338 0 0 0-14.9 30.6 447.9 447.9 0 0 0-134.426 0 310 310 0 0 0-15.135-30.6 1.89 1.89 0 0 0-1.924-.91 483.7 483.7 0 0 0-119.688 37.107 1.7 1.7 0 0 0-.788.676C39.068 183.651 18.186 294.69 28.43 404.354a2.02 2.02 0 0 0 .765 1.375 487.7 487.7 0 0 0 146.825 74.189 1.9 1.9 0 0 0 2.063-.676A348 348 0 0 0 208.12 430.4a1.86 1.86 0 0 0-1.019-2.588 321 321 0 0 1-45.868-21.853 1.885 1.885 0 0 1-.185-3.126 251 251 0 0 0 9.109-7.137 1.82 1.82 0 0 1 1.9-.256c96.229 43.917 200.41 43.917 295.5 0a1.81 1.81 0 0 1 1.924.233 235 235 0 0 0 9.132 7.16 1.884 1.884 0 0 1-.162 3.126 301.4 301.4 0 0 1-45.89 21.83 1.875 1.875 0 0 0-1 2.611 391 391 0 0 0 30.014 48.815 1.86 1.86 0 0 0 2.063.7A486 486 0 0 0 610.7 405.729a1.88 1.88 0 0 0 .765-1.352c12.264-126.783-20.532-236.912-86.934-334.541M222.491 337.58c-28.972 0-52.844-26.587-52.844-59.239s23.409-59.241 52.844-59.241c29.665 0 53.306 26.82 52.843 59.239 0 32.654-23.41 59.241-52.843 59.241m195.38 0c-28.971 0-52.843-26.587-52.843-59.239s23.409-59.241 52.843-59.241c29.667 0 53.307 26.82 52.844 59.239 0 32.654-23.177 59.241-52.844 59.241"/></svg> </a> </div> </div> </div> </footer> </div> <div class=md-dialog data-md-component=dialog> <div class="md-dialog__inner md-typeset"></div> </div> <script id=__config type=application/json>{"base": "../../..", "features": ["navigation.sections", "navigation.indexes", "navigation.top", "navigation.footer", "navigation.path", "navigation.prune", "search.suggest", "search.highlight", "search.share", "toc.follow", "toc.integrate"], "search": "../../../assets/javascripts/workers/search.6ce7567c.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}}</script> <script src=../../../assets/javascripts/bundle.88dd0f4e.min.js></script> <script src=https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/MathJax.js?config=TeX-MML-AM_CHTML"></script> <script src=../../../js/extra.js></script> </body> </html>
\ No newline at end of file
diff --git a/docs/api_reference/full_reference/index.html b/docs/api_reference/full_reference/index.html
index 13d0a834..6eb4e1b3 100644
--- a/docs/api_reference/full_reference/index.html
+++ b/docs/api_reference/full_reference/index.html
@@ -6,4 +6,4 @@
   .jupyter-wrapper .jp-MarkdownOutput.jp-RenderedHTMLCommon {
     font-size: 0.8rem;
   }
-</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/full_reference.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=full-api-reference-on-a-single-page>Full API Reference on a single page<a class=headerlink href=#full-api-reference-on-a-single-page title="Permanent link">&para;</a></h1> <h2 id=transform-types>Transform Types<a class=headerlink href=#transform-types title="Permanent link">&para;</a></h2> <h3 id=1-pixel-level-transforms>1. Pixel-level transforms<a class=headerlink href=#1-pixel-level-transforms title="Permanent link">&para;</a></h3> <p>Transforms that modify pixel values without changing spatial relationships. These can be safely applied to any target as they only affect the input image, leaving other targets (masks, bounding boxes, keypoints) unchanged.</p> <ul> <li><a href=https://explore.albumentations.ai/transform/AdditiveNoise>AdditiveNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/AdvancedBlur>AdvancedBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/AutoContrast>AutoContrast</a></li> <li><a href=https://explore.albumentations.ai/transform/Blur>Blur</a></li> <li><a href=https://explore.albumentations.ai/transform/CLAHE>CLAHE</a></li> <li><a href=https://explore.albumentations.ai/transform/ChannelDropout>ChannelDropout</a></li> <li><a href=https://explore.albumentations.ai/transform/ChannelShuffle>ChannelShuffle</a></li> <li><a href=https://explore.albumentations.ai/transform/ChromaticAberration>ChromaticAberration</a></li> <li><a href=https://explore.albumentations.ai/transform/ColorJitter>ColorJitter</a></li> <li><a href=https://explore.albumentations.ai/transform/Defocus>Defocus</a></li> <li><a href=https://explore.albumentations.ai/transform/Downscale>Downscale</a></li> <li><a href=https://explore.albumentations.ai/transform/Emboss>Emboss</a></li> <li><a href=https://explore.albumentations.ai/transform/Equalize>Equalize</a></li> <li><a href=https://explore.albumentations.ai/transform/FDA>FDA</a></li> <li><a href=https://explore.albumentations.ai/transform/FancyPCA>FancyPCA</a></li> <li><a href=https://explore.albumentations.ai/transform/FromFloat>FromFloat</a></li> <li><a href=https://explore.albumentations.ai/transform/GaussNoise>GaussNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/GaussianBlur>GaussianBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/GlassBlur>GlassBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/HistogramMatching>HistogramMatching</a></li> <li><a href=https://explore.albumentations.ai/transform/HueSaturationValue>HueSaturationValue</a></li> <li><a href=https://explore.albumentations.ai/transform/ISONoise>ISONoise</a></li> <li><a href=https://explore.albumentations.ai/transform/Illumination>Illumination</a></li> <li><a href=https://explore.albumentations.ai/transform/ImageCompression>ImageCompression</a></li> <li><a href=https://explore.albumentations.ai/transform/InvertImg>InvertImg</a></li> <li><a href=https://explore.albumentations.ai/transform/MedianBlur>MedianBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/MotionBlur>MotionBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/MultiplicativeNoise>MultiplicativeNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/Normalize>Normalize</a></li> <li><a href=https://explore.albumentations.ai/transform/PixelDistributionAdaptation>PixelDistributionAdaptation</a></li> <li><a href=https://explore.albumentations.ai/transform/PlanckianJitter>PlanckianJitter</a></li> <li><a href=https://explore.albumentations.ai/transform/PlasmaBrightnessContrast>PlasmaBrightnessContrast</a></li> <li><a href=https://explore.albumentations.ai/transform/PlasmaShadow>PlasmaShadow</a></li> <li><a href=https://explore.albumentations.ai/transform/Posterize>Posterize</a></li> <li><a href=https://explore.albumentations.ai/transform/RGBShift>RGBShift</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomBrightnessContrast>RandomBrightnessContrast</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomFog>RandomFog</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomGamma>RandomGamma</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomGravel>RandomGravel</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomRain>RandomRain</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomShadow>RandomShadow</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomSnow>RandomSnow</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomSunFlare>RandomSunFlare</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomToneCurve>RandomToneCurve</a></li> <li><a href=https://explore.albumentations.ai/transform/RingingOvershoot>RingingOvershoot</a></li> <li><a href=https://explore.albumentations.ai/transform/SaltAndPepper>SaltAndPepper</a></li> <li><a href=https://explore.albumentations.ai/transform/Sharpen>Sharpen</a></li> <li><a href=https://explore.albumentations.ai/transform/ShotNoise>ShotNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/Solarize>Solarize</a></li> <li><a href=https://explore.albumentations.ai/transform/Spatter>Spatter</a></li> <li><a href=https://explore.albumentations.ai/transform/Superpixels>Superpixels</a></li> <li><a href=https://explore.albumentations.ai/transform/TemplateTransform>TemplateTransform</a></li> <li><a href=https://explore.albumentations.ai/transform/TextImage>TextImage</a></li> <li><a href=https://explore.albumentations.ai/transform/ToFloat>ToFloat</a></li> <li><a href=https://explore.albumentations.ai/transform/ToGray>ToGray</a></li> <li><a href=https://explore.albumentations.ai/transform/ToRGB>ToRGB</a></li> <li><a href=https://explore.albumentations.ai/transform/ToSepia>ToSepia</a></li> <li><a href=https://explore.albumentations.ai/transform/UnsharpMask>UnsharpMask</a></li> <li><a href=https://explore.albumentations.ai/transform/ZoomBlur>ZoomBlur</a></li> </ul> <h3 id=2-spatial-level-transforms>2. Spatial-level transforms<a class=headerlink href=#2-spatial-level-transforms title="Permanent link">&para;</a></h3> <p>Transforms that modify the spatial arrangement of pixels/features. Different targets have different spatial transform support - see the compatibility table below:</p> <table> <thead> <tr> <th>Transform</th> <th style="text-align: center;">Image</th> <th style="text-align: center;">Mask</th> <th style="text-align: center;">BBoxes</th> <th style="text-align: center;">Keypoints</th> <th style="text-align: center;">Volume</th> <th style="text-align: center;">Mask3D</th> </tr> </thead> <tbody> <tr> <td><a href=https://explore.albumentations.ai/transform/Affine>Affine</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/AtLeastOneBBoxRandomCrop>AtLeastOneBBoxRandomCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/BBoxSafeRandomCrop>BBoxSafeRandomCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CenterCrop>CenterCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CoarseDropout>CoarseDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Crop>Crop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CropAndPad>CropAndPad</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CropNonEmptyMaskIfExists>CropNonEmptyMaskIfExists</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/D4>D4</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/ElasticTransform>ElasticTransform</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Erasing>Erasing</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/FrequencyMasking>FrequencyMasking</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/GridDistortion>GridDistortion</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/GridDropout>GridDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/GridElasticDeform>GridElasticDeform</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/HorizontalFlip>HorizontalFlip</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Lambda>Lambda</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/LongestMaxSize>LongestMaxSize</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/MaskDropout>MaskDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Morphological>Morphological</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/NoOp>NoOp</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/OpticalDistortion>OpticalDistortion</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/OverlayElements>OverlayElements</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Pad>Pad</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PadIfNeeded>PadIfNeeded</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Perspective>Perspective</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PiecewiseAffine>PiecewiseAffine</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PixelDropout>PixelDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCrop>RandomCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCropFromBorders>RandomCropFromBorders</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCropNearBBox>RandomCropNearBBox</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomGridShuffle>RandomGridShuffle</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomResizedCrop>RandomResizedCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomRotate90>RandomRotate90</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomScale>RandomScale</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomSizedBBoxSafeCrop>RandomSizedBBoxSafeCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomSizedCrop>RandomSizedCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Resize>Resize</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Rotate>Rotate</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/SafeRotate>SafeRotate</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/ShiftScaleRotate>ShiftScaleRotate</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/SmallestMaxSize>SmallestMaxSize</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/ThinPlateSpline>ThinPlateSpline</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/TimeMasking>TimeMasking</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/TimeReverse>TimeReverse</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Transpose>Transpose</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/VerticalFlip>VerticalFlip</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/XYMasking>XYMasking</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> </tbody> </table> <h3 id=3-volumetric-3d-transforms>3. Volumetric (3D) transforms<a class=headerlink href=#3-volumetric-3d-transforms title="Permanent link">&para;</a></h3> <p>Transforms designed for three-dimensional data (D, H, W). These operate on volumes and their corresponding 3D masks, supporting both single-channel and multi-channel data.</p> <table> <thead> <tr> <th>Transform</th> <th style="text-align: center;">Image</th> <th style="text-align: center;">Mask</th> <th style="text-align: center;">BBoxes</th> <th style="text-align: center;">Keypoints</th> <th style="text-align: center;">Volume</th> <th style="text-align: center;">Mask3D</th> </tr> </thead> <tbody> <tr> <td><a href=https://explore.albumentations.ai/transform/CenterCrop3D>CenterCrop3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CoarseDropout3D>CoarseDropout3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CubicSymmetry>CubicSymmetry</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Pad3D>Pad3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PadIfNeeded3D>PadIfNeeded3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCrop3D>RandomCrop3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> </tbody> </table> </article> </div> <script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script> </div> <button type=button class="md-top md-icon" data-md-component=top hidden> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8z"/></svg> Back to top </button> </main> <footer class=md-footer> <nav class="md-footer__inner md-grid" aria-label=Footer> <a href=../../frameworks_and_libraries/ class="md-footer__link md-footer__link--prev" aria-label="Previous: Frameworks and libraries that use Albumentations"> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11z"/></svg> </div> <div class=md-footer__title> <span class=md-footer__direction> Previous </span> <div class=md-ellipsis> Frameworks and libraries that use Albumentations </div> </div> </a> <a href=../ class="md-footer__link md-footer__link--next" aria-label="Next: Index"> <div class=md-footer__title> <span class=md-footer__direction> Next </span> <div class=md-ellipsis> Index </div> </div> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M4 11v2h12l-5.5 5.5 1.42 1.42L19.84 12l-7.92-7.92L10.5 5.5 16 11z"/></svg> </div> </a> </nav> <div class="md-footer-meta md-typeset"> <div class="md-footer-meta__inner md-grid"> <div class=md-copyright> </div> <div class=md-social> <a href=https://twitter.com/albumentations target=_blank rel=noopener title=twitter.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 512 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8l164.9-188.5L26.8 48h145.6l100.5 132.9zm-24.8 373.8h39.1L151.1 88h-42z"/></svg> </a> <a href=https://www.linkedin.com/company/100504475/ target=_blank rel=noopener title=www.linkedin.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 448 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3M135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5m282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9z"/></svg> </a> <a href=https://discord.com/invite/AKPrrDYNAt target=_blank rel=noopener title=discord.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 640 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M524.531 69.836a1.5 1.5 0 0 0-.764-.7A485 485 0 0 0 404.081 32.03a1.82 1.82 0 0 0-1.923.91 338 338 0 0 0-14.9 30.6 447.9 447.9 0 0 0-134.426 0 310 310 0 0 0-15.135-30.6 1.89 1.89 0 0 0-1.924-.91 483.7 483.7 0 0 0-119.688 37.107 1.7 1.7 0 0 0-.788.676C39.068 183.651 18.186 294.69 28.43 404.354a2.02 2.02 0 0 0 .765 1.375 487.7 487.7 0 0 0 146.825 74.189 1.9 1.9 0 0 0 2.063-.676A348 348 0 0 0 208.12 430.4a1.86 1.86 0 0 0-1.019-2.588 321 321 0 0 1-45.868-21.853 1.885 1.885 0 0 1-.185-3.126 251 251 0 0 0 9.109-7.137 1.82 1.82 0 0 1 1.9-.256c96.229 43.917 200.41 43.917 295.5 0a1.81 1.81 0 0 1 1.924.233 235 235 0 0 0 9.132 7.16 1.884 1.884 0 0 1-.162 3.126 301.4 301.4 0 0 1-45.89 21.83 1.875 1.875 0 0 0-1 2.611 391 391 0 0 0 30.014 48.815 1.86 1.86 0 0 0 2.063.7A486 486 0 0 0 610.7 405.729a1.88 1.88 0 0 0 .765-1.352c12.264-126.783-20.532-236.912-86.934-334.541M222.491 337.58c-28.972 0-52.844-26.587-52.844-59.239s23.409-59.241 52.844-59.241c29.665 0 53.306 26.82 52.843 59.239 0 32.654-23.41 59.241-52.843 59.241m195.38 0c-28.971 0-52.843-26.587-52.843-59.239s23.409-59.241 52.843-59.241c29.667 0 53.307 26.82 52.844 59.239 0 32.654-23.177 59.241-52.844 59.241"/></svg> </a> </div> </div> </div> </footer> </div> <div class=md-dialog data-md-component=dialog> <div class="md-dialog__inner md-typeset"></div> </div> <script id=__config type=application/json>{"base": "../..", "features": ["navigation.sections", "navigation.indexes", "navigation.top", "navigation.footer", "navigation.path", "navigation.prune", "search.suggest", "search.highlight", "search.share", "toc.follow", "toc.integrate"], "search": "../../assets/javascripts/workers/search.6ce7567c.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}}</script> <script src=../../assets/javascripts/bundle.88dd0f4e.min.js></script> <script src=https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/MathJax.js?config=TeX-MML-AM_CHTML"></script> <script src=../../js/extra.js></script> </body> </html>
\ No newline at end of file
+</style> <a href=https://github.com/albumentations-team/albumentations/edit/master/docs/src/docs/api_reference/full_reference.md title=edit.link.title class="md-content__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20.71 7.04c.39-.39.39-1.04 0-1.41l-2.34-2.34c-.37-.39-1.02-.39-1.41 0l-1.84 1.83 3.75 3.75M3 17.25V21h3.75L17.81 9.93l-3.75-3.75z"/></svg> </a> <h1 id=full-api-reference-on-a-single-page>Full API Reference on a single page<a class=headerlink href=#full-api-reference-on-a-single-page title="Permanent link">&para;</a></h1> <h2 id=transform-types>Transform Types<a class=headerlink href=#transform-types title="Permanent link">&para;</a></h2> <h3 id=1-pixel-level-transforms>1. Pixel-level transforms<a class=headerlink href=#1-pixel-level-transforms title="Permanent link">&para;</a></h3> <p>Transforms that modify pixel values without changing spatial relationships. These can be safely applied to any target as they only affect the input image, leaving other targets (masks, bounding boxes, keypoints) unchanged.</p> <ul> <li><a href=https://explore.albumentations.ai/transform/AdditiveNoise>AdditiveNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/AdvancedBlur>AdvancedBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/AutoContrast>AutoContrast</a></li> <li><a href=https://explore.albumentations.ai/transform/Blur>Blur</a></li> <li><a href=https://explore.albumentations.ai/transform/CLAHE>CLAHE</a></li> <li><a href=https://explore.albumentations.ai/transform/ChannelDropout>ChannelDropout</a></li> <li><a href=https://explore.albumentations.ai/transform/ChannelShuffle>ChannelShuffle</a></li> <li><a href=https://explore.albumentations.ai/transform/ChromaticAberration>ChromaticAberration</a></li> <li><a href=https://explore.albumentations.ai/transform/ColorJitter>ColorJitter</a></li> <li><a href=https://explore.albumentations.ai/transform/Defocus>Defocus</a></li> <li><a href=https://explore.albumentations.ai/transform/Downscale>Downscale</a></li> <li><a href=https://explore.albumentations.ai/transform/Emboss>Emboss</a></li> <li><a href=https://explore.albumentations.ai/transform/Equalize>Equalize</a></li> <li><a href=https://explore.albumentations.ai/transform/FDA>FDA</a></li> <li><a href=https://explore.albumentations.ai/transform/FancyPCA>FancyPCA</a></li> <li><a href=https://explore.albumentations.ai/transform/FromFloat>FromFloat</a></li> <li><a href=https://explore.albumentations.ai/transform/GaussNoise>GaussNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/GaussianBlur>GaussianBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/GlassBlur>GlassBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/HistogramMatching>HistogramMatching</a></li> <li><a href=https://explore.albumentations.ai/transform/HueSaturationValue>HueSaturationValue</a></li> <li><a href=https://explore.albumentations.ai/transform/ISONoise>ISONoise</a></li> <li><a href=https://explore.albumentations.ai/transform/Illumination>Illumination</a></li> <li><a href=https://explore.albumentations.ai/transform/ImageCompression>ImageCompression</a></li> <li><a href=https://explore.albumentations.ai/transform/InvertImg>InvertImg</a></li> <li><a href=https://explore.albumentations.ai/transform/MedianBlur>MedianBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/MotionBlur>MotionBlur</a></li> <li><a href=https://explore.albumentations.ai/transform/MultiplicativeNoise>MultiplicativeNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/Normalize>Normalize</a></li> <li><a href=https://explore.albumentations.ai/transform/PixelDistributionAdaptation>PixelDistributionAdaptation</a></li> <li><a href=https://explore.albumentations.ai/transform/PlanckianJitter>PlanckianJitter</a></li> <li><a href=https://explore.albumentations.ai/transform/PlasmaBrightnessContrast>PlasmaBrightnessContrast</a></li> <li><a href=https://explore.albumentations.ai/transform/PlasmaShadow>PlasmaShadow</a></li> <li><a href=https://explore.albumentations.ai/transform/Posterize>Posterize</a></li> <li><a href=https://explore.albumentations.ai/transform/RGBShift>RGBShift</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomBrightnessContrast>RandomBrightnessContrast</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomFog>RandomFog</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomGamma>RandomGamma</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomGravel>RandomGravel</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomRain>RandomRain</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomShadow>RandomShadow</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomSnow>RandomSnow</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomSunFlare>RandomSunFlare</a></li> <li><a href=https://explore.albumentations.ai/transform/RandomToneCurve>RandomToneCurve</a></li> <li><a href=https://explore.albumentations.ai/transform/RingingOvershoot>RingingOvershoot</a></li> <li><a href=https://explore.albumentations.ai/transform/SaltAndPepper>SaltAndPepper</a></li> <li><a href=https://explore.albumentations.ai/transform/Sharpen>Sharpen</a></li> <li><a href=https://explore.albumentations.ai/transform/ShotNoise>ShotNoise</a></li> <li><a href=https://explore.albumentations.ai/transform/Solarize>Solarize</a></li> <li><a href=https://explore.albumentations.ai/transform/Spatter>Spatter</a></li> <li><a href=https://explore.albumentations.ai/transform/Superpixels>Superpixels</a></li> <li><a href=https://explore.albumentations.ai/transform/TemplateTransform>TemplateTransform</a></li> <li><a href=https://explore.albumentations.ai/transform/TextImage>TextImage</a></li> <li><a href=https://explore.albumentations.ai/transform/ToFloat>ToFloat</a></li> <li><a href=https://explore.albumentations.ai/transform/ToGray>ToGray</a></li> <li><a href=https://explore.albumentations.ai/transform/ToRGB>ToRGB</a></li> <li><a href=https://explore.albumentations.ai/transform/ToSepia>ToSepia</a></li> <li><a href=https://explore.albumentations.ai/transform/UnsharpMask>UnsharpMask</a></li> <li><a href=https://explore.albumentations.ai/transform/ZoomBlur>ZoomBlur</a></li> </ul> <h3 id=2-spatial-level-transforms>2. Spatial-level transforms<a class=headerlink href=#2-spatial-level-transforms title="Permanent link">&para;</a></h3> <p>Transforms that modify the spatial arrangement of pixels/features. Different targets have different spatial transform support - see the compatibility table below:</p> <table> <thead> <tr> <th>Transform</th> <th style="text-align: center;">Image</th> <th style="text-align: center;">Mask</th> <th style="text-align: center;">BBoxes</th> <th style="text-align: center;">Keypoints</th> <th style="text-align: center;">Volume</th> <th style="text-align: center;">Mask3D</th> </tr> </thead> <tbody> <tr> <td><a href=https://explore.albumentations.ai/transform/Affine>Affine</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/AtLeastOneBBoxRandomCrop>AtLeastOneBBoxRandomCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/BBoxSafeRandomCrop>BBoxSafeRandomCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CenterCrop>CenterCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CoarseDropout>CoarseDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Crop>Crop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CropAndPad>CropAndPad</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CropNonEmptyMaskIfExists>CropNonEmptyMaskIfExists</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/D4>D4</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/ElasticTransform>ElasticTransform</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Erasing>Erasing</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/FrequencyMasking>FrequencyMasking</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/GridDistortion>GridDistortion</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/GridDropout>GridDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/GridElasticDeform>GridElasticDeform</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/HorizontalFlip>HorizontalFlip</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Lambda>Lambda</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/LongestMaxSize>LongestMaxSize</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/MaskDropout>MaskDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Morphological>Morphological</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/NoOp>NoOp</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/OpticalDistortion>OpticalDistortion</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/OverlayElements>OverlayElements</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Pad>Pad</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PadIfNeeded>PadIfNeeded</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Perspective>Perspective</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PiecewiseAffine>PiecewiseAffine</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PixelDropout>PixelDropout</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCrop>RandomCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCropFromBorders>RandomCropFromBorders</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCropNearBBox>RandomCropNearBBox</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomGridShuffle>RandomGridShuffle</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomResizedCrop>RandomResizedCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomRotate90>RandomRotate90</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomScale>RandomScale</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomSizedBBoxSafeCrop>RandomSizedBBoxSafeCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomSizedCrop>RandomSizedCrop</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Resize>Resize</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Rotate>Rotate</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/SafeRotate>SafeRotate</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/ShiftScaleRotate>ShiftScaleRotate</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/SmallestMaxSize>SmallestMaxSize</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/ThinPlateSpline>ThinPlateSpline</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/TimeMasking>TimeMasking</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/TimeReverse>TimeReverse</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Transpose>Transpose</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/VerticalFlip>VerticalFlip</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/XYMasking>XYMasking</a></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> </tbody> </table> <h3 id=3-volumetric-3d-transforms>3. Volumetric (3D) transforms<a class=headerlink href=#3-volumetric-3d-transforms title="Permanent link">&para;</a></h3> <p>Transforms designed for three-dimensional data (D, H, W). These operate on volumes and their corresponding 3D masks, supporting both single-channel and multi-channel data.</p> <table> <thead> <tr> <th>Transform</th> <th style="text-align: center;">Image</th> <th style="text-align: center;">Mask</th> <th style="text-align: center;">BBoxes</th> <th style="text-align: center;">Keypoints</th> <th style="text-align: center;">Volume</th> <th style="text-align: center;">Mask3D</th> </tr> </thead> <tbody> <tr> <td><a href=https://explore.albumentations.ai/transform/CenterCrop3D>CenterCrop3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CoarseDropout3D>CoarseDropout3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/CubicSymmetry>CubicSymmetry</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/Pad3D>Pad3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/PadIfNeeded3D>PadIfNeeded3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> <tr> <td><a href=https://explore.albumentations.ai/transform/RandomCrop3D>RandomCrop3D</a></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;"></td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> <td style="text-align: center;">✓</td> </tr> </tbody> </table> </article> </div> <script>var target=document.getElementById(location.hash.slice(1));target&&target.name&&(target.checked=target.name.startsWith("__tabbed_"))</script> </div> <button type=button class="md-top md-icon" data-md-component=top hidden> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M13 20h-2V8l-5.5 5.5-1.42-1.42L12 4.16l7.92 7.92-1.42 1.42L13 8z"/></svg> Back to top </button> </main> <footer class=md-footer> <nav class="md-footer__inner md-grid" aria-label=Footer> <a href=../../frameworks_and_libraries/ class="md-footer__link md-footer__link--prev" aria-label="Previous: Frameworks and libraries that use Albumentations"> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M20 11v2H8l5.5 5.5-1.42 1.42L4.16 12l7.92-7.92L13.5 5.5 8 11z"/></svg> </div> <div class=md-footer__title> <span class=md-footer__direction> Previous </span> <div class=md-ellipsis> Frameworks and libraries that use Albumentations </div> </div> </a> <a href=../ class="md-footer__link md-footer__link--next" aria-label="Next: Index"> <div class=md-footer__title> <span class=md-footer__direction> Next </span> <div class=md-ellipsis> Index </div> </div> <div class="md-footer__button md-icon"> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 24 24"><path d="M4 11v2h12l-5.5 5.5 1.42 1.42L19.84 12l-7.92-7.92L10.5 5.5 16 11z"/></svg> </div> </a> </nav> <div class="md-footer-meta md-typeset"> <div class="md-footer-meta__inner md-grid"> <div class=md-copyright> </div> <div class=md-social> <a href=https://twitter.com/albumentations target=_blank rel=noopener title=twitter.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 512 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8l164.9-188.5L26.8 48h145.6l100.5 132.9zm-24.8 373.8h39.1L151.1 88h-42z"/></svg> </a> <a href=https://www.linkedin.com/company/100504475/ target=_blank rel=noopener title=www.linkedin.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 448 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3M135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5m282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9z"/></svg> </a> <a href=https://discord.com/invite/AKPrrDYNAt target=_blank rel=noopener title=discord.com class=md-social__link> <svg xmlns=http://www.w3.org/2000/svg viewbox="0 0 640 512"><!-- Font Awesome Free 6.7.1 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free (Icons: CC BY 4.0, Fonts: SIL OFL 1.1, Code: MIT License) Copyright 2024 Fonticons, Inc.--><path d="M524.531 69.836a1.5 1.5 0 0 0-.764-.7A485 485 0 0 0 404.081 32.03a1.82 1.82 0 0 0-1.923.91 338 338 0 0 0-14.9 30.6 447.9 447.9 0 0 0-134.426 0 310 310 0 0 0-15.135-30.6 1.89 1.89 0 0 0-1.924-.91 483.7 483.7 0 0 0-119.688 37.107 1.7 1.7 0 0 0-.788.676C39.068 183.651 18.186 294.69 28.43 404.354a2.02 2.02 0 0 0 .765 1.375 487.7 487.7 0 0 0 146.825 74.189 1.9 1.9 0 0 0 2.063-.676A348 348 0 0 0 208.12 430.4a1.86 1.86 0 0 0-1.019-2.588 321 321 0 0 1-45.868-21.853 1.885 1.885 0 0 1-.185-3.126 251 251 0 0 0 9.109-7.137 1.82 1.82 0 0 1 1.9-.256c96.229 43.917 200.41 43.917 295.5 0a1.81 1.81 0 0 1 1.924.233 235 235 0 0 0 9.132 7.16 1.884 1.884 0 0 1-.162 3.126 301.4 301.4 0 0 1-45.89 21.83 1.875 1.875 0 0 0-1 2.611 391 391 0 0 0 30.014 48.815 1.86 1.86 0 0 0 2.063.7A486 486 0 0 0 610.7 405.729a1.88 1.88 0 0 0 .765-1.352c12.264-126.783-20.532-236.912-86.934-334.541M222.491 337.58c-28.972 0-52.844-26.587-52.844-59.239s23.409-59.241 52.844-59.241c29.665 0 53.306 26.82 52.843 59.239 0 32.654-23.41 59.241-52.843 59.241m195.38 0c-28.971 0-52.843-26.587-52.843-59.239s23.409-59.241 52.843-59.241c29.667 0 53.307 26.82 52.844 59.239 0 32.654-23.177 59.241-52.844 59.241"/></svg> </a> </div> </div> </div> </footer> </div> <div class=md-dialog data-md-component=dialog> <div class="md-dialog__inner md-typeset"></div> </div> <script id=__config type=application/json>{"base": "../..", "features": ["navigation.sections", "navigation.indexes", "navigation.top", "navigation.footer", "navigation.path", "navigation.prune", "search.suggest", "search.highlight", "search.share", "toc.follow", "toc.integrate"], "search": "../../assets/javascripts/workers/search.6ce7567c.min.js", "translations": {"clipboard.copied": "Copied to clipboard", "clipboard.copy": "Copy to clipboard", "search.result.more.one": "1 more on this page", "search.result.more.other": "# more on this page", "search.result.none": "No matching documents", "search.result.one": "1 matching document", "search.result.other": "# matching documents", "search.result.placeholder": "Type to start searching", "search.result.term.missing": "Missing", "select.version": "Select version"}}</script> <script src=../../assets/javascripts/bundle.88dd0f4e.min.js></script> <script src=https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.2.2/MathJax.js?config=TeX-MML-AM_CHTML"></script> <script src=../../js/extra.js></script> </body> </html>
\ No newline at end of file
diff --git a/docs/search/search_index.json b/docs/search/search_index.json
index dc6aedd5..7a3f2e56 100644
--- a/docs/search/search_index.json
+++ b/docs/search/search_index.json
@@ -1 +1 @@
-{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to Albumentations documentation","text":"<p>Albumentations is a fast and flexible image augmentation library. The library is widely used in industry, deep learning research, machine learning competitions, and open source projects. Albumentations is written in Python, and it is licensed under the MIT license. The source code is available at https://github.com/albumentations-team/albumentations.</p> <p>If you are new to image augmentation, start with our \"Learning Path\" for beginners. It describes what image augmentation is, how it can boost deep neural networks' performance, and why you should use Albumentations.</p> <p>For hands-on experience, check out our \"Quick Start Guide\" and \"Examples\" sections. They show how you can use the library for different computer vision tasks: image classification, semantic segmentation, instance segmentation, object detection, and keypoint detection. Each example includes a link to Google Colab, where you can run the code by yourself.</p> <p>You can also visit explore.albumentations.ai to visually explore and experiment with different augmentations in your browser. This interactive tool helps you better understand how each transform affects images before implementing it in your code.</p> <p>\"API Reference\" contains the description of Albumentations' methods and classes.</p>"},{"location":"#quick-start-guide","title":"Quick Start Guide","text":"<ul> <li>Installation</li> <li>Frequently Asked Questions</li> <li>Your First Augmentation Pipeline</li> </ul>"},{"location":"#working-with-multi-dimensional-data","title":"Working with Multi-dimensional Data","text":""},{"location":"#volumetric-data-3d","title":"Volumetric Data (3D)","text":"<ul> <li>Introduction to 3D (Volumetric) Image Augmentation</li> <li>Available 3D Transforms</li> </ul>"},{"location":"#video-and-sequential-data","title":"Video and Sequential Data","text":"<ul> <li>Video Frame Augmentation </li> </ul>"},{"location":"#learning-path","title":"Learning Path","text":""},{"location":"#beginners","title":"Beginners","text":"<ul> <li>What is Image Augmentation?</li> <li>Why Choose Albumentations?</li> <li>Basic Image Classification</li> </ul>"},{"location":"#intermediate","title":"Intermediate","text":"<ul> <li>Semantic Segmentation</li> <li>Object Detection</li> <li>Keypoint Detection</li> <li>Multi-target Augmentation</li> </ul>"},{"location":"#advanced","title":"Advanced","text":"<ul> <li>Pipeline Configuration</li> <li>Debugging with ReplayCompose</li> <li>Serialization</li> </ul>"},{"location":"#framework-integration","title":"Framework Integration","text":"<ul> <li>PyTorch</li> <li>TensorFlow</li> <li>HuggingFace</li> <li>Roboflow</li> <li>Voxel51</li> </ul>"},{"location":"#library-comparisons","title":"Library Comparisons","text":"<ul> <li>Transform Library Comparison - Find equivalent transforms between Albumentations and other libraries (torchvision, Kornia)</li> <li>Migration from torchvision - Step-by-step migration guide</li> </ul>"},{"location":"#examples","title":"Examples","text":"<ul> <li>Defining a simple augmentation pipeline for image augmentation</li> <li>Using Albumentations to augment bounding boxes for object detection tasks</li> <li>How to use Albumentations for detection tasks if you need to keep all bounding boxes</li> <li>Using Albumentations for a semantic segmentation task</li> <li>Using Albumentations to augment keypoints</li> <li>Applying the same augmentation with the same parameters to multiple images, masks, bounding boxes, or keypoints</li> <li>Weather augmentations in Albumentations</li> <li>Example of applying XYMasking transform</li> <li>Example of applying ChromaticAberration transform</li> <li>Example of applying Morphological transform</li> <li>Example of applying D4 transform</li> <li>Example of applying RandomGridShuffle transform</li> <li>Example of applying OverlayElements transform</li> <li>Example of applying TextImage transform</li> <li>Debugging an augmentation pipeline with ReplayCompose</li> <li>How to save and load parameters of an augmentation pipeline</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> <li>How to save and load transforms to HuggingFace Hub.</li> </ul>"},{"location":"#examples-of-how-to-use-albumentations-with-different-deep-learning-frameworks","title":"Examples of how to use Albumentations with different deep learning frameworks","text":"<ul> <li>PyTorch and Albumentations for image classification</li> <li>PyTorch and Albumentations for semantic segmentation</li> <li>Using Albumentations with Tensorflow</li> </ul>"},{"location":"#external-resources","title":"External resources","text":"<ul> <li>Blog posts, podcasts, talks, and videos about Albumentations</li> <li>Books that mention Albumentations</li> <li>Online courses that cover Albumentations</li> </ul>"},{"location":"#other-topics","title":"Other topics","text":"<ul> <li>Contributing</li> </ul>"},{"location":"#api-reference","title":"API Reference","text":"<ul> <li>Full API Reference on a single page</li> <li>Index</li> <li>Core API (albumentations.core)</li> <li>Augmentations (albumentations.augmentations)</li> <li>PyTorch Helpers (albumentations.pytorch)</li> </ul>"},{"location":"CONTRIBUTING/","title":"Contributing to Albumentations","text":"<p>Thank you for your interest in contributing to Albumentations! This guide will help you get started with contributing to our image augmentation library.</p>"},{"location":"CONTRIBUTING/#quick-start","title":"Quick Start","text":"<p>For small changes (e.g., bug fixes), feel free to submit a PR directly.</p> <p>For larger changes:</p> <ol> <li>Create an issue outlining your proposed change</li> <li>Join our Discord community to discuss your idea</li> </ol>"},{"location":"CONTRIBUTING/#contribution-guides","title":"Contribution Guides","text":"<p>We've organized our contribution guidelines into focused documents:</p> <ul> <li>Environment Setup Guide - How to set up your development environment</li> <li>Coding Guidelines - Code style, best practices, and technical requirements</li> </ul>"},{"location":"CONTRIBUTING/#contribution-process","title":"Contribution Process","text":"<ol> <li>Find an Issue: Look for open issues or propose a new one. For newcomers, look for issues labeled \"good first issue\"</li> <li>Set Up: Follow our Environment Setup Guide</li> <li>Create a Branch: <code>git checkout -b feature/my-new-feature</code></li> <li>Make Changes: Write code following our Coding Guidelines</li> <li>Test: Add tests and ensure all tests pass</li> <li>Submit: Open a Pull Request with a clear description of your changes</li> </ol>"},{"location":"CONTRIBUTING/#code-review-process","title":"Code Review Process","text":"<ol> <li>Maintainers will review your contribution</li> <li>Address any feedback or questions</li> <li>Once approved, your code will be merged</li> </ol>"},{"location":"CONTRIBUTING/#project-structure","title":"Project Structure","text":"<ul> <li><code>albumentations/</code> - Main source code</li> <li><code>tests/</code> - Test suite</li> <li><code>docs/</code> - Documentation</li> </ul>"},{"location":"CONTRIBUTING/#getting-help","title":"Getting Help","text":"<ul> <li>Join our Discord community</li> <li>Open a GitHub issue</li> <li>Ask questions in your pull request</li> </ul>"},{"location":"CONTRIBUTING/#license","title":"License","text":"<p>By contributing, you agree that your contributions will be licensed under the project's MIT License.</p>"},{"location":"benchmarking_results/","title":"Benchmarking results","text":""},{"location":"benchmarking_results/#benchmarking-results_1","title":"Benchmarking results","text":""},{"location":"benchmarking_results/#system-information","title":"System Information","text":"<ul> <li>Platform: macOS-15.0.1-arm64-arm-64bit</li> <li>Processor: arm</li> <li>CPU Count: 10</li> <li>Python Version: 3.12.7</li> </ul>"},{"location":"benchmarking_results/#benchmark-parameters","title":"Benchmark Parameters","text":"<ul> <li>Number of images: 1000</li> <li>Runs per transform: 10</li> <li>Max warmup iterations: 1000</li> </ul>"},{"location":"benchmarking_results/#library-versions","title":"Library Versions","text":"<ul> <li>albumentations: 1.4.20</li> <li>augly: 1.0.0</li> <li>imgaug: 0.4.0</li> <li>kornia: 0.7.3</li> <li>torchvision: 0.20.0</li> </ul>"},{"location":"benchmarking_results/#performance-comparison","title":"Performance Comparison","text":"<p>Number - is the number of uint8 RGB images processed per second on a single CPU core. Higher is better.</p> Transform albumentations1.4.20 augly1.0.0 imgaug0.4.0 kornia0.7.3 torchvision0.20.0 HorizontalFlip 8618 \u00b1 1233 4807 \u00b1 818 6042 \u00b1 788 390 \u00b1 106 914 \u00b1 67 VerticalFlip 22847 \u00b1 2031 9153 \u00b1 1291 10931 \u00b1 1844 1212 \u00b1 402 3198 \u00b1 200 Rotate 1146 \u00b1 79 1119 \u00b1 41 1136 \u00b1 218 143 \u00b1 11 181 \u00b1 11 Affine 682 \u00b1 192 - 774 \u00b1 97 147 \u00b1 9 130 \u00b1 12 Equalize 892 \u00b1 61 - 581 \u00b1 54 152 \u00b1 19 479 \u00b1 12 RandomCrop80 47341 \u00b1 20523 25272 \u00b1 1822 11503 \u00b1 441 1510 \u00b1 230 32109 \u00b1 1241 ShiftRGB 2349 \u00b1 76 - 1582 \u00b1 65 - - Resize 2316 \u00b1 166 611 \u00b1 78 1806 \u00b1 63 232 \u00b1 24 195 \u00b1 4 RandomGamma 8675 \u00b1 274 - 2318 \u00b1 269 108 \u00b1 13 - Grayscale 3056 \u00b1 47 2720 \u00b1 932 1681 \u00b1 156 289 \u00b1 75 1838 \u00b1 130 RandomPerspective 412 \u00b1 38 - 554 \u00b1 22 86 \u00b1 11 96 \u00b1 5 GaussianBlur 1728 \u00b1 89 242 \u00b1 4 1090 \u00b1 65 176 \u00b1 18 79 \u00b1 3 MedianBlur 868 \u00b1 60 - 813 \u00b1 30 5 \u00b1 0 - MotionBlur 4047 \u00b1 67 - 612 \u00b1 18 73 \u00b1 2 - Posterize 9094 \u00b1 301 - 2097 \u00b1 68 430 \u00b1 49 3196 \u00b1 185 JpegCompression 918 \u00b1 23 778 \u00b1 5 459 \u00b1 35 71 \u00b1 3 625 \u00b1 17 GaussianNoise 166 \u00b1 12 67 \u00b1 2 206 \u00b1 11 75 \u00b1 1 - Elastic 201 \u00b1 5 - 235 \u00b1 20 1 \u00b1 0 2 \u00b1 0 Clahe 454 \u00b1 22 - 335 \u00b1 43 94 \u00b1 9 - CoarseDropout 13368 \u00b1 744 - 671 \u00b1 38 536 \u00b1 87 - Blur 5267 \u00b1 543 246 \u00b1 3 3807 \u00b1 325 - - ColorJitter 628 \u00b1 55 255 \u00b1 13 - 55 \u00b1 18 46 \u00b1 2 Brightness 8956 \u00b1 300 1163 \u00b1 86 - 472 \u00b1 101 429 \u00b1 20 Contrast 8879 \u00b1 1426 736 \u00b1 79 - 425 \u00b1 52 335 \u00b1 35 RandomResizedCrop 2828 \u00b1 186 - - 287 \u00b1 58 511 \u00b1 10 Normalize 1196 \u00b1 56 - - 626 \u00b1 40 519 \u00b1 12 PlankianJitter 2204 \u00b1 385 - - 813 \u00b1 211 -"},{"location":"faq/","title":"Frequently Asked Questions","text":"<p>This FAQ covers common questions about Albumentations, from basic setup to advanced usage. You'll find information about:</p> <ul> <li>Installation troubleshooting and configuration</li> <li>Working with different data formats (images, video, volumetric data)</li> <li>Advanced usage patterns and best practices</li> <li>Integration with other tools and migration from other libraries</li> </ul> <p>If you don't find an answer to your question, please check our GitHub Issues or join our Discord community.</p>"},{"location":"faq/#installation","title":"Installation","text":""},{"location":"faq/#i-am-receiving-an-error-message-failed-building-wheel-for-imagecodecs-when-i-am-trying-to-install-albumentations-how-can-i-fix-the-problem","title":"I am receiving an error message <code>Failed building wheel for imagecodecs</code> when I am trying to install Albumentations. How can I fix the problem?","text":"<p>Try to update <code>pip</code> by running the following command:</p> Bash<pre><code>python -m pip install --upgrade pip\n</code></pre>"},{"location":"faq/#how-to-disable-automatic-checks-for-new-versions","title":"How to disable automatic checks for new versions?","text":"<p>To disable automatic checks for new versions, set the environment variable <code>NO_ALBUMENTATIONS_UPDATE</code> to <code>1</code>.</p>"},{"location":"faq/#how-to-make-albumentations-use-one-cpu-core","title":"How to make Albumentations use one CPU core?","text":"<p>Albumentations do not use multithreading by default, but libraries it depends on (like opencv) may use multithreading. To make Albumentations use one CPU core, you can set the following environment variables:</p> Python<pre><code>os.environ[\"OMP_NUM_THREADS\"] = \"1\"\nos.environ[\"OPENBLAS_NUM_THREADS\"] = \"1\"\nos.environ[\"MKL_NUM_THREADS\"] = \"1\"\nos.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"1\"\nos.environ[\"NUMEXPR_NUM_THREADS\"] = \"1\"\n</code></pre>"},{"location":"faq/#data-formats-and-basic-usage","title":"Data Formats and Basic Usage","text":""},{"location":"faq/#supported-image-types","title":"Supported Image Types","text":"<p>Albumentations works with images of type uint8 and float32. uint8 images should be in the <code>[0, 255]</code> range, and float32 images should be in the <code>[0, 1]</code> range. If float32 images lie outside of the <code>[0, 1]</code> range, they will be automatically clipped to the <code>[0, 1]</code> range.</p>"},{"location":"faq/#why-do-you-call-cv2cvtcolorimage-cv2color_bgr2rgb-in-your-examples","title":"Why do you call <code>cv2.cvtColor(image, cv2.COLOR_BGR2RGB)</code> in your examples?","text":"<p>For historical reasons, OpenCV reads an image in BGR format (so color channels of the image have the following order: Blue, Green, Red). Albumentations uses the most common and popular RGB image format. So when using OpenCV, we need to convert the image format to RGB explicitly.</p>"},{"location":"faq/#how-to-have-reproducible-augmentations","title":"How to have reproducible augmentations?","text":"<p>To have reproducible augmentations, set the <code>seed</code> parameter in your transform pipeline. This will ensure that the same random parameters are used for each augmentation, resulting in the same output for the same input.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(height=256, width=256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], seed=42)\n</code></pre>"},{"location":"faq/#working-with-different-data-types","title":"Working with Different Data Types","text":""},{"location":"faq/#how-to-process-video-data-with-albumentations","title":"How to process video data with Albumentations?","text":"<p>Albumentations can process video data by treating it as a sequence of frames in numpy array format: - <code>(N, H, W)</code> - Grayscale video (N frames) - <code>(N, H, W, C)</code> - Color video (N frames)</p> <p>When you pass a video array, Albumentations will apply the same transform with identical parameters to each frame, ensuring temporal consistency.</p> Python<pre><code>video = np.random.rand(32, 256, 256, 3) # 32 RGB frames\n\ntransform = A.Compose([\n  A.RandomCrop(height=224, width=224),\n  A.HorizontalFlip(p=0.5)\n], seed=42)\n\ntransformed = transform(image=video)['image']\n</code></pre> <p>See Working with Video Data for more info.</p>"},{"location":"faq/#how-to-process-volumetric-data-with-albumentations","title":"How to process volumetric data with Albumentations?","text":"<p>Albumentations can process volumetric data by treating it as a sequence of 2D slices. When you pass a volumetric data as a numpy array, Albumentations will apply the same transform with identical parameters to each slice, ensuring temporal consistency.</p> <p>See Working with Volumetric Data (3D) for more info.</p>"},{"location":"faq/#my-computer-vision-pipeline-works-with-a-sequence-of-images-i-want-to-apply-the-same-augmentations-with-the-same-parameters-to-each-image-in-the-sequence-can-albumentations-do-it","title":"My computer vision pipeline works with a sequence of images. I want to apply the same augmentations with the same parameters to each image in the sequence. Can Albumentations do it?","text":"<p>Yes. You can define additional images, masks, bounding boxes, or keypoints through the <code>additional_targets</code> argument to <code>Compose</code>. You can then pass those additional targets to the augmentation pipeline, and Albumentations will augment them in the same way. See this example for more info.</p> <p>But if you want only to the sequence of images, you may just use <code>images</code> target that accepts <code>list[numpy.ndarray]</code> or np.ndarray with shape <code>(N, H, W, C) / (N, H, W)</code>.</p>"},{"location":"faq/#advanced-usage","title":"Advanced Usage","text":""},{"location":"faq/#how-can-i-find-which-augmentations-were-applied-to-the-input-data-and-which-parameters-they-used","title":"How can I find which augmentations were applied to the input data and which parameters they used?","text":"<p>You may pass <code>save_applied_params=True</code> to <code>Compose</code> to save the parameters of the applied augmentations. You can access them later using <code>applied_transforms</code>.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(256, 256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.5),\n    A.RandomGamma(p=0.5),\n    A.Normalize(),\n], save_applied_params=True, seed=42)\n\ntransformed = transform(image=image)['image']\n\nprint(transform[\"applied_transforms\"])\n</code></pre>"},{"location":"faq/#how-to-perform-balanced-scaling","title":"How to perform balanced scaling?","text":"<p>The default scaling logic in <code>RandomScale</code>, <code>ShiftScaleRotate</code>, and <code>Affine</code> transformations is biased towards upscaling.</p> <p>For example, if <code>scale_limit = (0.5, 2)</code>, a user might expect that the image will be scaled down in half of the cases and scaled up in the other half. However, in reality, the image will be scaled up in 75% of the cases and scaled down in only 25% of the cases. This is because the default behavior samples uniformly from the interval <code>[0.5, 2]</code>, and the interval <code>[0.5, 1]</code> is three times smaller than <code>[1, 2]</code>.</p> <p>To achieve balanced scaling, you can use <code>Affine</code> with <code>balanced_scale=True</code>, which ensures that the probability of scaling up and scaling down is equal.</p> Python<pre><code>balanced_scale_transform = A.Affine(scale=(0.5, 2), balanced_scale=True)\n</code></pre> <p>or use <code>OneOf</code> transform as follows:</p> Python<pre><code>balanced_scale_transform = A.OneOf([\n  A.Affine(scale=(0.5, 1), p=0.5),\n  A.Affine(scale=(1, 2), p=0.5)])\n</code></pre> <p>This approach ensures that exactly half of the samples will be upscaled and half will be downscaled.</p>"},{"location":"faq/#augmentations-have-a-parameter-named-p-that-sets-the-probability-of-applying-that-augmentation-how-does-p-work-in-nested-containers","title":"Augmentations have a parameter named <code>p</code> that sets the probability of applying that augmentation. How does <code>p</code> work in nested containers?","text":"<p>The <code>p</code> parameter sets the probability of applying a specific augmentation. When augmentations are nested within a top-level container like <code>Compose</code>, the effective probability of each augmentation is the product of the container's probability and the augmentation's probability.</p> <p>Let's look at an example when a container <code>Compose</code> contains one augmentation <code>Resize</code>:</p> Python<pre><code>transform = A.Compose([\n    A.Resize(height=256, width=256, p=1.0),\n], p=0.9)\n</code></pre> <p>In this case, <code>Resize</code> has a 90% chance to be applied. This is because there is a 90% chance for <code>Compose</code> to be applied (p=0.9). If <code>Compose</code> is applied, then <code>Resize</code> is applied with 100% probability <code>(p=1.0)</code>.</p> <p>To visualize:</p> <ul> <li>Probability of <code>Compose</code> being applied: 0.9</li> <li>Probability of <code>Resize</code> being applied given <code>Compose</code> is applied: 1.0</li> <li>Effective probability of <code>Resize</code> being applied: 0.9 * 1.0 = 0.9 (or 90%)</li> </ul> <p>This means that the effective probability of <code>Resize</code> being applied is the product of the probabilities of <code>Compose</code> and <code>Resize</code>, which is <code>0.9 * 1.0 = 0.9</code> or 90%. This principle applies to other transformations as well, where the overall probability is the product of the individual probabilities within the transformation pipeline.</p> <p>Here\u2019s another example:</p> Python<pre><code>transform = A.Compose([\n    A.Resize(height=256, width=256, p=0.5),\n], p=0.9)\n</code></pre> <p>In this example, Resize has an effective probability of being applied as <code>0.9 * 0.5</code> = 0.45 or 45%. This is because <code>Compose</code> is applied 90% of the time, and within that 90%, <code>Resize</code> is applied 50% of the time.</p>"},{"location":"faq/#i-created-annotations-for-bounding-boxes-using-labeling-service-or-labeling-software-how-can-i-use-those-annotations-in-albumentations","title":"I created annotations for bounding boxes using labeling service or labeling software. How can I use those annotations in Albumentations?","text":"<p>You need to convert those annotations to one of the formats, supported by Albumentations. For the list of formats, please refer to this article. Consult the documentation of the labeling service to see how you can export annotations in those formats.</p>"},{"location":"faq/#integration-and-migration","title":"Integration and Migration","text":""},{"location":"faq/#how-to-save-and-load-augmentation-transforms-to-huggingface-hub","title":"How to save and load augmentation transforms to HuggingFace Hub?","text":"Python<pre><code>import albumentations as A\nimport numpy as np\n\ntransform = A.Compose([\n    A.RandomCrop(256, 256),\n    A.HorizontalFlip(),\n    A.RandomBrightnessContrast(),\n    A.RGBShift(),\n    A.Normalize(),\n])\n\ntransform.save_pretrained(\"qubvel-hf/albu\", key=\"train\")\n# The 'key' parameter specifies the context or purpose of the saved transform,\n# allowing for organized and context-specific retrieval.\n# ^ this will save the transform to a directory \"qubvel-hf/albu\" with filename \"albumentations_config_train.json\"\n\ntransform.save_pretrained(\"qubvel-hf/albu\", key=\"train\", push_to_hub=True)\n# ^ this will save the transform to a directory \"qubvel-hf/albu\" with filename \"albumentations_config_train.json\"\n# + push the transform to the Hub to the repository \"qubvel-hf/albu\"\n\ntransform.push_to_hub(\"qubvel-hf/albu\", key=\"train\")\n# Use `save_pretrained` to save the transform locally and optionally push to the Hub.\n# Use `push_to_hub` to directly push the transform to the Hub without saving it locally.\n# ^ this will push the transform to the Hub to the repository \"qubvel-hf/albu\" (without saving it locally)\n\nloaded_transform = A.Compose.from_pretrained(\"qubvel-hf/albu\", key=\"train\")\n# ^ this will load the transform from local folder if exist or from the Hub repository \"qubvel-hf/albu\"\n</code></pre> <p>See this example for more info.</p>"},{"location":"faq/#how-do-i-migrate-from-other-augmentation-libraries-to-albumentations","title":"How do I migrate from other augmentation libraries to Albumentations?","text":"<p>If you're migrating from other libraries like torchvision or Kornia, you can refer to our Library Comparison &amp; Benchmarks guide. This guide provides:</p> <ol> <li>Mapping tables showing equivalent transforms between libraries</li> <li>Performance benchmarks demonstrating Albumentations' speed advantages</li> <li>Code examples for common migration scenarios</li> <li>Key differences in implementation and parameter handling</li> </ol> <p>For a quick visual comparison of different augmentations, you can also use our interactive tool at explore.albumentations.ai to see how transforms affect images before implementing them.</p> <p>For specific migration examples, see:</p> <ul> <li>Migrating from torchvision</li> <li>Performance comparison with other libraries</li> </ul>"},{"location":"frameworks_and_libraries/","title":"Frameworks and libraries that use Albumentations","text":""},{"location":"frameworks_and_libraries/#mmdetection","title":"MMDetection","text":"<p>https://github.com/open-mmlab/mmdetection</p> <p>MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.</p> <ul> <li>To install MMDetection with Albumentations follow the installation instructions.</li> <li>MMDetection has an example config with augmentations from Albumentations.</li> </ul>"},{"location":"frameworks_and_libraries/#yolov5","title":"YOLOv5","text":"<p>https://github.com/ultralytics/yolov5</p> <p>YOLOv5 \ud83d\ude80 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.</p> <ul> <li>To use Albumentations along with YOLOv5 simply <code>pip install -U albumentations</code> and then update the augmentation pipeline as you see fit in the Albumentations class in <code>utils/augmentations.py</code>. An example is available in the YOLOv5 repository.</li> </ul>"},{"location":"frameworks_and_libraries/#other-frameworks-and-libraries","title":"Other frameworks and libraries","text":"<p>Other you can see find at GitHub</p>"},{"location":"api_reference/","title":"Index","text":"<ul> <li>Full API Reference on a single page</li> <li>Core API (albumentations.core)<ul> <li>Composition API (albumentations.core.composition)</li> <li>Serialization API (albumentations.core.serialization)</li> <li>Transforms Interface (albumentations.core.transforms_interface)</li> <li>Helper functions for working with bounding boxes (albumentations.core.bbox_utils)</li> <li>Helper functions for working with keypoints (albumentations.core.keypoints_utils)</li> </ul> </li> <li>Augmentations (albumentations.augmentations)<ul> <li>Transforms (albumentations.augmentations.transforms)</li> <li>Functional transforms (albumentations.augmentations.functional)</li> </ul> </li> <li>PyTorch Helpers (albumentations.pytorch)<ul> <li>Transforms (albumentations.pytorch.transforms)</li> </ul> </li> </ul>"},{"location":"api_reference/full_reference/","title":"Full API Reference on a single page","text":""},{"location":"api_reference/full_reference/#transform-types","title":"Transform Types","text":""},{"location":"api_reference/full_reference/#1-pixel-level-transforms","title":"1. Pixel-level transforms","text":"<p>Transforms that modify pixel values without changing spatial relationships. These can be safely applied to any target as they only affect the input image, leaving other targets (masks, bounding boxes, keypoints) unchanged.</p> <ul> <li>AdditiveNoise</li> <li>AdvancedBlur</li> <li>AutoContrast</li> <li>Blur</li> <li>CLAHE</li> <li>ChannelDropout</li> <li>ChannelShuffle</li> <li>ChromaticAberration</li> <li>ColorJitter</li> <li>Defocus</li> <li>Downscale</li> <li>Emboss</li> <li>Equalize</li> <li>FDA</li> <li>FancyPCA</li> <li>FromFloat</li> <li>GaussNoise</li> <li>GaussianBlur</li> <li>GlassBlur</li> <li>HistogramMatching</li> <li>HueSaturationValue</li> <li>ISONoise</li> <li>Illumination</li> <li>ImageCompression</li> <li>InvertImg</li> <li>MedianBlur</li> <li>MotionBlur</li> <li>MultiplicativeNoise</li> <li>Normalize</li> <li>PixelDistributionAdaptation</li> <li>PlanckianJitter</li> <li>PlasmaBrightnessContrast</li> <li>PlasmaShadow</li> <li>Posterize</li> <li>RGBShift</li> <li>RandomBrightnessContrast</li> <li>RandomFog</li> <li>RandomGamma</li> <li>RandomGravel</li> <li>RandomRain</li> <li>RandomShadow</li> <li>RandomSnow</li> <li>RandomSunFlare</li> <li>RandomToneCurve</li> <li>RingingOvershoot</li> <li>SaltAndPepper</li> <li>Sharpen</li> <li>ShotNoise</li> <li>Solarize</li> <li>Spatter</li> <li>Superpixels</li> <li>TemplateTransform</li> <li>TextImage</li> <li>ToFloat</li> <li>ToGray</li> <li>ToRGB</li> <li>ToSepia</li> <li>UnsharpMask</li> <li>ZoomBlur</li> </ul>"},{"location":"api_reference/full_reference/#2-spatial-level-transforms","title":"2. Spatial-level transforms","text":"<p>Transforms that modify the spatial arrangement of pixels/features. Different targets have different spatial transform support - see the compatibility table below:</p> Transform Image Mask BBoxes Keypoints Volume Mask3D Affine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 AtLeastOneBBoxRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 BBoxSafeRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CenterCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CoarseDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Crop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropAndPad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropNonEmptyMaskIfExists \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 D4 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ElasticTransform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Erasing \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 FrequencyMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridElasticDeform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 HorizontalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Lambda \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 LongestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 MaskDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Morphological \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 NoOp \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OpticalDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OverlayElements \u2713 \u2713 Pad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PadIfNeeded \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Perspective \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PiecewiseAffine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PixelDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropFromBorders \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropNearBBox \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomGridShuffle \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomResizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomRotate90 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomScale \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedBBoxSafeCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Resize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Rotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SafeRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ShiftScaleRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SmallestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ThinPlateSpline \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeReverse \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Transpose \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 VerticalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 XYMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713"},{"location":"api_reference/full_reference/#3-volumetric-3d-transforms","title":"3. Volumetric (3D) transforms","text":"<p>Transforms designed for three-dimensional data (D, H, W). These operate on volumes and their corresponding 3D masks, supporting both single-channel and multi-channel data.</p> Transform Image Mask BBoxes Keypoints Volume Mask3D CenterCrop3D \u2713 \u2713 CoarseDropout3D \u2713 \u2713 CubicSymmetry \u2713 \u2713 Pad3D \u2713 \u2713 PadIfNeeded3D \u2713 \u2713 RandomCrop3D \u2713 \u2713"},{"location":"api_reference/augmentations/","title":"Index","text":"<ul> <li>Transforms (albumentations.augmentations.transforms)</li> <li>Blur transforms (albumentations.augmentations.blur)</li> <li>Crop transforms (albumentations.augmentations.crops)</li> <li>Dropout transforms (albumentations.augmentations.dropout)</li> <li>Geometric transforms (albumentations.augmentations.geometric)</li> <li>Domain adaptation transforms (albumentations.augmentations.domain_adaptation)</li> <li>Functional transforms (albumentations.augmentations.functional)</li> </ul>"},{"location":"api_reference/augmentations/domain_adaptation/","title":"Domain adaptation transforms (augmentations.domain_adaptation)","text":""},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional","title":"<code>functional</code>","text":""},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional.apply_histogram","title":"<code>def apply_histogram    (img, reference_image, blend_ratio)    </code> [view source on GitHub]","text":"<p>Apply histogram matching to an input image using a reference image and blend the result.</p> <p>This function performs histogram matching between the input image and a reference image, then blends the result with the original input image based on the specified blend ratio.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The input image to be transformed. Can be either grayscale or RGB. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>reference_image</code> <code>np.ndarray</code> <p>The reference image used for histogram matching. Should have the same number of channels as the input image. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>blend_ratio</code> <code>float</code> <p>The ratio for blending the matched image with the original image. Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The transformed image after histogram matching and blending.     The output will have the same shape and dtype as the input image.</p> <p>Supported image types:     - Grayscale images: 2D arrays     - RGB images: 3D arrays with 3 channels     - Multispectral images: 3D arrays with more than 3 channels</p> <p>Note</p> <ul> <li>If the input and reference images have different sizes, the reference image   will be resized to match the input image's dimensions.</li> <li>The function uses a custom implementation of histogram matching based on OpenCV and NumPy.</li> <li>The @clipped and @preserve_channel_dim decorators ensure the output is within   the valid range and maintains the original number of dimensions.</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef apply_histogram(img: np.ndarray, reference_image: np.ndarray, blend_ratio: float) -&gt; np.ndarray:\n    \"\"\"Apply histogram matching to an input image using a reference image and blend the result.\n\n    This function performs histogram matching between the input image and a reference image,\n    then blends the result with the original input image based on the specified blend ratio.\n\n    Args:\n        img (np.ndarray): The input image to be transformed. Can be either grayscale or RGB.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        reference_image (np.ndarray): The reference image used for histogram matching.\n            Should have the same number of channels as the input image.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        blend_ratio (float): The ratio for blending the matched image with the original image.\n            Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.\n\n    Returns:\n        np.ndarray: The transformed image after histogram matching and blending.\n            The output will have the same shape and dtype as the input image.\n\n    Supported image types:\n        - Grayscale images: 2D arrays\n        - RGB images: 3D arrays with 3 channels\n        - Multispectral images: 3D arrays with more than 3 channels\n\n    Note:\n        - If the input and reference images have different sizes, the reference image\n          will be resized to match the input image's dimensions.\n        - The function uses a custom implementation of histogram matching based on OpenCV and NumPy.\n        - The @clipped and @preserve_channel_dim decorators ensure the output is within\n          the valid range and maintains the original number of dimensions.\n    \"\"\"\n    # Resize reference image only if necessary\n    if img.shape[:2] != reference_image.shape[:2]:\n        reference_image = cv2.resize(reference_image, dsize=(img.shape[1], img.shape[0]))\n\n    img = np.squeeze(img)\n    reference_image = np.squeeze(reference_image)\n\n    # Match histograms between the images\n    matched = match_histograms(img, reference_image)\n\n    # Blend the original image and the matched image\n    return add_weighted(matched, blend_ratio, img, 1 - blend_ratio)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional.fourier_domain_adaptation","title":"<code>def fourier_domain_adaptation    (img, target_img, beta)    </code> [view source on GitHub]","text":"<p>Apply Fourier Domain Adaptation to the input image using a target image.</p> <p>This function performs domain adaptation in the frequency domain by modifying the amplitude spectrum of the source image based on the target image's amplitude spectrum. It preserves the phase information of the source image, which helps maintain its content while adapting its style to match the target image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The source image to be adapted. Can be grayscale or RGB.</p> <code>target_img</code> <code>np.ndarray</code> <p>The target image used as a reference for adaptation. Should have the same dimensions as the source image.</p> <code>beta</code> <code>float</code> <p>The adaptation strength, typically in the range [0, 1]. Higher values result in stronger adaptation towards the target image's style.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The adapted image with the same shape and type as the input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the source and target images have different shapes.</p> <p>Note</p> <ul> <li>Both input images are converted to float32 for processing.</li> <li>The function handles both grayscale (2D) and color (3D) images.</li> <li>For grayscale images, an extra dimension is added to facilitate uniform processing.</li> <li>The adaptation is performed channel-wise for color images.</li> <li>The output is clipped to the valid range and preserves the original number of channels.</li> </ul> <p>The adaptation process involves the following steps for each channel: 1. Compute the 2D Fourier Transform of both source and target images. 2. Shift the zero frequency component to the center of the spectrum. 3. Extract amplitude and phase information from the source image's spectrum. 4. Mutate the source amplitude using the target amplitude and the beta parameter. 5. Combine the mutated amplitude with the original phase. 6. Perform the inverse Fourier Transform to obtain the adapted channel.</p> <p>The <code>low_freq_mutate</code> function (not shown here) is responsible for the actual amplitude mutation, focusing on low-frequency components which carry style information.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n&gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n</code></pre> <p>References</p> <ul> <li>\"FDA: Fourier Domain Adaptation for Semantic Segmentation\"   (Yang and Soatto, 2020, CVPR)   https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef fourier_domain_adaptation(img: np.ndarray, target_img: np.ndarray, beta: float) -&gt; np.ndarray:\n    \"\"\"Apply Fourier Domain Adaptation to the input image using a target image.\n\n    This function performs domain adaptation in the frequency domain by modifying the amplitude\n    spectrum of the source image based on the target image's amplitude spectrum. It preserves\n    the phase information of the source image, which helps maintain its content while adapting\n    its style to match the target image.\n\n    Args:\n        img (np.ndarray): The source image to be adapted. Can be grayscale or RGB.\n        target_img (np.ndarray): The target image used as a reference for adaptation.\n            Should have the same dimensions as the source image.\n        beta (float): The adaptation strength, typically in the range [0, 1].\n            Higher values result in stronger adaptation towards the target image's style.\n\n    Returns:\n        np.ndarray: The adapted image with the same shape and type as the input image.\n\n    Raises:\n        ValueError: If the source and target images have different shapes.\n\n    Note:\n        - Both input images are converted to float32 for processing.\n        - The function handles both grayscale (2D) and color (3D) images.\n        - For grayscale images, an extra dimension is added to facilitate uniform processing.\n        - The adaptation is performed channel-wise for color images.\n        - The output is clipped to the valid range and preserves the original number of channels.\n\n    The adaptation process involves the following steps for each channel:\n    1. Compute the 2D Fourier Transform of both source and target images.\n    2. Shift the zero frequency component to the center of the spectrum.\n    3. Extract amplitude and phase information from the source image's spectrum.\n    4. Mutate the source amplitude using the target amplitude and the beta parameter.\n    5. Combine the mutated amplitude with the original phase.\n    6. Perform the inverse Fourier Transform to obtain the adapted channel.\n\n    The `low_freq_mutate` function (not shown here) is responsible for the actual\n    amplitude mutation, focusing on low-frequency components which carry style information.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n        &gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n\n    References:\n        - \"FDA: Fourier Domain Adaptation for Semantic Segmentation\"\n          (Yang and Soatto, 2020, CVPR)\n          https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n    \"\"\"\n    src_img = img.astype(np.float32)\n    trg_img = target_img.astype(np.float32)\n\n    if src_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        src_img = np.expand_dims(src_img, axis=-1)\n    if trg_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        trg_img = np.expand_dims(trg_img, axis=-1)\n\n    num_channels = src_img.shape[-1]\n\n    # Prepare container for the output image\n    src_in_trg = np.zeros_like(src_img)\n\n    for channel_id in range(num_channels):\n        # Perform FFT on each channel\n        fft_src = np.fft.fft2(src_img[:, :, channel_id])\n        fft_trg = np.fft.fft2(trg_img[:, :, channel_id])\n\n        # Shift the zero frequency component to the center\n        fft_src_shifted = np.fft.fftshift(fft_src)\n        fft_trg_shifted = np.fft.fftshift(fft_trg)\n\n        # Extract amplitude and phase\n        amp_src, pha_src = np.abs(fft_src_shifted), np.angle(fft_src_shifted)\n        amp_trg = np.abs(fft_trg_shifted)\n\n        # Mutate the amplitude part of the source with the target\n        mutated_amp = low_freq_mutate(amp_src.copy(), amp_trg, beta)\n\n        # Combine the mutated amplitude with the original phase\n        fft_src_mutated = np.fft.ifftshift(mutated_amp * np.exp(1j * pha_src))\n\n        # Perform inverse FFT\n        src_in_trg_channel = np.fft.ifft2(fft_src_mutated)\n\n        # Store the result in the corresponding channel of the output image\n        src_in_trg[:, :, channel_id] = np.real(src_in_trg_channel)\n\n    return src_in_trg\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional.match_histograms","title":"<code>def match_histograms    (image, reference)    </code> [view source on GitHub]","text":"<p>Adjust an image so that its cumulative histogram matches that of another.</p> <p>The adjustment is applied separately for each channel.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image. Can be gray-scale or in color.</p> <code>reference</code> <code>np.ndarray</code> <p>Image to match histogram of. Must have the same number of channels as image.</p> <code>channel_axis</code> <p>If None, the image is assumed to be a grayscale (single channel) image. Otherwise, this parameter indicates which axis of the array corresponds to channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>Thrown when the number of channels in the input image and the reference differ.</p> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef match_histograms(image: np.ndarray, reference: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Adjust an image so that its cumulative histogram matches that of another.\n\n    The adjustment is applied separately for each channel.\n\n    Args:\n        image: Input image. Can be gray-scale or in color.\n        reference: Image to match histogram of. Must have the same number of channels as image.\n        channel_axis: If None, the image is assumed to be a grayscale (single channel) image.\n            Otherwise, this parameter indicates which axis of the array corresponds to channels.\n\n    Returns:\n        np.ndarray: Transformed input image.\n\n    Raises:\n        ValueError: Thrown when the number of channels in the input image and the reference differ.\n    \"\"\"\n    if reference.dtype != np.uint8:\n        reference = from_float(reference, np.uint8)\n\n    if image.ndim != reference.ndim:\n        raise ValueError(\"Image and reference must have the same number of dimensions.\")\n\n    # Expand dimensions for grayscale images\n    if image.ndim == 2:\n        image = np.expand_dims(image, axis=-1)\n    if reference.ndim == 2:\n        reference = np.expand_dims(reference, axis=-1)\n\n    matched = np.empty(image.shape, dtype=np.uint8)\n\n    num_channels = image.shape[-1]\n\n    for channel in range(num_channels):\n        matched_channel = _match_cumulative_cdf(image[..., channel], reference[..., channel]).astype(np.uint8)\n        matched[..., channel] = matched_channel\n\n    return matched\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms","title":"<code>transforms</code>","text":""},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.FDA","title":"<code>class  FDA</code> <code>       (reference_images, beta_limit=(0, 0.1), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.</p> <p>This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.</p> <p>The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more of the original image's characteristics and higher values leading to more pronounced adaptation effects. It is recommended to use beta values less than 0.3 to avoid introducing artifacts.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>Sequence of objects to be converted into images by <code>read_fn</code>. This typically involves paths to images that serve as target domain examples for adaptation.</p> <code>beta_limit</code> <code>tuple[float, float] | float</code> <p>Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.</p> <code>read_fn</code> <code>Callable</code> <p>User-defined function for reading images. It takes an element from <code>reference_images</code> and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <ul> <li>https://github.com/YanchaoYang/FDA</li> <li>https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n&gt;&gt;&gt; result = aug(image=image)\n</code></pre> <p>Note</p> <p>FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target domain samples are unavailable. It enables significant improvements in model generalization by aligning the low-level statistics of source and target images through a simple yet effective Fourier-based method.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class FDA(ImageOnlyTransform):\n    \"\"\"Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation\n    (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source\n    and target datasets, effectively adapting images from one domain to closely resemble those from another without\n    altering their semantic content.\n\n    This transform is particularly beneficial in scenarios where the training (source) and testing (target) images\n    come from different distributions, such as synthetic versus real images, or day versus night scenes.\n    Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain\n    alignment by swapping low-frequency components of the Fourier transform between the source and target images.\n    This technique has shown to improve the performance of models on the target domain, particularly for tasks\n    like semantic segmentation, without additional training for domain invariance.\n\n    The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more\n    of the original image's characteristics and higher values leading to more pronounced adaptation effects.\n    It is recommended to use beta values less than 0.3 to avoid introducing artifacts.\n\n    Args:\n        reference_images (Sequence[Any]): Sequence of objects to be converted into images by `read_fn`. This typically\n            involves paths to images that serve as target domain examples for adaptation.\n        beta_limit (tuple[float, float] | float): Coefficient beta from the paper, controlling the swapping extent of\n            frequency components. If one value is provided beta will be sampled from uniform\n            distribution [0, beta_limit]. Values should be less than 0.5.\n        read_fn (Callable): User-defined function for reading images. It takes an element from `reference_images` and\n            returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a\n            numpy array.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        - https://github.com/YanchaoYang/FDA\n        - https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n        &gt;&gt;&gt; result = aug(image=image)\n\n    Note:\n        FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target\n        domain samples are unavailable. It enables significant improvements in model generalization by aligning\n        the low-level statistics of source and target images through a simple yet effective Fourier-based method.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        read_fn: Callable[[Any], np.ndarray]\n        beta_limit: ZeroOneRangeType\n\n        @field_validator(\"beta_limit\")\n        @classmethod\n        def check_ranges(cls, value: tuple[float, float]) -&gt; tuple[float, float]:\n            bounds = 0, MAX_BETA_LIMIT\n            if not bounds[0] &lt;= value[0] &lt;= value[1] &lt;= bounds[1]:\n                raise ValueError(f\"Values should be in the range {bounds} got {value} \")\n            return value\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        beta_limit: ScaleFloatType = (0, 0.1),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.beta_limit = cast(tuple[float, float], beta_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        target_image: np.ndarray,\n        beta: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fourier_domain_adaptation(img, target_image, beta)\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n        target_img = self.read_fn(self.py_random.choice(self.reference_images))\n        target_img = cv2.resize(target_img, dsize=(width, height))\n\n        return {\"target_image\": target_img, \"beta\": self.py_random.uniform(*self.beta_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str]:\n        return \"reference_images\", \"beta_limit\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"FDA can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.HistogramMatching","title":"<code>class  HistogramMatching</code> <code>       (reference_images, blend_ratio=(0.5, 1.0), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Adjust the pixel values of an input image to match the histogram of a reference image.</p> <p>This transform applies histogram matching, a technique that modifies the distribution of pixel intensities in the input image to closely resemble that of a reference image. This process is performed independently for each channel in multi-channel images, provided both the input and reference images have the same number of channels.</p> <p>Histogram matching is particularly useful for: - Normalizing images from different sources or captured under varying conditions. - Preparing images for feature matching or other computer vision tasks where consistent   tone and contrast are important. - Simulating different lighting or camera conditions in a controlled manner.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the <code>read_fn</code>.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)</p> <code>read_fn</code> <code>Callable[[Any], np.ndarray]</code> <p>A function that takes an element from <code>reference_images</code> and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> <li>The effectiveness of the matching depends on the similarity between the input and reference images.</li> <li>For best results, choose reference images that represent the desired tone and contrast.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.HistogramMatching(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     read_fn=lambda x: x,\n...     p=1\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; matched_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Histogram Matching in scikit-image:   https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class HistogramMatching(ImageOnlyTransform):\n    \"\"\"Adjust the pixel values of an input image to match the histogram of a reference image.\n\n    This transform applies histogram matching, a technique that modifies the distribution of pixel\n    intensities in the input image to closely resemble that of a reference image. This process is\n    performed independently for each channel in multi-channel images, provided both the input and\n    reference images have the same number of channels.\n\n    Histogram matching is particularly useful for:\n    - Normalizing images from different sources or captured under varying conditions.\n    - Preparing images for feature matching or other computer vision tasks where consistent\n      tone and contrast are important.\n    - Simulating different lighting or camera conditions in a controlled manner.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of reference image sources. These can be\n            file paths, URLs, or any objects that can be converted to images by the `read_fn`.\n        blend_ratio (tuple[float, float]): Range for the blending factor between the original\n            and the matched image. Must be two floats between 0 and 1, where:\n            - 0 means no blending (original image is returned)\n            - 1 means full histogram matching\n            A random value within this range is chosen for each application.\n            Default: (0.5, 1.0)\n        read_fn (Callable[[Any], np.ndarray]): A function that takes an element from\n            `reference_images` and returns a numpy array representing the image.\n            Default: read_rgb_image (reads image file from disk)\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform cannot be directly serialized due to its dependency on external image data.\n        - The effectiveness of the matching depends on the similarity between the input and reference images.\n        - For best results, choose reference images that represent the desired tone and contrast.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.HistogramMatching(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     read_fn=lambda x: x,\n        ...     p=1\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; matched_image = result[\"image\"]\n\n    References:\n        - Histogram Matching in scikit-image:\n          https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.5, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n\n    def apply(\n        self: np.ndarray,\n        img: np.ndarray,\n        reference_image: np.ndarray,\n        blend_ratio: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return apply_histogram(img, reference_image, blend_ratio)\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"HistogramMatching can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation","title":"<code>class  PixelDistributionAdaptation</code> <code>       (reference_images, blend_ratio=(0.25, 1.0), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, transform_type='pca', p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical transformation (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.</p> <p>The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse    of the transform fitted on the reference image.</p> <p>This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of objects (typically image paths) that will be converted into images by <code>read_fn</code>. These images serve as references for the domain adaptation.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)</p> <code>read_fn</code> <code>Callable</code> <p>A user-defined function for reading and converting the objects in <code>reference_images</code> into numpy arrays. By default, it assumes these objects are image paths.</p> <code>transform_type</code> <code>Literal[\"pca\", \"standard\", \"minmax\"]</code> <p>Specifies the type of statistical transformation to apply. - \"pca\": Principal Component Analysis - \"standard\": StandardScaler (zero mean and unit variance) - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: \"pca\"</p> <code>p</code> <code>float</code> <p>The probability of applying the transform to any given image. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The effectiveness of the adaptation depends on the similarity between the input and reference domains.</li> <li>PCA transformation may alter color relationships more significantly than other methods.</li> <li>StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.</li> <li>The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.</li> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     transform_type=\"standard\",\n...     read_fn=lambda x: x,\n...     p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; adapted_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>https://github.com/arsenyinfo/qudida</li> <li>https://arxiv.org/abs/1911.11483</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class PixelDistributionAdaptation(ImageOnlyTransform):\n    \"\"\"Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image\n    with that of a reference image. This process involves fitting a simple statistical transformation\n    (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images,\n    transforming the original image with the transformation trained on it, and then applying the inverse\n    transformation using the transform fitted on the reference image. The result is an adapted image\n    that retains the original content while mimicking the pixel value distribution of the reference domain.\n\n    The process can be visualized as two main steps:\n    1. Adjusting the original image to a standard distribution space using a selected transform.\n    2. Moving the adjusted image into the distribution space of the reference image by applying the inverse\n       of the transform fitted on the reference image.\n\n    This technique is especially useful in scenarios where images from different domains (e.g., synthetic\n    vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in\n    image processing tasks.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of objects (typically image paths) that will be\n            converted into images by `read_fn`. These images serve as references for the domain adaptation.\n        blend_ratio (tuple[float, float]): Specifies the minimum and maximum blend ratio for mixing\n            the adapted image with the original. This enhances the diversity of the output images.\n            Values should be in the range [0, 1]. Default: (0.25, 1.0)\n        read_fn (Callable): A user-defined function for reading and converting the objects in\n            `reference_images` into numpy arrays. By default, it assumes these objects are image paths.\n        transform_type (Literal[\"pca\", \"standard\", \"minmax\"]): Specifies the type of statistical\n            transformation to apply.\n            - \"pca\": Principal Component Analysis\n            - \"standard\": StandardScaler (zero mean and unit variance)\n            - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1])\n            Default: \"pca\"\n        p (float): The probability of applying the transform to any given image. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The effectiveness of the adaptation depends on the similarity between the input and reference domains.\n        - PCA transformation may alter color relationships more significantly than other methods.\n        - StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.\n        - The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.\n        - This transform cannot be directly serialized due to its dependency on external image data.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     transform_type=\"standard\",\n        ...     read_fn=lambda x: x,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; adapted_image = result[\"image\"]\n\n    References:\n        - https://github.com/arsenyinfo/qudida\n        - https://arxiv.org/abs/1911.11483\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.25, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"] = \"pca\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n        self.transform_type = transform_type\n\n    def apply(self, img: np.ndarray, reference_image: np.ndarray, blend_ratio: float, **params: Any) -&gt; np.ndarray:\n        return adapt_pixel_distribution(\n            img,\n            ref=reference_image,\n            weight=blend_ratio,\n            transform_type=self.transform_type,\n        )\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\", \"transform_type\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"PixelDistributionAdaptation can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.TemplateTransform","title":"<code>class  TemplateTransform</code> <code>       (templates, img_weight=(0.5, 0.5), template_weight=None, template_transform=None, name=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Apply blending of input image with specified templates.</p> <p>This transform overlays one or more template images onto the input image using alpha blending. It allows for creating complex composite images or simulating various visual effects.</p> <p>Parameters:</p> Name Type Description <code>templates</code> <code>numpy array | list[np.ndarray]</code> <p>Images to use as templates for the transform. If a single numpy array is provided, it will be used as the only template. If a list of numpy arrays is provided, one will be randomly chosen for each application.</p> <code>img_weight</code> <code>tuple[float, float]  | float</code> <p>Weight of the original image in the blend. If a single float, that value will always be used. If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application. To use a fixed weight, use (weight, weight). Default: (0.5, 0.5).</p> <code>template_transform</code> <code>A.Compose | None</code> <p>A composition of Albumentations transforms to apply to the template before blending. This should be an instance of A.Compose containing one or more Albumentations transforms. Default: None.</p> <code>name</code> <code>str | None</code> <p>Name of the transform instance. Used for serialization purposes. Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The template(s) must have the same number of channels as the input image or be single-channel.</li> <li>If a single-channel template is used with a multi-channel image, the template will be replicated across   all channels.</li> <li>The template(s) will be resized to match the input image size if they differ.</li> <li>To make this transform serializable, provide a name when initializing it.</li> </ul> <p>Mathematical Formulation:     Given:     - I: Input image     - T: Template image     - w_i: Weight of input image (sampled from img_weight)</p> <pre><code>The blended image B is computed as:\n\nB = w_i * I + (1 - w_i) * T\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-a-single-template","title":"Apply template transform with a single template","text":"Python<pre><code>&gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-multiple-templates-and-custom-weights","title":"Apply template transform with multiple templates and custom weights","text":"Python<pre><code>&gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=templates,\n...     img_weight=(0.3, 0.7),\n...     name=\"multi_template_transform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-additional-transforms-on-the-template","title":"Apply template transform with additional transforms on the template","text":"Python<pre><code>&gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=template,\n...     img_weight=0.6,\n...     template_transform=template_transform,\n...     name=\"transformed_template\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Image blending: https://en.wikipedia.org/wiki/Image_blending</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class TemplateTransform(ImageOnlyTransform):\n    \"\"\"Apply blending of input image with specified templates.\n\n    This transform overlays one or more template images onto the input image using alpha blending.\n    It allows for creating complex composite images or simulating various visual effects.\n\n    Args:\n        templates (numpy array | list[np.ndarray]): Images to use as templates for the transform.\n            If a single numpy array is provided, it will be used as the only template.\n            If a list of numpy arrays is provided, one will be randomly chosen for each application.\n\n        img_weight (tuple[float, float]  | float): Weight of the original image in the blend.\n            If a single float, that value will always be used.\n            If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application.\n            To use a fixed weight, use (weight, weight).\n            Default: (0.5, 0.5).\n\n        template_transform (A.Compose | None): A composition of Albumentations transforms to apply to the template\n            before blending.\n            This should be an instance of A.Compose containing one or more Albumentations transforms.\n            Default: None.\n\n        name (str | None): Name of the transform instance. Used for serialization purposes.\n            Default: None.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The template(s) must have the same number of channels as the input image or be single-channel.\n        - If a single-channel template is used with a multi-channel image, the template will be replicated across\n          all channels.\n        - The template(s) will be resized to match the input image size if they differ.\n        - To make this transform serializable, provide a name when initializing it.\n\n    Mathematical Formulation:\n        Given:\n        - I: Input image\n        - T: Template image\n        - w_i: Weight of input image (sampled from img_weight)\n\n        The blended image B is computed as:\n\n        B = w_i * I + (1 - w_i) * T\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply template transform with a single template\n        &gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with multiple templates and custom weights\n        &gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=templates,\n        ...     img_weight=(0.3, 0.7),\n        ...     name=\"multi_template_transform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with additional transforms on the template\n        &gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=template,\n        ...     img_weight=0.6,\n        ...     template_transform=template_transform,\n        ...     name=\"transformed_template\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n    References:\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Image blending: https://en.wikipedia.org/wiki/Image_blending\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        templates: np.ndarray | Sequence[np.ndarray]\n        img_weight: ZeroOneRangeType\n        template_weight: ZeroOneRangeType | None = Field(\n            deprecated=\"Template_weight is deprecated. Computed automatically as (1 - img_weight)\",\n        )\n        template_transform: Compose | BasicTransform | None = None\n        name: str | None\n\n        @field_validator(\"templates\")\n        @classmethod\n        def validate_templates(cls, v: np.ndarray | list[np.ndarray]) -&gt; list[np.ndarray]:\n            if isinstance(v, np.ndarray):\n                return [v]\n            if isinstance(v, list):\n                if not all(isinstance(item, np.ndarray) for item in v):\n                    msg = \"All templates must be numpy arrays.\"\n                    raise ValueError(msg)\n                return v\n            msg = \"Templates must be a numpy array or a list of numpy arrays.\"\n            raise TypeError(msg)\n\n    def __init__(\n        self,\n        templates: np.ndarray | list[np.ndarray],\n        img_weight: ScaleFloatType = (0.5, 0.5),\n        template_weight: None = None,\n        template_transform: Compose | BasicTransform | None = None,\n        name: str | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.templates = templates\n        self.img_weight = cast(tuple[float, float], img_weight)\n        self.template_transform = template_transform\n        self.name = name\n\n    def apply(\n        self,\n        img: np.ndarray,\n        template: np.ndarray,\n        img_weight: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if img_weight == 0:\n            return template\n        if img_weight == 1:\n            return img\n\n        return add_weighted(img, img_weight, template, 1 - img_weight)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"img_weight\": self.py_random.uniform(*self.img_weight),\n        }\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        template = self.py_random.choice(self.templates)\n\n        if self.template_transform is not None:\n            template = self.template_transform(image=template)[\"image\"]\n\n        if get_num_channels(template) not in [1, get_num_channels(image)]:\n            msg = (\n                \"Template must be a single channel or \"\n                \"has the same number of channels as input \"\n                f\"image ({get_num_channels(image)}), got {get_num_channels(template)}\"\n            )\n            raise ValueError(msg)\n\n        if template.dtype != image.dtype:\n            msg = \"Image and template must be the same image type\"\n            raise ValueError(msg)\n\n        if image.shape[:2] != template.shape[:2]:\n            template = fgeometric.resize(template, image.shape[:2], interpolation=cv2.INTER_AREA)\n\n        if get_num_channels(template) == 1 and get_num_channels(image) &gt; 1:\n            # Replicate single channel template across all channels to match input image\n            template = cv2.merge([template] * get_num_channels(image))\n        # in order to support grayscale image with dummy dim\n        template = template.reshape(image.shape)\n\n        return {\"template\": template}\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        if self.name is None:\n            msg = (\n                \"To make a TemplateTransform serializable you should provide the `name` argument, \"\n                \"e.g. `TemplateTransform(name='my_transform', ...)`.\"\n            )\n            raise ValueError(msg)\n        return {\"__class_fullname__\": self.get_class_fullname(), \"__name__\": self.name}\n</code></pre>"},{"location":"api_reference/augmentations/functional/","title":"Functional transforms (augmentations.functional)","text":""},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_fog","title":"<code>def add_fog    (img, fog_intensity, alpha_coef, fog_particle_positions, fog_particle_radiuses)    </code> [view source on GitHub]","text":"<p>Add fog to the input image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image.</p> <code>fog_intensity</code> <code>float</code> <p>Intensity of the fog effect, between 0 and 1.</p> <code>alpha_coef</code> <code>float</code> <p>Base alpha (transparency) value for fog particles.</p> <code>fog_particle_positions</code> <code>list[tuple[int, int]]</code> <p>List of (x, y) coordinates for fog particles.</p> <code>fog_particle_radiuses</code> <code>list[int]</code> <p>List of radiuses for each fog particle.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with added fog effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\n@preserve_channel_dim\ndef add_fog(\n    img: np.ndarray,\n    fog_intensity: float,\n    alpha_coef: float,\n    fog_particle_positions: list[tuple[int, int]],\n    fog_particle_radiuses: list[int],\n) -&gt; np.ndarray:\n    \"\"\"Add fog to the input image.\n\n    Args:\n        img (np.ndarray): Input image.\n        fog_intensity (float): Intensity of the fog effect, between 0 and 1.\n        alpha_coef (float): Base alpha (transparency) value for fog particles.\n        fog_particle_positions (list[tuple[int, int]]): List of (x, y) coordinates for fog particles.\n        fog_particle_radiuses (list[int]): List of radiuses for each fog particle.\n\n    Returns:\n        np.ndarray: Image with added fog effect.\n    \"\"\"\n    height, width = img.shape[:2]\n    num_channels = get_num_channels(img)\n\n    fog_layer = np.zeros((height, width, num_channels), dtype=np.uint8)\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    for (x, y), radius in zip(fog_particle_positions, fog_particle_radiuses):\n        color = max_value if num_channels == 1 else (max_value,) * num_channels\n        cv2.circle(\n            fog_layer,\n            center=(x, y),\n            radius=radius,\n            color=color,\n            thickness=-1,\n        )\n\n    # Apply gaussian blur to the fog layer\n    fog_layer = cv2.GaussianBlur(fog_layer, (25, 25), 0)\n\n    # Blend the fog layer with the original image\n    alpha = np.mean(fog_layer, axis=2, keepdims=True) / max_value * alpha_coef * fog_intensity\n\n    result = img * (1 - alpha) + fog_layer * alpha\n\n    return clip(result, np.uint8, inplace=True)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_rain","title":"<code>def add_rain    (img, slant, drop_length, drop_width, drop_color, blur_value, brightness_coefficient, rain_drops)    </code> [view source on GitHub]","text":"<p>Adds rain drops to the image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image.</p> <code>slant</code> <code>int</code> <p>The angle of the rain drops.</p> <code>drop_length</code> <code>int</code> <p>The length of each rain drop.</p> <code>drop_width</code> <code>int</code> <p>The width of each rain drop.</p> <code>drop_color</code> <code>tuple[int, int, int]</code> <p>The color of the rain drops in RGB format.</p> <code>blur_value</code> <code>int</code> <p>The size of the kernel used to blur the image. Rainy views are blurry.</p> <code>brightness_coefficient</code> <code>float</code> <p>Coefficient to adjust the brightness of the image. Rainy days are usually shady.</p> <code>rain_drops</code> <code>list[tuple[int, int]]</code> <p>A list of tuples where each tuple represents the (x, y) coordinates of the starting point of a rain drop.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with rain effect added.</p> <p>Reference</p> <p>https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef add_rain(\n    img: np.ndarray,\n    slant: int,\n    drop_length: int,\n    drop_width: int,\n    drop_color: tuple[int, int, int],\n    blur_value: int,\n    brightness_coefficient: float,\n    rain_drops: list[tuple[int, int]],\n) -&gt; np.ndarray:\n    \"\"\"Adds rain drops to the image.\n\n    Args:\n        img (np.ndarray): Input image.\n        slant (int): The angle of the rain drops.\n        drop_length (int): The length of each rain drop.\n        drop_width (int): The width of each rain drop.\n        drop_color (tuple[int, int, int]): The color of the rain drops in RGB format.\n        blur_value (int): The size of the kernel used to blur the image. Rainy views are blurry.\n        brightness_coefficient (float): Coefficient to adjust the brightness of the image. Rainy days are usually shady.\n        rain_drops (list[tuple[int, int]]): A list of tuples where each tuple represents the (x, y)\n            coordinates of the starting point of a rain drop.\n\n    Returns:\n        np.ndarray: Image with rain effect added.\n\n    Reference:\n        https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n    \"\"\"\n    img = img.copy()\n    for rain_drop_x0, rain_drop_y0 in rain_drops:\n        rain_drop_x1 = rain_drop_x0 + slant\n        rain_drop_y1 = rain_drop_y0 + drop_length\n\n        cv2.line(\n            img,\n            (rain_drop_x0, rain_drop_y0),\n            (rain_drop_x1, rain_drop_y1),\n            drop_color,\n            drop_width,\n        )\n\n    img = cv2.blur(img, (blur_value, blur_value))  # rainy view are blurry\n    image_hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV).astype(np.float32)\n    image_hsv[:, :, 2] *= brightness_coefficient\n\n    return cv2.cvtColor(image_hsv.astype(np.uint8), cv2.COLOR_HSV2RGB)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_shadow","title":"<code>def add_shadow    (img, vertices_list, intensities)    </code> [view source on GitHub]","text":"<p>Add shadows to the image by reducing the intensity of the pixel values in specified regions.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Multichannel images are supported.</p> <code>vertices_list</code> <code>list[np.ndarray]</code> <p>List of vertices for shadow polygons.</p> <code>intensities</code> <code>np.ndarray</code> <p>Array of shadow intensities. Range is [0, 1].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with shadows added.</p> <p>Reference</p> <p>https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef add_shadow(\n    img: np.ndarray,\n    vertices_list: list[np.ndarray],\n    intensities: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Add shadows to the image by reducing the intensity of the pixel values in specified regions.\n\n    Args:\n        img (np.ndarray): Input image. Multichannel images are supported.\n        vertices_list (list[np.ndarray]): List of vertices for shadow polygons.\n        intensities (np.ndarray): Array of shadow intensities. Range is [0, 1].\n\n    Returns:\n        np.ndarray: Image with shadows added.\n\n    Reference:\n        https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n    \"\"\"\n    num_channels = get_num_channels(img)\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    img_shadowed = img.copy()\n\n    # Iterate over the vertices and intensity list\n    for vertices, shadow_intensity in zip(vertices_list, intensities):\n        # Create mask for the current shadow polygon\n        mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)\n        cv2.fillPoly(mask, [vertices], (max_value,))\n\n        # Duplicate the mask to have the same number of channels as the image\n        mask = np.repeat(mask, num_channels, axis=2)\n\n        # Apply shadow to the channels directly\n        # It could be tempting to convert to HLS and apply the shadow to the L channel, but it creates artifacts\n        shadowed_indices = mask[:, :, 0] == max_value\n        darkness = 1 - shadow_intensity\n        img_shadowed[shadowed_indices] = clip(\n            img_shadowed[shadowed_indices] * darkness,\n            np.uint8,\n            inplace=True,\n        )\n\n    return img_shadowed\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_snow_bleach","title":"<code>def add_snow_bleach    (img, snow_point, brightness_coeff)    </code> [view source on GitHub]","text":"<p>Adds a simple snow effect to the image by bleaching out pixels.</p> <p>This function simulates a basic snow effect by increasing the brightness of pixels that are above a certain threshold (snow_point). It operates in the HLS color space to modify the lightness channel.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be either RGB uint8 or float32.</p> <code>snow_point</code> <code>float</code> <p>A float in the range [0, 1], scaled and adjusted to determine the threshold for pixel modification. Higher values result in less snow effect.</p> <code>brightness_coeff</code> <code>float</code> <p>Coefficient applied to increase the brightness of pixels below the snow_point threshold. Larger values lead to more pronounced snow effects. Should be greater than 1.0 for a visible effect.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with simulated snow effect. The output has the same dtype as the input.</p> <p>Note</p> <ul> <li>This function converts the image to the HLS color space to modify the lightness channel.</li> <li>The snow effect is created by selectively increasing the brightness of pixels.</li> <li>This method tends to create a 'bleached' look, which may not be as realistic as more   advanced snow simulation techniques.</li> <li>The function automatically handles both uint8 and float32 input images.</li> </ul> <p>The snow effect is created through the following steps: 1. Convert the image from RGB to HLS color space. 2. Adjust the snow_point threshold. 3. Increase the lightness of pixels below the threshold. 4. Convert the image back to RGB.</p> <p>Mathematical Formulation:     Let L be the lightness channel in HLS space.     For each pixel (i, j):     If L[i, j] &lt; snow_point:         L[i, j] = L[i, j] * brightness_coeff</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; snowy_image = A.functional.add_snow_v1(image, snow_point=0.5, brightness_coeff=1.5)\n</code></pre> <p>References</p> <ul> <li>HLS Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> <li>Original implementation: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef add_snow_bleach(\n    img: np.ndarray,\n    snow_point: float,\n    brightness_coeff: float,\n) -&gt; np.ndarray:\n    \"\"\"Adds a simple snow effect to the image by bleaching out pixels.\n\n    This function simulates a basic snow effect by increasing the brightness of pixels\n    that are above a certain threshold (snow_point). It operates in the HLS color space\n    to modify the lightness channel.\n\n    Args:\n        img (np.ndarray): Input image. Can be either RGB uint8 or float32.\n        snow_point (float): A float in the range [0, 1], scaled and adjusted to determine\n            the threshold for pixel modification. Higher values result in less snow effect.\n        brightness_coeff (float): Coefficient applied to increase the brightness of pixels\n            below the snow_point threshold. Larger values lead to more pronounced snow effects.\n            Should be greater than 1.0 for a visible effect.\n\n    Returns:\n        np.ndarray: Image with simulated snow effect. The output has the same dtype as the input.\n\n    Note:\n        - This function converts the image to the HLS color space to modify the lightness channel.\n        - The snow effect is created by selectively increasing the brightness of pixels.\n        - This method tends to create a 'bleached' look, which may not be as realistic as more\n          advanced snow simulation techniques.\n        - The function automatically handles both uint8 and float32 input images.\n\n    The snow effect is created through the following steps:\n    1. Convert the image from RGB to HLS color space.\n    2. Adjust the snow_point threshold.\n    3. Increase the lightness of pixels below the threshold.\n    4. Convert the image back to RGB.\n\n    Mathematical Formulation:\n        Let L be the lightness channel in HLS space.\n        For each pixel (i, j):\n        If L[i, j] &lt; snow_point:\n            L[i, j] = L[i, j] * brightness_coeff\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; snowy_image = A.functional.add_snow_v1(image, snow_point=0.5, brightness_coeff=1.5)\n\n    References:\n        - HLS Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV\n        - Original implementation: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n    \"\"\"\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    snow_point *= max_value / 2\n    snow_point += max_value / 3\n\n    image_hls = cv2.cvtColor(img, cv2.COLOR_RGB2HLS)\n    image_hls = np.array(image_hls, dtype=np.float32)\n\n    image_hls[:, :, 1][image_hls[:, :, 1] &lt; snow_point] *= brightness_coeff\n\n    image_hls[:, :, 1] = clip(image_hls[:, :, 1], np.uint8, inplace=True)\n\n    image_hls = np.array(image_hls, dtype=np.uint8)\n\n    return cv2.cvtColor(image_hls, cv2.COLOR_HLS2RGB)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_snow_texture","title":"<code>def add_snow_texture    (img, snow_point, brightness_coeff, snow_texture, sparkle_mask)    </code> [view source on GitHub]","text":"<p>Add a realistic snow effect to the input image.</p> <p>This function simulates snowfall by applying multiple visual effects to the image, including brightness adjustment, snow texture overlay, depth simulation, and color tinting. The result is a more natural-looking snow effect compared to simple pixel bleaching methods.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image in RGB format.</p> <code>snow_point</code> <code>float</code> <p>Coefficient that controls the amount and intensity of snow. Should be in the range [0, 1], where 0 means no snow and 1 means maximum snow effect.</p> <code>brightness_coeff</code> <code>float</code> <p>Coefficient for brightness adjustment to simulate the reflective nature of snow. Should be in the range [0, 1], where higher values result in a brighter image.</p> <code>snow_texture</code> <code>np.ndarray</code> <p>Snow texture.</p> <code>sparkle_mask</code> <code>np.ndarray</code> <p>Sparkle mask.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with added snow effect. The output has the same dtype as the input.</p> <p>Note</p> <ul> <li>The function first converts the image to HSV color space for better control over   brightness and color adjustments.</li> <li>A snow texture is generated using Gaussian noise and then filtered for a more   natural appearance.</li> <li>A depth effect is simulated, with more snow at the top of the image and less at the bottom.</li> <li>A slight blue tint is added to simulate the cool color of snow.</li> <li>Random sparkle effects are added to simulate light reflecting off snow crystals.</li> </ul> <p>The snow effect is created through the following steps: 1. Brightness adjustment in HSV space 2. Generation of a snow texture using Gaussian noise 3. Application of a depth effect to the snow texture 4. Blending of the snow texture with the original image 5. Addition of a cool blue tint 6. Addition of sparkle effects</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; snowy_image = A.functional.add_snow_v2(image, snow_coeff=0.5, brightness_coeff=0.2)\n</code></pre> <p>Note</p> <p>This function works with both uint8 and float32 image types, automatically handling the conversion between them.</p> <p>References</p> <ul> <li>Perlin Noise: https://en.wikipedia.org/wiki/Perlin_noise</li> <li>HSV Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef add_snow_texture(\n    img: np.ndarray,\n    snow_point: float,\n    brightness_coeff: float,\n    snow_texture: np.ndarray,\n    sparkle_mask: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Add a realistic snow effect to the input image.\n\n    This function simulates snowfall by applying multiple visual effects to the image,\n    including brightness adjustment, snow texture overlay, depth simulation, and color tinting.\n    The result is a more natural-looking snow effect compared to simple pixel bleaching methods.\n\n    Args:\n        img (np.ndarray): Input image in RGB format.\n        snow_point (float): Coefficient that controls the amount and intensity of snow.\n            Should be in the range [0, 1], where 0 means no snow and 1 means maximum snow effect.\n        brightness_coeff (float): Coefficient for brightness adjustment to simulate the\n            reflective nature of snow. Should be in the range [0, 1], where higher values\n            result in a brighter image.\n        snow_texture (np.ndarray): Snow texture.\n        sparkle_mask (np.ndarray): Sparkle mask.\n\n    Returns:\n        np.ndarray: Image with added snow effect. The output has the same dtype as the input.\n\n    Note:\n        - The function first converts the image to HSV color space for better control over\n          brightness and color adjustments.\n        - A snow texture is generated using Gaussian noise and then filtered for a more\n          natural appearance.\n        - A depth effect is simulated, with more snow at the top of the image and less at the bottom.\n        - A slight blue tint is added to simulate the cool color of snow.\n        - Random sparkle effects are added to simulate light reflecting off snow crystals.\n\n    The snow effect is created through the following steps:\n    1. Brightness adjustment in HSV space\n    2. Generation of a snow texture using Gaussian noise\n    3. Application of a depth effect to the snow texture\n    4. Blending of the snow texture with the original image\n    5. Addition of a cool blue tint\n    6. Addition of sparkle effects\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; snowy_image = A.functional.add_snow_v2(image, snow_coeff=0.5, brightness_coeff=0.2)\n\n    Note:\n        This function works with both uint8 and float32 image types, automatically\n        handling the conversion between them.\n\n    References:\n        - Perlin Noise: https://en.wikipedia.org/wiki/Perlin_noise\n        - HSV Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV\n    \"\"\"\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    # Convert to HSV for better color control\n    img_hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV).astype(np.float32)\n\n    # Increase brightness\n    img_hsv[:, :, 2] = np.clip(\n        img_hsv[:, :, 2] * (1 + brightness_coeff * snow_point),\n        0,\n        max_value,\n    )\n\n    # Generate snow texture\n    snow_texture = cv2.GaussianBlur(snow_texture, (0, 0), sigmaX=1, sigmaY=1)\n\n    # Create depth effect for snow simulation\n    # More snow accumulates at the top of the image, gradually decreasing towards the bottom\n    # This simulates natural snow distribution on surfaces\n    # The effect is achieved using a linear gradient from 1 (full snow) to 0.2 (less snow)\n    rows = img.shape[0]\n    depth_effect = np.linspace(1, 0.2, rows)[:, np.newaxis]\n    snow_texture *= depth_effect\n\n    # Apply snow texture\n    snow_layer = (np.dstack([snow_texture] * 3) * max_value * snow_point).astype(\n        np.float32,\n    )\n\n    # Blend snow with original image\n    img_with_snow = cv2.add(img_hsv, snow_layer)\n\n    # Add a slight blue tint to simulate cool snow color\n    blue_tint = np.full_like(img_with_snow, (0.6, 0.75, 1))  # Slight blue in HSV\n\n    img_with_snow = cv2.addWeighted(\n        img_with_snow,\n        0.85,\n        blue_tint,\n        0.15 * snow_point,\n        0,\n    )\n\n    # Convert back to RGB\n    img_with_snow = cv2.cvtColor(img_with_snow.astype(np.uint8), cv2.COLOR_HSV2RGB)\n\n    # Add some sparkle effects for snow glitter\n    img_with_snow[sparkle_mask] = [max_value, max_value, max_value]\n\n    return img_with_snow\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_sun_flare_overlay","title":"<code>def add_sun_flare_overlay    (img, flare_center, src_radius, src_color, circles)    </code> [view source on GitHub]","text":"<p>Add a sun flare effect to an image using a simple overlay technique.</p> <p>This function creates a basic sun flare effect by overlaying multiple semi-transparent circles of varying sizes and intensities on the input image. The effect simulates a simple lens flare caused by bright light sources.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The input image.</p> <code>flare_center</code> <code>tuple[float, float]</code> <p>(x, y) coordinates of the flare center in pixel coordinates.</p> <code>src_radius</code> <code>int</code> <p>The radius of the main sun circle in pixels.</p> <code>src_color</code> <code>tuple[int, ...]</code> <p>The color of the sun, represented as a tuple of RGB values.</p> <code>circles</code> <code>list[Any]</code> <p>A list of tuples, each representing a circle that contributes to the flare effect. Each tuple contains: - alpha (float): The transparency of the circle (0.0 to 1.0). - center (tuple[int, int]): (x, y) coordinates of the circle center. - radius (int): The radius of the circle. - color (tuple[int, int, int]): RGB color of the circle.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The output image with the sun flare effect added.</p> <p>Note</p> <ul> <li>This function uses a simple alpha blending technique to overlay flare elements.</li> <li>The main sun is created as a gradient circle, fading from the center outwards.</li> <li>Additional flare circles are added along an imaginary line from the sun's position.</li> <li>This method is computationally efficient but may produce less realistic results   compared to more advanced techniques.</li> </ul> <p>The flare effect is created through the following steps: 1. Create an overlay image and output image as copies of the input. 2. Add smaller flare circles to the overlay. 3. Blend the overlay with the output image using alpha compositing. 4. Add the main sun circle with a radial gradient.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; flare_center = (50, 50)\n&gt;&gt;&gt; src_radius = 20\n&gt;&gt;&gt; src_color = (255, 255, 200)\n&gt;&gt;&gt; circles = [\n...     (0.1, (60, 60), 5, (255, 200, 200)),\n...     (0.2, (70, 70), 3, (200, 255, 200))\n... ]\n&gt;&gt;&gt; flared_image = A.functional.add_sun_flare_overlay(\n...     image, flare_center, src_radius, src_color, circles\n... )\n</code></pre> <p>References</p> <ul> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Lens flare: https://en.wikipedia.org/wiki/Lens_flare</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\n@maybe_process_in_chunks\ndef add_sun_flare_overlay(\n    img: np.ndarray,\n    flare_center: tuple[float, float],\n    src_radius: int,\n    src_color: tuple[int, ...],\n    circles: list[Any],\n) -&gt; np.ndarray:\n    \"\"\"Add a sun flare effect to an image using a simple overlay technique.\n\n    This function creates a basic sun flare effect by overlaying multiple semi-transparent\n    circles of varying sizes and intensities on the input image. The effect simulates\n    a simple lens flare caused by bright light sources.\n\n    Args:\n        img (np.ndarray): The input image.\n        flare_center (tuple[float, float]): (x, y) coordinates of the flare center\n            in pixel coordinates.\n        src_radius (int): The radius of the main sun circle in pixels.\n        src_color (tuple[int, ...]): The color of the sun, represented as a tuple of RGB values.\n        circles (list[Any]): A list of tuples, each representing a circle that contributes\n            to the flare effect. Each tuple contains:\n            - alpha (float): The transparency of the circle (0.0 to 1.0).\n            - center (tuple[int, int]): (x, y) coordinates of the circle center.\n            - radius (int): The radius of the circle.\n            - color (tuple[int, int, int]): RGB color of the circle.\n\n    Returns:\n        np.ndarray: The output image with the sun flare effect added.\n\n    Note:\n        - This function uses a simple alpha blending technique to overlay flare elements.\n        - The main sun is created as a gradient circle, fading from the center outwards.\n        - Additional flare circles are added along an imaginary line from the sun's position.\n        - This method is computationally efficient but may produce less realistic results\n          compared to more advanced techniques.\n\n    The flare effect is created through the following steps:\n    1. Create an overlay image and output image as copies of the input.\n    2. Add smaller flare circles to the overlay.\n    3. Blend the overlay with the output image using alpha compositing.\n    4. Add the main sun circle with a radial gradient.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; flare_center = (50, 50)\n        &gt;&gt;&gt; src_radius = 20\n        &gt;&gt;&gt; src_color = (255, 255, 200)\n        &gt;&gt;&gt; circles = [\n        ...     (0.1, (60, 60), 5, (255, 200, 200)),\n        ...     (0.2, (70, 70), 3, (200, 255, 200))\n        ... ]\n        &gt;&gt;&gt; flared_image = A.functional.add_sun_flare_overlay(\n        ...     image, flare_center, src_radius, src_color, circles\n        ... )\n\n    References:\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Lens flare: https://en.wikipedia.org/wiki/Lens_flare\n    \"\"\"\n    overlay = img.copy()\n    output = img.copy()\n\n    weighted_brightness = 0.0\n    total_radius_length = 0.0\n\n    for alpha, (x, y), rad3, circle_color in circles:\n        weighted_brightness += alpha * rad3\n        total_radius_length += rad3\n        cv2.circle(overlay, (x, y), rad3, circle_color, -1)\n        output = add_weighted(overlay, alpha, output, 1 - alpha)\n\n    point = [int(x) for x in flare_center]\n\n    overlay = output.copy()\n    num_times = src_radius // 10\n\n    # max_alpha is calculated using weighted_brightness and total_radii_length times 5\n    # meaning the higher the alpha with larger area, the brighter the bright spot will be\n    # for list of alphas in range [0.05, 0.2], the max_alpha should below 1\n    max_alpha = weighted_brightness / total_radius_length * 5\n    alpha = np.linspace(0.0, min(max_alpha, 1.0), num=num_times)\n\n    rad = np.linspace(1, src_radius, num=num_times)\n\n    for i in range(num_times):\n        cv2.circle(overlay, point, int(rad[i]), src_color, -1)\n        alp = alpha[num_times - i - 1] * alpha[num_times - i - 1] * alpha[num_times - i - 1]\n        output = add_weighted(overlay, alp, output, 1 - alp)\n\n    return output\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_sun_flare_physics_based","title":"<code>def add_sun_flare_physics_based    (img, flare_center, src_radius, src_color, circles)    </code> [view source on GitHub]","text":"<p>Add a more realistic sun flare effect to the image.</p> <p>This function creates a complex sun flare effect by simulating various optical phenomena that occur in real camera lenses when capturing bright light sources. The result is a more realistic and physically plausible lens flare effect.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image.</p> <code>flare_center</code> <code>tuple[int, int]</code> <p>(x, y) coordinates of the sun's center in pixels.</p> <code>src_radius</code> <code>int</code> <p>Radius of the main sun circle in pixels.</p> <code>src_color</code> <code>tuple[int, int, int]</code> <p>Color of the sun in RGB format.</p> <code>circles</code> <code>list[Any]</code> <p>List of tuples, each representing a flare circle with parameters: (alpha, center, size, color) - alpha (float): Transparency of the circle (0.0 to 1.0). - center (tuple[int, int]): (x, y) coordinates of the circle center. - size (float): Size factor for the circle radius. - color (tuple[int, int, int]): RGB color of the circle.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with added sun flare effect.</p> <p>Note</p> <p>This function implements several techniques to create a more realistic flare: 1. Separate flare layer: Allows for complex manipulations of the flare effect. 2. Lens diffraction spikes: Simulates light diffraction in camera aperture. 3. Radial gradient mask: Creates natural fading of the flare from the center. 4. Gaussian blur: Softens the flare for a more natural glow effect. 5. Chromatic aberration: Simulates color fringing often seen in real lens flares. 6. Screen blending: Provides a more realistic blending of the flare with the image.</p> <p>The flare effect is created through the following steps: 1. Create a separate flare layer. 2. Add the main sun circle and diffraction spikes to the flare layer. 3. Add additional flare circles based on the input parameters. 4. Apply Gaussian blur to soften the flare. 5. Create and apply a radial gradient mask for natural fading. 6. Simulate chromatic aberration by applying different blurs to color channels. 7. Blend the flare with the original image using screen blending mode.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n&gt;&gt;&gt; flare_center = (500, 500)\n&gt;&gt;&gt; src_radius = 50\n&gt;&gt;&gt; src_color = (255, 255, 200)\n&gt;&gt;&gt; circles = [\n...     (0.1, (550, 550), 10, (255, 200, 200)),\n...     (0.2, (600, 600), 5, (200, 255, 200))\n... ]\n&gt;&gt;&gt; flared_image = A.functional.add_sun_flare_physics_based(\n...     image, flare_center, src_radius, src_color, circles\n... )\n</code></pre> <p>References</p> <ul> <li>Lens flare: https://en.wikipedia.org/wiki/Lens_flare</li> <li>Diffraction: https://en.wikipedia.org/wiki/Diffraction</li> <li>Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration</li> <li>Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\ndef add_sun_flare_physics_based(\n    img: np.ndarray,\n    flare_center: tuple[int, int],\n    src_radius: int,\n    src_color: tuple[int, int, int],\n    circles: list[Any],\n) -&gt; np.ndarray:\n    \"\"\"Add a more realistic sun flare effect to the image.\n\n    This function creates a complex sun flare effect by simulating various optical phenomena\n    that occur in real camera lenses when capturing bright light sources. The result is a\n    more realistic and physically plausible lens flare effect.\n\n    Args:\n        img (np.ndarray): Input image.\n        flare_center (tuple[int, int]): (x, y) coordinates of the sun's center in pixels.\n        src_radius (int): Radius of the main sun circle in pixels.\n        src_color (tuple[int, int, int]): Color of the sun in RGB format.\n        circles (list[Any]): List of tuples, each representing a flare circle with parameters:\n            (alpha, center, size, color)\n            - alpha (float): Transparency of the circle (0.0 to 1.0).\n            - center (tuple[int, int]): (x, y) coordinates of the circle center.\n            - size (float): Size factor for the circle radius.\n            - color (tuple[int, int, int]): RGB color of the circle.\n\n    Returns:\n        np.ndarray: Image with added sun flare effect.\n\n    Note:\n        This function implements several techniques to create a more realistic flare:\n        1. Separate flare layer: Allows for complex manipulations of the flare effect.\n        2. Lens diffraction spikes: Simulates light diffraction in camera aperture.\n        3. Radial gradient mask: Creates natural fading of the flare from the center.\n        4. Gaussian blur: Softens the flare for a more natural glow effect.\n        5. Chromatic aberration: Simulates color fringing often seen in real lens flares.\n        6. Screen blending: Provides a more realistic blending of the flare with the image.\n\n    The flare effect is created through the following steps:\n    1. Create a separate flare layer.\n    2. Add the main sun circle and diffraction spikes to the flare layer.\n    3. Add additional flare circles based on the input parameters.\n    4. Apply Gaussian blur to soften the flare.\n    5. Create and apply a radial gradient mask for natural fading.\n    6. Simulate chromatic aberration by applying different blurs to color channels.\n    7. Blend the flare with the original image using screen blending mode.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n        &gt;&gt;&gt; flare_center = (500, 500)\n        &gt;&gt;&gt; src_radius = 50\n        &gt;&gt;&gt; src_color = (255, 255, 200)\n        &gt;&gt;&gt; circles = [\n        ...     (0.1, (550, 550), 10, (255, 200, 200)),\n        ...     (0.2, (600, 600), 5, (200, 255, 200))\n        ... ]\n        &gt;&gt;&gt; flared_image = A.functional.add_sun_flare_physics_based(\n        ...     image, flare_center, src_radius, src_color, circles\n        ... )\n\n    References:\n        - Lens flare: https://en.wikipedia.org/wiki/Lens_flare\n        - Diffraction: https://en.wikipedia.org/wiki/Diffraction\n        - Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration\n        - Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen\n    \"\"\"\n    output = img.copy()\n    height, width = img.shape[:2]\n\n    # Create a separate flare layer\n    flare_layer = np.zeros_like(img, dtype=np.float32)\n\n    # Add the main sun\n    cv2.circle(flare_layer, flare_center, src_radius, src_color, -1)\n\n    # Add lens diffraction spikes\n    for angle in [0, 45, 90, 135]:\n        end_point = (\n            int(flare_center[0] + np.cos(np.radians(angle)) * max(width, height)),\n            int(flare_center[1] + np.sin(np.radians(angle)) * max(width, height)),\n        )\n        cv2.line(flare_layer, flare_center, end_point, src_color, 2)\n\n    # Add flare circles\n    for _, center, size, color in circles:\n        cv2.circle(flare_layer, center, int(size**0.33), color, -1)\n\n    # Apply gaussian blur to soften the flare\n    flare_layer = cv2.GaussianBlur(flare_layer, (0, 0), sigmaX=15, sigmaY=15)\n\n    # Create a radial gradient mask\n    y, x = np.ogrid[:height, :width]\n    mask = np.sqrt((x - flare_center[0]) ** 2 + (y - flare_center[1]) ** 2)\n    mask = 1 - np.clip(mask / (max(width, height) * 0.7), 0, 1)\n    mask = np.dstack([mask] * 3)\n\n    # Apply the mask to the flare layer\n    flare_layer *= mask\n\n    # Add chromatic aberration\n    channels = list(cv2.split(flare_layer))\n    channels[0] = cv2.GaussianBlur(\n        channels[0],\n        (0, 0),\n        sigmaX=3,\n        sigmaY=3,\n    )  # Blue channel\n    channels[2] = cv2.GaussianBlur(\n        channels[2],\n        (0, 0),\n        sigmaX=5,\n        sigmaY=5,\n    )  # Red channel\n    flare_layer = cv2.merge(channels)\n\n    # Blend the flare with the original image using screen blending\n    return 255 - ((255 - output) * (255 - flare_layer) / 255)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_corner_illumination","title":"<code>def apply_corner_illumination    (img, intensity, corner)    </code> [view source on GitHub]","text":"<p>Apply corner-based illumination effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_corner_illumination(\n    img: np.ndarray,\n    intensity: float,\n    corner: Literal[0, 1, 2, 3],\n) -&gt; np.ndarray:\n    \"\"\"Apply corner-based illumination effect.\"\"\"\n    result, height, width = prepare_illumination_input(img)\n\n    # Create distance map coordinates\n    y, x = np.ogrid[:height, :width]\n\n    # Adjust coordinates based on corner\n    if corner == 1:  # top-right\n        x = width - 1 - x\n    elif corner == 2:  # bottom-right\n        x = width - 1 - x\n        y = height - 1 - y\n    elif corner == 3:  # bottom-left\n        y = height - 1 - y\n\n    # Calculate normalized distance\n    distance = np.sqrt(x * x + y * y) / np.sqrt(height * height + width * width)\n    pattern = 1 - distance  # Invert so corner is brightest\n\n    return apply_illumination_pattern(result, pattern, intensity)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_gaussian_illumination","title":"<code>def apply_gaussian_illumination    (img, intensity, center, sigma)    </code> [view source on GitHub]","text":"<p>Apply gaussian illumination effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_gaussian_illumination(\n    img: np.ndarray,\n    intensity: float,\n    center: tuple[float, float],\n    sigma: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply gaussian illumination effect.\"\"\"\n    result, height, width = prepare_illumination_input(img)\n\n    # Create coordinate grid\n    y, x = np.ogrid[:height, :width]\n\n    # Calculate gaussian pattern\n    center_x = width * center[0]\n    center_y = height * center[1]\n    sigma_pixels = max(height, width) * sigma\n    gaussian = np.exp(\n        -((x - center_x) ** 2 + (y - center_y) ** 2) / (2 * sigma_pixels**2),\n    )\n\n    return apply_illumination_pattern(result, gaussian, intensity)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_illumination_pattern","title":"<code>def apply_illumination_pattern    (img, pattern, intensity)    </code> [view source on GitHub]","text":"<p>Apply illumination pattern to image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>pattern</code> <code>np.ndarray</code> <p>Illumination pattern of shape (H, W)</p> <code>intensity</code> <code>float</code> <p>Effect strength (-0.2 to 0.2)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with applied illumination</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def apply_illumination_pattern(\n    img: np.ndarray,\n    pattern: np.ndarray,\n    intensity: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply illumination pattern to image.\n\n    Args:\n        img: Input image\n        pattern: Illumination pattern of shape (H, W)\n        intensity: Effect strength (-0.2 to 0.2)\n\n    Returns:\n        Image with applied illumination\n    \"\"\"\n    if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n        pattern = pattern[..., np.newaxis]\n    return img * (1 + intensity * pattern)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_linear_illumination","title":"<code>def apply_linear_illumination    (img, intensity, angle)    </code> [view source on GitHub]","text":"<p>Apply linear gradient illumination effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_linear_illumination(\n    img: np.ndarray,\n    intensity: float,\n    angle: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply linear gradient illumination effect.\"\"\"\n    result, height, width = prepare_illumination_input(img)\n\n    # Create gradient coordinates\n    y, x = np.ogrid[:height, :width]\n\n    # Calculate gradient direction\n    angle_rad = np.deg2rad(angle)\n    dx, dy = np.cos(angle_rad), np.sin(angle_rad)\n\n    # Create normalized gradient\n    gradient = (x * dx + y * dy) / np.sqrt(height * height + width * width)\n    gradient = (gradient + 1) / 2  # Normalize to [0, 1]\n\n    return apply_illumination_pattern(result, gradient, intensity)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_plasma_brightness_contrast","title":"<code>def apply_plasma_brightness_contrast    (img, brightness_factor, contrast_factor, plasma_pattern)    </code> [view source on GitHub]","text":"<p>Apply plasma-based brightness and contrast adjustments.</p> <p>The plasma pattern is used to create spatially-varying adjustments: 1. Brightness is modified by adding the pattern * brightness_factor 2. Contrast is modified by interpolating between mean and original    using the pattern * contrast_factor</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_plasma_brightness_contrast(\n    img: np.ndarray,\n    brightness_factor: float,\n    contrast_factor: float,\n    plasma_pattern: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply plasma-based brightness and contrast adjustments.\n\n    The plasma pattern is used to create spatially-varying adjustments:\n    1. Brightness is modified by adding the pattern * brightness_factor\n    2. Contrast is modified by interpolating between mean and original\n       using the pattern * contrast_factor\n    \"\"\"\n    result = img.copy()\n\n    max_value = MAX_VALUES_BY_DTYPE[img.dtype]\n\n    # Expand plasma pattern to match image dimensions\n    plasma_pattern = plasma_pattern[..., np.newaxis] if img.ndim &gt; MONO_CHANNEL_DIMENSIONS else plasma_pattern\n\n    # Apply brightness adjustment\n    if brightness_factor != 0:\n        brightness_adjustment = plasma_pattern * brightness_factor * max_value\n        result = np.clip(result + brightness_adjustment, 0, max_value)\n\n    # Apply contrast adjustment\n    if contrast_factor != 0:\n        mean = result.mean()\n        contrast_weights = plasma_pattern * contrast_factor + 1\n        result = np.clip(mean + (result - mean) * contrast_weights, 0, max_value)\n\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_plasma_shadow","title":"<code>def apply_plasma_shadow    (img, intensity, plasma_pattern)    </code> [view source on GitHub]","text":"<p>Apply plasma-based shadow effect by darkening.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>intensity</code> <code>float</code> <p>Shadow intensity in [0, 1]</p> <code>plasma_pattern</code> <code>np.ndarray</code> <p>Generated plasma pattern of shape (H, W)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with applied shadow effect</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_plasma_shadow(\n    img: np.ndarray,\n    intensity: float,\n    plasma_pattern: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply plasma-based shadow effect by darkening.\n\n    Args:\n        img: Input image\n        intensity: Shadow intensity in [0, 1]\n        plasma_pattern: Generated plasma pattern of shape (H, W)\n\n    Returns:\n        Image with applied shadow effect\n    \"\"\"\n    result = img.copy()\n\n    # Expand dimensions to match image\n    plasma_pattern = plasma_pattern[..., np.newaxis] if img.ndim &gt; MONO_CHANNEL_DIMENSIONS else plasma_pattern\n\n    # Apply shadow by darkening (multiplying by values &lt; 1)\n    shadow_mask = 1 - plasma_pattern * intensity\n\n    return result * shadow_mask\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_salt_and_pepper","title":"<code>def apply_salt_and_pepper    (img, salt_mask, pepper_mask)    </code> [view source on GitHub]","text":"<p>Apply salt and pepper noise to image using pre-computed masks.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>salt_mask</code> <code>np.ndarray</code> <p>Boolean mask for salt (white) noise</p> <code>pepper_mask</code> <code>np.ndarray</code> <p>Boolean mask for pepper (black) noise</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with applied salt and pepper noise</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def apply_salt_and_pepper(\n    img: np.ndarray,\n    salt_mask: np.ndarray,\n    pepper_mask: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply salt and pepper noise to image using pre-computed masks.\n\n    Args:\n        img: Input image\n        salt_mask: Boolean mask for salt (white) noise\n        pepper_mask: Boolean mask for pepper (black) noise\n\n    Returns:\n        Image with applied salt and pepper noise\n    \"\"\"\n    result = img.copy()\n\n    result[salt_mask] = MAX_VALUES_BY_DTYPE[img.dtype]\n    result[pepper_mask] = 0\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.auto_contrast","title":"<code>def auto_contrast    (img)    </code> [view source on GitHub]","text":"<p>Apply auto contrast to the image.</p> <p>Auto contrast enhances image contrast by stretching the intensity range to use the full range while preserving relative intensities.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image in uint8 or float32 format.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Contrast-enhanced image in the same dtype as input.</p> <p>Note</p> <p>The function: 1. Computes histogram for each channel 2. Creates cumulative distribution 3. Normalizes to full intensity range 4. Uses lookup table for scaling</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef auto_contrast(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply auto contrast to the image.\n\n    Auto contrast enhances image contrast by stretching the intensity range\n    to use the full range while preserving relative intensities.\n\n    Args:\n        img: Input image in uint8 or float32 format.\n\n    Returns:\n        Contrast-enhanced image in the same dtype as input.\n\n    Note:\n        The function:\n        1. Computes histogram for each channel\n        2. Creates cumulative distribution\n        3. Normalizes to full intensity range\n        4. Uses lookup table for scaling\n    \"\"\"\n    result = img.copy()\n    num_channels = get_num_channels(img)\n    max_value = MAX_VALUES_BY_DTYPE[img.dtype]\n\n    for i in range(num_channels):\n        channel = img[..., i] if img.ndim &gt; MONO_CHANNEL_DIMENSIONS else img\n\n        # Compute histogram\n        hist = np.histogram(channel.flatten(), bins=256, range=(0, max_value))[0]\n\n        # Calculate cumulative distribution\n        cdf = hist.cumsum()\n\n        # Find the minimum and maximum non-zero values in the CDF\n        if cdf[cdf &gt; 0].size == 0:\n            continue  # Skip if the channel is constant or empty\n\n        cdf_min = cdf[cdf &gt; 0].min()\n        cdf_max = cdf.max()\n\n        if cdf_min == cdf_max:\n            continue\n\n        # Normalize CDF\n        cdf = (cdf - cdf_min) * max_value / (cdf_max - cdf_min)\n\n        # Create lookup table\n        lut = np.clip(np.around(cdf), 0, max_value).astype(np.uint8)\n\n        # Apply lookup table\n        if img.ndim &gt; MONO_CHANNEL_DIMENSIONS:\n            result[..., i] = sz_lut(channel, lut)\n        else:\n            result = sz_lut(channel, lut)\n\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.clahe","title":"<code>def clahe    (img, clip_limit, tile_grid_size)    </code> [view source on GitHub]","text":"<p>Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.</p> <p>This function enhances the contrast of the input image using CLAHE. For color images, it converts the image to the LAB color space, applies CLAHE to the L channel, and then converts the image back to RGB.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be grayscale (2D array) or RGB (3D array).</p> <code>clip_limit</code> <code>float</code> <p>Threshold for contrast limiting. Higher values give more contrast.</p> <code>tile_grid_size</code> <code>tuple[int, int]</code> <p>Size of grid for histogram equalization. Width and height of the grid.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with CLAHE applied. The output has the same dtype as the input.</p> <p>Note</p> <ul> <li>If the input image is float32, it's temporarily converted to uint8 for processing   and then converted back to float32.</li> <li>For color images, CLAHE is applied only to the luminance channel in the LAB color space.</li> </ul> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input image is not 2D or 3D.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; img = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; result = clahe(img, clip_limit=2.0, tile_grid_size=(8, 8))\n&gt;&gt;&gt; assert result.shape == img.shape\n&gt;&gt;&gt; assert result.dtype == img.dtype\n</code></pre> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef clahe(\n    img: np.ndarray,\n    clip_limit: float,\n    tile_grid_size: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.\n\n    This function enhances the contrast of the input image using CLAHE. For color images,\n    it converts the image to the LAB color space, applies CLAHE to the L channel, and then\n    converts the image back to RGB.\n\n    Args:\n        img (np.ndarray): Input image. Can be grayscale (2D array) or RGB (3D array).\n        clip_limit (float): Threshold for contrast limiting. Higher values give more contrast.\n        tile_grid_size (tuple[int, int]): Size of grid for histogram equalization.\n            Width and height of the grid.\n\n    Returns:\n        np.ndarray: Image with CLAHE applied. The output has the same dtype as the input.\n\n    Note:\n        - If the input image is float32, it's temporarily converted to uint8 for processing\n          and then converted back to float32.\n        - For color images, CLAHE is applied only to the luminance channel in the LAB color space.\n\n    Raises:\n        ValueError: If the input image is not 2D or 3D.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; img = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; result = clahe(img, clip_limit=2.0, tile_grid_size=(8, 8))\n        &gt;&gt;&gt; assert result.shape == img.shape\n        &gt;&gt;&gt; assert result.dtype == img.dtype\n    \"\"\"\n    img = img.copy()\n    clahe_mat = cv2.createCLAHE(clipLimit=clip_limit, tileGridSize=tile_grid_size)\n\n    if is_grayscale_image(img):\n        return clahe_mat.apply(img)\n\n    img = cv2.cvtColor(img, cv2.COLOR_RGB2LAB)\n\n    img[:, :, 0] = clahe_mat.apply(img[:, :, 0])\n\n    return cv2.cvtColor(img, cv2.COLOR_LAB2RGB)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.diamond_step","title":"<code>def diamond_step    (pattern, y, x, half, grid_size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Compute edge value during diamond step.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def diamond_step(\n    pattern: np.ndarray,\n    y: int,\n    x: int,\n    half: int,\n    grid_size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; float:\n    \"\"\"Compute edge value during diamond step.\"\"\"\n    points = []\n    if y &gt;= half:\n        points.append(pattern[y - half, x])\n    if y + half &lt;= grid_size:\n        points.append(pattern[y + half, x])\n    if x &gt;= half:\n        points.append(pattern[y, x - half])\n    if x + half &lt;= grid_size:\n        points.append(pattern[y, x + half])\n\n    return sum(points) / len(points) + random_offset(\n        half * 2,\n        grid_size,\n        roughness,\n        random_generator,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.equalize","title":"<code>def equalize    (img, mask=None, mode='cv', by_channels=True)    </code> [view source on GitHub]","text":"<p>Apply histogram equalization to the input image.</p> <p>This function enhances the contrast of the input image by equalizing its histogram. It supports both grayscale and color images, and can operate on individual channels or on the luminance channel of the image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be grayscale (2D array) or RGB (3D array).</p> <code>mask</code> <code>np.ndarray | None</code> <p>Optional mask to apply the equalization selectively. If provided, must have the same shape as the input image. Default: None.</p> <code>mode</code> <code>ImageMode</code> <p>The backend to use for equalization. Can be either \"cv\" for OpenCV or \"pil\" for Pillow-style equalization. Default: \"cv\".</p> <code>by_channels</code> <code>bool</code> <p>If True, applies equalization to each channel independently. If False, converts the image to YCrCb color space and equalizes only the luminance channel. Only applicable to color images. Default: True.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Equalized image. The output has the same dtype as the input.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input image or mask have invalid shapes or types.</p> <p>Note</p> <ul> <li>If the input image is not uint8, it will be temporarily converted to uint8   for processing and then converted back to its original dtype.</li> <li>For color images, when by_channels=False, the image is converted to YCrCb   color space, equalized on the Y channel, and then converted back to RGB.</li> <li>The function preserves the original number of channels in the image.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; equalized = A.equalize(image, mode=\"cv\", by_channels=True)\n&gt;&gt;&gt; assert equalized.shape == image.shape\n&gt;&gt;&gt; assert equalized.dtype == image.dtype\n</code></pre> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef equalize(\n    img: np.ndarray,\n    mask: np.ndarray | None = None,\n    mode: ImageMode = \"cv\",\n    by_channels: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Apply histogram equalization to the input image.\n\n    This function enhances the contrast of the input image by equalizing its histogram.\n    It supports both grayscale and color images, and can operate on individual channels\n    or on the luminance channel of the image.\n\n    Args:\n        img (np.ndarray): Input image. Can be grayscale (2D array) or RGB (3D array).\n        mask (np.ndarray | None): Optional mask to apply the equalization selectively.\n            If provided, must have the same shape as the input image. Default: None.\n        mode (ImageMode): The backend to use for equalization. Can be either \"cv\" for\n            OpenCV or \"pil\" for Pillow-style equalization. Default: \"cv\".\n        by_channels (bool): If True, applies equalization to each channel independently.\n            If False, converts the image to YCrCb color space and equalizes only the\n            luminance channel. Only applicable to color images. Default: True.\n\n    Returns:\n        np.ndarray: Equalized image. The output has the same dtype as the input.\n\n    Raises:\n        ValueError: If the input image or mask have invalid shapes or types.\n\n    Note:\n        - If the input image is not uint8, it will be temporarily converted to uint8\n          for processing and then converted back to its original dtype.\n        - For color images, when by_channels=False, the image is converted to YCrCb\n          color space, equalized on the Y channel, and then converted back to RGB.\n        - The function preserves the original number of channels in the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; equalized = A.equalize(image, mode=\"cv\", by_channels=True)\n        &gt;&gt;&gt; assert equalized.shape == image.shape\n        &gt;&gt;&gt; assert equalized.dtype == image.dtype\n    \"\"\"\n    _check_preconditions(img, mask, by_channels)\n\n    function = _equalize_pil if mode == \"pil\" else _equalize_cv\n\n    if is_grayscale_image(img):\n        return function(img, _handle_mask(mask))\n\n    if not by_channels:\n        result_img = cv2.cvtColor(img, cv2.COLOR_RGB2YCrCb)\n        result_img[..., 0] = function(result_img[..., 0], _handle_mask(mask))\n        return cv2.cvtColor(result_img, cv2.COLOR_YCrCb2RGB)\n\n    result_img = np.empty_like(img)\n    for i in range(NUM_RGB_CHANNELS):\n        _mask = _handle_mask(mask, i)\n        result_img[..., i] = function(img[..., i], _mask)\n\n    return result_img\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.fancy_pca","title":"<code>def fancy_pca    (img, alpha_vector)    </code> [view source on GitHub]","text":"<p>Perform 'Fancy PCA' augmentation on an image with any number of channels.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>alpha_vector</code> <code>np.ndarray</code> <p>Vector of scale factors for each principal component.                        Should have the same length as the number of channels in the image.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Augmented image of the same shape, type, and range as the input.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This function generalizes the Fancy PCA augmentation to work with any number of channels.</li> <li>It preserves the original range of the image ([0, 255] for uint8, [0, 1] for float32).</li> <li>For single-channel images, the augmentation is applied as a simple scaling of pixel intensity variation.</li> <li>For multi-channel images, PCA is performed on the entire image, treating each pixel   as a point in N-dimensional space (where N is the number of channels).</li> <li>The augmentation preserves the correlation between channels while adding controlled noise.</li> <li>Computation time may increase significantly for images with a large number of channels.</li> </ul> <p>Reference</p> <p>Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@float32_io\n@clipped\n@preserve_channel_dim\ndef fancy_pca(img: np.ndarray, alpha_vector: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Perform 'Fancy PCA' augmentation on an image with any number of channels.\n\n    Args:\n        img (np.ndarray): Input image\n        alpha_vector (np.ndarray): Vector of scale factors for each principal component.\n                                   Should have the same length as the number of channels in the image.\n\n    Returns:\n        np.ndarray: Augmented image of the same shape, type, and range as the input.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This function generalizes the Fancy PCA augmentation to work with any number of channels.\n        - It preserves the original range of the image ([0, 255] for uint8, [0, 1] for float32).\n        - For single-channel images, the augmentation is applied as a simple scaling of pixel intensity variation.\n        - For multi-channel images, PCA is performed on the entire image, treating each pixel\n          as a point in N-dimensional space (where N is the number of channels).\n        - The augmentation preserves the correlation between channels while adding controlled noise.\n        - Computation time may increase significantly for images with a large number of channels.\n\n    Reference:\n        Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012).\n        ImageNet classification with deep convolutional neural networks.\n        In Advances in neural information processing systems (pp. 1097-1105).\n    \"\"\"\n    orig_shape = img.shape\n    num_channels = get_num_channels(img)\n\n    # Reshape image to 2D array of pixels\n    img_reshaped = img.reshape(-1, num_channels)\n\n    # Center the pixel values\n    img_mean = np.mean(img_reshaped, axis=0)\n    img_centered = img_reshaped - img_mean\n\n    if num_channels == 1:\n        # For grayscale images, apply a simple scaling\n        std_dev = np.std(img_centered)\n        noise = alpha_vector[0] * std_dev * img_centered\n    else:\n        # Compute covariance matrix\n        img_cov = np.cov(img_centered, rowvar=False)\n\n        # Compute eigenvectors &amp; eigenvalues of the covariance matrix\n        eig_vals, eig_vecs = np.linalg.eigh(img_cov)\n\n        # Sort eigenvectors by eigenvalues in descending order\n        sort_perm = eig_vals[::-1].argsort()\n        eig_vals = eig_vals[sort_perm]\n        eig_vecs = eig_vecs[:, sort_perm]\n\n        # Create noise vector\n        noise = np.dot(\n            np.dot(eig_vecs, np.diag(alpha_vector * eig_vals)),\n            img_centered.T,\n        ).T\n\n    # Add noise to the image\n    img_pca = img_reshaped + noise\n\n    # Reshape back to original shape\n    img_pca = img_pca.reshape(orig_shape)\n\n    # Clip values to [0, 1] range\n    return np.clip(img_pca, 0, 1, out=img_pca)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_constant_noise","title":"<code>def generate_constant_noise    (noise_type, shape, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Generate one value per channel.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_constant_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    shape: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate one value per channel.\"\"\"\n    num_channels = shape[-1] if len(shape) &gt; MONO_CHANNEL_DIMENSIONS else 1\n    return sample_noise(\n        noise_type,\n        (num_channels,),\n        params,\n        max_value,\n        random_generator,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_per_pixel_noise","title":"<code>def generate_per_pixel_noise    (noise_type, shape, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Generate separate noise map for each channel.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_per_pixel_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    shape: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate separate noise map for each channel.\"\"\"\n    return sample_noise(noise_type, shape, params, max_value, random_generator)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_plasma_pattern","title":"<code>def generate_plasma_pattern    (target_shape, size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Generate a plasma fractal pattern using the Diamond-Square algorithm.</p> <p>The Diamond-Square algorithm creates a natural-looking noise pattern by recursively subdividing a grid and adding random displacements at each step. The roughness parameter controls how quickly the random displacements decrease with each iteration.</p> <p>Parameters:</p> Name Type Description <code>target_shape</code> <code>tuple[int, int]</code> <p>Final shape (height, width) of the pattern</p> <code>size</code> <code>int</code> <p>Initial size of the pattern grid. Will be rounded up to nearest power of 2. Larger values create more detailed patterns.</p> <code>roughness</code> <code>float</code> <p>Controls pattern roughness. Higher values create more rough/sharp transitions. Typical values are between 1.0 and 5.0.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Normalized plasma pattern array of shape target_shape with values in [0, 1]</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_plasma_pattern(\n    target_shape: tuple[int, int],\n    size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate a plasma fractal pattern using the Diamond-Square algorithm.\n\n    The Diamond-Square algorithm creates a natural-looking noise pattern by recursively\n    subdividing a grid and adding random displacements at each step. The roughness\n    parameter controls how quickly the random displacements decrease with each iteration.\n\n    Args:\n        target_shape: Final shape (height, width) of the pattern\n        size: Initial size of the pattern grid. Will be rounded up to nearest power of 2.\n            Larger values create more detailed patterns.\n        roughness: Controls pattern roughness. Higher values create more rough/sharp transitions.\n            Typical values are between 1.0 and 5.0.\n        random_generator: NumPy random generator.\n\n    Returns:\n        Normalized plasma pattern array of shape target_shape with values in [0, 1]\n    \"\"\"\n    # Initialize grid\n    grid_size = get_grid_size(size, target_shape)\n    pattern = initialize_grid(grid_size, random_generator)\n\n    # Diamond-Square algorithm\n    step_size = grid_size\n    while step_size &gt; 1:\n        half_step = step_size // 2\n\n        # Square step\n        for y in range(0, grid_size, step_size):\n            for x in range(0, grid_size, step_size):\n                if half_step &gt; 0:\n                    pattern[y + half_step, x + half_step] = square_step(\n                        pattern,\n                        y,\n                        x,\n                        step_size,\n                        half_step,\n                        roughness,\n                        random_generator,\n                    )\n\n        # Diamond step\n        for y in range(0, grid_size + 1, half_step):\n            for x in range((y + half_step) % step_size, grid_size + 1, step_size):\n                pattern[y, x] = diamond_step(\n                    pattern,\n                    y,\n                    x,\n                    half_step,\n                    grid_size,\n                    roughness,\n                    random_generator,\n                )\n\n        step_size = half_step\n\n    min_pattern = pattern.min()\n\n    # Normalize to [0, 1] range\n    pattern = (pattern - min_pattern) / (pattern.max() - min_pattern)\n\n    return (\n        fgeometric.resize(pattern, target_shape, interpolation=cv2.INTER_LINEAR)\n        if pattern.shape != target_shape\n        else pattern\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_shared_noise","title":"<code>def generate_shared_noise    (noise_type, shape, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Generate one noise map and broadcast to all channels.</p> <p>Parameters:</p> Name Type Description <code>noise_type</code> <code>Literal['uniform', 'gaussian', 'laplace', 'beta']</code> <p>Type of noise distribution to use</p> <code>shape</code> <code>tuple[int, ...]</code> <p>Shape of the input image (H, W) or (H, W, C)</p> <code>params</code> <code>dict[str, Any]</code> <p>Parameters for the noise distribution</p> <code>max_value</code> <code>float</code> <p>Maximum value for the noise distribution</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator instance</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Noise array of shape (H, W) or (H, W, C) where the same noise pattern is shared across all channels</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_shared_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    shape: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate one noise map and broadcast to all channels.\n\n    Args:\n        noise_type: Type of noise distribution to use\n        shape: Shape of the input image (H, W) or (H, W, C)\n        params: Parameters for the noise distribution\n        max_value: Maximum value for the noise distribution\n        random_generator: NumPy random generator instance\n\n    Returns:\n        Noise array of shape (H, W) or (H, W, C) where the same noise\n        pattern is shared across all channels\n    \"\"\"\n    # Generate noise for (H, W)\n    height, width = shape[:2]\n    noise_map = sample_noise(\n        noise_type,\n        (height, width),\n        params,\n        max_value,\n        random_generator,\n    )\n\n    # If input is multichannel, broadcast noise to all channels\n    if len(shape) &gt; MONO_CHANNEL_DIMENSIONS:\n        return np.broadcast_to(noise_map[..., None], shape)\n    return noise_map\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_snow_textures","title":"<code>def generate_snow_textures    (img_shape, random_generator)    </code> [view source on GitHub]","text":"<p>Generate snow texture and sparkle mask.</p> <p>Parameters:</p> Name Type Description <code>img_shape</code> <code>tuple[int, int]</code> <p>Image shape.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random generator to use.</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>Tuple of (snow_texture, sparkle_mask) arrays.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_snow_textures(\n    img_shape: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate snow texture and sparkle mask.\n\n    Args:\n        img_shape (tuple[int, int]): Image shape.\n        random_generator (np.random.Generator): Random generator to use.\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: Tuple of (snow_texture, sparkle_mask) arrays.\n    \"\"\"\n    # Generate base snow texture\n    snow_texture = random_generator.normal(size=img_shape[:2], loc=0.5, scale=0.3)\n    snow_texture = cv2.GaussianBlur(snow_texture, (0, 0), sigmaX=1, sigmaY=1)\n\n    # Generate sparkle mask\n    sparkle_mask = random_generator.random(img_shape[:2]) &gt; 0.99\n\n    return snow_texture, sparkle_mask\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.get_fog_particle_radiuses","title":"<code>def get_fog_particle_radiuses    (img_shape, num_particles, fog_intensity, random_generator)    </code> [view source on GitHub]","text":"<p>Generate radiuses for fog particles.</p> <p>Parameters:</p> Name Type Description <code>img_shape</code> <code>tuple[int, int]</code> <p>Image shape.</p> <code>num_particles</code> <code>int</code> <p>Number of fog particles.</p> <code>fog_intensity</code> <code>float</code> <p>Intensity of the fog effect, between 0 and 1.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random generator to use.</p> <p>Returns:</p> Type Description <code>list[int]</code> <p>List of radiuses for each fog particle.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def get_fog_particle_radiuses(\n    img_shape: tuple[int, int],\n    num_particles: int,\n    fog_intensity: float,\n    random_generator: np.random.Generator,\n) -&gt; list[int]:\n    \"\"\"Generate radiuses for fog particles.\n\n    Args:\n        img_shape (tuple[int, int]): Image shape.\n        num_particles (int): Number of fog particles.\n        fog_intensity (float): Intensity of the fog effect, between 0 and 1.\n        random_generator (np.random.Generator): Random generator to use.\n\n    Returns:\n        list[int]: List of radiuses for each fog particle.\n    \"\"\"\n    height, width = img_shape[:2]\n    max_fog_radius = max(2, int(min(height, width) * 0.1 * fog_intensity))\n    min_radius = max(1, max_fog_radius // 2)\n\n    return [random_generator.integers(min_radius, max_fog_radius) for _ in range(num_particles)]\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.get_grid_size","title":"<code>def get_grid_size    (size, target_shape)    </code> [view source on GitHub]","text":"<p>Round up to nearest power of 2.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def get_grid_size(size: int, target_shape: tuple[int, int]) -&gt; int:\n    \"\"\"Round up to nearest power of 2.\"\"\"\n    return 2 ** int(np.ceil(np.log2(max(size, *target_shape))))\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.get_safe_brightness_contrast_params","title":"<code>def get_safe_brightness_contrast_params    (alpha, beta, max_value)    </code> [view source on GitHub]","text":"<p>Calculate safe alpha and beta values to prevent overflow/underflow.</p> <p>For any pixel value x, we want: 0 &lt;= alpha * x + beta &lt;= max_value</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>float</code> <p>Contrast factor (1 means no change)</p> <code>beta</code> <code>float</code> <p>Brightness offset</p> <code>max_value</code> <code>float</code> <p>Maximum allowed value (255 for uint8, 1 for float32)</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>Safe (alpha, beta) values that prevent overflow/underflow</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def get_safe_brightness_contrast_params(\n    alpha: float,\n    beta: float,\n    max_value: float,\n) -&gt; tuple[float, float]:\n    \"\"\"Calculate safe alpha and beta values to prevent overflow/underflow.\n\n    For any pixel value x, we want: 0 &lt;= alpha * x + beta &lt;= max_value\n\n    Args:\n        alpha: Contrast factor (1 means no change)\n        beta: Brightness offset\n        max_value: Maximum allowed value (255 for uint8, 1 for float32)\n\n    Returns:\n        tuple[float, float]: Safe (alpha, beta) values that prevent overflow/underflow\n    \"\"\"\n    if alpha &gt; 0:\n        # For x = max_value: alpha * max_value + beta &lt;= max_value\n        # For x = 0: beta &gt;= 0\n        safe_beta = np.clip(beta, 0, max_value)\n        # From alpha * max_value + safe_beta &lt;= max_value\n        safe_alpha = min(alpha, (max_value - safe_beta) / max_value)\n    else:\n        # For x = 0: beta &lt;= max_value\n        # For x = max_value: alpha * max_value + beta &gt;= 0\n        safe_beta = min(beta, max_value)\n        # From alpha * max_value + safe_beta &gt;= 0\n        safe_alpha = max(alpha, -safe_beta / max_value)\n\n    return safe_alpha, safe_beta\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.grayscale_to_multichannel","title":"<code>def grayscale_to_multichannel    (grayscale_image, num_output_channels=3)    </code> [view source on GitHub]","text":"<p>Convert a grayscale image to a multi-channel image.</p> <p>This function takes a 2D grayscale image or a 3D image with a single channel and converts it to a multi-channel image by repeating the grayscale data across the specified number of channels.</p> <p>Parameters:</p> Name Type Description <code>grayscale_image</code> <code>np.ndarray</code> <p>Input grayscale image. Can be 2D (height, width)                           or 3D (height, width, 1).</p> <code>num_output_channels</code> <code>int</code> <p>Number of channels in the output image. Defaults to 3.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Multi-channel image with shape (height, width, num_channels)</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def grayscale_to_multichannel(\n    grayscale_image: np.ndarray,\n    num_output_channels: int = 3,\n) -&gt; np.ndarray:\n    \"\"\"Convert a grayscale image to a multi-channel image.\n\n    This function takes a 2D grayscale image or a 3D image with a single channel\n    and converts it to a multi-channel image by repeating the grayscale data\n    across the specified number of channels.\n\n    Args:\n        grayscale_image (np.ndarray): Input grayscale image. Can be 2D (height, width)\n                                      or 3D (height, width, 1).\n        num_output_channels (int, optional): Number of channels in the output image. Defaults to 3.\n\n    Returns:\n        np.ndarray: Multi-channel image with shape (height, width, num_channels)\n    \"\"\"\n    # If output should be single channel, just squeeze and return\n    if num_output_channels == 1:\n        return grayscale_image\n\n    # For multi-channel output, squeeze and stack\n    squeezed = np.squeeze(grayscale_image)\n\n    return cv2.merge([squeezed] * num_output_channels)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.image_compression","title":"<code>def image_compression    (img, quality, image_type)    </code> [view source on GitHub]","text":"<p>Apply compression to image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>quality</code> <code>int</code> <p>Compression quality (0-100)</p> <code>image_type</code> <code>Literal['.jpg', '.webp']</code> <p>Type of compression ('.jpg' or '.webp')</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Compressed image with same number of channels as input</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef image_compression(\n    img: np.ndarray,\n    quality: int,\n    image_type: Literal[\".jpg\", \".webp\"],\n) -&gt; np.ndarray:\n    \"\"\"Apply compression to image.\n\n    Args:\n        img: Input image\n        quality: Compression quality (0-100)\n        image_type: Type of compression ('.jpg' or '.webp')\n\n    Returns:\n        Compressed image with same number of channels as input\n    \"\"\"\n    quality_flag = cv2.IMWRITE_JPEG_QUALITY if image_type == \".jpg\" else cv2.IMWRITE_WEBP_QUALITY\n\n    num_channels = get_num_channels(img)\n\n    if num_channels == 1:\n        # For grayscale, ensure we read back as single channel\n        _, encoded_img = cv2.imencode(image_type, img, (int(quality_flag), quality))\n        decoded = cv2.imdecode(encoded_img, cv2.IMREAD_GRAYSCALE)\n        return decoded[..., np.newaxis]  # Add channel dimension back\n\n    if num_channels == NUM_RGB_CHANNELS:\n        # Standard RGB image\n        _, encoded_img = cv2.imencode(image_type, img, (int(quality_flag), quality))\n        return cv2.imdecode(encoded_img, cv2.IMREAD_UNCHANGED)\n\n    # For 2,4 or more channels, we need to handle alpha/extra channels separately\n    if num_channels == 2:\n        # For 2 channels, pad to 3 channels and take only first 2 after compression\n        padded = np.pad(img, ((0, 0), (0, 0), (0, 1)), mode=\"constant\")\n        _, encoded_bgr = cv2.imencode(image_type, padded, (int(quality_flag), quality))\n        decoded_bgr = cv2.imdecode(encoded_bgr, cv2.IMREAD_UNCHANGED)\n        return decoded_bgr[..., :2]\n\n    # Process first 3 channels together\n    bgr = img[..., :NUM_RGB_CHANNELS]\n    _, encoded_bgr = cv2.imencode(image_type, bgr, (int(quality_flag), quality))\n    decoded_bgr = cv2.imdecode(encoded_bgr, cv2.IMREAD_UNCHANGED)\n\n    if num_channels &gt; NUM_RGB_CHANNELS:\n        # Process additional channels one by one\n        extra_channels = []\n        for i in range(NUM_RGB_CHANNELS, num_channels):\n            channel = img[..., i]\n            _, encoded = cv2.imencode(image_type, channel, (int(quality_flag), quality))\n            decoded = cv2.imdecode(encoded, cv2.IMREAD_GRAYSCALE)\n            if len(decoded.shape) == 2:\n                decoded = decoded[..., np.newaxis]\n            extra_channels.append(decoded)\n\n        # Combine BGR with extra channels\n        return np.dstack([decoded_bgr, *extra_channels])\n\n    return decoded_bgr\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.initialize_grid","title":"<code>def initialize_grid    (grid_size, random_generator)    </code> [view source on GitHub]","text":"<p>Initialize grid with random corners.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def initialize_grid(\n    grid_size: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Initialize grid with random corners.\"\"\"\n    pattern = np.zeros((grid_size + 1, grid_size + 1), dtype=np.float32)\n    for corner in [(0, 0), (0, -1), (-1, 0), (-1, -1)]:\n        pattern[corner] = random_generator.random()\n    return pattern\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.iso_noise","title":"<code>def iso_noise    (image, color_shift, intensity, random_generator)    </code> [view source on GitHub]","text":"<p>Apply poisson noise to an image to simulate camera sensor noise.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image. Currently, only RGB images are supported.</p> <code>color_shift</code> <code>float</code> <p>The amount of color shift to apply.</p> <code>intensity</code> <code>float</code> <p>Multiplication factor for noise values. Values of ~0.5 produce a noticeable,                yet acceptable level of noise.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>If specified, this will be random generator used for noise generation.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The noised image.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@float32_io\n@clipped\ndef iso_noise(\n    image: np.ndarray,\n    color_shift: float,\n    intensity: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Apply poisson noise to an image to simulate camera sensor noise.\n\n    Args:\n        image (np.ndarray): Input image. Currently, only RGB images are supported.\n        color_shift (float): The amount of color shift to apply.\n        intensity (float): Multiplication factor for noise values. Values of ~0.5 produce a noticeable,\n                           yet acceptable level of noise.\n        random_generator (np.random.Generator): If specified, this will be random generator used\n            for noise generation.\n\n    Returns:\n        np.ndarray: The noised image.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n    \"\"\"\n    hls = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)\n    _, stddev = cv2.meanStdDev(hls)\n\n    luminance_noise = random_generator.poisson(\n        stddev[1] * intensity,\n        size=hls.shape[:2],\n    )\n    color_noise = random_generator.normal(\n        0,\n        color_shift * intensity,\n        size=hls.shape[:2],\n    )\n\n    hls[..., 0] += color_noise\n    hls[..., 1] = add_array(\n        hls[..., 1],\n        luminance_noise * intensity * (1.0 - hls[..., 1]),\n    )\n\n    noised_hls = cv2.cvtColor(hls, cv2.COLOR_HLS2RGB)\n    return np.clip(noised_hls, 0, 1, out=noised_hls)  # Ensure output is in [0, 1] range\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.move_tone_curve","title":"<code>def move_tone_curve    (img, low_y, high_y)    </code> [view source on GitHub]","text":"<p>Rescales the relationship between bright and dark areas of the image by manipulating its tone curve.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>np.ndarray. Any number of channels</p> <code>low_y</code> <code>float | np.ndarray</code> <p>per-channel or single y-position of a Bezier control point used to adjust the tone curve, must be in range [0, 1]</p> <code>high_y</code> <code>float | np.ndarray</code> <p>per-channel or single y-position of a Bezier control point used to adjust image tone curve, must be in range [0, 1]</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef move_tone_curve(\n    img: np.ndarray,\n    low_y: float | np.ndarray,\n    high_y: float | np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Rescales the relationship between bright and dark areas of the image by manipulating its tone curve.\n\n    Args:\n        img: np.ndarray. Any number of channels\n        low_y: per-channel or single y-position of a Bezier control point used\n            to adjust the tone curve, must be in range [0, 1]\n        high_y: per-channel or single y-position of a Bezier control point used\n            to adjust image tone curve, must be in range [0, 1]\n\n    \"\"\"\n    t = np.linspace(0.0, 1.0, 256)\n\n    def evaluate_bez(\n        t: np.ndarray,\n        low_y: float | np.ndarray,\n        high_y: float | np.ndarray,\n    ) -&gt; np.ndarray:\n        one_minus_t = 1 - t\n        return (3 * one_minus_t**2 * t * low_y + 3 * one_minus_t * t**2 * high_y + t**3) * 255\n\n    num_channels = get_num_channels(img)\n\n    if np.isscalar(low_y) and np.isscalar(high_y):\n        lut = clip(np.rint(evaluate_bez(t, low_y, high_y)), np.uint8, inplace=False)\n        return sz_lut(img, lut, inplace=False)\n    if isinstance(low_y, np.ndarray) and isinstance(high_y, np.ndarray):\n        luts = clip(\n            np.rint(evaluate_bez(t[:, np.newaxis], low_y, high_y).T),\n            np.uint8,\n            inplace=False,\n        )\n        return cv2.merge(\n            [sz_lut(img[:, :, i], np.ascontiguousarray(luts[i]), inplace=False) for i in range(num_channels)],\n        )\n\n    raise TypeError(\n        f\"low_y and high_y must both be of type float or np.ndarray. Got {type(low_y)} and {type(high_y)}\",\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.posterize","title":"<code>def posterize    (img, bits)    </code> [view source on GitHub]","text":"<p>Reduce the number of bits for each color channel by keeping only the highest N bits.</p> <p>This transform performs bit-depth reduction by masking out lower bits, effectively reducing the number of possible values per channel. This creates a posterization effect where similar colors are merged together.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be single or multi-channel.</p> <code>bits</code> <code>Literal[1, 2, 3, 4, 5, 6, 7] | list[Literal[1, 2, 3, 4, 5, 6, 7]]</code> <p>Number of high bits to keep. Must be in range [1, 7]. Can be either: - A single value to apply the same bit reduction to all channels - A list of values to apply different bit reduction per channel.   Length of list must match number of channels in image.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with reduced bit depth. Has same shape and dtype as input.</p> <p>Note</p> <ul> <li>The transform keeps the N highest bits and sets all other bits to 0</li> <li>For example, if bits=3:<ul> <li>Original value: 11010110 (214)</li> <li>Keep 3 bits:   11000000 (192)</li> </ul> </li> <li>The number of unique colors per channel will be 2^bits</li> <li>Higher bits values = more colors = more subtle effect</li> <li>Lower bits values = fewer colors = more dramatic posterization</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; # Same posterization for all channels\n&gt;&gt;&gt; result = posterize(image, bits=3)\n&gt;&gt;&gt; # Different posterization per channel\n&gt;&gt;&gt; result = posterize(image, bits=[3, 4, 5])  # RGB channels\n</code></pre> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\ndef posterize(img: np.ndarray, bits: Literal[1, 2, 3, 4, 5, 6, 7] | list[Literal[1, 2, 3, 4, 5, 6, 7]]) -&gt; np.ndarray:\n    \"\"\"Reduce the number of bits for each color channel by keeping only the highest N bits.\n\n    This transform performs bit-depth reduction by masking out lower bits, effectively\n    reducing the number of possible values per channel. This creates a posterization\n    effect where similar colors are merged together.\n\n    Args:\n        img: Input image. Can be single or multi-channel.\n        bits: Number of high bits to keep. Must be in range [1, 7].\n            Can be either:\n            - A single value to apply the same bit reduction to all channels\n            - A list of values to apply different bit reduction per channel.\n              Length of list must match number of channels in image.\n\n    Returns:\n        np.ndarray: Image with reduced bit depth. Has same shape and dtype as input.\n\n    Note:\n        - The transform keeps the N highest bits and sets all other bits to 0\n        - For example, if bits=3:\n            - Original value: 11010110 (214)\n            - Keep 3 bits:   11000000 (192)\n        - The number of unique colors per channel will be 2^bits\n        - Higher bits values = more colors = more subtle effect\n        - Lower bits values = fewer colors = more dramatic posterization\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; # Same posterization for all channels\n        &gt;&gt;&gt; result = posterize(image, bits=3)\n        &gt;&gt;&gt; # Different posterization per channel\n        &gt;&gt;&gt; result = posterize(image, bits=[3, 4, 5])  # RGB channels\n    \"\"\"\n    bits_array = np.uint8(bits)\n\n    if not bits_array.shape or len(bits_array) == 1:\n        lut = np.arange(0, 256, dtype=np.uint8)\n        mask = ~np.uint8(2 ** (8 - bits_array) - 1)\n        lut &amp;= mask\n\n        return sz_lut(img, lut, inplace=False)\n\n    result_img = np.empty_like(img)\n    for i, channel_bits in enumerate(bits_array):\n        lut = np.arange(0, 256, dtype=np.uint8)\n        mask = ~np.uint8(2 ** (8 - channel_bits) - 1)\n        lut &amp;= mask\n\n        result_img[..., i] = sz_lut(img[..., i], lut, inplace=True)\n\n    return result_img\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.prepare_illumination_input","title":"<code>def prepare_illumination_input    (img)    </code> [view source on GitHub]","text":"<p>Prepare image for illumination effect.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>float32 image</li> <li>height</li> <li>width</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def prepare_illumination_input(img: np.ndarray) -&gt; tuple[np.ndarray, int, int]:\n    \"\"\"Prepare image for illumination effect.\n\n    Args:\n        img: Input image\n\n    Returns:\n        tuple of:\n        - float32 image\n        - height\n        - width\n    \"\"\"\n    result = img.astype(np.float32)\n    height, width = img.shape[:2]\n    return result, height, width\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.random_offset","title":"<code>def random_offset    (current_size, total_size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Calculate random offset based on current grid size.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def random_offset(\n    current_size: int,\n    total_size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; float:\n    \"\"\"Calculate random offset based on current grid size.\"\"\"\n    return (random_generator.random() - 0.5) * (current_size / total_size) ** (roughness / 2)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_beta","title":"<code>def sample_beta    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from Beta distribution.</p> <p>The Beta distribution is bounded by [0, 1] and then scaled and shifted to [-scale, scale]. Alpha and beta parameters control the shape of the distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_beta(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from Beta distribution.\n\n    The Beta distribution is bounded by [0, 1] and then scaled and shifted to [-scale, scale].\n    Alpha and beta parameters control the shape of the distribution.\n    \"\"\"\n    alpha = random_generator.uniform(*params[\"alpha_range\"])\n    beta = random_generator.uniform(*params[\"beta_range\"])\n    scale = random_generator.uniform(*params[\"scale_range\"])\n\n    # Sample from Beta[0,1] and transform to [-scale,scale]\n    samples = random_generator.beta(alpha, beta, size=size)\n    return (2 * samples - 1) * scale\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_gaussian","title":"<code>def sample_gaussian    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from Gaussian distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_gaussian(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from Gaussian distribution.\"\"\"\n    mean = random_generator.uniform(*params[\"mean_range\"])\n    std = random_generator.uniform(*params[\"std_range\"])\n    return random_generator.normal(mean, std, size=size)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_laplace","title":"<code>def sample_laplace    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from Laplace distribution.</p> <p>The Laplace distribution is also known as the double exponential distribution. It has heavier tails than the Gaussian distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_laplace(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from Laplace distribution.\n\n    The Laplace distribution is also known as the double exponential distribution.\n    It has heavier tails than the Gaussian distribution.\n    \"\"\"\n    loc = random_generator.uniform(*params[\"mean_range\"])\n    scale = random_generator.uniform(*params[\"scale_range\"])\n    return random_generator.laplace(loc=loc, scale=scale, size=size)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_noise","title":"<code>def sample_noise    (noise_type, size, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from specific noise distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from specific noise distribution.\"\"\"\n    if noise_type == \"uniform\":\n        return sample_uniform(size, params, random_generator) * max_value\n    if noise_type == \"gaussian\":\n        return sample_gaussian(size, params, random_generator) * max_value\n    if noise_type == \"laplace\":\n        return sample_laplace(size, params, random_generator) * max_value\n    if noise_type == \"beta\":\n        return sample_beta(size, params, random_generator) * max_value\n\n    raise ValueError(f\"Unknown noise type: {noise_type}\")\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_uniform","title":"<code>def sample_uniform    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from uniform distribution.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, ...]</code> <p>Output shape. If length is 1, generates constant noise per channel.</p> <code>params</code> <code>dict[str, Any]</code> <p>Must contain 'ranges' key with list of (min, max) tuples. If only one range is provided, it will be used for all channels.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator instance</p> <p>Returns:</p> Type Description <code>np.ndarray | float</code> <p>Noise array of specified size. For single-channel constant mode, returns scalar instead of array with shape (1,).</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_uniform(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray | float:\n    \"\"\"Sample from uniform distribution.\n\n    Args:\n        size: Output shape. If length is 1, generates constant noise per channel.\n        params: Must contain 'ranges' key with list of (min, max) tuples.\n            If only one range is provided, it will be used for all channels.\n        random_generator: NumPy random generator instance\n\n    Returns:\n        Noise array of specified size. For single-channel constant mode,\n        returns scalar instead of array with shape (1,).\n    \"\"\"\n    if len(size) == 1:  # constant mode\n        ranges = params[\"ranges\"]\n        num_channels = size[0]\n\n        if len(ranges) == 1:\n            ranges = ranges * num_channels\n        elif len(ranges) &lt; num_channels:\n            raise ValueError(\n                f\"Not enough ranges provided. Expected {num_channels}, got {len(ranges)}\",\n            )\n\n        return np.array(\n            [random_generator.uniform(low, high) for low, high in ranges[:num_channels]],\n        )\n\n    # use first range for spatial noise\n    low, high = params[\"ranges\"][0]\n    return random_generator.uniform(low, high, size=size)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sharpen_gaussian","title":"<code>def sharpen_gaussian    (img, alpha, kernel_size, sigma)    </code> [view source on GitHub]","text":"<p>Sharpen image using Gaussian blur.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef sharpen_gaussian(\n    img: np.ndarray,\n    alpha: float,\n    kernel_size: int,\n    sigma: float,\n) -&gt; np.ndarray:\n    \"\"\"Sharpen image using Gaussian blur.\"\"\"\n    blurred = cv2.GaussianBlur(\n        img,\n        ksize=(kernel_size, kernel_size),\n        sigmaX=sigma,\n        sigmaY=sigma,\n    )\n    # Unsharp mask formula: original + alpha * (original - blurred)\n    # This is equivalent to: original * (1 + alpha) - alpha * blurred\n    return img + alpha * (img - blurred)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.shot_noise","title":"<code>def shot_noise    (img, scale, random_generator)    </code> [view source on GitHub]","text":"<p>Apply shot noise to the image by simulating photon counting in linear light space.</p> <p>This function simulates photon shot noise, which occurs due to the quantum nature of light. The process: 1. Converts image to linear light space (removes gamma correction) 2. Scales pixel values to represent expected photon counts 3. Samples actual photon counts from Poisson distribution 4. Converts back to display space (reapplies gamma)</p> <p>The simulation is performed in linear light space because photon shot noise is a physical process that occurs before gamma correction is applied by cameras/displays.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image in range [0, 1]. Can be single or multi-channel.</p> <code>scale</code> <code>float</code> <p>Reciprocal of the number of photons (noise intensity). - Larger values = fewer photons = more noise - Smaller values = more photons = less noise For example: - scale = 0.1 simulates ~100 photons per unit intensity - scale = 10.0 simulates ~0.1 photons per unit intensity</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator for Poisson sampling</p> <p>Returns:</p> Type Description <code>Image with shot noise applied, same shape and range [0, 1] as input. The noise characteristics will follow Poisson statistics in linear space</code> <ul> <li>Variance equals mean in linear space</li> <li>More noise in brighter regions (but less relative noise)</li> <li>Less noise in darker regions (but more relative noise)</li> </ul> <p>Note</p> <ul> <li>Uses gamma value of 2.2 for linear/display space conversion</li> <li>Adds small constant (1e-6) to avoid issues with zero values</li> <li>Clips final values to [0, 1] range</li> <li>Operates on the image in-place for memory efficiency</li> <li>Preserves float32 precision throughout calculations</li> </ul> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Shot_noise</li> <li>https://en.wikipedia.org/wiki/Gamma_correction</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@preserve_channel_dim\n@float32_io\ndef shot_noise(\n    img: np.ndarray,\n    scale: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Apply shot noise to the image by simulating photon counting in linear light space.\n\n    This function simulates photon shot noise, which occurs due to the quantum nature of light.\n    The process:\n    1. Converts image to linear light space (removes gamma correction)\n    2. Scales pixel values to represent expected photon counts\n    3. Samples actual photon counts from Poisson distribution\n    4. Converts back to display space (reapplies gamma)\n\n    The simulation is performed in linear light space because photon shot noise is a physical\n    process that occurs before gamma correction is applied by cameras/displays.\n\n    Args:\n        img: Input image in range [0, 1]. Can be single or multi-channel.\n        scale: Reciprocal of the number of photons (noise intensity).\n            - Larger values = fewer photons = more noise\n            - Smaller values = more photons = less noise\n            For example:\n            - scale = 0.1 simulates ~100 photons per unit intensity\n            - scale = 10.0 simulates ~0.1 photons per unit intensity\n        random_generator: NumPy random generator for Poisson sampling\n\n    Returns:\n        Image with shot noise applied, same shape and range [0, 1] as input.\n        The noise characteristics will follow Poisson statistics in linear space:\n        - Variance equals mean in linear space\n        - More noise in brighter regions (but less relative noise)\n        - Less noise in darker regions (but more relative noise)\n\n    Note:\n        - Uses gamma value of 2.2 for linear/display space conversion\n        - Adds small constant (1e-6) to avoid issues with zero values\n        - Clips final values to [0, 1] range\n        - Operates on the image in-place for memory efficiency\n        - Preserves float32 precision throughout calculations\n\n    References:\n        - https://en.wikipedia.org/wiki/Shot_noise\n        - https://en.wikipedia.org/wiki/Gamma_correction\n    \"\"\"\n    # Apply inverse gamma correction to work in linear space\n    img_linear = cv2.pow(img, 2.2)\n\n    # Scale image values and add small constant to avoid zero values\n    scaled_img = (img_linear + scale * 1e-6) / scale\n\n    # Generate Poisson noise\n    noisy_img = multiply_by_constant(\n        random_generator.poisson(scaled_img).astype(np.float32),\n        scale,\n        inplace=True,\n    )\n\n    # Scale back and apply gamma correction\n    return power(np.clip(noisy_img, 0, 1, out=noisy_img), 1 / 2.2)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.slic","title":"<code>def slic    (image, n_segments, compactness=10.0, max_iterations=10)    </code> [view source on GitHub]","text":"<p>Simple Linear Iterative Clustering (SLIC) superpixel segmentation using OpenCV and NumPy.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image (2D or 3D numpy array).</p> <code>n_segments</code> <code>int</code> <p>Approximate number of superpixels to generate.</p> <code>compactness</code> <code>float</code> <p>Balance between color proximity and space proximity.</p> <code>max_iterations</code> <code>int</code> <p>Maximum number of iterations for k-means.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Segmentation mask where each superpixel has a unique label.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def slic(\n    image: np.ndarray,\n    n_segments: int,\n    compactness: float = 10.0,\n    max_iterations: int = 10,\n) -&gt; np.ndarray:\n    \"\"\"Simple Linear Iterative Clustering (SLIC) superpixel segmentation using OpenCV and NumPy.\n\n    Args:\n        image (np.ndarray): Input image (2D or 3D numpy array).\n        n_segments (int): Approximate number of superpixels to generate.\n        compactness (float): Balance between color proximity and space proximity.\n        max_iterations (int): Maximum number of iterations for k-means.\n\n    Returns:\n        np.ndarray: Segmentation mask where each superpixel has a unique label.\n    \"\"\"\n    if image.ndim == MONO_CHANNEL_DIMENSIONS:\n        image = image[..., np.newaxis]\n\n    height, width = image.shape[:2]\n    num_pixels = height * width\n\n    # Normalize image to [0, 1] range\n    image_normalized = image.astype(np.float32) / np.max(image + 1e-6)\n\n    # Initialize cluster centers\n    grid_step = int((num_pixels / n_segments) ** 0.5)\n    x_range = np.arange(grid_step // 2, width, grid_step)\n    y_range = np.arange(grid_step // 2, height, grid_step)\n    centers = np.array(\n        [(x, y) for y in y_range for x in x_range if x &lt; width and y &lt; height],\n    )\n\n    # Initialize labels and distances\n    labels = -1 * np.ones((height, width), dtype=np.int32)\n    distances = np.full((height, width), np.inf)\n\n    for _ in range(max_iterations):\n        for i, center in enumerate(centers):\n            y, x = int(center[1]), int(center[0])\n\n            # Define the neighborhood\n            y_low, y_high = max(0, y - grid_step), min(height, y + grid_step + 1)\n            x_low, x_high = max(0, x - grid_step), min(width, x + grid_step + 1)\n\n            # Compute distances\n            crop = image_normalized[y_low:y_high, x_low:x_high]\n            color_diff = crop - image_normalized[y, x]\n            color_distance = np.sum(color_diff**2, axis=-1)\n\n            yy, xx = np.ogrid[y_low:y_high, x_low:x_high]\n            spatial_distance = ((yy - y) ** 2 + (xx - x) ** 2) / (grid_step**2)\n\n            distance = color_distance + compactness * spatial_distance\n\n            mask = distance &lt; distances[y_low:y_high, x_low:x_high]\n            distances[y_low:y_high, x_low:x_high][mask] = distance[mask]\n            labels[y_low:y_high, x_low:x_high][mask] = i\n\n        # Update centers\n        for i in range(len(centers)):\n            mask = labels == i\n            if np.any(mask):\n                centers[i] = np.mean(np.argwhere(mask), axis=0)[::-1]\n\n    return labels\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.solarize","title":"<code>def solarize    (img, threshold)    </code> [view source on GitHub]","text":"<p>Invert all pixel values above a threshold.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The image to solarize. Can be uint8 or float32.</p> <code>threshold</code> <code>float</code> <p>Normalized threshold value in range [0, 1]. For uint8 images: pixels above threshold * 255 are inverted For float32 images: pixels above threshold are inverted</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Solarized image.</p> <p>Note</p> <p>The threshold is normalized to [0, 1] range for both uint8 and float32 images. For uint8 images, the threshold is internally scaled by 255.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef solarize(img: np.ndarray, threshold: float) -&gt; np.ndarray:\n    \"\"\"Invert all pixel values above a threshold.\n\n    Args:\n        img: The image to solarize. Can be uint8 or float32.\n        threshold: Normalized threshold value in range [0, 1].\n            For uint8 images: pixels above threshold * 255 are inverted\n            For float32 images: pixels above threshold are inverted\n\n    Returns:\n        Solarized image.\n\n    Note:\n        The threshold is normalized to [0, 1] range for both uint8 and float32 images.\n        For uint8 images, the threshold is internally scaled by 255.\n    \"\"\"\n    dtype = img.dtype\n    max_val = MAX_VALUES_BY_DTYPE[dtype]\n\n    if dtype == np.uint8:\n        lut = [(max_val - i if i &gt;= threshold * max_val else i) for i in range(int(max_val) + 1)]\n\n        prev_shape = img.shape\n        img = sz_lut(img, np.array(lut, dtype=dtype), inplace=False)\n\n        return np.expand_dims(img, -1) if len(prev_shape) != img.ndim else img\n\n    img = img.copy()\n\n    cond = img &gt;= threshold\n    img[cond] = max_val - img[cond]\n    return img\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.square_step","title":"<code>def square_step    (pattern, y, x, step, grid_size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Compute center value during square step.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def square_step(\n    pattern: np.ndarray,\n    y: int,\n    x: int,\n    step: int,\n    grid_size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; float:\n    \"\"\"Compute center value during square step.\"\"\"\n    corners = [\n        pattern[y, x],  # top-left\n        pattern[y, x + step],  # top-right\n        pattern[y + step, x],  # bottom-left\n        pattern[y + step, x + step],  # bottom-right\n    ]\n    return sum(corners) / 4.0 + random_offset(\n        step,\n        grid_size,\n        roughness,\n        random_generator,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_average","title":"<code>def to_gray_average    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using the average method.</p> <p>This function computes the arithmetic mean across all channels for each pixel, resulting in a grayscale representation of the image.</p> <p>Key aspects of this method: 1. It treats all channels equally, regardless of their perceptual importance. 2. Works with any number of channels, making it versatile for various image types. 3. Simple and fast to compute, but may not accurately represent perceived brightness. 4. For RGB images, the formula is: Gray = (R + G + B) / 3</p> <p>Note: This method may produce different results compared to weighted methods (like RGB weighted average) which account for human perception of color brightness. It may also produce unexpected results for images with alpha channels or non-color data in additional channels.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array. Can be any number of channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array. The output data type             matches the input data type.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def to_gray_average(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using the average method.\n\n    This function computes the arithmetic mean across all channels for each pixel,\n    resulting in a grayscale representation of the image.\n\n    Key aspects of this method:\n    1. It treats all channels equally, regardless of their perceptual importance.\n    2. Works with any number of channels, making it versatile for various image types.\n    3. Simple and fast to compute, but may not accurately represent perceived brightness.\n    4. For RGB images, the formula is: Gray = (R + G + B) / 3\n\n    Note: This method may produce different results compared to weighted methods\n    (like RGB weighted average) which account for human perception of color brightness.\n    It may also produce unexpected results for images with alpha channels or\n    non-color data in additional channels.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array. Can be any number of channels.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array. The output data type\n                    matches the input data type.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    return np.mean(img, axis=-1).astype(img.dtype)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_desaturation","title":"<code>def to_gray_desaturation    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using the desaturation method.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef to_gray_desaturation(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using the desaturation method.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    float_image = img.astype(np.float32)\n    return (np.max(float_image, axis=-1) + np.min(float_image, axis=-1)) / 2\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_from_lab","title":"<code>def to_gray_from_lab    (img)    </code> [view source on GitHub]","text":"<p>Convert an RGB image to grayscale using the L channel from the LAB color space.</p> <p>This function converts the RGB image to the LAB color space and extracts the L channel. The LAB color space is designed to approximate human vision, where L represents lightness.</p> <p>Key aspects of this method: 1. The L channel represents the lightness of each pixel, ranging from 0 (black) to 100 (white). 2. It's more perceptually uniform than RGB, meaning equal changes in L values correspond to    roughly equal changes in perceived lightness. 3. The L channel is independent of the color information (A and B channels), making it    suitable for grayscale conversion.</p> <p>This method can be particularly useful when you want a grayscale image that closely matches human perception of lightness, potentially preserving more perceived contrast than simple RGB-based methods.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input RGB image as a numpy array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array, representing the L (lightness) channel.             Values are scaled to match the input image's data type range.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\ndef to_gray_from_lab(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an RGB image to grayscale using the L channel from the LAB color space.\n\n    This function converts the RGB image to the LAB color space and extracts the L channel.\n    The LAB color space is designed to approximate human vision, where L represents lightness.\n\n    Key aspects of this method:\n    1. The L channel represents the lightness of each pixel, ranging from 0 (black) to 100 (white).\n    2. It's more perceptually uniform than RGB, meaning equal changes in L values correspond to\n       roughly equal changes in perceived lightness.\n    3. The L channel is independent of the color information (A and B channels), making it\n       suitable for grayscale conversion.\n\n    This method can be particularly useful when you want a grayscale image that closely\n    matches human perception of lightness, potentially preserving more perceived contrast\n    than simple RGB-based methods.\n\n    Args:\n        img (np.ndarray): Input RGB image as a numpy array.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array, representing the L (lightness) channel.\n                    Values are scaled to match the input image's data type range.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n    \"\"\"\n    return cv2.cvtColor(img, cv2.COLOR_RGB2LAB)[..., 0]\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_max","title":"<code>def to_gray_max    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using the maximum channel value method.</p> <p>This function takes the maximum value across all channels for each pixel, resulting in a grayscale image that preserves the brightest parts of the original image.</p> <p>Key aspects of this method: 1. Works with any number of channels, making it versatile for various image types. 2. For 3-channel (e.g., RGB) images, this method is equivalent to extracting the V (Value)    channel from the HSV color space. 3. Preserves the brightest parts of the image but may lose some color contrast information. 4. Simple and fast to compute.</p> <p>Note: - This method tends to produce brighter grayscale images compared to other conversion methods,   as it always selects the highest intensity value from the channels. - For RGB images, it may not accurately represent perceived brightness as it doesn't   account for human color perception.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array. Can be any number of channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array. The output data type             matches the input data type.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def to_gray_max(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using the maximum channel value method.\n\n    This function takes the maximum value across all channels for each pixel,\n    resulting in a grayscale image that preserves the brightest parts of the original image.\n\n    Key aspects of this method:\n    1. Works with any number of channels, making it versatile for various image types.\n    2. For 3-channel (e.g., RGB) images, this method is equivalent to extracting the V (Value)\n       channel from the HSV color space.\n    3. Preserves the brightest parts of the image but may lose some color contrast information.\n    4. Simple and fast to compute.\n\n    Note:\n    - This method tends to produce brighter grayscale images compared to other conversion methods,\n      as it always selects the highest intensity value from the channels.\n    - For RGB images, it may not accurately represent perceived brightness as it doesn't\n      account for human color perception.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array. Can be any number of channels.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array. The output data type\n                    matches the input data type.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    return np.max(img, axis=-1)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_pca","title":"<code>def to_gray_pca    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using Principal Component Analysis (PCA).</p> <p>This function applies PCA to reduce a multi-channel image to a single channel, effectively creating a grayscale representation that captures the maximum variance in the color data.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array with shape (height, width, channels).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array with shape (height, width).             If input is uint8, output is uint8 in range [0, 255].             If input is float32, output is float32 in range [0, 1].</p> <p>Note</p> <p>This method can potentially preserve more information from the original image compared to standard weighted average methods, as it accounts for the correlations between color channels.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef to_gray_pca(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using Principal Component Analysis (PCA).\n\n    This function applies PCA to reduce a multi-channel image to a single channel,\n    effectively creating a grayscale representation that captures the maximum variance\n    in the color data.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array with shape (height, width, channels).\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array with shape (height, width).\n                    If input is uint8, output is uint8 in range [0, 255].\n                    If input is float32, output is float32 in range [0, 1].\n\n    Note:\n        This method can potentially preserve more information from the original image\n        compared to standard weighted average methods, as it accounts for the\n        correlations between color channels.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    dtype = img.dtype\n    # Reshape the image to a 2D array of pixels\n    pixels = img.reshape(-1, img.shape[2])\n\n    # Perform PCA\n    pca = PCA(n_components=1)\n    pca_result = pca.fit_transform(pixels)\n\n    # Reshape back to image dimensions and scale to 0-255\n    grayscale = pca_result.reshape(img.shape[:2])\n    grayscale = normalize_per_image(grayscale, \"min_max\")\n\n    return from_float(grayscale, target_dtype=dtype) if dtype == np.uint8 else grayscale\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_weighted_average","title":"<code>def to_gray_weighted_average    (img)    </code> [view source on GitHub]","text":"<p>Convert an RGB image to grayscale using the weighted average method.</p> <p>This function uses OpenCV's cvtColor function with COLOR_RGB2GRAY conversion, which applies the following formula: Y = 0.299R + 0.587G + 0.114*B</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input RGB image as a numpy array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def to_gray_weighted_average(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an RGB image to grayscale using the weighted average method.\n\n    This function uses OpenCV's cvtColor function with COLOR_RGB2GRAY conversion,\n    which applies the following formula:\n    Y = 0.299*R + 0.587*G + 0.114*B\n\n    Args:\n        img (np.ndarray): Input RGB image as a numpy array.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n    \"\"\"\n    return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/","title":"Geometric augmentations (augmentations.geometric)","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional","title":"<code>functional</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.adjust_padding_by_position","title":"<code>def adjust_padding_by_position    (h_top, h_bottom, w_left, w_right, position, py_random)    </code> [view source on GitHub]","text":"<p>Adjust padding values based on desired position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def adjust_padding_by_position(\n    h_top: int,\n    h_bottom: int,\n    w_left: int,\n    w_right: int,\n    position: PositionType,\n    py_random: np.random.RandomState,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Adjust padding values based on desired position.\"\"\"\n    if position == \"center\":\n        return h_top, h_bottom, w_left, w_right\n\n    if position == \"top_left\":\n        return 0, h_top + h_bottom, 0, w_left + w_right\n\n    if position == \"top_right\":\n        return 0, h_top + h_bottom, w_left + w_right, 0\n\n    if position == \"bottom_left\":\n        return h_top + h_bottom, 0, 0, w_left + w_right\n\n    if position == \"bottom_right\":\n        return h_top + h_bottom, 0, w_left + w_right, 0\n\n    if position == \"random\":\n        h_pad = h_top + h_bottom\n        w_pad = w_left + w_right\n        h_top = py_random.randint(0, h_pad)\n        h_bottom = h_pad - h_top\n        w_left = py_random.randint(0, w_pad)\n        w_right = w_pad - w_left\n        return h_top, h_bottom, w_left, w_right\n\n    raise ValueError(f\"Unknown position: {position}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.almost_equal_intervals","title":"<code>def almost_equal_intervals    (n, parts)    </code> [view source on GitHub]","text":"<p>Generates an array of nearly equal integer intervals that sum up to <code>n</code>.</p> <p>This function divides the number <code>n</code> into <code>parts</code> nearly equal parts. It ensures that the sum of all parts equals <code>n</code>, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.</p> <p>Parameters:</p> Name Type Description <code>n</code> <code>int</code> <p>The total value to be split.</p> <code>parts</code> <code>int</code> <p>The number of parts to split into.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of integers where each integer represents the size of a part.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; almost_equal_intervals(20, 3)\narray([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n&gt;&gt;&gt; almost_equal_intervals(16, 4)\narray([4, 4, 4, 4])  # Splits 16 into four equal parts\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def almost_equal_intervals(n: int, parts: int) -&gt; np.ndarray:\n    \"\"\"Generates an array of nearly equal integer intervals that sum up to `n`.\n\n    This function divides the number `n` into `parts` nearly equal parts. It ensures that\n    the sum of all parts equals `n`, and the difference between any two parts is at most one.\n    This is useful for distributing a total amount into nearly equal discrete parts.\n\n    Args:\n        n (int): The total value to be split.\n        parts (int): The number of parts to split into.\n\n    Returns:\n        np.ndarray: An array of integers where each integer represents the size of a part.\n\n    Example:\n        &gt;&gt;&gt; almost_equal_intervals(20, 3)\n        array([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n        &gt;&gt;&gt; almost_equal_intervals(16, 4)\n        array([4, 4, 4, 4])  # Splits 16 into four equal parts\n    \"\"\"\n    part_size, remainder = divmod(n, parts)\n    # Create an array with the base part size and adjust the first `remainder` parts by adding 1\n    return np.array(\n        [part_size + 1 if i &lt; remainder else part_size for i in range(parts)],\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.apply_affine_to_points","title":"<code>def apply_affine_to_points    (points, matrix)    </code> [view source on GitHub]","text":"<p>Apply affine transformation to a set of points.</p> <p>This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.</p> <p>Parameters:</p> Name Type Description <code>points</code> <code>np.ndarray</code> <p>Array of points with shape (N, 2).</p> <code>matrix</code> <code>np.ndarray</code> <p>3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (N, 2).</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"points\")\ndef apply_affine_to_points(points: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply affine transformation to a set of points.\n\n    This function handles potential division by zero by replacing zero values\n    in the homogeneous coordinate with a small epsilon value.\n\n    Args:\n        points (np.ndarray): Array of points with shape (N, 2).\n        matrix (np.ndarray): 3x3 affine transformation matrix.\n\n    Returns:\n        np.ndarray: Transformed points with shape (N, 2).\n    \"\"\"\n    homogeneous_points = np.column_stack([points, np.ones(points.shape[0])])\n    transformed_points = homogeneous_points @ matrix.T\n\n    # Handle potential division by zero\n    epsilon = np.finfo(transformed_points.dtype).eps\n    transformed_points[:, 2] = np.where(\n        np.abs(transformed_points[:, 2]) &lt; epsilon,\n        np.sign(transformed_points[:, 2]) * epsilon,\n        transformed_points[:, 2],\n    )\n\n    return transformed_points[:, :2] / transformed_points[:, 2:]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_affine","title":"<code>def bboxes_affine    (bboxes, matrix, rotate_method, image_shape, border_mode, output_shape)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes.</p> <p>For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes</p> <p>For other border modes, it directly applies the affine transformation without padding.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Input bounding boxes</p> <code>matrix</code> <code>np.ndarray</code> <p>Affine transformation matrix</p> <code>rotate_method</code> <code>str</code> <p>Method for rotating bounding boxes ('largest_box' or 'ellipse')</p> <code>image_shape</code> <code>Sequence[int]</code> <p>Shape of the input image</p> <code>border_mode</code> <code>int</code> <p>OpenCV border mode</p> <code>output_shape</code> <code>Sequence[int]</code> <p>Shape of the output image</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed and normalized bounding boxes</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine(\n    bboxes: np.ndarray,\n    matrix: np.ndarray,\n    rotate_method: Literal[\"largest_box\", \"ellipse\"],\n    image_shape: tuple[int, int],\n    border_mode: int,\n    output_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes.\n\n    For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function:\n    1. Calculates necessary padding to avoid information loss\n    2. Applies padding to the bounding boxes\n    3. Adjusts the transformation matrix to account for padding\n    4. Applies the affine transformation\n    5. Validates the transformed bounding boxes\n\n    For other border modes, it directly applies the affine transformation without padding.\n\n    Args:\n        bboxes (np.ndarray): Input bounding boxes\n        matrix (np.ndarray): Affine transformation matrix\n        rotate_method (str): Method for rotating bounding boxes ('largest_box' or 'ellipse')\n        image_shape (Sequence[int]): Shape of the input image\n        border_mode (int): OpenCV border mode\n        output_shape (Sequence[int]): Shape of the output image\n\n    Returns:\n        np.ndarray: Transformed and normalized bounding boxes\n    \"\"\"\n    if is_identity_matrix(matrix):\n        return bboxes\n\n    bboxes = denormalize_bboxes(bboxes, image_shape)\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        bboxes = generate_reflected_bboxes(\n            bboxes,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Apply affine transform\n    if rotate_method == \"largest_box\":\n        transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n    elif rotate_method == \"ellipse\":\n        transformed_bboxes = bboxes_affine_ellipse(bboxes, matrix)\n    else:\n        raise ValueError(f\"Method {rotate_method} is not a valid rotation method.\")\n\n    # Validate and normalize bboxes\n    validated_bboxes = validate_bboxes(transformed_bboxes, output_shape)\n\n    return normalize_bboxes(validated_bboxes, output_shape)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_affine_ellipse","title":"<code>def bboxes_affine_ellipse    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes using an ellipse approximation method.</p> <p>This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The ellipse approximation method can provide a tighter bounding box compared to the   largest box method, especially for rotations.</li> <li>360 points are used to approximate each ellipse, which provides a good balance between   accuracy and computational efficiency.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method may be more suitable for objects that are roughly elliptical in shape.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_ellipse(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes using an ellipse approximation method.\n\n    This function transforms bounding boxes by approximating each box with an ellipse,\n    transforming points along the ellipse's circumference, and then computing the\n    new bounding box that encloses the transformed ellipse.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The ellipse approximation method can provide a tighter bounding box compared to the\n          largest box method, especially for rotations.\n        - 360 points are used to approximate each ellipse, which provides a good balance between\n          accuracy and computational efficiency.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method may be more suitable for objects that are roughly elliptical in shape.\n    \"\"\"\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n    bbox_width = (x_max - x_min) / 2\n    bbox_height = (y_max - y_min) / 2\n    center_x = x_min + bbox_width\n    center_y = y_min + bbox_height\n\n    angles = np.arange(0, 360, dtype=np.float32)\n    cos_angles = np.cos(np.radians(angles))\n    sin_angles = np.sin(np.radians(angles))\n\n    # Generate points for all ellipses at once\n    x = bbox_width[:, np.newaxis] * sin_angles + center_x[:, np.newaxis]\n    y = bbox_height[:, np.newaxis] * cos_angles + center_y[:, np.newaxis]\n    points = np.stack([x, y], axis=-1).reshape(-1, 2)\n\n    # Transform all points at once using the helper function\n    transformed_points = apply_affine_to_points(points, matrix)\n\n    transformed_points = transformed_points.reshape(len(bboxes), -1, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_points[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_points[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_points[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_points[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_affine_largest_box","title":"<code>def bboxes_affine_largest_box    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes and return the largest enclosing boxes.</p> <p>This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The resulting bounding boxes are the smallest axis-aligned boxes that completely   enclose the transformed original boxes. They may be larger than the minimal possible   bounding box if the original box becomes rotated.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method is called \"largest box\" because it returns the largest axis-aligned box   that encloses all corners of the transformed bounding box.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n&gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n&gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n&gt;&gt;&gt; print(transformed_bboxes)\n[[ 25.  25.  45.  45.   1.]\n [ 65.  65.  85.  85.   2.]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_largest_box(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes and return the largest enclosing boxes.\n\n    This function transforms each corner of every bounding box using the given affine transformation\n    matrix, then computes the new bounding boxes that fully enclose the transformed corners.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The resulting bounding boxes are the smallest axis-aligned boxes that completely\n          enclose the transformed original boxes. They may be larger than the minimal possible\n          bounding box if the original box becomes rotated.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method is called \"largest box\" because it returns the largest axis-aligned box\n          that encloses all corners of the transformed bounding box.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n        &gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n        &gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n        &gt;&gt;&gt; print(transformed_bboxes)\n        [[ 25.  25.  45.  45.   1.]\n         [ 65.  65.  85.  85.   2.]]\n    \"\"\"\n    # Extract corners of all bboxes\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    corners = (\n        np.array([[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]]).transpose(2, 0, 1).reshape(-1, 2)\n    )\n\n    # Transform all corners at once\n    transformed_corners = apply_affine_to_points(corners, matrix).reshape(-1, 4, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_corners[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_corners[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_corners[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_corners[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_d4","title":"<code>def bboxes_d4    (bboxes, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a bounding box.</p> <p>The function transforms a bounding box according to the specified group member from the <code>D_4</code> group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions.</p> <ul> <li>bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</li> <li>group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.</li> </ul> <ul> <li>BoxInternalType: The transformed bounding box.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Applying a 90-degree rotation:   <code>bbox_d4((10, 20, 110, 120), 'r90')</code>   This would rotate the bounding box 90 degrees within a 100x100 image.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_d4(\n    bboxes: np.ndarray,\n    group_member: D4Type,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a bounding box.\n\n    The function transforms a bounding box according to the specified group member from the `D_4` group.\n    These transformations include rotations and reflections, specified to work on an image's bounding box given\n    its dimensions.\n\n    Parameters:\n    -  bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n    - group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.\n\n    Returns:\n    - BoxInternalType: The transformed bounding box.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Applying a 90-degree rotation:\n      `bbox_d4((10, 20, 110, 120), 'r90')`\n      This would rotate the bounding box 90 degrees within a 100x100 image.\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: bboxes_rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: bboxes_rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: bboxes_rot90(x, 3),  # Rotate 270 degrees\n        \"v\": lambda x: bboxes_vflip(x),  # Vertical flip\n        \"hvt\": lambda x: bboxes_transpose(\n            bboxes_rot90(x, 2),\n        ),  # Reflect over anti-diagonal\n        \"h\": lambda x: bboxes_hflip(x),  # Horizontal flip\n        \"t\": lambda x: bboxes_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](bboxes)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_grid_shuffle","title":"<code>def bboxes_grid_shuffle    (bboxes, tiles, mapping, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Apply grid shuffle transformation to bounding boxes.</p> <p>This function transforms bounding boxes according to a grid shuffle operation. It handles cases where bounding boxes may be split into multiple components after shuffling and applies filtering based on minimum area and visibility requirements.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) where N is the number of boxes.    Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents    optional additional fields (e.g., class_id, score).</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tile coordinates with shape (M, 4) where M is the number of tiles.    Each tile is in format [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int]</code> <p>List of indices defining how tiles should be rearranged. Each index i in the list     contains the index of the tile that should be moved to position i.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area threshold in pixels. If a component's area after shuffling is      smaller than this value, it will be filtered out. If None, no area filtering      is applied.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio threshold in range [0, 1]. Calculated as            (component_area / original_area). If a component's visibility is lower            than this value, it will be filtered out. If None, no visibility            filtering is applied.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of transformed bounding boxes with shape (K, 4+) where K is the            number of valid components after shuffling and filtering. The format of            each box matches the input format, preserving any additional fields.            If no valid components remain after filtering, returns an empty array            with shape (0, C) where C matches the input column count.</p> <p>Note</p> <ul> <li>The function converts bboxes to masks before applying the transformation to handle   cases where boxes may be split into multiple components.</li> <li>After shuffling, each component is validated against min_area and min_visibility   requirements independently.</li> <li>Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and   copied to all components derived from the same original bbox.</li> <li>Empty input arrays are handled gracefully and return empty arrays of the   appropriate shape.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n&gt;&gt;&gt; tiles = np.array([\n...     [0, 0, 50, 50],    # top-left tile\n...     [0, 50, 50, 100],  # top-right tile\n...     [50, 0, 100, 50],  # bottom-left tile\n...     [50, 50, 100, 100] # bottom-right tile\n... ])\n&gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n&gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n&gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_grid_shuffle(\n    bboxes: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int],\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply grid shuffle transformation to bounding boxes.\n\n    This function transforms bounding boxes according to a grid shuffle operation. It handles cases\n    where bounding boxes may be split into multiple components after shuffling and applies\n    filtering based on minimum area and visibility requirements.\n\n    Args:\n        bboxes: Array of bounding boxes with shape (N, 4+) where N is the number of boxes.\n               Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents\n               optional additional fields (e.g., class_id, score).\n        tiles: Array of tile coordinates with shape (M, 4) where M is the number of tiles.\n               Each tile is in format [start_y, start_x, end_y, end_x].\n        mapping: List of indices defining how tiles should be rearranged. Each index i in the list\n                contains the index of the tile that should be moved to position i.\n        image_shape: Shape of the image as (height, width).\n        min_area: Minimum area threshold in pixels. If a component's area after shuffling is\n                 smaller than this value, it will be filtered out. If None, no area filtering\n                 is applied.\n        min_visibility: Minimum visibility ratio threshold in range [0, 1]. Calculated as\n                       (component_area / original_area). If a component's visibility is lower\n                       than this value, it will be filtered out. If None, no visibility\n                       filtering is applied.\n\n    Returns:\n        np.ndarray: Array of transformed bounding boxes with shape (K, 4+) where K is the\n                   number of valid components after shuffling and filtering. The format of\n                   each box matches the input format, preserving any additional fields.\n                   If no valid components remain after filtering, returns an empty array\n                   with shape (0, C) where C matches the input column count.\n\n    Note:\n        - The function converts bboxes to masks before applying the transformation to handle\n          cases where boxes may be split into multiple components.\n        - After shuffling, each component is validated against min_area and min_visibility\n          requirements independently.\n        - Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and\n          copied to all components derived from the same original bbox.\n        - Empty input arrays are handled gracefully and return empty arrays of the\n          appropriate shape.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n        &gt;&gt;&gt; tiles = np.array([\n        ...     [0, 0, 50, 50],    # top-left tile\n        ...     [0, 50, 50, 100],  # top-right tile\n        ...     [50, 0, 100, 50],  # bottom-left tile\n        ...     [50, 50, 100, 100] # bottom-right tile\n        ... ])\n        &gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n        &gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n        &gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n    \"\"\"\n    # Convert bboxes to masks\n    masks = masks_from_bboxes(bboxes, image_shape)\n\n    # Apply grid shuffle to each mask and handle split components\n    all_component_masks = []\n    extra_bbox_data = []  # Store additional bbox data for each component\n\n    for idx, mask in enumerate(masks):\n        original_area = np.sum(mask)  # Get original mask area\n\n        # Shuffle the mask\n        shuffled_mask = swap_tiles_on_image(mask, tiles, mapping)\n\n        # Find connected components\n        num_components, components = cv2.connectedComponents(\n            shuffled_mask.astype(np.uint8),\n        )\n\n        # For each component, create a separate binary mask\n        for comp_idx in range(1, num_components):  # Skip background (0)\n            component_mask = (components == comp_idx).astype(np.uint8)\n\n            # Calculate area and visibility ratio\n            component_area = np.sum(component_mask)\n            # Check if component meets minimum requirements\n            if is_valid_component(\n                component_area,\n                original_area,\n                min_area,\n                min_visibility,\n            ):\n                all_component_masks.append(component_mask)\n                # Append additional bbox data for this component\n                if bboxes.shape[1] &gt; NUM_BBOXES_COLUMNS_IN_ALBUMENTATIONS:\n                    extra_bbox_data.append(bboxes[idx, 4:])\n\n    # Convert all component masks to bboxes\n    if all_component_masks:\n        all_component_masks = np.array(all_component_masks)\n        shuffled_bboxes = bboxes_from_masks(all_component_masks)\n\n        # Add back additional bbox data if present\n        if extra_bbox_data:\n            extra_bbox_data = np.array(extra_bbox_data)\n            return np.column_stack([shuffled_bboxes, extra_bbox_data])\n    else:\n        # Handle case where no valid components were found\n        return np.zeros((0, bboxes.shape[1]), dtype=bboxes.dtype)\n\n    return shuffled_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_hflip","title":"<code>def bboxes_hflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of horizontally flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_hflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally around the y-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of horizontally flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 0] = 1 - bboxes[:, 2]  # new x_min = 1 - x_max\n    flipped_bboxes[:, 2] = 1 - bboxes[:, 0]  # new x_max = 1 - x_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_rot90","title":"<code>def bboxes_rot90    (bboxes, factor)    </code> [view source on GitHub]","text":"<p>Rotates bounding boxes by 90 degrees CCW (see np.rot90)</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <code>factor</code> <code>int</code> <p>Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of rotated bounding boxes with the same shape as input.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If factor is not in set {0, 1, 2, 3}.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_rot90(bboxes: np.ndarray, factor: int) -&gt; np.ndarray:\n    \"\"\"Rotates bounding boxes by 90 degrees CCW (see np.rot90)\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n        factor: Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.\n\n    Returns:\n        np.ndarray: A numpy array of rotated bounding boxes with the same shape as input.\n\n    Raises:\n        ValueError: If factor is not in set {0, 1, 2, 3}.\n    \"\"\"\n    if factor not in {0, 1, 2, 3}:\n        raise ValueError(\"Parameter factor must be in set {0, 1, 2, 3}\")\n\n    if factor == 0:\n        return bboxes\n\n    rotated_bboxes = bboxes.copy()\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    if factor == 1:\n        rotated_bboxes[:, 0] = y_min\n        rotated_bboxes[:, 1] = 1 - x_max\n        rotated_bboxes[:, 2] = y_max\n        rotated_bboxes[:, 3] = 1 - x_min\n    elif factor == ROT90_180_FACTOR:\n        rotated_bboxes[:, 0] = 1 - x_max\n        rotated_bboxes[:, 1] = 1 - y_max\n        rotated_bboxes[:, 2] = 1 - x_min\n        rotated_bboxes[:, 3] = 1 - y_min\n    elif factor == ROT90_270_FACTOR:\n        rotated_bboxes[:, 0] = 1 - y_max\n        rotated_bboxes[:, 1] = x_min\n        rotated_bboxes[:, 2] = 1 - y_min\n        rotated_bboxes[:, 3] = x_max\n\n    return rotated_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_transpose","title":"<code>def bboxes_transpose    (bboxes)    </code> [view source on GitHub]","text":"<p>Transpose bounding boxes by swapping x and y coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of transposed bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_transpose(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transpose bounding boxes by swapping x and y coordinates.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of transposed bounding boxes with the same shape as input.\n    \"\"\"\n    transposed_bboxes = bboxes.copy()\n    transposed_bboxes[:, [0, 1, 2, 3]] = bboxes[:, [1, 0, 3, 2]]\n\n    return transposed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_vflip","title":"<code>def bboxes_vflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of vertically flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_vflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes vertically around the x-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of vertically flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 1] = 1 - bboxes[:, 3]  # new y_min = 1 - y_max\n    flipped_bboxes[:, 3] = 1 - bboxes[:, 1]  # new y_max = 1 - y_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.calculate_affine_transform_padding","title":"<code>def calculate_affine_transform_padding    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the necessary padding for an affine transformation to avoid empty spaces.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def calculate_affine_transform_padding(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate the necessary padding for an affine transformation to avoid empty spaces.\"\"\"\n    height, width = image_shape[:2]\n\n    # Check for identity transform\n    if is_identity_matrix(matrix):\n        return (0, 0, 0, 0)\n\n    # Original corners\n    corners = np.array([[0, 0], [width, 0], [width, height], [0, height]])\n\n    # Transform corners\n    transformed_corners = apply_affine_to_points(corners, matrix)\n\n    # Ensure transformed_corners is 2D\n    transformed_corners = transformed_corners.reshape(-1, 2)\n\n    # Find box that includes both original and transformed corners\n    all_corners = np.vstack((corners, transformed_corners))\n    min_x, min_y = all_corners.min(axis=0)\n    max_x, max_y = all_corners.max(axis=0)\n\n    # Compute the inverse transform\n    inverse_matrix = np.linalg.inv(matrix)\n\n    # Apply inverse transform to all corners of the bounding box\n    bbox_corners = np.array(\n        [[min_x, min_y], [max_x, min_y], [max_x, max_y], [min_x, max_y]],\n    )\n    inverse_corners = apply_affine_to_points(bbox_corners, inverse_matrix).reshape(\n        -1,\n        2,\n    )\n\n    min_x, min_y = inverse_corners.min(axis=0)\n    max_x, max_y = inverse_corners.max(axis=0)\n\n    pad_left = max(0, math.ceil(0 - min_x))\n    pad_right = max(0, math.ceil(max_x - width))\n    pad_top = max(0, math.ceil(0 - min_y))\n    pad_bottom = max(0, math.ceil(max_y - height))\n\n    return pad_left, pad_right, pad_top, pad_bottom\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.center","title":"<code>def center    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates if image. Used by images, masks and keypoints.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates if image. Used by images, masks and keypoints.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2 - 0.5, height / 2 - 0.5\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.center_bbox","title":"<code>def center_bbox    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates for of image for bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center_bbox(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates for of image for bounding boxes.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2, height / 2\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.compute_tps_weights","title":"<code>def compute_tps_weights    (src_points, dst_points)    </code> [view source on GitHub]","text":"<p>Compute Thin Plate Spline weights.</p> <p>Parameters:</p> Name Type Description <code>src_points</code> <code>np.ndarray</code> <p>Source control points with shape (num_points, 2)</p> <code>dst_points</code> <code>np.ndarray</code> <p>Destination control points with shape (num_points, 2)</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)</li> <li>affine_weights: Weights for affine transformation (3, 2)     [constant term, x scale/shear, y scale/shear]</li> </ul> <p>Note</p> <p>The TPS interpolation is decomposed into: 1. Nonlinear part (controlled by kernel weights) 2. Affine part (global scaling, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_tps_weights(\n    src_points: np.ndarray,\n    dst_points: np.ndarray,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute Thin Plate Spline weights.\n\n    Args:\n        src_points: Source control points with shape (num_points, 2)\n        dst_points: Destination control points with shape (num_points, 2)\n\n    Returns:\n        tuple of:\n        - nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)\n        - affine_weights: Weights for affine transformation (3, 2)\n            [constant term, x scale/shear, y scale/shear]\n\n    Note:\n        The TPS interpolation is decomposed into:\n        1. Nonlinear part (controlled by kernel weights)\n        2. Affine part (global scaling, rotation, translation)\n    \"\"\"\n    num_points = src_points.shape[0]\n\n    # Compute pairwise distances\n    distances = np.linalg.norm(src_points[:, None] - src_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    # Add small epsilon to avoid log(0)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Construct affine terms matrix [1, x, y]\n    affine_terms = np.ones((num_points, 3))\n    affine_terms[:, 1:] = src_points\n\n    # Build system matrix\n    system_matrix = np.zeros((num_points + 3, num_points + 3))\n    system_matrix[:num_points, :num_points] = kernel_matrix\n    system_matrix[:num_points, num_points:] = affine_terms\n    system_matrix[num_points:, :num_points] = affine_terms.T\n\n    # Right-hand side of the system\n    target_coords = np.zeros((num_points + 3, 2))\n    target_coords[:num_points] = dst_points\n\n    # Solve the system for both x and y coordinates\n    all_weights = np.linalg.solve(system_matrix, target_coords)\n\n    # Split weights into nonlinear and affine components\n    nonlinear_weights = all_weights[:num_points]\n    affine_weights = all_weights[num_points:]\n\n    return nonlinear_weights, affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.compute_transformed_image_bounds","title":"<code>def compute_transformed_image_bounds    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Compute the bounds of an image after applying an affine transformation.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing:     - min_coords: An array with the minimum x and y coordinates.     - max_coords: An array with the maximum x and y coordinates.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_transformed_image_bounds(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute the bounds of an image after applying an affine transformation.\n\n    Args:\n        matrix (np.ndarray): The 3x3 affine transformation matrix.\n        image_shape (Tuple[int, int]): The shape of the image as (height, width).\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing:\n            - min_coords: An array with the minimum x and y coordinates.\n            - max_coords: An array with the maximum x and y coordinates.\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Define the corners of the image\n    corners = np.array([[0, 0, 1], [width, 0, 1], [width, height, 1], [0, height, 1]])\n\n    # Transform the corners\n    transformed_corners = corners @ matrix.T\n    transformed_corners = transformed_corners[:, :2] / transformed_corners[:, 2:]\n\n    # Calculate the bounding box of the transformed corners\n    min_coords = np.floor(transformed_corners.min(axis=0)).astype(int)\n    max_coords = np.ceil(transformed_corners.max(axis=0)).astype(int)\n\n    return min_coords, max_coords\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.create_affine_transformation_matrix","title":"<code>def create_affine_transformation_matrix    (translate, shear, scale, rotate, shift)    </code> [view source on GitHub]","text":"<p>Create an affine transformation matrix combining translation, shear, scale, and rotation.</p> <p>Parameters:</p> Name Type Description <code>translate</code> <code>dict[str, float]</code> <p>Translation in x and y directions.</p> <code>shear</code> <code>dict[str, float]</code> <p>Shear in x and y directions (in degrees).</p> <code>scale</code> <code>dict[str, float]</code> <p>Scale factors for x and y directions.</p> <code>rotate</code> <code>float</code> <p>Rotation angle in degrees.</p> <code>shift</code> <code>tuple[float, float]</code> <p>Shift to apply before and after transformations.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The resulting 3x3 affine transformation matrix.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_affine_transformation_matrix(\n    translate: XYInt,\n    shear: XYFloat,\n    scale: XYFloat,\n    rotate: float,\n    shift: tuple[float, float],\n) -&gt; np.ndarray:\n    \"\"\"Create an affine transformation matrix combining translation, shear, scale, and rotation.\n\n    Args:\n        translate (dict[str, float]): Translation in x and y directions.\n        shear (dict[str, float]): Shear in x and y directions (in degrees).\n        scale (dict[str, float]): Scale factors for x and y directions.\n        rotate (float): Rotation angle in degrees.\n        shift (tuple[float, float]): Shift to apply before and after transformations.\n\n    Returns:\n        np.ndarray: The resulting 3x3 affine transformation matrix.\n    \"\"\"\n    # Convert angles to radians\n    rotate_rad = np.deg2rad(rotate % 360)\n\n    shear_x_rad = np.deg2rad(shear[\"x\"])\n    shear_y_rad = np.deg2rad(shear[\"y\"])\n\n    # Create individual transformation matrices\n    # 1. Shift to top-left\n    m_shift_topleft = np.array([[1, 0, -shift[0]], [0, 1, -shift[1]], [0, 0, 1]])\n\n    # 2. Scale\n    m_scale = np.array([[scale[\"x\"], 0, 0], [0, scale[\"y\"], 0], [0, 0, 1]])\n\n    # 3. Rotation\n    m_rotate = np.array(\n        [\n            [np.cos(rotate_rad), np.sin(rotate_rad), 0],\n            [-np.sin(rotate_rad), np.cos(rotate_rad), 0],\n            [0, 0, 1],\n        ],\n    )\n\n    # 4. Shear\n    m_shear = np.array(\n        [[1, np.tan(shear_x_rad), 0], [np.tan(shear_y_rad), 1, 0], [0, 0, 1]],\n    )\n\n    # 5. Translation\n    m_translate = np.array([[1, 0, translate[\"x\"]], [0, 1, translate[\"y\"]], [0, 0, 1]])\n\n    # 6. Shift back to center\n    m_shift_center = np.array([[1, 0, shift[0]], [0, 1, shift[1]], [0, 0, 1]])\n\n    # Combine all transformations\n    # The order is important: transformations are applied from right to left\n    m = m_shift_center @ m_translate @ m_shear @ m_rotate @ m_scale @ m_shift_topleft\n\n    # Ensure the last row is exactly [0, 0, 1]\n    m[2] = [0, 0, 1]\n\n    return m\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.create_piecewise_affine_maps","title":"<code>def create_piecewise_affine_maps    (image_shape, grid, scale, absolute_scale, random_generator)    </code> [view source on GitHub]","text":"<p>Create maps for piecewise affine transformation using OpenCV's remap function.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_piecewise_affine_maps(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    scale: float,\n    absolute_scale: bool,\n    random_generator: np.random.Generator,\n) -&gt; tuple[np.ndarray | None, np.ndarray | None]:\n    \"\"\"Create maps for piecewise affine transformation using OpenCV's remap function.\"\"\"\n    height, width = image_shape[:2]\n    nb_rows, nb_cols = grid\n\n    # Input validation\n    if height &lt;= 0 or width &lt;= 0 or nb_rows &lt;= 0 or nb_cols &lt;= 0:\n        raise ValueError(\"Dimensions must be positive\")\n    if scale &lt;= 0:\n        return None, None\n\n    # Create source points grid\n    y = np.linspace(0, height - 1, nb_rows, dtype=np.float32)\n    x = np.linspace(0, width - 1, nb_cols, dtype=np.float32)\n    xx_src, yy_src = np.meshgrid(x, y)\n\n    # Initialize destination maps at full resolution\n    map_x = np.zeros((height, width), dtype=np.float32)\n    map_y = np.zeros((height, width), dtype=np.float32)\n\n    # Generate jitter for control points\n    jitter_scale = scale / 3 if absolute_scale else scale * min(width, height) / 3\n\n    jitter = random_generator.normal(0, jitter_scale, (nb_rows, nb_cols, 2)).astype(\n        np.float32,\n    )\n\n    # Create control points with jitter\n    control_points = np.zeros((nb_rows * nb_cols, 4), dtype=np.float32)\n    for i in range(nb_rows):\n        for j in range(nb_cols):\n            idx = i * nb_cols + j\n            # Source points\n            control_points[idx, 0] = xx_src[i, j]\n            control_points[idx, 1] = yy_src[i, j]\n            # Destination points with jitter\n            control_points[idx, 2] = np.clip(\n                xx_src[i, j] + jitter[i, j, 1],\n                0,\n                width - 1,\n            )\n            control_points[idx, 3] = np.clip(\n                yy_src[i, j] + jitter[i, j, 0],\n                0,\n                height - 1,\n            )\n\n    # Create full resolution maps\n    for i in range(height):\n        for j in range(width):\n            # Find nearest control points and interpolate\n            dx = j - control_points[:, 0]\n            dy = i - control_points[:, 1]\n            dist = dx * dx + dy * dy\n            weights = 1 / (dist + 1e-8)\n            weights = weights / np.sum(weights)\n\n            map_x[i, j] = np.sum(weights * control_points[:, 2])\n            map_y[i, j] = np.sum(weights * control_points[:, 3])\n\n    # Ensure output is within bounds\n    map_x = np.clip(map_x, 0, width - 1, out=map_x)\n    map_y = np.clip(map_y, 0, height - 1, out=map_y)\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.create_shape_groups","title":"<code>def create_shape_groups    (tiles)    </code> [view source on GitHub]","text":"<p>Groups tiles by their shape and stores the indices for each shape.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_shape_groups(tiles: np.ndarray) -&gt; dict[tuple[int, int], list[int]]:\n    \"\"\"Groups tiles by their shape and stores the indices for each shape.\"\"\"\n    shape_groups = defaultdict(list)\n    for index, (start_y, start_x, end_y, end_x) in enumerate(tiles):\n        shape = (end_y - start_y, end_x - start_x)\n        shape_groups[shape].append(index)\n    return shape_groups\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.d4","title":"<code>def d4    (img, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to an image array.</p> <p>This function manipulates an image using transformations such as rotations and flips, corresponding to the <code>D_4</code> dihedral group symmetry operations. Each transformation is identified by a unique group member code.</p> <ul> <li>img (np.ndarray): The input image array to transform.</li> <li>group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:</li> <li>'e': Identity (no transformation).</li> <li>'r90': Rotate 90 degrees counterclockwise.</li> <li>'r180': Rotate 180 degrees.</li> <li>'r270': Rotate 270 degrees counterclockwise.</li> <li>'v': Vertical flip.</li> <li>'hvt': Transpose over second diagonal</li> <li>'h': Horizontal flip.</li> <li>'t': Transpose (reflect over the main diagonal).</li> </ul> <ul> <li>np.ndarray: The transformed image array.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Rotating an image by 90 degrees:   <code>transformed_image = d4(original_image, 'r90')</code></li> <li>Applying a horizontal flip to an image:   <code>transformed_image = d4(original_image, 'h')</code></li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def d4(img: np.ndarray, group_member: D4Type) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to an image array.\n\n    This function manipulates an image using transformations such as rotations and flips,\n    corresponding to the `D_4` dihedral group symmetry operations.\n    Each transformation is identified by a unique group member code.\n\n    Parameters:\n    - img (np.ndarray): The input image array to transform.\n    - group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:\n      - 'e': Identity (no transformation).\n      - 'r90': Rotate 90 degrees counterclockwise.\n      - 'r180': Rotate 180 degrees.\n      - 'r270': Rotate 270 degrees counterclockwise.\n      - 'v': Vertical flip.\n      - 'hvt': Transpose over second diagonal\n      - 'h': Horizontal flip.\n      - 't': Transpose (reflect over the main diagonal).\n\n    Returns:\n    - np.ndarray: The transformed image array.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Rotating an image by 90 degrees:\n      `transformed_image = d4(original_image, 'r90')`\n    - Applying a horizontal flip to an image:\n      `transformed_image = d4(original_image, 'h')`\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: rot90(x, 3),  # Rotate 270 degrees\n        \"v\": vflip,  # Vertical flip\n        \"hvt\": lambda x: transpose(rot90(x, 2)),  # Reflect over anti-diagonal\n        \"h\": hflip,  # Horizontal flip\n        \"t\": transpose,  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](img)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.distort_image","title":"<code>def distort_image    (image, generated_mesh, interpolation)    </code> [view source on GitHub]","text":"<p>Apply perspective distortion to an image based on a generated mesh.</p> <p>This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>The input image to be distorted. Can be a 2D grayscale image or a                 3D color image.</p> <code>generated_mesh</code> <code>np.ndarray</code> <p>A 2D array where each row represents a quadrilateral cell                         as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].                         The first four values define the source rectangle, and the last eight values                         define the destination quadrilateral.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used in the perspective transformation.                  Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The distorted image with the same shape and dtype as the input image.</p> <p>Note</p> <ul> <li>The function preserves the channel dimension of the input image.</li> <li>Each cell of the generated mesh is transformed independently and then blended into the output image.</li> <li>The distortion is applied using perspective transformation, which allows for more complex   distortions compared to affine transformations.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n&gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n&gt;&gt;&gt; distorted.shape\n(100, 100, 3)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@preserve_channel_dim\ndef distort_image(\n    image: np.ndarray,\n    generated_mesh: np.ndarray,\n    interpolation: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply perspective distortion to an image based on a generated mesh.\n\n    This function applies a perspective transformation to each cell of the image defined by the\n    generated mesh. The distortion is applied using OpenCV's perspective transformation and\n    blending techniques.\n\n    Args:\n        image (np.ndarray): The input image to be distorted. Can be a 2D grayscale image or a\n                            3D color image.\n        generated_mesh (np.ndarray): A 2D array where each row represents a quadrilateral cell\n                                    as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].\n                                    The first four values define the source rectangle, and the last eight values\n                                    define the destination quadrilateral.\n        interpolation (int): Interpolation method to be used in the perspective transformation.\n                             Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).\n\n    Returns:\n        np.ndarray: The distorted image with the same shape and dtype as the input image.\n\n    Note:\n        - The function preserves the channel dimension of the input image.\n        - Each cell of the generated mesh is transformed independently and then blended into the output image.\n        - The distortion is applied using perspective transformation, which allows for more complex\n          distortions compared to affine transformations.\n\n    Example:\n        &gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n        &gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n        &gt;&gt;&gt; distorted.shape\n        (100, 100, 3)\n    \"\"\"\n    distorted_image = np.zeros_like(image)\n\n    for mesh in generated_mesh:\n        # Extract source rectangle and destination quadrilateral\n        x1, y1, x2, y2 = mesh[:4]  # Source rectangle\n        dst_quad = mesh[4:].reshape(4, 2)  # Destination quadrilateral\n\n        # Convert source rectangle to quadrilateral\n        src_quad = np.array(\n            [\n                [x1, y1],  # Top-left\n                [x2, y1],  # Top-right\n                [x2, y2],  # Bottom-right\n                [x1, y2],  # Bottom-left\n            ],\n            dtype=np.float32,\n        )\n\n        # Calculate Perspective transformation matrix\n        perspective_mat = cv2.getPerspectiveTransform(src_quad, dst_quad)\n\n        # Apply Perspective transformation\n        warped = cv2.warpPerspective(\n            image,\n            perspective_mat,\n            (image.shape[1], image.shape[0]),\n            flags=interpolation,\n        )\n\n        # Create mask for the transformed region\n        mask = np.zeros(image.shape[:2], dtype=np.uint8)\n        cv2.fillConvexPoly(mask, np.int32(dst_quad), 255)\n\n        # Copy only the warped quadrilateral area to the output image\n        distorted_image = cv2.copyTo(warped, mask, distorted_image)\n\n    return distorted_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.find_keypoint","title":"<code>def find_keypoint    (position, distance_map, threshold, inverted)    </code> [view source on GitHub]","text":"<p>Determine if a valid keypoint can be found at the given position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def find_keypoint(\n    position: tuple[int, int],\n    distance_map: np.ndarray,\n    threshold: float | None,\n    inverted: bool,\n) -&gt; tuple[float, float] | None:\n    \"\"\"Determine if a valid keypoint can be found at the given position.\"\"\"\n    y, x = position\n    value = distance_map[y, x]\n    if not inverted and threshold is not None and value &gt;= threshold:\n        return None\n    if inverted and threshold is not None and value &lt;= threshold:\n        return None\n    return float(x), float(y)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.flip_bboxes","title":"<code>def flip_bboxes    (bboxes, flip_horizontal=False, flip_vertical=False, image_shape=(0, 0))    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally and/or vertically.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...].</p> <code>flip_horizontal</code> <code>bool</code> <p>Whether to flip horizontally.</p> <code>flip_vertical</code> <code>bool</code> <p>Whether to flip vertically.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Flipped bounding boxes.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef flip_bboxes(\n    bboxes: np.ndarray,\n    flip_horizontal: bool = False,\n    flip_vertical: bool = False,\n    image_shape: tuple[int, int] = (0, 0),\n) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally and/or vertically.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where each row is\n            [x_min, y_min, x_max, y_max, ...].\n        flip_horizontal (bool): Whether to flip horizontally.\n        flip_vertical (bool): Whether to flip vertically.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Flipped bounding boxes.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    flipped_bboxes = bboxes.copy()\n    if flip_horizontal:\n        flipped_bboxes[:, [0, 2]] = cols - flipped_bboxes[:, [2, 0]]\n    if flip_vertical:\n        flipped_bboxes[:, [1, 3]] = rows - flipped_bboxes[:, [3, 1]]\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.from_distance_maps","title":"<code>def from_distance_maps    (distance_maps, inverted, if_not_found_coords=None, threshold=None)    </code> [view source on GitHub]","text":"<p>Convert distance maps back to keypoints coordinates.</p> <p>This function is the inverse of <code>to_distance_maps</code>. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.</p> <p>Parameters:</p> Name Type Description <code>distance_maps</code> <code>np.ndarray</code> <p>A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint.</p> <code>inverted</code> <code>bool</code> <p>If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity).</p> <code>if_not_found_coords</code> <code>Sequence[int] | dict[str, Any] | None</code> <p>Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None.</p> <code>threshold</code> <code>float | None</code> <p>A threshold value to determine valid keypoints. For inverted maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates of the reconstructed keypoints. If <code>drop_if_not_found</code> is True (derived from if_not_found_coords), the output may have fewer rows than input keypoints.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input <code>distance_maps</code> is not a 3D array.</p> <p>Notes</p> <ul> <li>The function uses vectorized operations for improved performance, especially with large numbers of keypoints.</li> <li>When <code>threshold</code> is None, all keypoints are considered valid, and <code>if_not_found_coords</code> is not used.</li> <li>The function assumes that the input distance maps are properly normalized and scaled according to the   original image dimensions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n&gt;&gt;&gt; inverted = True\n&gt;&gt;&gt; if_not_found_coords = [0, 0]\n&gt;&gt;&gt; threshold = 0.5\n&gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n&gt;&gt;&gt; print(keypoints.shape)\n(3, 2)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def from_distance_maps(\n    distance_maps: np.ndarray,\n    inverted: bool,\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None = None,\n    threshold: float | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Convert distance maps back to keypoints coordinates.\n\n    This function is the inverse of `to_distance_maps`. It takes distance maps generated for a set of keypoints\n    and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps,\n    and can handle cases where keypoints are not found or fall outside a specified threshold.\n\n    Args:\n        distance_maps (np.ndarray): A 3D numpy array of shape (height, width, nb_keypoints) containing\n            distance maps for each keypoint. Each channel represents the distance map for one keypoint.\n        inverted (bool): If True, treats the distance maps as inverted (where higher values indicate\n            closer proximity to keypoints). If False, treats them as regular distance maps (where lower\n            values indicate closer proximity).\n        if_not_found_coords (Sequence[int] | dict[str, Any] | None, optional): Coordinates to use for\n            keypoints that are not found or fall outside the threshold. Can be:\n            - None: Drop keypoints that are not found.\n            - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints.\n            - Dict with 'x' and 'y' keys: Use these values for not found keypoints.\n            Defaults to None.\n        threshold (float | None, optional): A threshold value to determine valid keypoints. For inverted\n            maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are\n            considered valid. If None, all keypoints are considered valid. Defaults to None.\n\n    Returns:\n        np.ndarray: A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates\n        of the reconstructed keypoints. If `drop_if_not_found` is True (derived from if_not_found_coords),\n        the output may have fewer rows than input keypoints.\n\n    Raises:\n        ValueError: If the input `distance_maps` is not a 3D array.\n\n    Notes:\n        - The function uses vectorized operations for improved performance, especially with large numbers of keypoints.\n        - When `threshold` is None, all keypoints are considered valid, and `if_not_found_coords` is not used.\n        - The function assumes that the input distance maps are properly normalized and scaled according to the\n          original image dimensions.\n\n    Example:\n        &gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n        &gt;&gt;&gt; inverted = True\n        &gt;&gt;&gt; if_not_found_coords = [0, 0]\n        &gt;&gt;&gt; threshold = 0.5\n        &gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n        &gt;&gt;&gt; print(keypoints.shape)\n        (3, 2)\n    \"\"\"\n    if distance_maps.ndim != NUM_MULTI_CHANNEL_DIMENSIONS:\n        msg = f\"Expected three-dimensional input, got {distance_maps.ndim} dimensions and shape {distance_maps.shape}.\"\n        raise ValueError(msg)\n    height, width, nb_keypoints = distance_maps.shape\n\n    drop_if_not_found, if_not_found_x, if_not_found_y = validate_if_not_found_coords(\n        if_not_found_coords,\n    )\n\n    # Find the indices of max/min values for all keypoints at once\n    if inverted:\n        hitidx_flat = np.argmax(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n    else:\n        hitidx_flat = np.argmin(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n\n    # Convert flat indices to 2D coordinates\n    hitidx_y, hitidx_x = np.unravel_index(hitidx_flat, (height, width))\n\n    # Create keypoints array\n    keypoints = np.column_stack((hitidx_x, hitidx_y)).astype(float)\n\n    if threshold is not None:\n        # Check threshold condition\n        if inverted:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &gt;= threshold\n        else:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &lt;= threshold\n\n        if not drop_if_not_found:\n            # Replace invalid keypoints with if_not_found_coords\n            keypoints[~valid_mask] = [if_not_found_x, if_not_found_y]\n        else:\n            # Keep only valid keypoints\n            return keypoints[valid_mask]\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_displacement_fields","title":"<code>def generate_displacement_fields    (image_shape, alpha, sigma, same_dxdy, kernel_size, random_generator, noise_distribution)    </code> [view source on GitHub]","text":"<p>Generate displacement fields for elastic transform.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width)</p> <code>alpha</code> <code>float</code> <p>Scaling factor for displacement</p> <code>sigma</code> <code>float</code> <p>Standard deviation for Gaussian blur</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use same displacement field for both directions</p> <code>kernel_size</code> <code>tuple[int, int]</code> <p>Size of Gaussian blur kernel</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random number generator</p> <code>noise_distribution</code> <code>Literal['gaussian', 'uniform']</code> <p>Type of noise distribution to use (\"gaussian\" or \"uniform\")</p> <p>Returns:</p> Type Description <code>tuple</code> <p>(dx, dy) displacement fields</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_displacement_fields(\n    image_shape: tuple[int, int],\n    alpha: float,\n    sigma: float,\n    same_dxdy: bool,\n    kernel_size: tuple[int, int],\n    random_generator: np.random.Generator,\n    noise_distribution: Literal[\"gaussian\", \"uniform\"],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate displacement fields for elastic transform.\n\n    Args:\n        image_shape: Shape of the image (height, width)\n        alpha: Scaling factor for displacement\n        sigma: Standard deviation for Gaussian blur\n        same_dxdy: Whether to use same displacement field for both directions\n        kernel_size: Size of Gaussian blur kernel\n        random_generator: NumPy random number generator\n        noise_distribution: Type of noise distribution to use (\"gaussian\" or \"uniform\")\n\n    Returns:\n        tuple: (dx, dy) displacement fields\n    \"\"\"\n\n    def generate_noise_field() -&gt; np.ndarray:\n        # Generate noise based on distribution type\n        if noise_distribution == \"gaussian\":\n            field = random_generator.standard_normal(size=image_shape[:2])\n        else:  # uniform\n            field = random_generator.uniform(low=-1, high=1, size=image_shape[:2])\n\n        # Common operations for both distributions\n        field = field.astype(np.float32)\n        cv2.GaussianBlur(field, kernel_size, sigma, dst=field)\n        return field * alpha\n\n    # Generate first displacement field\n    dx = generate_noise_field()\n\n    # Generate or copy second displacement field\n    dy = dx if same_dxdy else generate_noise_field()\n\n    return dx, dy\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons","title":"<code>def generate_distorted_grid_polygons    (dimensions, magnitude, random_generator)    </code> [view source on GitHub]","text":"<p>Generate distorted grid polygons based on input dimensions and magnitude.</p> <p>This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.</p> <p>Parameters:</p> Name Type Description <code>dimensions</code> <code>np.ndarray</code> <p>A 3D array of shape (grid_height, grid_width, 4) where each element                      is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. The actual displacement              will be randomly chosen in the range [-magnitude, magnitude].</p> <code>random_generator</code> <code>np.random.Generator</code> <p>A random number generator.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D array of shape (total_cells, 8) where each row represents a distorted polygon             as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.</p> <p>Note</p> <ul> <li>Only internal grid points are distorted; boundary points remain fixed.</li> <li>The function ensures consistent distortion across shared vertices of adjacent cells.</li> <li>The distortion is applied to the following points of each internal cell:<ul> <li>Bottom-right of the cell above and to the left</li> <li>Bottom-left of the cell above</li> <li>Top-right of the cell to the left</li> <li>Top-left of the current cell</li> </ul> </li> <li>Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.     +--+--+--+--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--+--+--+--+</li> <li>For each X, the coordinates of the left, right, top, and bottom edges   in the four adjacent cells are displaced.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n&gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n&gt;&gt;&gt; distorted.shape\n(4, 8)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_distorted_grid_polygons(\n    dimensions: np.ndarray,\n    magnitude: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate distorted grid polygons based on input dimensions and magnitude.\n\n    This function creates a grid of polygons and applies random distortions to the internal vertices,\n    while keeping the boundary vertices fixed. The distortion is applied consistently across shared\n    vertices to avoid gaps or overlaps in the resulting grid.\n\n    Args:\n        dimensions (np.ndarray): A 3D array of shape (grid_height, grid_width, 4) where each element\n                                 is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.\n        magnitude (int): Maximum pixel-wise displacement for distortion. The actual displacement\n                         will be randomly chosen in the range [-magnitude, magnitude].\n        random_generator (np.random.Generator): A random number generator.\n\n    Returns:\n        np.ndarray: A 2D array of shape (total_cells, 8) where each row represents a distorted polygon\n                    as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.\n\n    Note:\n        - Only internal grid points are distorted; boundary points remain fixed.\n        - The function ensures consistent distortion across shared vertices of adjacent cells.\n        - The distortion is applied to the following points of each internal cell:\n            * Bottom-right of the cell above and to the left\n            * Bottom-left of the cell above\n            * Top-right of the cell to the left\n            * Top-left of the current cell\n        - Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.\n            +--+--+--+--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--+--+--+--+\n        - For each X, the coordinates of the left, right, top, and bottom edges\n          in the four adjacent cells are displaced.\n\n    Example:\n        &gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n        ...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n        &gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n        &gt;&gt;&gt; distorted.shape\n        (4, 8)\n    \"\"\"\n    grid_height, grid_width = dimensions.shape[:2]\n    total_cells = grid_height * grid_width\n\n    # Initialize polygons\n    polygons = np.zeros((total_cells, 8), dtype=np.float32)\n    polygons[:, 0:2] = dimensions.reshape(-1, 4)[:, [0, 1]]  # x1, y1\n    polygons[:, 2:4] = dimensions.reshape(-1, 4)[:, [2, 1]]  # x2, y1\n    polygons[:, 4:6] = dimensions.reshape(-1, 4)[:, [2, 3]]  # x2, y2\n    polygons[:, 6:8] = dimensions.reshape(-1, 4)[:, [0, 3]]  # x1, y2\n\n    # Generate displacements for internal grid points only\n    internal_points_height, internal_points_width = grid_height - 1, grid_width - 1\n    displacements = random_generator.integers(\n        -magnitude,\n        magnitude + 1,\n        size=(internal_points_height, internal_points_width, 2),\n    ).astype(np.float32)\n\n    # Apply displacements to internal polygon vertices\n    for i in range(1, grid_height):\n        for j in range(1, grid_width):\n            dx, dy = displacements[i - 1, j - 1]\n\n            # Bottom-right of cell (i-1, j-1)\n            polygons[(i - 1) * grid_width + (j - 1), 4:6] += [dx, dy]\n\n            # Bottom-left of cell (i-1, j)\n            polygons[(i - 1) * grid_width + j, 6:8] += [dx, dy]\n\n            # Top-right of cell (i, j-1)\n            polygons[i * grid_width + (j - 1), 2:4] += [dx, dy]\n\n            # Top-left of cell (i, j)\n            polygons[i * grid_width + j, 0:2] += [dx, dy]\n\n    return polygons\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_grid","title":"<code>def generate_grid    (image_shape, steps_x, steps_y, num_steps)    </code> [view source on GitHub]","text":"<p>Generate a distorted grid for image transformation based on given step sizes.</p> <p>This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>steps_x</code> <code>list[float]</code> <p>List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction.</p> <code>steps_y</code> <code>list[float]</code> <p>List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction.</p> <code>num_steps</code> <code>int</code> <p>The number of steps to divide each axis into. This determines the granularity of the distortion grid.</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing two 2D numpy arrays:     - map_x: A 2D array of float32 values representing the x-coordinates       of the distorted grid.     - map_y: A 2D array of float32 values representing the y-coordinates       of the distorted grid.</p> <p>Note</p> <ul> <li>The function generates a grid where each cell can be distorted independently.</li> <li>The distortion is controlled by the steps_x and steps_y parameters, which   determine how much each grid line is shifted.</li> <li>The resulting map_x and map_y can be used directly with cv2.remap() to   apply the distortion to an image.</li> <li>The distortion is applied smoothly across each grid cell using linear   interpolation.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n&gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n&gt;&gt;&gt; num_steps = 5\n&gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n&gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_grid(\n    image_shape: tuple[int, int],\n    steps_x: list[float],\n    steps_y: list[float],\n    num_steps: int,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate a distorted grid for image transformation based on given step sizes.\n\n    This function creates two 2D arrays (map_x and map_y) that represent a distorted version\n    of the original image grid. These arrays can be used with OpenCV's remap function to\n    apply grid distortion to an image.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        steps_x (list[float]): List of step sizes for the x-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the x direction.\n        steps_y (list[float]): List of step sizes for the y-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the y direction.\n        num_steps (int): The number of steps to divide each axis into. This determines\n            the granularity of the distortion grid.\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing two 2D numpy arrays:\n            - map_x: A 2D array of float32 values representing the x-coordinates\n              of the distorted grid.\n            - map_y: A 2D array of float32 values representing the y-coordinates\n              of the distorted grid.\n\n    Note:\n        - The function generates a grid where each cell can be distorted independently.\n        - The distortion is controlled by the steps_x and steps_y parameters, which\n          determine how much each grid line is shifted.\n        - The resulting map_x and map_y can be used directly with cv2.remap() to\n          apply the distortion to an image.\n        - The distortion is applied smoothly across each grid cell using linear\n          interpolation.\n\n    Example:\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n        &gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n        &gt;&gt;&gt; num_steps = 5\n        &gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n        &gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n    \"\"\"\n    height, width = image_shape[:2]\n    x_step = width // num_steps\n    xx = np.zeros(width, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_x):\n        x = idx * x_step\n        start = int(x)\n        end = min(int(x) + x_step, width)\n        cur = prev + x_step * step\n        xx[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    y_step = height // num_steps\n    yy = np.zeros(height, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_y):\n        y = idx * y_step\n        start = int(y)\n        end = min(int(y) + y_step, height)\n        cur = prev + y_step * step\n        yy[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    return np.meshgrid(xx, yy)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_reflected_bboxes","title":"<code>def generate_reflected_bboxes    (bboxes, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected bounding boxes for the entire reflection grid.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Original bounding boxes.</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>Grid dimensions and original position.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted bounding boxes for the entire grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_bboxes(\n    bboxes: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected bounding boxes for the entire reflection grid.\n\n    Args:\n        bboxes (np.ndarray): Original bounding boxes.\n        grid_dims (dict[str, tuple[int, int]]): Grid dimensions and original position.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted bounding boxes for the entire grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of bboxes\n    bboxes_hflipped = flip_bboxes(bboxes, flip_horizontal=True, image_shape=image_shape)\n    bboxes_vflipped = flip_bboxes(bboxes, flip_vertical=True, image_shape=image_shape)\n    bboxes_hvflipped = flip_bboxes(\n        bboxes,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [\n            original_col * cols,\n            original_row * rows,\n            original_col * cols,\n            original_row * rows,\n        ],\n    )\n    bboxes = shift_bboxes(bboxes, shift_vector)\n    bboxes_hflipped = shift_bboxes(bboxes_hflipped, shift_vector)\n    bboxes_vflipped = shift_bboxes(bboxes_vflipped, shift_vector)\n    bboxes_hvflipped = shift_bboxes(bboxes_hvflipped, shift_vector)\n\n    new_bboxes = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of bboxes to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes\n            elif (grid_row - original_row) % 2 == 0:\n                current_bboxes = bboxes_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes_vflipped\n            else:\n                current_bboxes = bboxes_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                ],\n            )\n            shifted_bboxes = shift_bboxes(current_bboxes, cell_shift)\n\n            new_bboxes.append(shifted_bboxes)\n\n    result = np.vstack(new_bboxes)\n\n    return shift_bboxes(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_reflected_keypoints","title":"<code>def generate_reflected_keypoints    (keypoints, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected keypoints for the entire reflection grid.</p> <p>This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the <code>center_in_origin</code> parameter.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Original keypoints array of shape (N, 4+), where N is the number of keypoints,                     and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing grid dimensions and original position. It should have the following keys: - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols) - \"original_position\": tuple[int, int] representing (original_row, original_col)</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted keypoints for the entire grid. The shape is             (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.</p> <p>Note</p> <ul> <li>The function handles keypoint flipping and shifting to create a grid of reflected keypoints.</li> <li>It preserves the angle and scale information of the keypoints during transformations.</li> <li>The resulting grid can be either centered at the origin or positioned based on the original grid.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_keypoints(\n    keypoints: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected keypoints for the entire reflection grid.\n\n    This function creates a grid of keypoints by reflecting and shifting the original keypoints.\n    It handles both centered and non-centered grids based on the `center_in_origin` parameter.\n\n    Args:\n        keypoints (np.ndarray): Original keypoints array of shape (N, 4+), where N is the number of keypoints,\n                                and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).\n        grid_dims (dict[str, tuple[int, int]]): A dictionary containing grid dimensions and original position.\n            It should have the following keys:\n            - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols)\n            - \"original_position\": tuple[int, int] representing (original_row, original_col)\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool, optional): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted keypoints for the entire grid. The shape is\n                    (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.\n\n    Note:\n        - The function handles keypoint flipping and shifting to create a grid of reflected keypoints.\n        - It preserves the angle and scale information of the keypoints during transformations.\n        - The resulting grid can be either centered at the origin or positioned based on the original grid.\n    \"\"\"\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of keypoints\n    keypoints_hflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        image_shape=image_shape,\n    )\n    keypoints_vflipped = flip_keypoints(\n        keypoints,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n    keypoints_hvflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    rows, cols = image_shape[:2]\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [original_col * cols, original_row * rows, 0, 0],\n    )  # Only shift x and y\n    keypoints = shift_keypoints(keypoints, shift_vector)\n    keypoints_hflipped = shift_keypoints(keypoints_hflipped, shift_vector)\n    keypoints_vflipped = shift_keypoints(keypoints_vflipped, shift_vector)\n    keypoints_hvflipped = shift_keypoints(keypoints_hvflipped, shift_vector)\n\n    new_keypoints = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of keypoints to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints\n            elif (grid_row - original_row) % 2 == 0:\n                current_keypoints = keypoints_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints_vflipped\n            else:\n                current_keypoints = keypoints_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    0,\n                    0,\n                ],\n            )\n            shifted_keypoints = shift_keypoints(current_keypoints, cell_shift)\n\n            new_keypoints.append(shifted_keypoints)\n\n    result = np.vstack(new_keypoints)\n\n    return shift_keypoints(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_shuffled_splits","title":"<code>def generate_shuffled_splits    (size, divisions, random_generator)    </code> [view source on GitHub]","text":"<p>Generate shuffled splits for a given dimension size and number of divisions.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>int</code> <p>Total size of the dimension (height or width).</p> <code>divisions</code> <code>int</code> <p>Number of divisions (rows or columns).</p> <code>random_generator</code> <code>np.random.Generator | None</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Cumulative edges of the shuffled intervals.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_shuffled_splits(\n    size: int,\n    divisions: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate shuffled splits for a given dimension size and number of divisions.\n\n    Args:\n        size (int): Total size of the dimension (height or width).\n        divisions (int): Number of divisions (rows or columns).\n        random_generator (np.random.Generator | None): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: Cumulative edges of the shuffled intervals.\n    \"\"\"\n    intervals = almost_equal_intervals(size, divisions)\n    random_generator.shuffle(intervals)\n    return np.insert(np.cumsum(intervals), 0, 0)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps","title":"<code>def get_camera_matrix_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using camera matrix model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_camera_matrix_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using camera matrix model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n    camera_matrix = np.array(\n        [[width, 0, center_xy[0]], [0, height, center_xy[1]], [0, 0, 1]],\n        dtype=np.float32,\n    )\n    distortion = np.array([k, k, 0, 0, 0], dtype=np.float32)\n    return cv2.initUndistortRectifyMap(\n        camera_matrix,\n        distortion,\n        None,\n        None,\n        (width, height),\n        cv2.CV_32FC1,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_dimension_padding","title":"<code>def get_dimension_padding    (current_size, min_size, divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding for a single dimension.</p> <p>Parameters:</p> Name Type Description <code>current_size</code> <code>int</code> <p>Current size of the dimension</p> <code>min_size</code> <code>int | None</code> <p>Minimum size requirement, if any</p> <code>divisor</code> <code>int | None</code> <p>Divisor for padding to make size divisible, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int]</code> <p>(pad_before, pad_after)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_dimension_padding(\n    current_size: int,\n    min_size: int | None,\n    divisor: int | None,\n) -&gt; tuple[int, int]:\n    \"\"\"Calculate padding for a single dimension.\n\n    Args:\n        current_size: Current size of the dimension\n        min_size: Minimum size requirement, if any\n        divisor: Divisor for padding to make size divisible, if any\n\n    Returns:\n        tuple[int, int]: (pad_before, pad_after)\n    \"\"\"\n    if min_size is not None:\n        if current_size &lt; min_size:\n            pad_before = int((min_size - current_size) / 2.0)\n            pad_after = min_size - current_size - pad_before\n            return pad_before, pad_after\n    elif divisor is not None:\n        remainder = current_size % divisor\n        if remainder &gt; 0:\n            total_pad = divisor - remainder\n            pad_before = total_pad // 2\n            pad_after = total_pad - pad_before\n            return pad_before, pad_after\n\n    return 0, 0\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps","title":"<code>def get_fisheye_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using fisheye model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_fisheye_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using fisheye model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n\n    center_x, center_y = center_xy\n\n    # Create coordinate grid\n    y, x = np.mgrid[:height, :width].astype(np.float32)\n\n    x = x - center_x\n    y = y - center_y\n\n    # Calculate polar coordinates\n    r = np.sqrt(x * x + y * y)\n    theta = np.arctan2(y, x)\n\n    # Normalize radius by the maximum possible radius to keep distortion in check\n    max_radius = math.sqrt(max(center_x, width - center_x) ** 2 + max(center_y, height - center_y) ** 2)\n    r_norm = r / max_radius\n\n    # Apply fisheye distortion to normalized radius\n    r_dist = r * (1 + k * r_norm * r_norm)\n\n    # Convert back to cartesian coordinates\n    map_x = r_dist * np.cos(theta) + center_x\n    map_y = r_dist * np.sin(theta) + center_y\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_pad_grid_dimensions","title":"<code>def get_pad_grid_dimensions    (pad_top, pad_bottom, pad_left, pad_right, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the dimensions of the grid needed for reflection padding and the position of the original image.</p> <p>Parameters:</p> Name Type Description <code>pad_top</code> <code>int</code> <p>Number of pixels to pad above the image.</p> <code>pad_bottom</code> <code>int</code> <p>Number of pixels to pad below the image.</p> <code>pad_left</code> <code>int</code> <p>Number of pixels to pad to the left of the image.</p> <code>pad_right</code> <code>int</code> <p>Number of pixels to pad to the right of the image.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <p>Returns:</p> Type Description <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing:     - 'grid_shape': A tuple (grid_rows, grid_cols) where:         - grid_rows (int): Number of times the image needs to be repeated vertically.         - grid_cols (int): Number of times the image needs to be repeated horizontally.     - 'original_position': A tuple (original_row, original_col) where:         - original_row (int): Row index of the original image in the grid.         - original_col (int): Column index of the original image in the grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_pad_grid_dimensions(\n    pad_top: int,\n    pad_bottom: int,\n    pad_left: int,\n    pad_right: int,\n    image_shape: tuple[int, int],\n) -&gt; dict[str, tuple[int, int]]:\n    \"\"\"Calculate the dimensions of the grid needed for reflection padding and the position of the original image.\n\n    Args:\n        pad_top (int): Number of pixels to pad above the image.\n        pad_bottom (int): Number of pixels to pad below the image.\n        pad_left (int): Number of pixels to pad to the left of the image.\n        pad_right (int): Number of pixels to pad to the right of the image.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n\n    Returns:\n        dict[str, tuple[int, int]]: A dictionary containing:\n            - 'grid_shape': A tuple (grid_rows, grid_cols) where:\n                - grid_rows (int): Number of times the image needs to be repeated vertically.\n                - grid_cols (int): Number of times the image needs to be repeated horizontally.\n            - 'original_position': A tuple (original_row, original_col) where:\n                - original_row (int): Row index of the original image in the grid.\n                - original_col (int): Column index of the original image in the grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    grid_rows = 1 + math.ceil(pad_top / rows) + math.ceil(pad_bottom / rows)\n    grid_cols = 1 + math.ceil(pad_left / cols) + math.ceil(pad_right / cols)\n    original_row = math.ceil(pad_top / rows)\n    original_col = math.ceil(pad_left / cols)\n\n    return {\n        \"grid_shape\": (grid_rows, grid_cols),\n        \"original_position\": (original_row, original_col),\n    }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_padding_params","title":"<code>def get_padding_params    (image_shape, min_height, min_width, pad_height_divisor, pad_width_divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding parameters based on target dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>(height, width) of the image</p> <code>min_height</code> <code>int | None</code> <p>Minimum height requirement, if any</p> <code>min_width</code> <code>int | None</code> <p>Minimum width requirement, if any</p> <code>pad_height_divisor</code> <code>int | None</code> <p>Divisor for height padding, if any</p> <code>pad_width_divisor</code> <code>int | None</code> <p>Divisor for width padding, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int, int, int]</code> <p>(pad_top, pad_bottom, pad_left, pad_right)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_padding_params(\n    image_shape: tuple[int, int],\n    min_height: int | None,\n    min_width: int | None,\n    pad_height_divisor: int | None,\n    pad_width_divisor: int | None,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate padding parameters based on target dimensions.\n\n    Args:\n        image_shape: (height, width) of the image\n        min_height: Minimum height requirement, if any\n        min_width: Minimum width requirement, if any\n        pad_height_divisor: Divisor for height padding, if any\n        pad_width_divisor: Divisor for width padding, if any\n\n    Returns:\n        tuple[int, int, int, int]: (pad_top, pad_bottom, pad_left, pad_right)\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    h_pad_top, h_pad_bottom = get_dimension_padding(\n        rows,\n        min_height,\n        pad_height_divisor,\n    )\n    w_pad_left, w_pad_right = get_dimension_padding(cols, min_width, pad_width_divisor)\n\n    return h_pad_top, h_pad_bottom, w_pad_left, w_pad_right\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.is_identity_matrix","title":"<code>def is_identity_matrix    (matrix)    </code> [view source on GitHub]","text":"<p>Check if the given matrix is an identity matrix.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>A 3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>bool</code> <p>True if the matrix is an identity matrix, False otherwise.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_identity_matrix(matrix: np.ndarray) -&gt; bool:\n    \"\"\"Check if the given matrix is an identity matrix.\n\n    Args:\n        matrix (np.ndarray): A 3x3 affine transformation matrix.\n\n    Returns:\n        bool: True if the matrix is an identity matrix, False otherwise.\n    \"\"\"\n    return np.allclose(matrix, np.eye(3, dtype=matrix.dtype))\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.is_valid_component","title":"<code>def is_valid_component    (component_area, original_area, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Validate if a component meets the minimum requirements.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_valid_component(\n    component_area: float,\n    original_area: float,\n    min_area: float | None,\n    min_visibility: float | None,\n) -&gt; bool:\n    \"\"\"Validate if a component meets the minimum requirements.\"\"\"\n    visibility = component_area / original_area\n    return (min_area is None or component_area &gt;= min_area) and (min_visibility is None or visibility &gt;= min_visibility)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_affine","title":"<code>def keypoints_affine    (keypoints, matrix, image_shape, scale, border_mode)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to keypoints.</p> <p>This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 4+) where N is the number of keypoints.                     Each keypoint is represented as [x, y, angle, scale, ...].</p> <code>matrix</code> <code>np.ndarray</code> <p>The 2x3 or 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width).</p> <code>scale</code> <code>dict[str, float]</code> <p>Dictionary containing scale factors for x and y directions.                       Expected keys are 'x' and 'y'.</p> <code>border_mode</code> <code>int</code> <p>Border mode for handling keypoints near image edges.                 Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed keypoints array with the same shape as input.</p> <p>Notes</p> <ul> <li>The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.</li> <li>Coordinates (x, y) are transformed using the affine matrix.</li> <li>Angles are adjusted based on the rotation component of the affine transformation.</li> <li>Scales are multiplied by the maximum of x and y scale factors.</li> <li>The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n&gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n&gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n&gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_affine(\n    keypoints: np.ndarray,\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n    scale: XYFloat,\n    border_mode: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to keypoints.\n\n    This function transforms keypoints using the given affine transformation matrix.\n    It handles reflection padding if necessary, updates coordinates, angles, and scales.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 4+) where N is the number of keypoints.\n                                Each keypoint is represented as [x, y, angle, scale, ...].\n        matrix (np.ndarray): The 2x3 or 3x3 affine transformation matrix.\n        image_shape (tuple[int, int]): Shape of the image (height, width).\n        scale (dict[str, float]): Dictionary containing scale factors for x and y directions.\n                                  Expected keys are 'x' and 'y'.\n        border_mode (int): Border mode for handling keypoints near image edges.\n                            Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.\n\n    Returns:\n        np.ndarray: Transformed keypoints array with the same shape as input.\n\n    Notes:\n        - The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.\n        - Coordinates (x, y) are transformed using the affine matrix.\n        - Angles are adjusted based on the rotation component of the affine transformation.\n        - Scales are multiplied by the maximum of x and y scale factors.\n        - The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.\n\n    Example:\n        &gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n        &gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n        &gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n    \"\"\"\n    keypoints = keypoints.copy().astype(np.float32)\n\n    if is_identity_matrix(matrix):\n        return keypoints\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        keypoints = generate_reflected_keypoints(\n            keypoints,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Extract x, y coordinates\n    xy = keypoints[:, :2]\n\n    # Ensure matrix is 2x3\n    if matrix.shape == (3, 3):\n        matrix = matrix[:2]\n\n    # Transform x, y coordinates\n    xy_transformed = cv2.transform(xy.reshape(-1, 1, 2), matrix).squeeze()\n\n    # Calculate angle adjustment\n    angle_adjustment = rotation2d_matrix_to_euler_angles(matrix[:2, :2], y_up=False)\n\n    # Update angles\n    keypoints[:, 2] = keypoints[:, 2] + angle_adjustment\n\n    # Update scales\n    max_scale = max(scale[\"x\"], scale[\"y\"])\n\n    keypoints[:, 3] *= max_scale\n\n    # Update x, y coordinates\n    keypoints[:, :2] = xy_transformed\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_d4","title":"<code>def keypoints_d4    (keypoints, group_member, image_shape, ** params)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a keypoint.</p> <p>This function adjusts a keypoint's coordinates according to the specified <code>D_4</code> group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.</p> <ul> <li>keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). -group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.</li> <li>image_shape (tuple[int, int]): The shape of the image.</li> <li>params (Any): Not used</li> </ul> <ul> <li>KeypointInternalType: The transformed keypoint.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.</li> </ul> <p>Examples:</p> <ul> <li>Rotating a keypoint by 90 degrees in a 100x100 image:   <code>keypoint_d4((50, 30), 'r90', 100, 100)</code>   This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_d4(\n    keypoints: np.ndarray,\n    group_member: D4Type,\n    image_shape: tuple[int, int],\n    **params: Any,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a keypoint.\n\n    This function adjusts a keypoint's coordinates according to the specified `D_4` group transformation,\n    which includes rotations and reflections suitable for image processing tasks. These transformations account\n    for the dimensions of the image to ensure the keypoint remains within its boundaries.\n\n    Parameters:\n    - keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n    -group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.\n    - image_shape (tuple[int, int]): The shape of the image.\n    - params (Any): Not used\n\n    Returns:\n    - KeypointInternalType: The transformed keypoint.\n\n    Raises:\n    - ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.\n\n    Examples:\n    - Rotating a keypoint by 90 degrees in a 100x100 image:\n      `keypoint_d4((50, 30), 'r90', 100, 100)`\n      This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: keypoints_rot90(x, 1, image_shape),  # Rotate 90 degrees\n        \"r180\": lambda x: keypoints_rot90(x, 2, image_shape),  # Rotate 180 degrees\n        \"r270\": lambda x: keypoints_rot90(x, 3, image_shape),  # Rotate 270 degrees\n        \"v\": lambda x: keypoints_vflip(x, rows),  # Vertical flip\n        \"hvt\": lambda x: keypoints_transpose(\n            keypoints_rot90(x, 2, image_shape),\n        ),  # Reflect over anti diagonal\n        \"h\": lambda x: keypoints_hflip(x, cols),  # Horizontal flip\n        \"t\": lambda x: keypoints_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](keypoints)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_hflip","title":"<code>def keypoints_hflip    (keypoints, cols)    </code> [view source on GitHub]","text":"<p>Flip keypoints horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>cols</code> <code>int</code> <p>Image width.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_hflip(keypoints: np.ndarray, cols: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints horizontally around the y-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        cols: Image width.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip x-coordinates\n    flipped_keypoints[:, 0] = (cols - 1) - keypoints[:, 0]\n\n    # Adjust angles\n    flipped_keypoints[:, 2] = np.pi - keypoints[:, 2]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_rot90","title":"<code>def keypoints_rot90    (keypoints, factor, image_shape)    </code> [view source on GitHub]","text":"<p>Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p> <code>factor</code> <code>int</code> <p>The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The rotated keypoints with the same shape as the input.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the factor is not in the set {0, 1, 2, 3}.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_rot90(\n    keypoints: np.ndarray,\n    factor: int,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.\n\n    Args:\n        keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n        factor (int): The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n\n    Returns:\n        np.ndarray: The rotated keypoints with the same shape as the input.\n\n    Raises:\n        ValueError: If the factor is not in the set {0, 1, 2, 3}.\n    \"\"\"\n    if factor not in {0, 1, 2, 3}:\n        raise ValueError(\"Parameter factor must be in set {0, 1, 2, 3}\")\n\n    if factor == 0:\n        return keypoints\n\n    height, width = image_shape[:2]\n    rotated_keypoints = keypoints.copy().astype(np.float32)\n\n    x, y, angle = keypoints[:, 0], keypoints[:, 1], keypoints[:, 2]\n\n    if factor == 1:\n        rotated_keypoints[:, 0] = y\n        rotated_keypoints[:, 1] = width - 1 - x\n        rotated_keypoints[:, 2] = angle - np.pi / 2\n    elif factor == ROT90_180_FACTOR:\n        rotated_keypoints[:, 0] = width - 1 - x\n        rotated_keypoints[:, 1] = height - 1 - y\n        rotated_keypoints[:, 2] = angle - np.pi\n    elif factor == ROT90_270_FACTOR:\n        rotated_keypoints[:, 0] = height - 1 - y\n        rotated_keypoints[:, 1] = x\n        rotated_keypoints[:, 2] = angle + np.pi / 2\n\n    return rotated_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_scale","title":"<code>def keypoints_scale    (keypoints, scale_x, scale_y)    </code> [view source on GitHub]","text":"<p>Scales keypoints by scale_x and scale_y.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p> <code>scale_x</code> <code>float</code> <p>Scale coefficient x-axis.</p> <code>scale_y</code> <code>float</code> <p>Scale coefficient y-axis.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of scaled keypoints with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_scale(\n    keypoints: np.ndarray,\n    scale_x: float,\n    scale_y: float,\n) -&gt; np.ndarray:\n    \"\"\"Scales keypoints by scale_x and scale_y.\n\n    Args:\n        keypoints: A numpy array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n        scale_x: Scale coefficient x-axis.\n        scale_y: Scale coefficient y-axis.\n\n    Returns:\n        A numpy array of scaled keypoints with the same shape as input.\n    \"\"\"\n    # Extract x, y, angle, and scale\n    x, y, angle, scale = (\n        keypoints[:, 0],\n        keypoints[:, 1],\n        keypoints[:, 2],\n        keypoints[:, 3],\n    )\n\n    # Scale x and y\n    x_scaled = x * scale_x\n    y_scaled = y * scale_y\n\n    # Scale the keypoint scale by the maximum of scale_x and scale_y\n    scale_scaled = scale * max(scale_x, scale_y)\n\n    # Create the output array\n    scaled_keypoints = np.column_stack([x_scaled, y_scaled, angle, scale_scaled])\n\n    # If there are additional columns, preserve them\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack(\n            [scaled_keypoints, keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:]],\n        )\n\n    return scaled_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_transpose","title":"<code>def keypoints_transpose    (keypoints)    </code> [view source on GitHub]","text":"<p>Transposes keypoints along the main diagonal.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transposed keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_transpose(keypoints: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes keypoints along the main diagonal.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n\n    Returns:\n        np.ndarray: An array of transposed keypoints with the same shape as the input.\n    \"\"\"\n    transposed_keypoints = keypoints.copy()\n\n    # Swap x and y coordinates\n    transposed_keypoints[:, [0, 1]] = keypoints[:, [1, 0]]\n\n    # Adjust angles to reflect the coordinate swap\n    angles = keypoints[:, 2]\n    transposed_keypoints[:, 2] = np.where(\n        angles &lt;= np.pi,\n        np.pi / 2 - angles,\n        3 * np.pi / 2 - angles,\n    )\n\n    return transposed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_vflip","title":"<code>def keypoints_vflip    (keypoints, rows)    </code> [view source on GitHub]","text":"<p>Flip keypoints vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>rows</code> <code>int</code> <p>Image height.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_vflip(keypoints: np.ndarray, rows: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints vertically around the x-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        rows: Image height.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip y-coordinates\n    flipped_keypoints[:, 1] = (rows - 1) - keypoints[:, 1]\n\n    # Negate angles\n    flipped_keypoints[:, 2] = -keypoints[:, 2]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.perspective_bboxes","title":"<code>def perspective_bboxes    (bboxes, image_shape, matrix, max_width, max_height, keep_size)    </code> [view source on GitHub]","text":"<p>Applies perspective transformation to bounding boxes.</p> <p>This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (num_bboxes, 4+).                  Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).                  Additional columns beyond the first 4 are preserved unchanged.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <code>matrix</code> <code>np.ndarray</code> <p>The perspective transformation matrix.</p> <code>max_width</code> <code>int</code> <p>The maximum width of the output image.</p> <code>max_height</code> <code>int</code> <p>The maximum height of the output image.</p> <code>keep_size</code> <code>bool</code> <p>If True, maintains the original image size after transformation.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as input.             The first 4 columns contain the transformed coordinates, and any             additional columns are preserved from the input.</p> <p>Note</p> <ul> <li>This function modifies only the coordinate columns (first 4) of the input bounding boxes.</li> <li>Any additional attributes (columns beyond the first 4) are kept unchanged.</li> <li>The function handles denormalization and renormalization of coordinates internally.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n&gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef perspective_bboxes(\n    bboxes: np.ndarray,\n    image_shape: tuple[int, int],\n    matrix: np.ndarray,\n    max_width: int,\n    max_height: int,\n    keep_size: bool,\n) -&gt; np.ndarray:\n    \"\"\"Applies perspective transformation to bounding boxes.\n\n    This function transforms bounding boxes using the given perspective transformation matrix.\n    It handles bounding boxes with additional attributes beyond the standard coordinates.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (num_bboxes, 4+).\n                             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n                             Additional columns beyond the first 4 are preserved unchanged.\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n        matrix (np.ndarray): The perspective transformation matrix.\n        max_width (int): The maximum width of the output image.\n        max_height (int): The maximum height of the output image.\n        keep_size (bool): If True, maintains the original image size after transformation.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as input.\n                    The first 4 columns contain the transformed coordinates, and any\n                    additional columns are preserved from the input.\n\n    Note:\n        - This function modifies only the coordinate columns (first 4) of the input bounding boxes.\n        - Any additional attributes (columns beyond the first 4) are kept unchanged.\n        - The function handles denormalization and renormalization of coordinates internally.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n        &gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n    \"\"\"\n    height, width = image_shape[:2]\n    transformed_bboxes = bboxes.copy()\n    denormalized_coords = denormalize_bboxes(bboxes[:, :4], image_shape)\n\n    x_min, y_min, x_max, y_max = denormalized_coords.T\n    points = np.array(\n        [[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]],\n    ).transpose(2, 0, 1)\n    points_reshaped = points.reshape(-1, 1, 2)\n\n    transformed_points = cv2.perspectiveTransform(\n        points_reshaped.astype(np.float32),\n        matrix,\n    )\n    transformed_points = transformed_points.reshape(-1, 4, 2)\n\n    new_coords = np.array(\n        [[np.min(box[:, 0]), np.min(box[:, 1]), np.max(box[:, 0]), np.max(box[:, 1])] for box in transformed_points],\n    )\n\n    if keep_size:\n        scale_x, scale_y = width / max_width, height / max_height\n        new_coords[:, [0, 2]] *= scale_x\n        new_coords[:, [1, 3]] *= scale_y\n        output_shape = image_shape\n    else:\n        output_shape = (max_height, max_width)\n\n    normalized_coords = normalize_bboxes(new_coords, output_shape)\n    transformed_bboxes[:, :4] = normalized_coords\n\n    return transformed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles","title":"<code>def rotation2d_matrix_to_euler_angles    (matrix, y_up)    </code> [view source on GitHub]","text":"<p>matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def rotation2d_matrix_to_euler_angles(matrix: np.ndarray, y_up: bool) -&gt; float:\n    \"\"\"Args:\n    matrix (np.ndarray): Rotation matrix\n    y_up (bool): is Y axis looks up or down\n\n    \"\"\"\n    if y_up:\n        return np.arctan2(matrix[1, 0], matrix[0, 0])\n    return np.arctan2(-matrix[1, 0], matrix[0, 0])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.shift_bboxes","title":"<code>def shift_bboxes    (bboxes, shift_vector)    </code> [view source on GitHub]","text":"<p>Shift bounding boxes by a given vector.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where n is the number of bboxes                  and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>shift_vector</code> <code>np.ndarray</code> <p>Vector to shift the bounding boxes by, with shape (4,) for                        [shift_x, shift_y, shift_x, shift_y].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Shifted bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shift_bboxes(bboxes: np.ndarray, shift_vector: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Shift bounding boxes by a given vector.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where n is the number of bboxes\n                             and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].\n        shift_vector (np.ndarray): Vector to shift the bounding boxes by, with shape (4,) for\n                                   [shift_x, shift_y, shift_x, shift_y].\n\n    Returns:\n        np.ndarray: Shifted bounding boxes with the same shape as input.\n    \"\"\"\n    # Create a copy of the input array to avoid modifying it in-place\n    shifted_bboxes = bboxes.copy()\n\n    # Add the shift vector to the first 4 columns\n    shifted_bboxes[:, :4] += shift_vector\n\n    return shifted_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups","title":"<code>def shuffle_tiles_within_shape_groups    (shape_groups, random_generator)    </code> [view source on GitHub]","text":"<p>Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.</p> <p>Parameters:</p> Name Type Description <code>shape_groups</code> <code>dict[tuple[int, int], list[int]]</code> <p>Groups of tile indices categorized by shape.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the indices. If None, a new random generator will be used.</p> <p>Returns:</p> Type Description <code>list[int]</code> <p>A list where each index is mapped to the new index of the tile after shuffling.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shuffle_tiles_within_shape_groups(\n    shape_groups: dict[tuple[int, int], list[int]],\n    random_generator: np.random.Generator,\n) -&gt; list[int]:\n    \"\"\"Shuffles indices within each group of similar shapes and creates a list where each\n    index points to the index of the tile it should be mapped to.\n\n    Args:\n        shape_groups (dict[tuple[int, int], list[int]]): Groups of tile indices categorized by shape.\n        random_generator (np.random.Generator): The random generator to use for shuffling the indices.\n            If None, a new random generator will be used.\n\n    Returns:\n        list[int]: A list where each index is mapped to the new index of the tile after shuffling.\n    \"\"\"\n    # Initialize the output list with the same size as the total number of tiles, filled with -1\n    num_tiles = sum(len(indices) for indices in shape_groups.values())\n    mapping = [-1] * num_tiles\n\n    # Prepare the random number generator\n\n    for indices in shape_groups.values():\n        shuffled_indices = indices.copy()\n        random_generator.shuffle(shuffled_indices)\n\n        for old, new in zip(indices, shuffled_indices):\n            mapping[old] = new\n\n    return mapping\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.split_uniform_grid","title":"<code>def split_uniform_grid    (image_shape, grid, random_generator)    </code> [view source on GitHub]","text":"<p>Splits an image shape into a uniform grid specified by the grid dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>grid</code> <code>tuple[int, int]</code> <p>The grid size as (rows, columns).</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).</p> <p>Note</p> <p>The function uses <code>generate_shuffled_splits</code> to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def split_uniform_grid(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Splits an image shape into a uniform grid specified by the grid dimensions.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        grid (tuple[int, int]): The grid size as (rows, columns).\n        random_generator (np.random.Generator): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).\n\n    Note:\n        The function uses `generate_shuffled_splits` to generate the splits for the height and width of the image.\n        The splits are then used to calculate the coordinates of the tiles.\n    \"\"\"\n    n_rows, n_cols = grid\n\n    height_splits = generate_shuffled_splits(\n        image_shape[0],\n        grid[0],\n        random_generator=random_generator,\n    )\n    width_splits = generate_shuffled_splits(\n        image_shape[1],\n        grid[1],\n        random_generator=random_generator,\n    )\n\n    # Calculate tiles coordinates\n    tiles = [\n        (height_splits[i], width_splits[j], height_splits[i + 1], width_splits[j + 1])\n        for i in range(n_rows)\n        for j in range(n_cols)\n    ]\n\n    return np.array(tiles, dtype=np.int16)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.swap_tiles_on_image","title":"<code>def swap_tiles_on_image    (image, tiles, mapping=None)    </code> [view source on GitHub]","text":"<p>Swap tiles on the image according to the new format.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image.</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tiles with each tile as [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int] | None</code> <p>list of new tile indices.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Output image with tiles swapped according to the random shuffle.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_image(\n    image: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int] | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Swap tiles on the image according to the new format.\n\n    Args:\n        image: Input image.\n        tiles: Array of tiles with each tile as [start_y, start_x, end_y, end_x].\n        mapping: list of new tile indices.\n\n    Returns:\n        np.ndarray: Output image with tiles swapped according to the random shuffle.\n    \"\"\"\n    # If no tiles are provided, return a copy of the original image\n    if tiles.size == 0 or mapping is None:\n        return image.copy()\n\n    # Create a copy of the image to retain original for reference\n    new_image = np.empty_like(image)\n    for num, new_index in enumerate(mapping):\n        start_y, start_x, end_y, end_x = tiles[new_index]\n        start_y_orig, start_x_orig, end_y_orig, end_x_orig = tiles[num]\n        # Assign the corresponding tile from the original image to the new image\n        new_image[start_y:end_y, start_x:end_x] = image[\n            start_y_orig:end_y_orig,\n            start_x_orig:end_x_orig,\n        ]\n\n    return new_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints","title":"<code>def swap_tiles_on_keypoints    (keypoints, tiles, mapping)    </code> [view source on GitHub]","text":"<p>Swap the positions of keypoints based on a tile mapping.</p> <p>This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (N, 2) where N is the number of keypoints.                     Each row represents a keypoint's (x, y) coordinates.</p> <code>tiles</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (M, 4) where M is the number of tiles.                 Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.</p> <code>mapping</code> <code>np.ndarray</code> <p>A 1D numpy array of shape (M,) where M is the number of tiles.                   Each element i contains the index of the tile that tile i should be swapped with.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of the same shape as the input keypoints, containing the new positions             of the keypoints after the tile swap.</p> <p>Exceptions:</p> Type Description <code>RuntimeWarning</code> <p>If any keypoint is not found within any tile.</p> <p>Notes</p> <ul> <li>Keypoints that do not fall within any tile will remain unchanged.</li> <li>The function assumes that the tiles do not overlap and cover the entire image space.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_keypoints(\n    keypoints: np.ndarray,\n    tiles: np.ndarray,\n    mapping: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Swap the positions of keypoints based on a tile mapping.\n\n    This function takes a set of keypoints and repositions them according to a mapping of tile swaps.\n    Keypoints are moved from their original tiles to new positions in the swapped tiles.\n\n    Args:\n        keypoints (np.ndarray): A 2D numpy array of shape (N, 2) where N is the number of keypoints.\n                                Each row represents a keypoint's (x, y) coordinates.\n        tiles (np.ndarray): A 2D numpy array of shape (M, 4) where M is the number of tiles.\n                            Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.\n        mapping (np.ndarray): A 1D numpy array of shape (M,) where M is the number of tiles.\n                              Each element i contains the index of the tile that tile i should be swapped with.\n\n    Returns:\n        np.ndarray: A 2D numpy array of the same shape as the input keypoints, containing the new positions\n                    of the keypoints after the tile swap.\n\n    Raises:\n        RuntimeWarning: If any keypoint is not found within any tile.\n\n    Notes:\n        - Keypoints that do not fall within any tile will remain unchanged.\n        - The function assumes that the tiles do not overlap and cover the entire image space.\n    \"\"\"\n    if not keypoints.size:\n        return keypoints\n\n    # Broadcast keypoints and tiles for vectorized comparison\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    start_y, start_x, end_y, end_x = tiles.T  # Each shape: (num_tiles,)\n\n    # Check if each keypoint is inside each tile\n    in_tile = (kp_y &gt;= start_y) &amp; (kp_y &lt; end_y) &amp; (kp_x &gt;= start_x) &amp; (kp_x &lt; end_x)\n\n    # Find which tile each keypoint belongs to\n    tile_indices = np.argmax(in_tile, axis=1)\n\n    # Check if any keypoint is not in any tile\n    not_in_any_tile = ~np.any(in_tile, axis=1)\n    if np.any(not_in_any_tile):\n        warn(\n            \"Some keypoints are not in any tile. They will be returned unchanged. This is unexpected and should be \"\n            \"investigated.\",\n            RuntimeWarning,\n            stacklevel=2,\n        )\n\n    # Get the new tile indices\n    new_tile_indices = np.array(mapping)[tile_indices]\n\n    # Calculate the offsets\n    old_start_x = tiles[tile_indices, 1]\n    old_start_y = tiles[tile_indices, 0]\n    new_start_x = tiles[new_tile_indices, 1]\n    new_start_y = tiles[new_tile_indices, 0]\n\n    # Apply the transformation\n    new_keypoints = keypoints.copy()\n    new_keypoints[:, 0] = (keypoints[:, 0] - old_start_x) + new_start_x\n    new_keypoints[:, 1] = (keypoints[:, 1] - old_start_y) + new_start_y\n\n    # Keep original coordinates for keypoints not in any tile\n    new_keypoints[not_in_any_tile] = keypoints[not_in_any_tile]\n\n    return new_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.to_distance_maps","title":"<code>def to_distance_maps    (keypoints, image_shape, inverted=False)    </code> [view source on GitHub]","text":"<p>Generate a <code>(H,W,N)</code> array of distance maps for <code>N</code> keypoints.</p> <p>The <code>n</code>-th distance map contains at every location <code>(y, x)</code> the euclidean distance to the <code>n</code>-th keypoint.</p> <p>This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 2+) where N is the number of keypoints.        Each row represents a keypoint's (x, y) coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>tuple[int, int] shape of the image (height, width)</p> <code>inverted</code> <code>bool</code> <p>If <code>True</code>, inverted distance maps are returned where each distance value d is replaced by <code>d/(d+1)</code>, i.e. the distance maps have values in the range <code>(0.0, 1.0]</code> with <code>1.0</code> denoting exactly the position of the respective keypoint.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A <code>float32</code> array of shape (H, W, N) containing <code>N</code> distance maps for <code>N</code>     keypoints. Each location <code>(y, x, n)</code> in the array denotes the     euclidean distance at <code>(y, x)</code> to the <code>n</code>-th keypoint.     If <code>inverted</code> is <code>True</code>, the distance <code>d</code> is replaced     by <code>d/(d+1)</code>. The height and width of the array match the     height and width in <code>image_shape</code>.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def to_distance_maps(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    inverted: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate a ``(H,W,N)`` array of distance maps for ``N`` keypoints.\n\n    The ``n``-th distance map contains at every location ``(y, x)`` the\n    euclidean distance to the ``n``-th keypoint.\n\n    This function can be used as a helper when augmenting keypoints with a\n    method that only supports the augmentation of images.\n\n    Args:\n        keypoints: A numpy array of shape (N, 2+) where N is the number of keypoints.\n                   Each row represents a keypoint's (x, y) coordinates.\n        image_shape: tuple[int, int] shape of the image (height, width)\n        inverted (bool): If ``True``, inverted distance maps are returned where each\n            distance value d is replaced by ``d/(d+1)``, i.e. the distance\n            maps have values in the range ``(0.0, 1.0]`` with ``1.0`` denoting\n            exactly the position of the respective keypoint.\n\n    Returns:\n        np.ndarray: A ``float32`` array of shape (H, W, N) containing ``N`` distance maps for ``N``\n            keypoints. Each location ``(y, x, n)`` in the array denotes the\n            euclidean distance at ``(y, x)`` to the ``n``-th keypoint.\n            If `inverted` is ``True``, the distance ``d`` is replaced\n            by ``d/(d+1)``. The height and width of the array match the\n            height and width in ``image_shape``.\n    \"\"\"\n    height, width = image_shape[:2]\n    if len(keypoints) == 0:\n        return np.zeros((height, width, 0), dtype=np.float32)\n\n    # Create coordinate grids\n    yy, xx = np.mgrid[:height, :width]\n\n    # Convert keypoints to numpy array\n    keypoints_array = np.array(keypoints)\n\n    # Compute distances for all keypoints at once\n    distances = np.sqrt(\n        (xx[..., np.newaxis] - keypoints_array[:, 0]) ** 2 + (yy[..., np.newaxis] - keypoints_array[:, 1]) ** 2,\n    )\n\n    if inverted:\n        return (1 / (distances + 1)).astype(np.float32)\n    return distances.astype(np.float32)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.tps_transform","title":"<code>def tps_transform    (target_points, control_points, nonlinear_weights, affine_weights)    </code> [view source on GitHub]","text":"<p>Apply Thin Plate Spline transformation to points.</p> <p>Parameters:</p> Name Type Description <code>target_points</code> <code>np.ndarray</code> <p>Points to transform with shape (num_targets, 2)</p> <code>control_points</code> <code>np.ndarray</code> <p>Original control points with shape (num_controls, 2)</p> <code>nonlinear_weights</code> <code>np.ndarray</code> <p>TPS kernel weights with shape (num_controls, 2)</p> <code>affine_weights</code> <code>np.ndarray</code> <p>Affine transformation weights with shape (3, 2)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (num_targets, 2)</p> <p>Note</p> <p>The transformation combines: 1. Nonlinear warping based on distances to control points 2. Global affine transformation (scale, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def tps_transform(\n    target_points: np.ndarray,\n    control_points: np.ndarray,\n    nonlinear_weights: np.ndarray,\n    affine_weights: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply Thin Plate Spline transformation to points.\n\n    Args:\n        target_points: Points to transform with shape (num_targets, 2)\n        control_points: Original control points with shape (num_controls, 2)\n        nonlinear_weights: TPS kernel weights with shape (num_controls, 2)\n        affine_weights: Affine transformation weights with shape (3, 2)\n\n    Returns:\n        Transformed points with shape (num_targets, 2)\n\n    Note:\n        The transformation combines:\n        1. Nonlinear warping based on distances to control points\n        2. Global affine transformation (scale, rotation, translation)\n    \"\"\"\n    # Compute all pairwise distances at once: (num_targets, num_controls)\n    distances = np.linalg.norm(target_points[:, None] - control_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Prepare affine terms [1, x, y] for each point\n    affine_terms = np.c_[np.ones(len(target_points)), target_points]\n\n    # Combine nonlinear and affine transformations\n    return kernel_matrix @ nonlinear_weights + affine_terms @ affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.transpose","title":"<code>def transpose    (img)    </code> [view source on GitHub]","text":"<p>Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transposed array.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def transpose(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes the first two dimensions of an array of any dimensionality.\n    Retains the order of any additional dimensions.\n\n    Args:\n        img (np.ndarray): Input array.\n\n    Returns:\n        np.ndarray: Transposed array.\n    \"\"\"\n    # Generate the new axes order\n    new_axes = list(range(img.ndim))\n    new_axes[0], new_axes[1] = 1, 0  # Swap the first two dimensions\n\n    # Transpose the array using the new axes order\n    return img.transpose(new_axes)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.validate_bboxes","title":"<code>def validate_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Validate bounding boxes and remove invalid ones.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid bounding boxes, potentially with fewer boxes than the input.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n&gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n&gt;&gt;&gt; print(valid_bboxes)\n[[10 20 30 40]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_bboxes(bboxes: np.ndarray, image_shape: Sequence[int]) -&gt; np.ndarray:\n    \"\"\"Validate bounding boxes and remove invalid ones.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid bounding boxes, potentially with fewer boxes than the input.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n        &gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n        &gt;&gt;&gt; print(valid_bboxes)\n        [[10 20 30 40]]\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    valid_indices = (x_max &gt; 0) &amp; (y_max &gt; 0) &amp; (x_min &lt; cols) &amp; (y_min &lt; rows)\n\n    return bboxes[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.validate_if_not_found_coords","title":"<code>def validate_if_not_found_coords    (if_not_found_coords)    </code> [view source on GitHub]","text":"<p>Validate and process <code>if_not_found_coords</code> parameter.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_if_not_found_coords(\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None,\n) -&gt; tuple[bool, float, float]:\n    \"\"\"Validate and process `if_not_found_coords` parameter.\"\"\"\n    if if_not_found_coords is None:\n        return True, -1, -1\n    if isinstance(if_not_found_coords, (tuple, list)):\n        if len(if_not_found_coords) != PAIR:\n            msg = \"Expected tuple/list 'if_not_found_coords' to contain exactly two entries.\"\n            raise ValueError(msg)\n        return False, if_not_found_coords[0], if_not_found_coords[1]\n    if isinstance(if_not_found_coords, dict):\n        return False, if_not_found_coords[\"x\"], if_not_found_coords[\"y\"]\n\n    msg = \"Expected if_not_found_coords to be None, tuple, list, or dict.\"\n    raise ValueError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.validate_keypoints","title":"<code>def validate_keypoints    (keypoints, image_shape)    </code> [view source on GitHub]","text":"<p>Validate keypoints and remove those that fall outside the image boundaries.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, M) where N is the number of keypoints                     and M &gt;= 2. The first two columns represent x and y coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid keypoints that fall within the image boundaries.</p> <p>Note</p> <p>This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Validate keypoints and remove those that fall outside the image boundaries.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, M) where N is the number of keypoints\n                                and M &gt;= 2. The first two columns represent x and y coordinates.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid keypoints that fall within the image boundaries.\n\n    Note:\n        This function only checks the x and y coordinates (first two columns) of the keypoints.\n        Any additional columns (e.g., angle, scale) are preserved for valid keypoints.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x, y = keypoints[:, 0], keypoints[:, 1]\n\n    valid_indices = (x &gt;= 0) &amp; (x &lt; cols) &amp; (y &gt;= 0) &amp; (y &lt; rows)\n\n    return keypoints[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize","title":"<code>resize</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.LongestMaxSize","title":"<code>class  LongestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,     keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, Sequence[int]</code> <p>Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>interpolation method. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the longest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / max(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = min(H_target/H, W_target/W)\n       This ensures both dimensions fit within the specified bounds.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n&gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Common use case with padding\n&gt;&gt;&gt; transform4 = A.Compose([\n...     A.LongestMaxSize(max_size=1024),\n...     A.PadIfNeeded(min_height=1024, min_width=1024),\n... ])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class LongestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,\n        keeping the aspect ratio.\n\n    Args:\n        max_size (int, Sequence[int], optional): Maximum size of the longest side after the transformation.\n            When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must fit within these bounds\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): interpolation method. Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the longest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / max(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = min(H_target/H, W_target/W)\n               This ensures both dimensions fit within the specified bounds.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n        &gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Common use case with padding\n        &gt;&gt;&gt; transform4 = A.Compose([\n        ...     A.LongestMaxSize(max_size=1024),\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024),\n        ... ])\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / max(img_h, img_w)\n        elif self.max_size_hw is not None:\n            # We know max_size_hw is not None here due to model validator\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on longest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = min(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.MaxSizeTransform","title":"<code>class  MaxSizeTransform</code> <code>       (max_size=1024, max_size_hw=None, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                                     </code>  [view source on GitHub]","text":"<p>Base class for transforms that resize based on maximum size constraints.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class MaxSizeTransform(DualTransform):\n    \"\"\"Base class for transforms that resize based on maximum size constraints.\"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_size: int | list[int] | None\n        max_size_hw: tuple[int | None, int | None] | None\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def validate_size_parameters(self) -&gt; Self:\n            if self.max_size is None and self.max_size_hw is None:\n                raise ValueError(\"Either max_size or max_size_hw must be specified\")\n            if self.max_size is not None and self.max_size_hw is not None:\n                raise ValueError(\"Only one of max_size or max_size_hw should be specified\")\n            return self\n\n    def __init__(\n        self,\n        max_size: int | Sequence[int] | None = 1024,\n        max_size_hw: tuple[int | None, int | None] | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_size = max_size\n        self.max_size_hw = max_size_hw\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = img.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(img, (new_height, new_width), interpolation=self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = mask.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(mask, (new_height, new_width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_images(self, images: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(images, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volume, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volumes, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(masks3d, *args, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_size\", \"max_size_hw\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.RandomScale","title":"<code>class  RandomScale</code> <code>       (scale_limit=(-0.1, 0.1), interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Randomly resize the input. Output image size is different from the input image size.</p> <p>Parameters:</p> Name Type Description <code>scale_limit</code> <code>float or tuple[float, float]</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The output image size is different from the input image size.</li> <li>Scale factor is sampled independently per image side (width and height).</li> <li>Bounding box coordinates are scaled accordingly.</li> <li>Keypoint coordinates are scaled accordingly.</li> </ul> <p>Mathematical formulation:     Let (W, H) be the original image dimensions and (W', H') be the output dimensions.     The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].     Then, W' = W * s and H' = H * s.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; scaled_image = result['image']\n# scaled_image will have dimensions in the range [90, 110] x [90, 110]\n# (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class RandomScale(DualTransform):\n    \"\"\"Randomly resize the input. Output image size is different from the input image size.\n\n    Args:\n        scale_limit (float or tuple[float, float]): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The output image size is different from the input image size.\n        - Scale factor is sampled independently per image side (width and height).\n        - Bounding box coordinates are scaled accordingly.\n        - Keypoint coordinates are scaled accordingly.\n\n    Mathematical formulation:\n        Let (W, H) be the original image dimensions and (W', H') be the output dimensions.\n        The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].\n        Then, W' = W * s and H' = H * s.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; scaled_image = result['image']\n        # scaled_image will have dimensions in the range [90, 110] x [90, 110]\n        # (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_limit: ScaleFloatType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(cls, v: ScaleFloatType) -&gt; tuple[float, float]:\n            return to_tuple(v, bias=1.0)\n\n    def __init__(\n        self,\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"scale\": self.py_random.uniform(*self.scale_limit)}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(img, scale, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(mask, scale, self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"interpolation\": self.interpolation,\n            \"mask_interpolation\": self.mask_interpolation,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.Resize","title":"<code>class  Resize</code> <code>       (height, width, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Resize the input to the given height and width.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>desired height of the output.</p> <code>width</code> <code>int</code> <p>desired width of the output.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class Resize(DualTransform):\n    \"\"\"Resize the input to the given height and width.\n\n    Args:\n        height (int): desired height of the output.\n        width (int): desired width of the output.\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        height: int = Field(ge=1)\n        width: int = Field(ge=1)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.height = height\n        self.width = width\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(img, (self.height, self.width), interpolation=self.interpolation)\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(mask, (self.height, self.width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        height, width = params[\"shape\"][:2]\n        scale_x = self.width / width\n        scale_y = self.height / height\n        return fgeometric.keypoints_scale(keypoints, scale_x, scale_y)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.SmallestMaxSize","title":"<code>class  SmallestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, list of int</code> <p>Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the smallest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be larger than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / min(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = max(H_target/H, W_target/W)\n       This ensures both dimensions are at least as large as specified.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n&gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n&gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class SmallestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints,\n    keeping the aspect ratio.\n\n    Args:\n        max_size (int, list of int, optional): Maximum size of smallest side of the image after the transformation.\n            When using a list, max size will be randomly selected from the values in the list. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must be at least these values\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the smallest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be larger than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / min(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = max(H_target/H, W_target/W)\n               This ensures both dimensions are at least as large as specified.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n        &gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n        &gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / min(img_h, img_w)\n        elif self.max_size_hw is not None:\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on smallest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = max(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate","title":"<code>rotate</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.RandomRotate90","title":"<code>class  RandomRotate90</code> <code> </code>  [view source on GitHub]","text":"<p>Randomly rotate the input by 90 degrees zero or more times.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RandomRotate90(DualTransform):\n    \"\"\"Randomly rotate the input by 90 degrees zero or more times.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, factor: int, **params: Any) -&gt; np.ndarray:\n        return fgeometric.rot90(img, factor)\n\n    def get_params(self) -&gt; dict[str, int]:\n        # Random int in the range [0, 3]\n        return {\"factor\": self.py_random.randint(0, 3)}\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        factor: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_rot90(bboxes, factor)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        factor: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_rot90(keypoints, factor, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.Rotate","title":"<code>class  Rotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Rotate the input by an angle selected randomly from the uniform distribution.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>str</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>crop_border</code> <code>bool</code> <p>Whether to crop border after rotation. If True, the output image size might differ from the input. Default: False</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation angle is randomly selected for each execution within the range specified by 'limit'.</li> <li>When 'crop_border' is False, the output image will have the same size as the input, potentially   introducing black triangles in the corners.</li> <li>When 'crop_border' is True, the output image is cropped to remove black triangles, which may result   in a smaller image.</li> <li>Bounding boxes are rotated and may change size or shape.</li> <li>Keypoints are rotated around the center of the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. Each point (x, y) in the image is transformed to (x', y') by:        [x']   cos(\u03b8)  -sin(\u03b8)   [cx]        [y'] = sin(\u03b8)   cos(\u03b8) + [cy]        where (cx, cy) is the center of the image.     5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class Rotate(DualTransform):\n    \"\"\"Rotate the input by an angle selected randomly from the uniform distribution.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        rotate_method (str): Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'.\n            Default: 'largest_box'\n        crop_border (bool): Whether to crop border after rotation. If True, the output image size might differ\n            from the input. Default: False\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation angle is randomly selected for each execution within the range specified by 'limit'.\n        - When 'crop_border' is False, the output image will have the same size as the input, potentially\n          introducing black triangles in the corners.\n        - When 'crop_border' is True, the output image is cropped to remove black triangles, which may result\n          in a smaller image.\n        - Bounding boxes are rotated and may change size or shape.\n        - Keypoints are rotated around the center of the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [cos(\u03b8)  -sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [sin(\u03b8)   cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n        5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        crop_border: bool\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_value(self) -&gt; Self:\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n            return self\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        crop_border: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.crop_border = crop_border\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            img,\n            matrix,\n            self.interpolation,\n            self.fill,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            mask,\n            matrix,\n            self.mask_interpolation,\n            self.fill_mask,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_out = fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            image_shape,\n            self.border_mode,\n            image_shape,\n        )\n        if self.crop_border:\n            return fcrops.crop_bboxes_by_coords(\n                bboxes_out,\n                (x_min, y_min, x_max, y_max),\n                image_shape,\n            )\n        return bboxes_out\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        keypoints_out = fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"][:2],\n            scale={\"x\": 1, \"y\": 1},\n            border_mode=self.border_mode,\n        )\n        if self.crop_border:\n            return fcrops.crop_keypoints_by_coords(\n                keypoints_out,\n                (x_min, y_min, x_max, y_max),\n            )\n        return keypoints_out\n\n    @staticmethod\n    def _rotated_rect_with_max_area(\n        height: int,\n        width: int,\n        angle: float,\n    ) -&gt; dict[str, int]:\n        \"\"\"Given a rectangle of size wxh that has been rotated by 'angle' (in\n        degrees), computes the width and height of the largest possible\n        axis-aligned rectangle (maximal area) within the rotated rectangle.\n\n        Reference:\n            https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders\n        \"\"\"\n        angle = math.radians(angle)\n        width_is_longer = width &gt;= height\n        side_long, side_short = (width, height) if width_is_longer else (height, width)\n\n        # since the solutions for angle, -angle and 180-angle are all the same,\n        # it is sufficient to look at the first quadrant and the absolute values of sin,cos:\n        sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))\n        if side_short &lt;= 2.0 * sin_a * cos_a * side_long or abs(sin_a - cos_a) &lt; SMALL_NUMBER:\n            # half constrained case: two crop corners touch the longer side,\n            # the other two corners are on the mid-line parallel to the longer line\n            x = 0.5 * side_short\n            wr, hr = (x / sin_a, x / cos_a) if width_is_longer else (x / cos_a, x / sin_a)\n        else:\n            # fully constrained case: crop touches all 4 sides\n            cos_2a = cos_a * cos_a - sin_a * sin_a\n            wr, hr = (\n                (width * cos_a - height * sin_a) / cos_2a,\n                (height * cos_a - width * sin_a) / cos_2a,\n            )\n\n        return {\n            \"x_min\": max(0, int(width / 2 - wr / 2)),\n            \"x_max\": min(width, int(width / 2 + wr / 2)),\n            \"y_min\": max(0, int(height / 2 - hr / 2)),\n            \"y_max\": min(height, int(height / 2 + hr / 2)),\n        }\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        angle = self.py_random.uniform(*self.limit)\n\n        if self.crop_border:\n            height, width = params[\"shape\"][:2]\n            out_params = self._rotated_rect_with_max_area(height, width, angle)\n        else:\n            out_params = {\"x_min\": -1, \"x_max\": -1, \"y_min\": -1, \"y_max\": -1}\n\n        center = fgeometric.center(params[\"shape\"][:2])\n        bbox_center = fgeometric.center_bbox(params[\"shape\"][:2])\n\n        translate: fgeometric.XYInt = {\"x\": 0, \"y\": 0}\n        shear: fgeometric.XYFloat = {\"x\": 0, \"y\": 0}\n        scale: fgeometric.XYFloat = {\"x\": 1, \"y\": 1}\n        rotate = angle\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            center,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_center,\n        )\n        out_params[\"matrix\"] = matrix\n        out_params[\"bbox_matrix\"] = bbox_matrix\n\n        return out_params\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"crop_border\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.RotateInitSchema","title":"<code>class  RotateInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RotateInitSchema(BaseTransformInitSchema):\n    limit: SymmetricRangeType\n\n    interpolation: InterpolationType\n    mask_interpolation: InterpolationType\n\n    border_mode: BorderModeType\n\n    fill: ColorType | None\n    fill_mask: ColorType | None\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.SafeRotate","title":"<code>class  SafeRotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.</p> <p>This transformation ensures that the entire rotated image fits within the original frame by scaling it down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the rotation and scaling process.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation is performed around the center of the image.</li> <li>After rotation, the image is scaled to fit within the original frame, which may cause some distortion.</li> <li>The output image will always have the same dimensions as the input image.</li> <li>Bounding boxes and keypoints are transformed along with the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:        s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),                height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))     5. The combined transformation matrix T is:        T = [scos(\u03b8)  -ssin(\u03b8)  tx]            [ssin(\u03b8)   scos(\u03b8)  ty]        where tx and ty are translation factors to keep the image centered.     6. Each point (x, y) in the image is transformed to (x', y') by:        [x']   scos(\u03b8)   ssin(\u03b8)   [cx]        [y'] = -ssin(\u03b8)  scos(\u03b8) + [cy]        where (cx, cy) is the center of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n# scaled to fit within the original 100x100 frame\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class SafeRotate(Affine):\n    \"\"\"Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.\n\n    This transformation ensures that the entire rotated image fits within the original frame by scaling it\n    down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the\n    rotation and scaling process.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied\n            for masks.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): Method to rotate bounding boxes.\n            Should be 'largest_box' or 'ellipse'. Default: 'largest_box'\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation is performed around the center of the image.\n        - After rotation, the image is scaled to fit within the original frame, which may cause some distortion.\n        - The output image will always have the same dimensions as the input image.\n        - Bounding boxes and keypoints are transformed along with the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:\n           s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),\n                   height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))\n        5. The combined transformation matrix T is:\n           T = [s*cos(\u03b8)  -s*sin(\u03b8)  tx]\n               [s*sin(\u03b8)   s*cos(\u03b8)  ty]\n           where tx and ty are translation factors to keep the image centered.\n        6. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [s*cos(\u03b8)   s*sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [-s*sin(\u03b8)  s*cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n        # scaled to fit within the original 100x100 frame\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            rotate=limit,\n            interpolation=interpolation,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            rotate_method=rotate_method,\n            fit_output=True,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.mask_interpolation = mask_interpolation\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"mask_interpolation\",\n        )\n\n    def _create_safe_rotate_matrix(\n        self,\n        angle: float,\n        center: tuple[float, float],\n        image_shape: tuple[int, int],\n    ) -&gt; tuple[np.ndarray, dict[str, float]]:\n        height, width = image_shape[:2]\n        rotation_mat = cv2.getRotationMatrix2D(center, angle, 1.0)\n\n        # Calculate new image size\n        abs_cos = abs(rotation_mat[0, 0])\n        abs_sin = abs(rotation_mat[0, 1])\n        new_w = int(height * abs_sin + width * abs_cos)\n        new_h = int(height * abs_cos + width * abs_sin)\n\n        # Adjust the rotation matrix to take into account the new size\n        rotation_mat[0, 2] += new_w / 2 - center[0]\n        rotation_mat[1, 2] += new_h / 2 - center[1]\n\n        # Calculate scaling factors\n        scale_x = width / new_w\n        scale_y = height / new_h\n\n        # Create scaling matrix\n        scale_mat = np.array([[scale_x, 0, 0], [0, scale_y, 0], [0, 0, 1]])\n\n        # Combine rotation and scaling\n        matrix = scale_mat @ np.vstack([rotation_mat, [0, 0, 1]])\n\n        return matrix, {\"x\": scale_x, \"y\": scale_y}\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        angle = self.py_random.uniform(*self.limit)\n\n        # Calculate centers for image and bbox\n        image_center = fgeometric.center(image_shape)\n        bbox_center = fgeometric.center_bbox(image_shape)\n\n        # Create matrices for image and bbox\n        matrix, scale = self._create_safe_rotate_matrix(\n            angle,\n            image_center,\n            image_shape,\n        )\n        bbox_matrix, _ = self._create_safe_rotate_matrix(\n            angle,\n            bbox_center,\n            image_shape,\n        )\n\n        return {\n            \"rotate\": angle,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": image_shape,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms","title":"<code>transforms</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Affine","title":"<code>class  Affine</code> <code>       (scale=1, translate_percent=None, translate_px=None, rotate=0, shear=0, interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, fit_output=False, keep_ratio=False, rotate_method='largest_box', balanced_scale=False, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Augmentation to apply affine transformations to images.</p> <p>Affine transformations involve:</p> <pre><code>- Translation (\"move\" image on the x-/y-axis)\n- Rotation\n- Scaling (\"zoom\" in/out)\n- Shear (move one side of the image, turning a square into a trapezoid)\n</code></pre> <p>All such transformations can create \"new\" pixels in the image without a defined content, e.g. if the image is translated to the left, pixels are created on the right. A method has to be defined to deal with these pixel values. The parameters <code>fill</code> and <code>fill_mask</code> of this class deal with this.</p> <p>Some transformations involve interpolations between several pixels of the input image to generate output pixel values. The parameters <code>interpolation</code> and <code>mask_interpolation</code> deals with the method of interpolation used for this.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>number, tuple of number or dict</code> <p>Scaling factor to use, where <code>1.0</code> denotes \"no change\" and <code>0.5</code> is zoomed out to <code>50</code> percent of the original size.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That the same range will be used for both x- and y-axis. To keep the aspect ratio, set       <code>keep_ratio=True</code>, then the same value will be used for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes. Note that when       the <code>keep_ratio=True</code>, the x- and y-axis ranges should be the same.</p> <code>translate_percent</code> <code>None, number, tuple of number or dict</code> <p>Translation as a fraction of the image height/width (x-translation, y-translation), where <code>0</code> denotes \"no change\" and <code>0.5</code> denotes \"half of the axis size\".     * If <code>None</code> then equivalent to <code>0.0</code> unless <code>translate_px</code> has a value other than <code>None</code>.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That sampled fraction value will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>translate_px</code> <code>None, int, tuple of int or dict</code> <p>Translation in pixels.     * If <code>None</code> then equivalent to <code>0</code> unless <code>translate_percent</code> has a value other than <code>None</code>.     * If a single int, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from       the discrete interval <code>[a..b]</code>. That number will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>rotate</code> <code>number or tuple of number</code> <p>Rotation in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>. Rotation happens around the center of the image, not the top left corner as in some other frameworks.     * If a number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>       and used as the rotation value.</p> <code>shear</code> <code>number, tuple of number or dict</code> <p>Shear in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>, with reasonable values being in the range of <code>[-45, 45]</code>.     * If a number, then that value will be used for all images as       the shear on the x-axis (no shear on the y-axis will be done).     * If a tuple <code>(a, b)</code>, then two value will be uniformly sampled per image       from the interval <code>[a, b]</code> and be used as the x- and y-shear value.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>mask_interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>fill</code> <code>ColorType</code> <p>The constant value to use when filling in newly created pixels. (E.g. translating by 1px to the right will create a new 1px-wide column of pixels on the left of the image). The value is only used when <code>mode=constant</code>. The expected value range is <code>[0, 255]</code> for <code>uint8</code> images.</p> <code>fill_mask</code> <code>ColorType</code> <p>Same as fill but only for masks.</p> <code>border_mode</code> <code>int</code> <p>OpenCV border flag.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to tightly capture the whole image after affine transformation (<code>translate_percent</code> and <code>translate_px</code> are ignored). Otherwise (<code>False</code>),  parts of the transformed image may end up outside the image plane. Fitting the output shape can be useful to avoid corners of the image being outside the image plane after applying rotations. Default: False</p> <code>keep_ratio</code> <code>bool</code> <p>When True, the original aspect ratio will be kept when the random scale is applied. Default: False.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"</p> <code>balanced_scale</code> <code>bool</code> <p>When True, scaling factors are chosen to be either entirely below or above 1, ensuring balanced scaling. Default: False.</p> <p>This is important because without it, scaling tends to lean towards upscaling. For example, if we want the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly from [0.5, 2]. With <code>balanced_scale</code>, the  function ensures that half the time, the scaling factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in). This makes the zooming in and out process more balanced.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>[1] https://arxiv.org/abs/2109.13488</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Affine(DualTransform):\n    \"\"\"Augmentation to apply affine transformations to images.\n\n    Affine transformations involve:\n\n        - Translation (\"move\" image on the x-/y-axis)\n        - Rotation\n        - Scaling (\"zoom\" in/out)\n        - Shear (move one side of the image, turning a square into a trapezoid)\n\n    All such transformations can create \"new\" pixels in the image without a defined content, e.g.\n    if the image is translated to the left, pixels are created on the right.\n    A method has to be defined to deal with these pixel values.\n    The parameters `fill` and `fill_mask` of this class deal with this.\n\n    Some transformations involve interpolations between several pixels\n    of the input image to generate output pixel values. The parameters `interpolation` and\n    `mask_interpolation` deals with the method of interpolation used for this.\n\n    Args:\n        scale (number, tuple of number or dict): Scaling factor to use, where ``1.0`` denotes \"no change\" and\n            ``0.5`` is zoomed out to ``50`` percent of the original size.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That the same range will be used for both x- and y-axis. To keep the aspect ratio, set\n                  ``keep_ratio=True``, then the same value will be used for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes. Note that when\n                  the ``keep_ratio=True``, the x- and y-axis ranges should be the same.\n        translate_percent (None, number, tuple of number or dict): Translation as a fraction of the image height/width\n            (x-translation, y-translation), where ``0`` denotes \"no change\"\n            and ``0.5`` denotes \"half of the axis size\".\n                * If ``None`` then equivalent to ``0.0`` unless `translate_px` has a value other than ``None``.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That sampled fraction value will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        translate_px (None, int, tuple of int or dict): Translation in pixels.\n                * If ``None`` then equivalent to ``0`` unless `translate_percent` has a value other than ``None``.\n                * If a single int, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from\n                  the discrete interval ``[a..b]``. That number will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        rotate (number or tuple of number): Rotation in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``. Rotation happens around the *center* of the image,\n            not the top left corner as in some other frameworks.\n                * If a number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``\n                  and used as the rotation value.\n        shear (number, tuple of number or dict): Shear in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``, with reasonable values being in the range of ``[-45, 45]``.\n                * If a number, then that value will be used for all images as\n                  the shear on the x-axis (no shear on the y-axis will be done).\n                * If a tuple ``(a, b)``, then two value will be uniformly sampled per image\n                  from the interval ``[a, b]`` and be used as the x- and y-shear value.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        interpolation (int): OpenCV interpolation flag.\n        mask_interpolation (int): OpenCV interpolation flag.\n        fill (ColorType): The constant value to use when filling in newly created pixels.\n            (E.g. translating by 1px to the right will create a new 1px-wide column of pixels\n            on the left of the image).\n            The value is only used when `mode=constant`. The expected value range is ``[0, 255]`` for ``uint8`` images.\n        fill_mask (ColorType): Same as fill but only for masks.\n        border_mode (int): OpenCV border flag.\n        fit_output (bool): If True, the image plane size and position will be adjusted to tightly capture\n            the whole image after affine transformation (`translate_percent` and `translate_px` are ignored).\n            Otherwise (``False``),  parts of the transformed image may end up outside the image plane.\n            Fitting the output shape can be useful to avoid corners of the image being outside the image plane\n            after applying rotations. Default: False\n        keep_ratio (bool): When True, the original aspect ratio will be kept when the random scale is applied.\n            Default: False.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): rotation method used for the bounding boxes.\n            Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"\n        balanced_scale (bool): When True, scaling factors are chosen to be either entirely below or above 1,\n            ensuring balanced scaling. Default: False.\n\n            This is important because without it, scaling tends to lean towards upscaling. For example, if we want\n            the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is\n            three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly\n            from [0.5, 2]. With `balanced_scale`, the  function ensures that half the time, the scaling\n            factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in).\n            This makes the zooming in and out process more balanced.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        [1] https://arxiv.org/abs/2109.13488\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: ScaleFloatType | fgeometric.XYFloatScale\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None\n        rotate: ScaleFloatType\n        shear: ScaleFloatType | fgeometric.XYFloatScale\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        cval: ColorType | None\n        cval_mask: ColorType | None\n        mode: BorderModeType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        fit_output: bool\n        keep_ratio: bool\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        balanced_scale: bool\n\n        @field_validator(\"shear\", \"scale\")\n        @classmethod\n        def process_shear(\n            cls,\n            value: ScaleFloatType | fgeometric.XYFloatScale,\n            info: ValidationInfo,\n        ) -&gt; fgeometric.XYFloatDict:\n            return cast(\n                fgeometric.XYFloatDict,\n                cls._handle_dict_arg(value, info.field_name),\n            )\n\n        @field_validator(\"rotate\")\n        @classmethod\n        def process_rotate(\n            cls,\n            value: ScaleFloatType,\n        ) -&gt; tuple[float, float]:\n            return to_tuple(value, value)\n\n        @model_validator(mode=\"after\")\n        def handle_translate(self) -&gt; Self:\n            if self.translate_percent is None and self.translate_px is None:\n                self.translate_px = 0\n\n            if self.translate_percent is not None and self.translate_px is not None:\n                msg = \"Expected either translate_percent or translate_px to be provided, but both were provided.\"\n                raise ValueError(msg)\n\n            if self.translate_percent is not None:\n                self.translate_percent = self._handle_dict_arg(\n                    self.translate_percent,\n                    \"translate_percent\",\n                    default=0.0,\n                )  # type: ignore[assignment]\n\n            if self.translate_px is not None:\n                self.translate_px = self._handle_dict_arg(\n                    self.translate_px,\n                    \"translate_px\",\n                    default=0,\n                )  # type: ignore[assignment]\n\n            return self\n\n        @staticmethod\n        def _handle_dict_arg(\n            val: ScaleType | fgeometric.XYFloatScale | fgeometric.XYIntScale,\n            name: str | None,\n            default: float = 1.0,\n        ) -&gt; dict[str, Any]:\n            if isinstance(val, dict):\n                if \"x\" not in val and \"y\" not in val:\n                    raise ValueError(\n                        f'Expected {name} dictionary to contain at least key \"x\" or key \"y\". Found neither of them.',\n                    )\n                x = val.get(\"x\", default)\n                y = val.get(\"y\", default)\n                return {\"x\": to_tuple(x, x), \"y\": to_tuple(y, y)}  # type: ignore[arg-type]\n            return {\"x\": to_tuple(val, val), \"y\": to_tuple(val, val)}\n\n        @model_validator(mode=\"after\")\n        def validate_fill_types(self) -&gt; Self:\n            if self.cval is not None:\n                self.fill = self.cval\n                warn(\"cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n            if self.cval_mask is not None:\n                self.fill_mask = self.cval_mask\n                warn(\"cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n            if self.mode is not None:\n                self.border_mode = self.mode\n                warn(\"mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType | fgeometric.XYFloatScale = 1,\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None = None,\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None = None,\n        rotate: ScaleFloatType = 0,\n        shear: ScaleFloatType | fgeometric.XYFloatScale = 0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: ColorType | None = None,\n        cval_mask: ColorType | None = None,\n        mode: int | None = None,\n        fit_output: bool = False,\n        keep_ratio: bool = False,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        balanced_scale: bool = False,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n        self.scale = cast(fgeometric.XYFloatDict, scale)\n        self.translate_percent = cast(fgeometric.XYFloatDict, translate_percent)\n        self.translate_px = cast(fgeometric.XYIntDict, translate_px)\n        self.rotate = cast(tuple[float, float], rotate)\n        self.fit_output = fit_output\n        self.shear = cast(fgeometric.XYFloatDict, shear)\n        self.keep_ratio = keep_ratio\n        self.rotate_method = rotate_method\n        self.balanced_scale = balanced_scale\n\n        if self.keep_ratio and self.scale[\"x\"] != self.scale[\"y\"]:\n            raise ValueError(\n                f\"When keep_ratio is True, the x and y scale range should be identical. got {self.scale}\",\n            )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"fill\",\n            \"border_mode\",\n            \"scale\",\n            \"translate_percent\",\n            \"translate_px\",\n            \"rotate\",\n            \"fit_output\",\n            \"shear\",\n            \"fill_mask\",\n            \"keep_ratio\",\n            \"rotate_method\",\n            \"balanced_scale\",\n        )\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            img,\n            matrix,\n            interpolation=self.interpolation,\n            fill=self.fill,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            mask,\n            matrix,\n            interpolation=self.mask_interpolation,\n            fill=self.fill_mask,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            params[\"shape\"][:2],\n            self.border_mode,\n            output_shape,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        scale: fgeometric.XYFloat,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"],\n            scale,\n            self.border_mode,\n        )\n\n    @staticmethod\n    def get_scale(\n        scale: fgeometric.XYFloatDict,\n        keep_ratio: bool,\n        balanced_scale: bool,\n        random_state: random.Random,\n    ) -&gt; fgeometric.XYFloat:\n        result_scale = {}\n        for key, value in scale.items():\n            if isinstance(value, (int, float)):\n                result_scale[key] = float(value)\n            elif isinstance(value, tuple):\n                if balanced_scale:\n                    lower_interval = (value[0], 1.0) if value[0] &lt; 1 else None\n                    upper_interval = (1.0, value[1]) if value[1] &gt; 1 else None\n\n                    if lower_interval is not None and upper_interval is not None:\n                        selected_interval = random_state.choice(\n                            [lower_interval, upper_interval],\n                        )\n                    elif lower_interval is not None:\n                        selected_interval = lower_interval\n                    elif upper_interval is not None:\n                        selected_interval = upper_interval\n                    else:\n                        result_scale[key] = 1.0\n                        continue\n\n                    result_scale[key] = random_state.uniform(*selected_interval)\n                else:\n                    result_scale[key] = random_state.uniform(*value)\n            else:\n                raise TypeError(\n                    f\"Invalid scale value for key {key}: {value}. Expected a float or a tuple of two floats.\",\n                )\n\n        if keep_ratio:\n            result_scale[\"y\"] = result_scale[\"x\"]\n\n        return cast(fgeometric.XYFloat, result_scale)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        translate = self._get_translate_params(image_shape)\n        shear = self._get_shear_params()\n        scale = self.get_scale(\n            self.scale,\n            self.keep_ratio,\n            self.balanced_scale,\n            self.py_random,\n        )\n        rotate = self.py_random.uniform(*self.rotate)\n\n        image_shift = fgeometric.center(image_shape)\n        bbox_shift = fgeometric.center_bbox(image_shape)\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            image_shift,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_shift,\n        )\n\n        if self.fit_output:\n            matrix, output_shape = fgeometric.compute_affine_warp_output_shape(\n                matrix,\n                image_shape,\n            )\n            bbox_matrix, _ = fgeometric.compute_affine_warp_output_shape(\n                bbox_matrix,\n                image_shape,\n            )\n        else:\n            output_shape = image_shape\n\n        return {\n            \"rotate\": rotate,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": output_shape,\n        }\n\n    def _get_translate_params(self, image_shape: tuple[int, int]) -&gt; fgeometric.XYInt:\n        height, width = image_shape[:2]\n        if self.translate_px is not None:\n            return {\n                \"x\": self.py_random.randint(*self.translate_px[\"x\"]),\n                \"y\": self.py_random.randint(*self.translate_px[\"y\"]),\n            }\n        if self.translate_percent is not None:\n            translate = {key: self.py_random.uniform(*value) for key, value in self.translate_percent.items()}\n            return cast(\n                fgeometric.XYInt,\n                {\"x\": int(translate[\"x\"] * width), \"y\": int(translate[\"y\"] * height)},\n            )\n        return cast(fgeometric.XYInt, {\"x\": 0, \"y\": 0})\n\n    def _get_shear_params(self) -&gt; fgeometric.XYFloat:\n        return {\n            \"x\": -self.py_random.uniform(*self.shear[\"x\"]),\n            \"y\": -self.py_random.uniform(*self.shear[\"y\"]),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.BaseDistortion","title":"<code>class  BaseDistortion</code> <code>       (interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Base class for distortion-based transformations.</p> <p>This class provides a foundation for implementing various types of image distortions, such as optical distortions, grid distortions, and elastic transformations. It handles the common operations of applying distortions to images, masks, bounding boxes, and keypoints.</p> <p>Parameters:</p> Name Type Description <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR, cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This is an abstract base class and should not be used directly.</li> <li>Subclasses should implement the <code>get_params_dependent_on_data</code> method to generate   the distortion maps (map_x and map_y).</li> <li>The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)   to maintain coherence in the augmented data.</li> </ul> <p>Example of a subclass:     class CustomDistortion(BaseDistortion):         def init(self, args, **kwargs):             super().init(args, **kwargs)             # Add custom parameters here</p> <pre><code>    def get_params_dependent_on_data(self, params, data):\n        # Generate and return map_x and map_y based on the distortion logic\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self):\n        return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class BaseDistortion(DualTransform):\n    \"\"\"Base class for distortion-based transformations.\n\n    This class provides a foundation for implementing various types of image distortions,\n    such as optical distortions, grid distortions, and elastic transformations. It handles\n    the common operations of applying distortions to images, masks, bounding boxes, and keypoints.\n\n    Args:\n        interpolation (int): Interpolation method to be used for image transformation.\n            Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR,\n            cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This is an abstract base class and should not be used directly.\n        - Subclasses should implement the `get_params_dependent_on_data` method to generate\n          the distortion maps (map_x and map_y).\n        - The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)\n          to maintain coherence in the augmented data.\n\n    Example of a subclass:\n        class CustomDistortion(BaseDistortion):\n            def __init__(self, *args, **kwargs):\n                super().__init__(*args, **kwargs)\n                # Add custom parameters here\n\n            def get_params_dependent_on_data(self, params, data):\n                # Generate and return map_x and map_y based on the distortion logic\n                return {\"map_x\": map_x, \"map_y\": map_y}\n\n            def get_transform_init_args_names(self):\n                return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            img,\n            map_x,\n            map_y,\n            self.interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            mask,\n            map_x,\n            map_y,\n            self.mask_interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        bboxes_returned = fgeometric.remap_bboxes(\n            bboxes_denorm,\n            map_x,\n            map_y,\n            image_shape,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap_keypoints(keypoints, map_x, map_y, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.D4","title":"<code>class  D4</code> <code>       (p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections.</p> <p>The D4 group transformations include: - 'e' (identity): No transformation is applied. - 'r90' (rotation by 90 degrees counterclockwise) - 'r180' (rotation by 180 degrees) - 'r270' (rotation by 270 degrees counterclockwise) - 'v' (reflection across the vertical midline) - 'hvt' (reflection across the anti-diagonal) - 'h' (reflection across the horizontal midline) - 't' (reflection across the main diagonal)</p> <p>Even if the probability (<code>p</code>) of applying the transform is set to 1, the identity transformation 'e' may still occur, which means the input will remain unchanged in one out of eight cases.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform is particularly useful for augmenting data that does not have a clear orientation,   such as top-view satellite or drone imagery, or certain types of medical images.</li> <li>The input image should be square-shaped for optimal results. Non-square inputs may lead to   unexpected behavior or distortions.</li> <li>When applied to bounding boxes or keypoints, their coordinates will be adjusted according   to the selected transformation.</li> <li>This transform preserves the aspect ratio and size of the input.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.D4(p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed['image']\n# The resulting image will be one of the 8 possible D4 transformations of the input\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class D4(DualTransform):\n    \"\"\"Applies one of the eight possible D4 dihedral group transformations to a square-shaped input,\n    maintaining the square shape. These transformations correspond to the symmetries of a square,\n    including rotations and reflections.\n\n    The D4 group transformations include:\n    - 'e' (identity): No transformation is applied.\n    - 'r90' (rotation by 90 degrees counterclockwise)\n    - 'r180' (rotation by 180 degrees)\n    - 'r270' (rotation by 270 degrees counterclockwise)\n    - 'v' (reflection across the vertical midline)\n    - 'hvt' (reflection across the anti-diagonal)\n    - 'h' (reflection across the horizontal midline)\n    - 't' (reflection across the main diagonal)\n\n    Even if the probability (`p`) of applying the transform is set to 1, the identity transformation\n    'e' may still occur, which means the input will remain unchanged in one out of eight cases.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform is particularly useful for augmenting data that does not have a clear orientation,\n          such as top-view satellite or drone imagery, or certain types of medical images.\n        - The input image should be square-shaped for optimal results. Non-square inputs may lead to\n          unexpected behavior or distortions.\n        - When applied to bounding boxes or keypoints, their coordinates will be adjusted according\n          to the selected transformation.\n        - This transform preserves the aspect ratio and size of the input.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.D4(p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        # The resulting image will be one of the 8 possible D4 transformations of the input\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(\n        self,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.d4(img, group_element)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_d4(bboxes, group_element)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_d4(keypoints, group_element, params[\"shape\"])\n\n    def get_params(self) -&gt; dict[str, D4Type]:\n        return {\n            \"group_element\": self.random_generator.choice(d4_group_elements),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.ElasticTransform","title":"<code>class  ElasticTransform</code> <code>       (alpha=1, sigma=50, interpolation=1, border_mode=4, value=None, mask_value=None, approximate=False, same_dxdy=False, mask_interpolation=0, noise_distribution='gaussian', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply elastic deformation to images, masks, bounding boxes, and keypoints.</p> <p>This transformation introduces random elastic distortions to the input data. It's particularly useful for data augmentation in training deep learning models, especially for tasks like image segmentation or object detection where you want to maintain the relative positions of features while introducing realistic deformations.</p> <p>The transform works by generating random displacement fields and applying them to the input. These fields are smoothed using a Gaussian filter to create more natural-looking distortions.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>float</code> <p>Scaling factor for the random displacement fields. Higher values result in more pronounced distortions. Default: 1.0</p> <code>sigma</code> <code>float</code> <p>Standard deviation of the Gaussian filter used to smooth the displacement fields. Higher values result in smoother, more global distortions. Default: 50.0</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>approximate</code> <code>bool</code> <p>Whether to use an approximate version of the elastic transform. If True, uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially less accurate for large sigma values. Default: False</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use the same random displacement field for both x and y directions. Can speed up the transform at the cost of less diverse distortions. Default: False</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>noise_distribution</code> <code>Literal[\"gaussian\", \"uniform\"]</code> <p>Distribution used to generate the displacement fields. \"gaussian\" generates fields using normal distribution (more natural deformations). \"uniform\" generates fields using uniform distribution (more mechanical deformations). Default: \"gaussian\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)   by using the same displacement fields for all.</li> <li>The 'approximate' parameter determines whether to use a precise or approximate method for   generating displacement fields. The approximate method can be faster but may be less   accurate for large sigma values.</li> <li>Bounding boxes that end up outside the image after transformation will be removed.</li> <li>Keypoints that end up outside the image after transformation will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ElasticTransform(BaseDistortion):\n    \"\"\"Apply elastic deformation to images, masks, bounding boxes, and keypoints.\n\n    This transformation introduces random elastic distortions to the input data. It's particularly\n    useful for data augmentation in training deep learning models, especially for tasks like\n    image segmentation or object detection where you want to maintain the relative positions of\n    features while introducing realistic deformations.\n\n    The transform works by generating random displacement fields and applying them to the input.\n    These fields are smoothed using a Gaussian filter to create more natural-looking distortions.\n\n    Args:\n        alpha (float): Scaling factor for the random displacement fields. Higher values result in\n            more pronounced distortions. Default: 1.0\n        sigma (float): Standard deviation of the Gaussian filter used to smooth the displacement\n            fields. Higher values result in smoother, more global distortions. Default: 50.0\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        approximate (bool): Whether to use an approximate version of the elastic transform. If True,\n            uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially\n            less accurate for large sigma values. Default: False\n        same_dxdy (bool): Whether to use the same random displacement field for both x and y\n            directions. Can speed up the transform at the cost of less diverse distortions. Default: False\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        noise_distribution (Literal[\"gaussian\", \"uniform\"]): Distribution used to generate the displacement fields.\n            \"gaussian\" generates fields using normal distribution (more natural deformations).\n            \"uniform\" generates fields using uniform distribution (more mechanical deformations).\n            Default: \"gaussian\".\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)\n          by using the same displacement fields for all.\n        - The 'approximate' parameter determines whether to use a precise or approximate method for\n          generating displacement fields. The approximate method can be faster but may be less\n          accurate for large sigma values.\n        - Bounding boxes that end up outside the image after transformation will be removed.\n        - Keypoints that end up outside the image after transformation will be removed.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        alpha: Annotated[float, Field(ge=0)]\n        sigma: Annotated[float, Field(ge=1)]\n        approximate: bool\n        same_dxdy: bool\n        noise_distribution: Literal[\"gaussian\", \"uniform\"]\n        border_mode: BorderModeType = Field(deprecated=\"Deprecated\")\n        value: ColorType | None = Field(deprecated=\"Deprecated\")\n        mask_value: ColorType | None = Field(deprecated=\"Deprecated\")\n\n    def __init__(\n        self,\n        alpha: float = 1,\n        sigma: float = 50,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        approximate: bool = False,\n        same_dxdy: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        noise_distribution: Literal[\"gaussian\", \"uniform\"] = \"gaussian\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.alpha = alpha\n        self.sigma = sigma\n        self.approximate = approximate\n        self.same_dxdy = same_dxdy\n        self.noise_distribution = noise_distribution\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n        kernel_size = (0, 0) if self.approximate else (17, 17)\n\n        # Generate displacement fields\n        dx, dy = fgeometric.generate_displacement_fields(\n            (height, width),\n            self.alpha,\n            self.sigma,\n            same_dxdy=self.same_dxdy,\n            kernel_size=kernel_size,\n            random_generator=self.random_generator,\n            noise_distribution=self.noise_distribution,\n        )\n\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        map_x = np.float32(x + dx)\n        map_y = np.float32(y + dy)\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"alpha\",\n            \"sigma\",\n            \"approximate\",\n            \"same_dxdy\",\n            \"noise_distribution\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.GridDistortion","title":"<code>class  GridDistortion</code> <code>       (num_steps=5, distort_limit=(-0.3, 0.3), interpolation=1, border_mode=4, value=None, mask_value=None, normalized=True, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply grid distortion to images, masks, bounding boxes, and keypoints.</p> <p>This transformation divides the image into a grid and randomly distorts each cell, creating localized warping effects. It's particularly useful for data augmentation in tasks like medical image analysis, OCR, and other domains where local geometric variations are meaningful.</p> <p>Parameters:</p> Name Type Description <code>num_steps</code> <code>int</code> <p>Number of grid cells on each side of the image. Higher values create more granular distortions. Must be at least 1. Default: 5.</p> <code>distort_limit</code> <code>float or tuple[float, float]</code> <p>Range of distortion. If a single float is provided, the range will be (-distort_limit, distort_limit). Higher values create stronger distortions. Should be in the range of -1 to 1. Default: (-0.3, 0.3).</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation method used for image transformation. Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.</p> <code>normalized</code> <code>bool</code> <p>If True, ensures that the distortion does not move pixels outside the image boundaries. This can result in less extreme distortions but guarantees that no information is lost. Default: True.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The same distortion is applied to all targets (image, mask, bboxes, keypoints)   to maintain consistency.</li> <li>When normalized=True, the distortion is adjusted to ensure all pixels remain   within the image boundaries.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridDistortion(BaseDistortion):\n    \"\"\"Apply grid distortion to images, masks, bounding boxes, and keypoints.\n\n    This transformation divides the image into a grid and randomly distorts each cell,\n    creating localized warping effects. It's particularly useful for data augmentation\n    in tasks like medical image analysis, OCR, and other domains where local geometric\n    variations are meaningful.\n\n    Args:\n        num_steps (int): Number of grid cells on each side of the image. Higher values\n            create more granular distortions. Must be at least 1. Default: 5.\n        distort_limit (float or tuple[float, float]): Range of distortion. If a single float\n            is provided, the range will be (-distort_limit, distort_limit). Higher values\n            create stronger distortions. Should be in the range of -1 to 1.\n            Default: (-0.3, 0.3).\n        interpolation (int): OpenCV interpolation method used for image transformation.\n            Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.\n        normalized (bool): If True, ensures that the distortion does not move pixels\n            outside the image boundaries. This can result in less extreme distortions\n            but guarantees that no information is lost. Default: True.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The same distortion is applied to all targets (image, mask, bboxes, keypoints)\n          to maintain consistency.\n        - When normalized=True, the distortion is adjusted to ensure all pixels remain\n          within the image boundaries.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        num_steps: Annotated[int, Field(ge=1)]\n        distort_limit: SymmetricRangeType\n        normalized: bool\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int = Field(deprecated=\"Deprecated. Does not have any effect.\")\n\n        @field_validator(\"distort_limit\")\n        @classmethod\n        def check_limits(\n            cls,\n            v: tuple[float, float],\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = -1, 1\n            result = to_tuple(v)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        num_steps: int = 5,\n        distort_limit: ScaleFloatType = (-0.3, 0.3),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        normalized: bool = True,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.num_steps = num_steps\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.normalized = normalized\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        steps_x = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n        steps_y = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n\n        if self.normalized:\n            normalized_params = fgeometric.normalize_grid_distortion_steps(\n                image_shape,\n                self.num_steps,\n                steps_x,\n                steps_y,\n            )\n            steps_x, steps_y = (\n                normalized_params[\"steps_x\"],\n                normalized_params[\"steps_y\"],\n            )\n\n        map_x, map_y = fgeometric.generate_grid(\n            image_shape,\n            steps_x,\n            steps_y,\n            self.num_steps,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"num_steps\",\n            \"distort_limit\",\n            \"normalized\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.GridElasticDeform","title":"<code>class  GridElasticDeform</code> <code>       (num_grid_xy, magnitude, interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.</p> <p>This transformation overlays a grid on the input and applies random displacements to the grid points, resulting in local elastic distortions. The granularity and intensity of the distortions can be controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.</p> <p>Parameters:</p> Name Type Description <code>num_grid_xy</code> <code>tuple[int, int]</code> <p>Number of grid cells along the width and height. Specified as (grid_width, grid_height). Each value must be greater than 1.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. Must be greater than 0.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for the image transformation. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method to be used for mask transformation. Default: cv2.INTER_NEAREST</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n&gt;&gt;&gt; result = transform(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n</code></pre> <p>Note</p> <p>This transformation is particularly useful for data augmentation in medical imaging and other domains where elastic deformations can simulate realistic variations.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridElasticDeform(DualTransform):\n    \"\"\"Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.\n\n    This transformation overlays a grid on the input and applies random displacements to the grid points,\n    resulting in local elastic distortions. The granularity and intensity of the distortions can be\n    controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.\n\n\n    Args:\n        num_grid_xy (tuple[int, int]): Number of grid cells along the width and height.\n            Specified as (grid_width, grid_height). Each value must be greater than 1.\n        magnitude (int): Maximum pixel-wise displacement for distortion. Must be greater than 0.\n        interpolation (int): Interpolation method to be used for the image transformation.\n            Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Interpolation method to be used for mask transformation.\n            Default: cv2.INTER_NEAREST\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n\n    Note:\n        This transformation is particularly useful for data augmentation in medical imaging\n        and other domains where elastic deformations can simulate realistic variations.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        num_grid_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n        magnitude: int = Field(gt=0)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        num_grid_xy: tuple[int, int],\n        magnitude: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_grid_xy = num_grid_xy\n        self.magnitude = magnitude\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    @staticmethod\n    def generate_mesh(polygons: np.ndarray, dimensions: np.ndarray) -&gt; np.ndarray:\n        return np.hstack((dimensions.reshape(-1, 4), polygons))\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        # Replace calculate_grid_dimensions with split_uniform_grid\n        tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.num_grid_xy,\n            self.random_generator,\n        )\n\n        # Convert tiles to the format expected by generate_distorted_grid_polygons\n        dimensions = np.array(\n            [\n                [\n                    tile[1],\n                    tile[0],\n                    tile[3],\n                    tile[2],\n                ]  # Reorder to [x_min, y_min, x_max, y_max]\n                for tile in tiles\n            ],\n        ).reshape(\n            self.num_grid_xy[::-1] + (4,),\n        )  # Reshape to (grid_height, grid_width, 4)\n\n        polygons = fgeometric.generate_distorted_grid_polygons(\n            dimensions,\n            self.magnitude,\n            self.random_generator,\n        )\n\n        generated_mesh = self.generate_mesh(polygons, dimensions)\n\n        return {\"generated_mesh\": generated_mesh}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(img, generated_mesh, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(mask, generated_mesh, self.mask_interpolation)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        bboxes_denorm = denormalize_bboxes(bboxes, params[\"shape\"][:2])\n        return normalize_bboxes(\n            fgeometric.bbox_distort_image(\n                bboxes_denorm,\n                generated_mesh,\n                params[\"shape\"][:2],\n            ),\n            params[\"shape\"][:2],\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image_keypoints(\n            keypoints,\n            generated_mesh,\n            params[\"shape\"][:2],\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"num_grid_xy\", \"magnitude\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.HorizontalFlip","title":"<code>class  HorizontalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class HorizontalFlip(DualTransform):\n    \"\"\"Flip the input horizontally around the y-axis.\n\n    Args:\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return hflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_hflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_hflip(keypoints, params[\"shape\"][1])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.OpticalDistortion","title":"<code>class  OpticalDistortion</code> <code>       (distort_limit=(-0.05, 0.05), shift_limit=None, interpolation=1, border_mode=None, value=None, mask_value=None, mask_interpolation=0, mode='camera', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply optical distortion to images, masks, bounding boxes, and keypoints.</p> <p>Supports two distortion models: 1. Camera matrix model (original):    Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients</p> <ol> <li>Fisheye model:    Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)</li> </ol> <p>Parameters:</p> Name Type Description <code>distort_limit</code> <code>float | tuple[float, float]</code> <p>Range of distortion coefficient. For camera model: recommended range (-0.05, 0.05) For fisheye model: recommended range (-0.3, 0.3) Default: (-0.05, 0.05)</p> <code>mode</code> <code>Literal['camera', 'fisheye']</code> <p>Distortion model to use: - 'camera': Original camera matrix model - 'fisheye': Fisheye lens model Default: 'camera'</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Interpolation method used for image transformation. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.</li> <li>The distortion coefficient (k) is randomly sampled from the distort_limit range.</li> <li>The image center is shifted by dx and dy, randomly sampled from the shift_limit range.</li> <li>Bounding boxes and keypoints are transformed along with the image to maintain consistency.</li> <li>Fisheye model directly applies radial distortion</li> <li>Both models use shift_limit to control distortion center</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class OpticalDistortion(BaseDistortion):\n    \"\"\"Apply optical distortion to images, masks, bounding boxes, and keypoints.\n\n    Supports two distortion models:\n    1. Camera matrix model (original):\n       Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients\n\n    2. Fisheye model:\n       Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)\n\n    Args:\n        distort_limit (float | tuple[float, float]): Range of distortion coefficient.\n            For camera model: recommended range (-0.05, 0.05)\n            For fisheye model: recommended range (-0.3, 0.3)\n            Default: (-0.05, 0.05)\n\n        mode (Literal['camera', 'fisheye']): Distortion model to use:\n            - 'camera': Original camera matrix model\n            - 'fisheye': Fisheye lens model\n            Default: 'camera'\n\n        interpolation (OpenCV flag): Interpolation method used for image transformation.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC,\n            cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.\n\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.\n        - The distortion coefficient (k) is randomly sampled from the distort_limit range.\n        - The image center is shifted by dx and dy, randomly sampled from the shift_limit range.\n        - Bounding boxes and keypoints are transformed along with the image to maintain consistency.\n        - Fisheye model directly applies radial distortion\n        - Both models use shift_limit to control distortion center\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        distort_limit: SymmetricRangeType\n        mode: Literal[\"camera\", \"fisheye\"]\n        shift_limit: SymmetricRangeType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n    def __init__(\n        self,\n        distort_limit: ScaleFloatType = (-0.05, 0.05),\n        shift_limit: ScaleFloatType | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int | None = None,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        mode: Literal[\"camera\", \"fisheye\"] = \"camera\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.mode = mode\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        height, width = image_shape\n\n        # Get distortion coefficient\n        k = self.py_random.uniform(*self.distort_limit)\n\n        # Calculate center shift\n        center_xy = fgeometric.center(image_shape)\n\n        # Get distortion maps based on mode\n        if self.mode == \"camera\":\n            map_x, map_y = fgeometric.get_camera_matrix_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n        else:  # fisheye\n            map_x, map_y = fgeometric.get_fisheye_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"distort_limit\",\n            \"mode\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Pad","title":"<code>class  Pad</code> <code>       (padding=0, fill=0, fill_mask=0, border_mode=0, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Pad the sides of an image by specified number of pixels.</p> <p>Parameters:</p> Name Type Description <code>padding</code> <code>int, tuple[int, int] or tuple[int, int, int, int]</code> <p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>References</p> <ul> <li>https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Pad(DualTransform):\n    \"\"\"Pad the sides of an image by specified number of pixels.\n\n    Args:\n        padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:\n            * int - pad all sides by this value\n            * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y\n            * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT\n        fill_mask (ColorType): Padding value for mask if border_mode is cv2.BORDER_CONSTANT\n        border_mode (OpenCV flag): OpenCV border mode\n        p (float): probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    References:\n        - https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        padding: int | tuple[int, int] | tuple[int, int, int, int]\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n    def __init__(\n        self,\n        padding: int | tuple[int, int] | tuple[int, int, int, int] = 0,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        border_mode: BorderModeType = cv2.BORDER_CONSTANT,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.padding = padding\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n\n    def apply(\n        self,\n        img: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            img,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            mask,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill_mask,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_np = denormalize_bboxes(bboxes, params[\"shape\"])\n\n        result = fgeometric.pad_bboxes(\n            bboxes_np,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=image_shape,\n        )\n\n        rows, cols = params[\"shape\"][:2]\n        return normalize_bboxes(\n            result,\n            (rows + pad_top + pad_bottom, cols + pad_left + pad_right),\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_keypoints(\n            keypoints,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=params[\"shape\"][:2],\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        if isinstance(self.padding, Real):\n            pad_top = pad_bottom = pad_left = pad_right = self.padding\n        elif isinstance(self.padding, (tuple, list)):\n            if len(self.padding) == NUM_PADS_XY:\n                pad_left = pad_right = self.padding[0]\n                pad_top = pad_bottom = self.padding[1]\n            elif len(self.padding) == NUM_PADS_ALL_SIDES:\n                pad_left, pad_top, pad_right, pad_bottom = self.padding  # type: ignore[misc]\n            else:\n                raise TypeError(\n                    \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n                )\n        else:\n            raise TypeError(\n                \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n            )\n\n        return {\n            \"pad_top\": pad_top,\n            \"pad_bottom\": pad_bottom,\n            \"pad_left\": pad_left,\n            \"pad_right\": pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"padding\",\n            \"fill\",\n            \"fill_mask\",\n            \"border_mode\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.PadIfNeeded","title":"<code>class  PadIfNeeded</code> <code>       (min_height=1024, min_width=1024, pad_height_divisor=None, pad_width_divisor=None, position='center', border_mode=4, value=None, mask_value=None, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pads the sides of an image if the image dimensions are less than the specified minimum dimensions. If the <code>pad_height_divisor</code> or <code>pad_width_divisor</code> is specified, the function additionally ensures that the image dimensions are divisible by these values.</p> <p>Parameters:</p> Name Type Description <code>min_height</code> <code>int | None</code> <p>Minimum desired height of the image. Ensures image height is at least this value. If not specified, pad_height_divisor must be provided.</p> <code>min_width</code> <code>int | None</code> <p>Minimum desired width of the image. Ensures image width is at least this value. If not specified, pad_width_divisor must be provided.</p> <code>pad_height_divisor</code> <code>int | None</code> <p>If set, pads the image height to make it divisible by this value. If not specified, min_height must be provided.</p> <code>pad_width_divisor</code> <code>int | None</code> <p>If set, pads the image width to make it divisible by this value. If not specified, min_width must be provided.</p> <code>position</code> <code>Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]</code> <p>Position where the image is to be placed after padding. Default is 'center'.</p> <code>border_mode</code> <code>int</code> <p>Specifies the border mode to use if padding is required. The default is <code>cv2.BORDER_REFLECT_101</code>.</p> <code>fill</code> <code>ColorType | None</code> <p>Value to fill the border pixels if the border mode is <code>cv2.BORDER_CONSTANT</code>. Default is None.</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Similar to <code>fill</code> but used for padding masks. Default is None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default is 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Either <code>min_height</code> or <code>pad_height_divisor</code> must be set, but not both.</li> <li>Either <code>min_width</code> or <code>pad_width_divisor</code> must be set, but not both.</li> <li>If <code>border_mode</code> is set to <code>cv2.BORDER_CONSTANT</code>, <code>value</code> must be provided.</li> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).</li> <li>For bounding boxes, the coordinates will be adjusted to account for the padding.</li> <li>For keypoints, their positions will be shifted according to the padding.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; padded_image = transformed['image']\n&gt;&gt;&gt; padded_mask = transformed['mask']\n&gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n&gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PadIfNeeded(Pad):\n    \"\"\"Pads the sides of an image if the image dimensions are less than the specified minimum dimensions.\n    If the `pad_height_divisor` or `pad_width_divisor` is specified, the function additionally ensures\n    that the image dimensions are divisible by these values.\n\n    Args:\n        min_height (int | None): Minimum desired height of the image. Ensures image height is at least this value.\n            If not specified, pad_height_divisor must be provided.\n        min_width (int | None): Minimum desired width of the image. Ensures image width is at least this value.\n            If not specified, pad_width_divisor must be provided.\n        pad_height_divisor (int | None): If set, pads the image height to make it divisible by this value.\n            If not specified, min_height must be provided.\n        pad_width_divisor (int | None): If set, pads the image width to make it divisible by this value.\n            If not specified, min_width must be provided.\n        position (Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]):\n            Position where the image is to be placed after padding. Default is 'center'.\n        border_mode (int): Specifies the border mode to use if padding is required.\n            The default is `cv2.BORDER_REFLECT_101`.\n        fill (ColorType | None): Value to fill the border pixels if the border mode is `cv2.BORDER_CONSTANT`.\n            Default is None.\n        fill_mask (ColorType | None): Similar to `fill` but used for padding masks. Default is None.\n        p (float): Probability of applying the transform. Default is 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Either `min_height` or `pad_height_divisor` must be set, but not both.\n        - Either `min_width` or `pad_width_divisor` must be set, but not both.\n        - If `border_mode` is set to `cv2.BORDER_CONSTANT`, `value` must be provided.\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).\n        - For bounding boxes, the coordinates will be adjusted to account for the padding.\n        - For keypoints, their positions will be shifted according to the padding.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; padded_image = transformed['image']\n        &gt;&gt;&gt; padded_mask = transformed['mask']\n        &gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        min_height: int | None = Field(ge=1)\n        min_width: int | None = Field(ge=1)\n        pad_height_divisor: int | None = Field(ge=1)\n        pad_width_divisor: int | None = Field(ge=1)\n        position: PositionType\n        border_mode: BorderModeType\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        @model_validator(mode=\"after\")\n        def validate_divisibility(self) -&gt; Self:\n            if (self.min_height is None) == (self.pad_height_divisor is None):\n                msg = \"Only one of 'min_height' and 'pad_height_divisor' parameters must be set\"\n                raise ValueError(msg)\n            if (self.min_width is None) == (self.pad_width_divisor is None):\n                msg = \"Only one of 'min_width' and 'pad_width_divisor' parameters must be set\"\n                raise ValueError(msg)\n\n            if self.border_mode == cv2.BORDER_CONSTANT and self.fill is None:\n                msg = \"If 'border_mode' is set to 'BORDER_CONSTANT', 'fill' must be provided.\"\n                raise ValueError(msg)\n\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n\n            return self\n\n    def __init__(\n        self,\n        min_height: int | None = 1024,\n        min_width: int | None = 1024,\n        pad_height_divisor: int | None = None,\n        pad_width_divisor: int | None = None,\n        position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        # Initialize with dummy padding that will be calculated later\n        super().__init__(\n            padding=0,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            p=p,\n        )\n        self.min_height = min_height\n        self.min_width = min_width\n        self.pad_height_divisor = pad_height_divisor\n        self.pad_width_divisor = pad_width_divisor\n        self.position = position\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.get_padding_params(\n            image_shape=params[\"shape\"][:2],\n            min_height=self.min_height,\n            min_width=self.min_width,\n            pad_height_divisor=self.pad_height_divisor,\n            pad_width_divisor=self.pad_width_divisor,\n        )\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.adjust_padding_by_position(\n            h_top=h_pad_top,\n            h_bottom=h_pad_bottom,\n            w_left=w_pad_left,\n            w_right=w_pad_right,\n            position=self.position,\n            py_random=self.py_random,\n        )\n\n        return {\n            \"pad_top\": h_pad_top,\n            \"pad_bottom\": h_pad_bottom,\n            \"pad_left\": w_pad_left,\n            \"pad_right\": w_pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"min_height\",\n            \"min_width\",\n            \"pad_height_divisor\",\n            \"pad_width_divisor\",\n            \"position\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Perspective","title":"<code>class  Perspective</code> <code>       (scale=(0.05, 0.1), keep_size=True, pad_mode=None, pad_val=None, mask_pad_val=None, fit_output=False, interpolation=1, mask_interpolation=0, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Apply random four point perspective transformation to the input.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>float or tuple of float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Default: (0.05, 0.1).</p> <code>keep_size</code> <code>bool</code> <p>Whether to resize image back to its original size after applying the perspective transform. If set to False, the resulting images may end up having different shapes. Default: True.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to still capture the whole image after perspective transformation. This is followed by image resizing if keep_size is set to True. If False, parts of the transformed image may be outside of the image plane. This setting should not be set to True when using large scale values as it could lead to very large images. Default: False.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>This transformation creates a perspective effect by randomly moving the four corners of the image. The amount of movement is controlled by the 'scale' parameter.</p> <p>When 'keep_size' is True, the output image will have the same size as the input image, which may cause some parts of the transformed image to be cut off or padded.</p> <p>When 'fit_output' is True, the transformation ensures that the entire transformed image is visible, which may result in a larger output image if keep_size is False.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n... ])\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Perspective(DualTransform):\n    \"\"\"Apply random four point perspective transformation to the input.\n\n    Args:\n        scale (float or tuple of float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Default: (0.05, 0.1).\n        keep_size (bool): Whether to resize image back to its original size after applying the perspective transform.\n            If set to False, the resulting images may end up having different shapes.\n            Default: True.\n        border_mode (OpenCV flag): OpenCV border mode used for padding.\n            Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n            Default: 0.\n        fill_mask (ColorType): Padding value for mask if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fit_output (bool): If True, the image plane size and position will be adjusted to still capture\n            the whole image after perspective transformation. This is followed by image resizing if keep_size is set\n            to True. If False, parts of the transformed image may be outside of the image plane.\n            This setting should not be set to True when using large scale values as it could lead to very large images.\n            Default: False.\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        This transformation creates a perspective effect by randomly moving the four corners of the image.\n        The amount of movement is controlled by the 'scale' parameter.\n\n        When 'keep_size' is True, the output image will have the same size as the input image,\n        which may cause some parts of the transformed image to be cut off or padded.\n\n        When 'fit_output' is True, the transformation ensures that the entire transformed image is visible,\n        which may result in a larger output image if keep_size is False.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n        ... ])\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        keep_size: bool\n        pad_mode: BorderModeType | None = Field(\n            deprecated=\"Deprecated use border_mode instead\",\n        )\n        pad_val: ColorType | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_pad_val: ColorType | None = Field(\n            deprecated=\"Deprecated use fill_mask instead\",\n        )\n        fit_output: bool\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        @model_validator(mode=\"after\")\n        def validate_deprecated_fields(self) -&gt; Self:\n            if self.pad_mode is not None:\n                self.border_mode = self.pad_mode\n            if self.pad_val is not None:\n                self.fill = self.pad_val\n            if self.mask_pad_val is not None:\n                self.fill_mask = self.mask_pad_val\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.05, 0.1),\n        keep_size: bool = True,\n        pad_mode: int | None = None,\n        pad_val: ColorType | None = None,\n        mask_pad_val: ColorType | None = None,\n        fit_output: bool = False,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p, always_apply=always_apply)\n        self.scale = cast(tuple[float, float], scale)\n        self.keep_size = keep_size\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.fit_output = fit_output\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            img,\n            matrix,\n            max_width,\n            max_height,\n            self.fill,\n            self.border_mode,\n            self.keep_size,\n            self.interpolation,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            mask,\n            matrix,\n            max_width,\n            max_height,\n            self.fill_mask,\n            self.border_mode,\n            self.keep_size,\n            self.mask_interpolation,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        matrix_bbox: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_bboxes(\n            bboxes,\n            params[\"shape\"],\n            matrix_bbox,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_keypoints(\n            keypoints,\n            params[\"shape\"],\n            matrix,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        scale = self.py_random.uniform(*self.scale)\n\n        points = fgeometric.generate_perspective_points(\n            image_shape,\n            scale,\n            self.random_generator,\n        )\n        points = fgeometric.order_points(points)\n\n        matrix, max_width, max_height = fgeometric.compute_perspective_params(\n            points,\n            image_shape,\n        )\n\n        if self.fit_output:\n            matrix, max_width, max_height = fgeometric.expand_transform(\n                matrix,\n                image_shape,\n            )\n\n        return {\n            \"matrix\": matrix,\n            \"max_height\": max_height,\n            \"max_width\": max_width,\n            \"matrix_bbox\": matrix,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"keep_size\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"fit_output\",\n            \"interpolation\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.PiecewiseAffine","title":"<code>class  PiecewiseAffine</code> <code>       (scale=(0.03, 0.05), nb_rows=(4, 4), nb_cols=(4, 4), interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, absolute_scale=False, p=0.5, always_apply=None, keypoints_threshold=0.01)                     </code>  [view source on GitHub]","text":"<p>Apply piecewise affine transformations to the input image.</p> <p>This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points around via affine transformations. This leads to local distortions in the image.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>tuple[float, float] | float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Recommended values are in the range (0.01, 0.05) for small distortions, and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).</p> <code>nb_rows</code> <code>tuple[int, int] | int</code> <p>Number of rows of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of rows. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>nb_cols</code> <code>tuple[int, int] | int</code> <p>Number of columns of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of columns. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>absolute_scale</code> <code>bool</code> <p>If set to True, the value of the scale parameter will be treated as an absolute pixel value. If set to False, it will be treated as a fraction of the image height and width. Default: False.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This augmentation is very slow. Consider using <code>ElasticTransform</code> instead, which is at least 10x faster.</li> <li>The augmentation may not always produce visible effects, especially with small scale values.</li> <li>For keypoints and bounding boxes, the transformation might move them outside the image boundaries.   In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PiecewiseAffine(BaseDistortion):\n    \"\"\"Apply piecewise affine transformations to the input image.\n\n    This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points\n    around via affine transformations. This leads to local distortions in the image.\n\n    Args:\n        scale (tuple[float, float] | float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Recommended values are in the range (0.01, 0.05) for small distortions,\n            and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).\n        nb_rows (tuple[int, int] | int): Number of rows of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of rows.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        nb_cols (tuple[int, int] | int): Number of columns of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of columns.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        absolute_scale (bool): If set to True, the value of the scale parameter will be treated as an absolute\n            pixel value. If set to False, it will be treated as a fraction of the image height and width.\n            Default: False.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This augmentation is very slow. Consider using `ElasticTransform` instead, which is at least 10x faster.\n        - The augmentation may not always produce visible effects, especially with small scale values.\n        - For keypoints and bounding boxes, the transformation might move them outside the image boundaries.\n          In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed[\"image\"]\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        nb_rows: ScaleIntType\n        nb_cols: ScaleIntType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        cval: int | None = Field(deprecated=\"Deprecated. Does not have any effect.\")\n        cval_mask: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = Field(\n            deprecated=\"Deprecated. Does not have any effects.\",\n        )\n\n        absolute_scale: bool\n        keypoints_threshold: float = Field(\n            deprecated=\"This parameter is not used anymore\",\n        )\n\n        @field_validator(\"nb_rows\", \"nb_cols\")\n        @classmethod\n        def process_range(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = 2, BIG_INTEGER\n            result = to_tuple(value, value)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.03, 0.05),\n        nb_rows: ScaleIntType = (4, 4),\n        nb_cols: ScaleIntType = (4, 4),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: int | None = None,\n        cval_mask: int | None = None,\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = None,\n        absolute_scale: bool = False,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n        keypoints_threshold: float = 0.01,\n    ):\n        super().__init__(\n            p=p,\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n        )\n\n        warn(\n            \"This augmenter is very slow. Try to use ``ElasticTransform`` instead, which is at least 10x faster.\",\n            stacklevel=2,\n        )\n\n        self.scale = cast(tuple[float, float], scale)\n        self.nb_rows = cast(tuple[int, int], nb_rows)\n        self.nb_cols = cast(tuple[int, int], nb_cols)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.absolute_scale = absolute_scale\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"nb_rows\",\n            \"nb_cols\",\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"absolute_scale\",\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        nb_rows = np.clip(self.py_random.randint(*self.nb_rows), 2, None)\n        nb_cols = np.clip(self.py_random.randint(*self.nb_cols), 2, None)\n        scale = self.py_random.uniform(*self.scale)\n\n        map_x, map_y = fgeometric.create_piecewise_affine_maps(\n            image_shape=image_shape,\n            grid=(nb_rows, nb_cols),\n            scale=scale,\n            absolute_scale=self.absolute_scale,\n            random_generator=self.random_generator,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.RandomGridShuffle","title":"<code>class  RandomGridShuffle</code> <code>       (grid=(3, 3), p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Randomly shuffles the grid's cells on an image, mask, or keypoints, effectively rearranging patches within the image. This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.</p> <p>Parameters:</p> Name Type Description <code>grid</code> <code>tuple[int, int]</code> <p>Size of the grid for splitting the image into cells. Each cell is shuffled randomly. For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled. Default: (3, 3)</p> <code>p</code> <code>float</code> <p>Probability that the transform will be applied. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform maintains consistency across all targets. If applied to an image and its corresponding   mask or keypoints, the same shuffling will be applied to all.</li> <li>The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))   for the transform to have any effect.</li> <li>Keypoints are moved along with their corresponding grid cell.</li> <li>This transform could be useful when only micro features are important for the model, and memorizing   the global structure could be harmful. For example:</li> <li>Identifying the type of cell phone used to take a picture based on micro artifacts generated by     phone post-processing algorithms, rather than the semantic features of the photo.     See more at https://ieeexplore.ieee.org/abstract/document/8622031</li> <li>Identifying stress, glucose, hydration levels based on skin images.</li> </ul> <p>Mathematical Formulation:     1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.     2. A random permutation P of integers from 0 to (mn - 1) is generated.     3. Each cell in the grid is assigned a number from 0 to (mn - 1) in row-major order.     4. The cells are then rearranged according to the permutation P.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4]\n... ])\n&gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# The resulting image might look like this (one possible outcome):\n# [[4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1]]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class RandomGridShuffle(DualTransform):\n    \"\"\"Randomly shuffles the grid's cells on an image, mask, or keypoints,\n    effectively rearranging patches within the image.\n    This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.\n\n    Args:\n        grid (tuple[int, int]): Size of the grid for splitting the image into cells. Each cell is shuffled randomly.\n            For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled.\n            Default: (3, 3)\n        p (float): Probability that the transform will be applied. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform maintains consistency across all targets. If applied to an image and its corresponding\n          mask or keypoints, the same shuffling will be applied to all.\n        - The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))\n          for the transform to have any effect.\n        - Keypoints are moved along with their corresponding grid cell.\n        - This transform could be useful when only micro features are important for the model, and memorizing\n          the global structure could be harmful. For example:\n          - Identifying the type of cell phone used to take a picture based on micro artifacts generated by\n            phone post-processing algorithms, rather than the semantic features of the photo.\n            See more at https://ieeexplore.ieee.org/abstract/document/8622031\n          - Identifying stress, glucose, hydration levels based on skin images.\n\n    Mathematical Formulation:\n        1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.\n        2. A random permutation P of integers from 0 to (m*n - 1) is generated.\n        3. Each cell in the grid is assigned a number from 0 to (m*n - 1) in row-major order.\n        4. The cells are then rearranged according to the permutation P.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4]\n        ... ])\n        &gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # The resulting image might look like this (one possible outcome):\n        # [[4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1]]\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        grid: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n\n    _targets = ALL_TARGETS\n\n    def __init__(\n        self,\n        grid: tuple[int, int] = (3, 3),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.grid = grid\n\n    def apply(\n        self,\n        img: np.ndarray,\n        tiles: np.ndarray,\n        mapping: list[int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_image(img, tiles, mapping)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n        bboxes_returned = fgeometric.bboxes_grid_shuffle(\n            bboxes_denorm,\n            tiles,\n            mapping,\n            image_shape,\n            min_area=processor.params.min_area,\n            min_visibility=processor.params.min_visibility,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_keypoints(keypoints, tiles, mapping)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        image_shape = params[\"shape\"][:2]\n\n        original_tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.grid,\n            self.random_generator,\n        )\n        shape_groups = fgeometric.create_shape_groups(original_tiles)\n        mapping = fgeometric.shuffle_tiles_within_shape_groups(\n            shape_groups,\n            self.random_generator,\n        )\n\n        return {\"tiles\": original_tiles, \"mapping\": mapping}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"grid\",)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.ShiftScaleRotate","title":"<code>class  ShiftScaleRotate</code> <code>       (shift_limit=(-0.0625, 0.0625), scale_limit=(-0.1, 0.1), rotate_limit=(-45, 45), interpolation=1, border_mode=4, value=None, mask_value=None, shift_limit_x=None, shift_limit_y=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Randomly apply affine transforms: translate, scale and rotate the input.</p> <p>Parameters:</p> Name Type Description <code>shift_limit</code> <code>float, float) or float</code> <p>shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).</p> <code>scale_limit</code> <code>float, float) or float</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>rotate_limit</code> <code>int, int) or int</code> <p>rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>shift_limit_x</code> <code>float, float) or float</code> <p>shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [-1, 1]. Default: None.</p> <code>shift_limit_y</code> <code>float, float) or float</code> <p>shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [-, 1]. Default: None.</p> <code>rotate_method</code> <code>str</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\". Default: \"largest_box\"</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ShiftScaleRotate(Affine):\n    \"\"\"Randomly apply affine transforms: translate, scale and rotate the input.\n\n    Args:\n        shift_limit ((float, float) or float): shift factor range for both height and width. If shift_limit\n            is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and\n            upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).\n        scale_limit ((float, float) or float): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        rotate_limit ((int, int) or int): rotation range. If rotate_limit is a single int value, the\n            range will be (-rotate_limit, rotate_limit). Default: (-45, 45).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        shift_limit_x ((float, float) or float): shift factor range for width. If it is set then this value\n            instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value,\n            the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in\n            the range [-1, 1]. Default: None.\n        shift_limit_y ((float, float) or float): shift factor range for height. If it is set then this value\n            instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value,\n            the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie\n            in the range [-, 1]. Default: None.\n        rotate_method (str): rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\".\n            Default: \"largest_box\"\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        shift_limit: SymmetricRangeType = (-0.0625, 0.0625)\n        scale_limit: SymmetricRangeType = (-0.1, 0.1)\n        rotate_limit: SymmetricRangeType = (-45, 45)\n        interpolation: InterpolationType = cv2.INTER_LINEAR\n        border_mode: BorderModeType = cv2.BORDER_REFLECT_101\n\n        value: ColorType | None = Field(\n            default=None,\n            deprecated=\"Deprecated. Use fill instead.\",\n        )\n        mask_value: ColorType | None = Field(\n            default=None,\n            deprecated=\"Deprecated. Use fill_mask instead.\",\n        )\n\n        fill: ColorType = 0\n        fill_mask: ColorType = 0\n\n        shift_limit_x: ScaleFloatType | None = Field(default=None)\n        shift_limit_y: ScaleFloatType | None = Field(default=None)\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\"\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def check_shift_limit(self) -&gt; Self:\n            bounds = -1, 1\n            self.shift_limit_x = to_tuple(\n                self.shift_limit_x if self.shift_limit_x is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_x, *bounds, \"shift_limit_x\")\n            self.shift_limit_y = to_tuple(\n                self.shift_limit_y if self.shift_limit_y is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_y, *bounds, \"shift_limit_y\")\n\n            return self\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; ScaleFloatType:\n            bounds = 0, float(\"inf\")\n            result = to_tuple(value, bias=1.0)\n            check_range(result, *bounds, str(info.field_name))\n            return result\n\n    def __init__(\n        self,\n        shift_limit: ScaleFloatType = (-0.0625, 0.0625),\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        rotate_limit: ScaleFloatType = (-45, 45),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        shift_limit_x: ScaleFloatType | None = None,\n        shift_limit_y: ScaleFloatType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: InterpolationType = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        shift_limit_x = cast(tuple[float, float], shift_limit_x)\n        shift_limit_y = cast(tuple[float, float], shift_limit_y)\n        super().__init__(\n            scale=scale_limit,\n            translate_percent={\"x\": shift_limit_x, \"y\": shift_limit_y},\n            rotate=rotate_limit,\n            shear=(0, 0),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            fit_output=False,\n            keep_ratio=False,\n            rotate_method=rotate_method,\n            always_apply=always_apply,\n            p=p,\n        )\n        warn(\n            \"ShiftScaleRotate is deprecated. Please use Affine transform instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        self.shift_limit_x = shift_limit_x\n        self.shift_limit_y = shift_limit_y\n\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.rotate_limit = cast(tuple[int, int], rotate_limit)\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"shift_limit_x\": self.shift_limit_x,\n            \"shift_limit_y\": self.shift_limit_y,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n            \"rotate_limit\": self.rotate_limit,\n            \"interpolation\": self.interpolation,\n            \"border_mode\": self.border_mode,\n            \"fill\": self.fill,\n            \"fill_mask\": self.fill_mask,\n            \"rotate_method\": self.rotate_method,\n            \"mask_interpolation\": self.mask_interpolation,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.ThinPlateSpline","title":"<code>class  ThinPlateSpline</code> <code>       (scale_range=(0.2, 0.4), num_control_points=4, interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.</p> <p>Imagine the image printed on a thin metal plate that can be bent and warped smoothly: - Control points act like pins pushing or pulling the plate - The plate resists sharp bending, creating smooth deformations - The transformation maintains continuity (no tears or folds) - Areas between control points are interpolated naturally</p> <p>The transform works by: 1. Creating a regular grid of control points (like pins in the plate) 2. Randomly displacing these points (like pushing/pulling the pins) 3. Computing a smooth interpolation (like the plate bending) 4. Applying the resulting deformation to the image</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for random displacement of control points. Values should be in [0.0, 1.0]: - 0.0: No displacement (identity transform) - 0.1: Subtle warping - 0.2-0.4: Moderate deformation (recommended range) - 0.5+: Strong warping Default: (0.2, 0.4)</p> <code>num_control_points</code> <code>int</code> <p>Number of control points per side. Creates a grid of num_control_points x num_control_points points. - 2: Minimal deformation (affine-like) - 3-4: Moderate flexibility (recommended) - 5+: More local deformation control Must be &gt;= 2. Default: 4</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag. Used for image sampling. See also: cv2.INTER_* Default: cv2.INTER_LINEAR</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transformation preserves smoothness and continuity</li> <li>Stronger scale values may create more extreme deformations</li> <li>Higher number of control points allows more local deformations</li> <li>The same deformation is applied consistently to all targets</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Basic usage\n&gt;&gt;&gt; transform = A.ThinPlateSpline()\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Subtle deformation\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.1, 0.2),\n...     num_control_points=3\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Strong warping with fine control\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.3, 0.5),\n...     num_control_points=5,\n... )\n</code></pre> <p>References</p> <ul> <li> <p>\"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"   by F.L. Bookstein   https://doi.org/10.1109/34.24792</p> </li> <li> <p>Thin Plate Splines in Computer Vision:   https://en.wikipedia.org/wiki/Thin_plate_spline</p> </li> <li> <p>Similar implementation in Kornia:   https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline</p> </li> </ul> <p>See Also:     - ElasticTransform: For different type of non-rigid deformation     - GridDistortion: For grid-based warping     - OpticalDistortion: For lens-like distortions</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ThinPlateSpline(BaseDistortion):\n    r\"\"\"Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.\n\n    Imagine the image printed on a thin metal plate that can be bent and warped smoothly:\n    - Control points act like pins pushing or pulling the plate\n    - The plate resists sharp bending, creating smooth deformations\n    - The transformation maintains continuity (no tears or folds)\n    - Areas between control points are interpolated naturally\n\n    The transform works by:\n    1. Creating a regular grid of control points (like pins in the plate)\n    2. Randomly displacing these points (like pushing/pulling the pins)\n    3. Computing a smooth interpolation (like the plate bending)\n    4. Applying the resulting deformation to the image\n\n\n    Args:\n        scale_range (tuple[float, float]): Range for random displacement of control points.\n            Values should be in [0.0, 1.0]:\n            - 0.0: No displacement (identity transform)\n            - 0.1: Subtle warping\n            - 0.2-0.4: Moderate deformation (recommended range)\n            - 0.5+: Strong warping\n            Default: (0.2, 0.4)\n\n        num_control_points (int): Number of control points per side.\n            Creates a grid of num_control_points x num_control_points points.\n            - 2: Minimal deformation (affine-like)\n            - 3-4: Moderate flexibility (recommended)\n            - 5+: More local deformation control\n            Must be &gt;= 2. Default: 4\n\n        interpolation (int): OpenCV interpolation flag. Used for image sampling.\n            See also: cv2.INTER_*\n            Default: cv2.INTER_LINEAR\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transformation preserves smoothness and continuity\n        - Stronger scale values may create more extreme deformations\n        - Higher number of control points allows more local deformations\n        - The same deformation is applied consistently to all targets\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Basic usage\n        &gt;&gt;&gt; transform = A.ThinPlateSpline()\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Subtle deformation\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.1, 0.2),\n        ...     num_control_points=3\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Strong warping with fine control\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.3, 0.5),\n        ...     num_control_points=5,\n        ... )\n\n    References:\n        - \"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"\n          by F.L. Bookstein\n          https://doi.org/10.1109/34.24792\n\n        - Thin Plate Splines in Computer Vision:\n          https://en.wikipedia.org/wiki/Thin_plate_spline\n\n        - Similar implementation in Kornia:\n          https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline\n\n    See Also:\n        - ElasticTransform: For different type of non-rigid deformation\n        - GridDistortion: For grid-based warping\n        - OpticalDistortion: For lens-like distortions\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        scale_range: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        num_control_points: int = Field(ge=2)\n\n    def __init__(\n        self,\n        scale_range: tuple[float, float] = (0.2, 0.4),\n        num_control_points: int = 4,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.scale_range = scale_range\n        self.num_control_points = num_control_points\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        # Create regular grid of control points\n        grid_size = self.num_control_points\n        x = np.linspace(0, 1, grid_size)\n        y = np.linspace(0, 1, grid_size)\n        src_points = np.stack(np.meshgrid(x, y), axis=-1).reshape(-1, 2)\n\n        # Add random displacement to destination points\n        scale = self.py_random.uniform(*self.scale_range) / 10\n        dst_points = src_points + self.random_generator.normal(\n            0,\n            scale,\n            src_points.shape,\n        )\n\n        # Compute TPS weights\n        weights, affine = fgeometric.compute_tps_weights(src_points, dst_points)\n\n        # Create grid of points\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        points = np.stack([x.flatten(), y.flatten()], axis=1).astype(np.float32)\n\n        # Transform points\n        transformed = fgeometric.tps_transform(\n            points / [width, height],\n            src_points,\n            weights,\n            affine,\n        )\n        transformed *= [width, height]\n\n        return {\n            \"map_x\": transformed[:, 0].reshape(height, width).astype(np.float32),\n            \"map_y\": transformed[:, 1].reshape(height, width).astype(np.float32),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale_range\",\n            \"num_control_points\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Transpose","title":"<code>class  Transpose</code> <code> </code>  [view source on GitHub]","text":"<p>Transpose the input by swapping its rows and columns.</p> <p>This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The dimensions of the output will be swapped compared to the input. For example,   an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).</li> <li>This transform is its own inverse. Applying it twice will return the original input.</li> <li>For multi-channel images (like RGB), the channels are preserved in their original order.</li> <li>Bounding boxes will have their coordinates adjusted to match the new image dimensions.</li> <li>Keypoints will have their x and y coordinates swapped.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (y_min, x_min, y_max, x_max)     3. For keypoints with coordinates (x, y):        new_keypoint = (y, x)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.Transpose(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transposed_image = result['image']\n&gt;&gt;&gt; print(transposed_image)\n[[[ 1  2  3]\n  [ 7  8  9]]\n [[ 4  5  6]\n  [10 11 12]]]\n# The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Transpose(DualTransform):\n    \"\"\"Transpose the input by swapping its rows and columns.\n\n    This transform flips the image over its main diagonal, effectively switching its width and height.\n    It's equivalent to a 90-degree rotation followed by a horizontal flip.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The dimensions of the output will be swapped compared to the input. For example,\n          an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).\n        - This transform is its own inverse. Applying it twice will return the original input.\n        - For multi-channel images (like RGB), the channels are preserved in their original order.\n        - Bounding boxes will have their coordinates adjusted to match the new image dimensions.\n        - Keypoints will have their x and y coordinates swapped.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (y_min, x_min, y_max, x_max)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (y, x)\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.Transpose(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transposed_image = result['image']\n        &gt;&gt;&gt; print(transposed_image)\n        [[[ 1  2  3]\n          [ 7  8  9]]\n         [[ 4  5  6]\n          [10 11 12]]]\n        # The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.transpose(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_transpose(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_transpose(keypoints)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.VerticalFlip","title":"<code>class  VerticalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform flips the image upside down. The top of the image becomes the bottom and vice versa.</li> <li>The dimensions of the image remain unchanged.</li> <li>For multi-channel images (like RGB), each channel is flipped independently.</li> <li>Bounding boxes are adjusted to match their new positions in the flipped image.</li> <li>Keypoints are moved to their new positions in the flipped image.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (x_min, H-y_max, x_max, H-y_min)     3. For keypoints with coordinates (x, y):        new_keypoint = (x, H-y)     where H is the height of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; flipped_image = result['image']\n&gt;&gt;&gt; print(flipped_image)\n[[[ 7  8  9]\n  [10 11 12]]\n [[ 1  2  3]\n  [ 4  5  6]]]\n# The original image is flipped vertically, with rows reversed\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class VerticalFlip(DualTransform):\n    \"\"\"Flip the input vertically around the x-axis.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform flips the image upside down. The top of the image becomes the bottom and vice versa.\n        - The dimensions of the image remain unchanged.\n        - For multi-channel images (like RGB), each channel is flipped independently.\n        - Bounding boxes are adjusted to match their new positions in the flipped image.\n        - Keypoints are moved to their new positions in the flipped image.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (x_min, H-y_max, x_max, H-y_min)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (x, H-y)\n        where H is the height of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; flipped_image = result['image']\n        &gt;&gt;&gt; print(flipped_image)\n        [[[ 7  8  9]\n          [10 11 12]]\n         [[ 1  2  3]\n          [ 4  5  6]]]\n        # The original image is flipped vertically, with rows reversed\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return vflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_vflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_vflip(keypoints, params[\"shape\"][0])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/","title":"Transforms (augmentations.transforms)","text":""},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.AdditiveNoise","title":"<code>class  AdditiveNoise</code> <code>       (noise_type='uniform', spatial_mode='constant', noise_params=None, approximation=1.0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply random noise to image channels using various noise distributions.</p> <p>This transform generates noise using different probability distributions and applies it to image channels. The noise can be generated in three spatial modes and supports multiple noise distributions, each with configurable parameters.</p> <p>Parameters:</p> Name Type Description <code>noise_type</code> <code>Literal['uniform', 'gaussian', 'laplace', 'beta']</code> <p>Type of noise distribution to use. Options: - \"uniform\": Uniform distribution, good for simple random perturbations - \"gaussian\": Normal distribution, models natural random processes - \"laplace\": Similar to Gaussian but with heavier tails, good for outliers - \"beta\": Flexible bounded distribution, can be symmetric or skewed</p> <code>spatial_mode</code> <code>Literal['constant', 'per_pixel', 'shared']</code> <p>How to generate and apply the noise. Options: - \"constant\": One noise value per channel, fastest - \"per_pixel\": Independent noise value for each pixel and channel, slowest - \"shared\": One noise map shared across all channels, medium speed</p> <code>approximation</code> <code>float</code> <p>float in [0, 1], default=1.0 Controls noise generation speed vs quality tradeoff. - 1.0: Generate full resolution noise (slowest, highest quality) - 0.5: Generate noise at half resolution and upsample - 0.25: Generate noise at quarter resolution and upsample Only affects 'per_pixel' and 'shared' spatial modes.</p> <code>noise_params</code> <code>dict[str, Any] | None</code> <p>Parameters for the chosen noise distribution. Must match the noise_type:</p> <p>uniform:     ranges: list[tuple[float, float]]         List of (min, max) ranges for each channel.         Each range must be in [-1, 1].         If only one range is provided, it will be used for all channels.</p> <pre><code>    [(-0.2, 0.2)]  # Same range for all channels\n    [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]  # Different ranges for RGB\n</code></pre> <p>gaussian:     mean_range: tuple[float, float], default (0.0, 0.0)         Range for sampling mean value, in [-1, 1]     std_range: tuple[float, float], default (0.1, 0.1)         Range for sampling standard deviation, in [0, 1]</p> <p>laplace:     mean_range: tuple[float, float], default (0.0, 0.0)         Range for sampling location parameter, in [-1, 1]     scale_range: tuple[float, float], default (0.1, 0.1)         Range for sampling scale parameter, in [0, 1]</p> <p>beta:     alpha_range: tuple[float, float], default (0.5, 1.5)         Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped         Range for sampling first shape parameter, in (0, inf)     beta_range: tuple[float, float], default (0.5, 1.5)         Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped         Range for sampling second shape parameter, in (0, inf)     scale_range: tuple[float, float], default (0.1, 0.3)         Smaller scale for subtler noise         Range for sampling output scale, in [0, 1]</p> <p>Note</p> <p>Performance considerations:     - \"constant\" mode is fastest as it generates only C values (C = number of channels)     - \"shared\" mode generates HxW values and reuses them for all channels     - \"per_pixel\" mode generates HxWxC values, slowest but most flexible</p> <p>Distribution characteristics:     - uniform: Equal probability within range, good for simple perturbations     - gaussian: Bell-shaped, symmetric, good for natural noise     - laplace: Like gaussian but with heavier tails, good for outliers     - beta: Very flexible shape, can be uniform, bell-shaped, or U-shaped</p> <p>Implementation details:     - All noise is generated in normalized range and scaled by image max value     - For uint8 images, final noise range is [-255, 255]     - For float images, final noise range is [-1, 1]</p> <p>Examples:</p> <p>Constant RGB shift with different ranges per channel:</p> Python<pre><code>&gt;&gt;&gt; transform = AdditiveNoise(\n...     noise_type=\"uniform\",\n...     spatial_mode=\"constant\",\n...     noise_params={\"ranges\": [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]}\n... )\n</code></pre> <p>Gaussian noise shared across channels:</p> Python<pre><code>&gt;&gt;&gt; transform = AdditiveNoise(\n...     noise_type=\"gaussian\",\n...     spatial_mode=\"shared\",\n...     noise_params={\"mean_range\": (0.0, 0.0), \"std_range\": (0.05, 0.15)}\n... )\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class AdditiveNoise(ImageOnlyTransform):\n    \"\"\"Apply random noise to image channels using various noise distributions.\n\n    This transform generates noise using different probability distributions and applies it\n    to image channels. The noise can be generated in three spatial modes and supports\n    multiple noise distributions, each with configurable parameters.\n\n    Args:\n        noise_type: Type of noise distribution to use. Options:\n            - \"uniform\": Uniform distribution, good for simple random perturbations\n            - \"gaussian\": Normal distribution, models natural random processes\n            - \"laplace\": Similar to Gaussian but with heavier tails, good for outliers\n            - \"beta\": Flexible bounded distribution, can be symmetric or skewed\n\n        spatial_mode: How to generate and apply the noise. Options:\n            - \"constant\": One noise value per channel, fastest\n            - \"per_pixel\": Independent noise value for each pixel and channel, slowest\n            - \"shared\": One noise map shared across all channels, medium speed\n\n        approximation: float in [0, 1], default=1.0\n            Controls noise generation speed vs quality tradeoff.\n            - 1.0: Generate full resolution noise (slowest, highest quality)\n            - 0.5: Generate noise at half resolution and upsample\n            - 0.25: Generate noise at quarter resolution and upsample\n            Only affects 'per_pixel' and 'shared' spatial modes.\n\n        noise_params: Parameters for the chosen noise distribution.\n            Must match the noise_type:\n\n            uniform:\n                ranges: list[tuple[float, float]]\n                    List of (min, max) ranges for each channel.\n                    Each range must be in [-1, 1].\n                    If only one range is provided, it will be used for all channels.\n\n                    [(-0.2, 0.2)]  # Same range for all channels\n                    [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]  # Different ranges for RGB\n\n            gaussian:\n                mean_range: tuple[float, float], default (0.0, 0.0)\n                    Range for sampling mean value, in [-1, 1]\n                std_range: tuple[float, float], default (0.1, 0.1)\n                    Range for sampling standard deviation, in [0, 1]\n\n            laplace:\n                mean_range: tuple[float, float], default (0.0, 0.0)\n                    Range for sampling location parameter, in [-1, 1]\n                scale_range: tuple[float, float], default (0.1, 0.1)\n                    Range for sampling scale parameter, in [0, 1]\n\n            beta:\n                alpha_range: tuple[float, float], default (0.5, 1.5)\n                    Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped\n                    Range for sampling first shape parameter, in (0, inf)\n                beta_range: tuple[float, float], default (0.5, 1.5)\n                    Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped\n                    Range for sampling second shape parameter, in (0, inf)\n                scale_range: tuple[float, float], default (0.1, 0.3)\n                    Smaller scale for subtler noise\n                    Range for sampling output scale, in [0, 1]\n\n    Note:\n        Performance considerations:\n            - \"constant\" mode is fastest as it generates only C values (C = number of channels)\n            - \"shared\" mode generates HxW values and reuses them for all channels\n            - \"per_pixel\" mode generates HxWxC values, slowest but most flexible\n\n        Distribution characteristics:\n            - uniform: Equal probability within range, good for simple perturbations\n            - gaussian: Bell-shaped, symmetric, good for natural noise\n            - laplace: Like gaussian but with heavier tails, good for outliers\n            - beta: Very flexible shape, can be uniform, bell-shaped, or U-shaped\n\n        Implementation details:\n            - All noise is generated in normalized range and scaled by image max value\n            - For uint8 images, final noise range is [-255, 255]\n            - For float images, final noise range is [-1, 1]\n\n    Examples:\n        Constant RGB shift with different ranges per channel:\n        &gt;&gt;&gt; transform = AdditiveNoise(\n        ...     noise_type=\"uniform\",\n        ...     spatial_mode=\"constant\",\n        ...     noise_params={\"ranges\": [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]}\n        ... )\n\n        Gaussian noise shared across channels:\n        &gt;&gt;&gt; transform = AdditiveNoise(\n        ...     noise_type=\"gaussian\",\n        ...     spatial_mode=\"shared\",\n        ...     noise_params={\"mean_range\": (0.0, 0.0), \"std_range\": (0.05, 0.15)}\n        ... )\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"]\n        spatial_mode: Literal[\"constant\", \"per_pixel\", \"shared\"]\n        noise_params: dict[str, Any] | None\n        approximation: float = Field(ge=0.0, le=1.0)\n\n        @model_validator(mode=\"after\")\n        def validate_noise_params(self) -&gt; Self:\n            # Default parameters for each noise type\n            default_params = {\n                \"uniform\": {\n                    \"ranges\": [(-0.1, 0.1)],  # Single channel by default\n                },\n                \"gaussian\": {\"mean_range\": (0.0, 0.0), \"std_range\": (0.05, 0.15)},\n                \"laplace\": {\"mean_range\": (0.0, 0.0), \"scale_range\": (0.05, 0.15)},\n                \"beta\": {\n                    \"alpha_range\": (0.5, 1.5),\n                    \"beta_range\": (0.5, 1.5),\n                    \"scale_range\": (0.1, 0.3),\n                },\n            }\n\n            # Use default params if none provided\n            params_dict = self.noise_params if self.noise_params is not None else default_params[self.noise_type]\n\n            # Convert dict to appropriate NoiseParams object\n            params_class = {\n                \"uniform\": UniformParams,\n                \"gaussian\": GaussianParams,\n                \"laplace\": LaplaceParams,\n                \"beta\": BetaParams,\n            }[self.noise_type]\n\n            # Add noise_type to params if not present\n            params_dict = {**params_dict, \"noise_type\": self.noise_type}  # type: ignore[dict-item]\n            self.noise_params = params_class(**params_dict)\n\n            return self\n\n    def __init__(\n        self,\n        noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"] = \"uniform\",\n        spatial_mode: Literal[\"constant\", \"per_pixel\", \"shared\"] = \"constant\",\n        noise_params: dict[str, Any] | None = None,\n        approximation: float = 1.0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.noise_type = noise_type\n        self.spatial_mode = spatial_mode\n        self.noise_params = noise_params\n        self.approximation = approximation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        noise_map: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_noise(img, noise_map)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        max_value = MAX_VALUES_BY_DTYPE[image.dtype]\n\n        noise_map = fmain.generate_noise(\n            noise_type=self.noise_type,\n            spatial_mode=self.spatial_mode,\n            shape=image.shape,\n            params=self.noise_params,\n            max_value=max_value,\n            approximation=self.approximation,\n            random_generator=self.random_generator,\n        )\n        return {\"noise_map\": noise_map}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"noise_type\", \"spatial_mode\", \"noise_params\", \"approximation\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.AutoContrast","title":"<code>class  AutoContrast</code> <code>       (p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply random auto contrast to images.</p> <p>Auto contrast enhances image contrast by stretching the intensity range to use the full range while preserving relative intensities. For each color channel: 1. Compute histogram 2. Find cumulative percentiles 3. Clip and scale intensities to full range</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class AutoContrast(ImageOnlyTransform):\n    \"\"\"Apply random auto contrast to images.\n\n    Auto contrast enhances image contrast by stretching the intensity range\n    to use the full range while preserving relative intensities. For each\n    color channel:\n    1. Compute histogram\n    2. Find cumulative percentiles\n    3. Clip and scale intensities to full range\n\n    Args:\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(\n        self,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.auto_contrast(img)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.BetaParams","title":"<code>class  BetaParams</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class BetaParams(NoiseParamsBase):\n    noise_type: Literal[\"beta\"] = \"beta\"\n    alpha_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0)),\n    ]\n    beta_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0)),\n    ]\n    scale_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0, max_val=1)),\n    ]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.CLAHE","title":"<code>class  CLAHE</code> <code>       (clip_limit=4.0, tile_grid_size=(8, 8), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.</p> <p>CLAHE is an advanced method of improving the contrast in an image. Unlike regular histogram equalization, which operates on the entire image, CLAHE operates on small regions (tiles) in the image. This results in a more balanced equalization, preventing over-amplification of contrast in areas with initially low contrast.</p> <p>Parameters:</p> Name Type Description <code>clip_limit</code> <code>tuple[float, float] | float</code> <p>Controls the contrast enhancement limit. - If a single float is provided, the range will be (1, clip_limit). - If a tuple of two floats is provided, it defines the range for random selection. Higher values allow for more contrast enhancement, but may also increase noise. Default: (1, 4)</p> <code>tile_grid_size</code> <code>tuple[int, int]</code> <p>Defines the number of tiles in the row and column directions. Format is (rows, columns). Smaller tile sizes can lead to more localized enhancements, while larger sizes give results closer to global histogram equalization. Default: (8, 8)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Notes</p> <ul> <li>Supports only RGB or grayscale images.</li> <li>For color images, CLAHE is applied to the L channel in the LAB color space.</li> <li>The clip limit determines the maximum slope of the cumulative histogram. A lower   clip limit will result in more contrast limiting.</li> <li>Tile grid size affects the adaptiveness of the method. More tiles increase local   adaptiveness but can lead to an unnatural look if set too high.</li> </ul> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1, 3</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.CLAHE(clip_limit=(1, 4), tile_grid_size=(8, 8), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; clahe_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>https://docs.opencv.org/master/d5/daf/tutorial_py_histogram_equalization.html</li> <li>Zuiderveld, Karel. \"Contrast Limited Adaptive Histogram Equalization.\"   Graphic Gems IV. Academic Press Professional, Inc., 1994.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class CLAHE(ImageOnlyTransform):\n    \"\"\"Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.\n\n    CLAHE is an advanced method of improving the contrast in an image. Unlike regular histogram\n    equalization, which operates on the entire image, CLAHE operates on small regions (tiles)\n    in the image. This results in a more balanced equalization, preventing over-amplification\n    of contrast in areas with initially low contrast.\n\n    Args:\n        clip_limit (tuple[float, float] | float): Controls the contrast enhancement limit.\n            - If a single float is provided, the range will be (1, clip_limit).\n            - If a tuple of two floats is provided, it defines the range for random selection.\n            Higher values allow for more contrast enhancement, but may also increase noise.\n            Default: (1, 4)\n\n        tile_grid_size (tuple[int, int]): Defines the number of tiles in the row and column directions.\n            Format is (rows, columns). Smaller tile sizes can lead to more localized enhancements,\n            while larger sizes give results closer to global histogram equalization.\n            Default: (8, 8)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Notes:\n        - Supports only RGB or grayscale images.\n        - For color images, CLAHE is applied to the L channel in the LAB color space.\n        - The clip limit determines the maximum slope of the cumulative histogram. A lower\n          clip limit will result in more contrast limiting.\n        - Tile grid size affects the adaptiveness of the method. More tiles increase local\n          adaptiveness but can lead to an unnatural look if set too high.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1, 3\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.CLAHE(clip_limit=(1, 4), tile_grid_size=(8, 8), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; clahe_image = result[\"image\"]\n\n    References:\n        - https://docs.opencv.org/master/d5/daf/tutorial_py_histogram_equalization.html\n        - Zuiderveld, Karel. \"Contrast Limited Adaptive Histogram Equalization.\"\n          Graphic Gems IV. Academic Press Professional, Inc., 1994.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        clip_limit: OnePlusFloatRangeType\n        tile_grid_size: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n\n    def __init__(\n        self,\n        clip_limit: ScaleFloatType = 4.0,\n        tile_grid_size: tuple[int, int] = (8, 8),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.clip_limit = cast(tuple[float, float], clip_limit)\n        self.tile_grid_size = tile_grid_size\n\n    def apply(self, img: np.ndarray, clip_limit: float, **params: Any) -&gt; np.ndarray:\n        if not is_rgb_image(img) and not is_grayscale_image(img):\n            msg = \"CLAHE transformation expects 1-channel or 3-channel images.\"\n            raise TypeError(msg)\n\n        return fmain.clahe(img, clip_limit, self.tile_grid_size)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"clip_limit\": self.py_random.uniform(*self.clip_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"clip_limit\", \"tile_grid_size\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ChannelShuffle","title":"<code>class  ChannelShuffle</code> <code> </code>  [view source on GitHub]","text":"<p>Randomly rearrange channels of the image.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ChannelShuffle(ImageOnlyTransform):\n    \"\"\"Randomly rearrange channels of the image.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    def apply(\n        self,\n        img: np.ndarray,\n        channels_shuffled: tuple[int, ...],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.channel_shuffle(img, channels_shuffled)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        ch_arr = list(range(params[\"shape\"][2]))\n        self.random_generator.shuffle(ch_arr)\n        return {\"channels_shuffled\": ch_arr}\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ChromaticAberration","title":"<code>class  ChromaticAberration</code> <code>       (primary_distortion_limit=(-0.02, 0.02), secondary_distortion_limit=(-0.05, 0.05), mode='green_purple', interpolation=1, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Add lateral chromatic aberration by distorting the red and blue channels of the input image.</p> <p>Chromatic aberration is an optical effect that occurs when a lens fails to focus all colors to the same point. This transform simulates this effect by applying different radial distortions to the red and blue channels of the image, while leaving the green channel unchanged.</p> <p>Parameters:</p> Name Type Description <code>primary_distortion_limit</code> <code>tuple[float, float] | float</code> <p>Range of the primary radial distortion coefficient. If a single float value is provided, the range will be (-primary_distortion_limit, primary_distortion_limit). This parameter controls the distortion in the center of the image: - Positive values result in pincushion distortion (edges bend inward) - Negative values result in barrel distortion (edges bend outward) Default: (-0.02, 0.02).</p> <code>secondary_distortion_limit</code> <code>tuple[float, float] | float</code> <p>Range of the secondary radial distortion coefficient. If a single float value is provided, the range will be (-secondary_distortion_limit, secondary_distortion_limit). This parameter controls the distortion in the corners of the image: - Positive values enhance pincushion distortion - Negative values enhance barrel distortion Default: (-0.05, 0.05).</p> <code>mode</code> <code>Literal[\"green_purple\", \"red_blue\", \"random\"]</code> <p>Type of color fringing to apply. Options are: - 'green_purple': Distorts red and blue channels in opposite directions, creating green-purple fringing. - 'red_blue': Distorts red and blue channels in the same direction, creating red-blue fringing. - 'random': Randomly chooses between 'green_purple' and 'red_blue' modes for each application. Default: 'green_purple'.</p> <code>interpolation</code> <code>InterpolationType</code> <p>Flag specifying the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>This transform only affects RGB images. Grayscale images will raise an error.</li> <li>The strength of the effect depends on both primary and secondary distortion limits.</li> <li>Higher absolute values for distortion limits will result in more pronounced chromatic aberration.</li> <li>The 'green_purple' mode tends to produce more noticeable effects than 'red_blue'.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; transform = A.ChromaticAberration(\n...     primary_distortion_limit=0.05,\n...     secondary_distortion_limit=0.1,\n...     mode='green_purple',\n...     interpolation=cv2.INTER_LINEAR,\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; aberrated_image = transformed['image']\n</code></pre> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Chromatic_aberration</li> <li>https://www.researchgate.net/publication/320691320_Chromatic_Aberration_in_Digital_Images</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ChromaticAberration(ImageOnlyTransform):\n    \"\"\"Add lateral chromatic aberration by distorting the red and blue channels of the input image.\n\n    Chromatic aberration is an optical effect that occurs when a lens fails to focus all colors to the same point.\n    This transform simulates this effect by applying different radial distortions to the red and blue channels\n    of the image, while leaving the green channel unchanged.\n\n    Args:\n        primary_distortion_limit (tuple[float, float] | float): Range of the primary radial distortion coefficient.\n            If a single float value is provided, the range\n            will be (-primary_distortion_limit, primary_distortion_limit).\n            This parameter controls the distortion in the center of the image:\n            - Positive values result in pincushion distortion (edges bend inward)\n            - Negative values result in barrel distortion (edges bend outward)\n            Default: (-0.02, 0.02).\n\n        secondary_distortion_limit (tuple[float, float] | float): Range of the secondary radial distortion coefficient.\n            If a single float value is provided, the range\n            will be (-secondary_distortion_limit, secondary_distortion_limit).\n            This parameter controls the distortion in the corners of the image:\n            - Positive values enhance pincushion distortion\n            - Negative values enhance barrel distortion\n            Default: (-0.05, 0.05).\n\n        mode (Literal[\"green_purple\", \"red_blue\", \"random\"]): Type of color fringing to apply. Options are:\n            - 'green_purple': Distorts red and blue channels in opposite directions, creating green-purple fringing.\n            - 'red_blue': Distorts red and blue channels in the same direction, creating red-blue fringing.\n            - 'random': Randomly chooses between 'green_purple' and 'red_blue' modes for each application.\n            Default: 'green_purple'.\n\n        interpolation (InterpolationType): Flag specifying the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - This transform only affects RGB images. Grayscale images will raise an error.\n        - The strength of the effect depends on both primary and secondary distortion limits.\n        - Higher absolute values for distortion limits will result in more pronounced chromatic aberration.\n        - The 'green_purple' mode tends to produce more noticeable effects than 'red_blue'.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; transform = A.ChromaticAberration(\n        ...     primary_distortion_limit=0.05,\n        ...     secondary_distortion_limit=0.1,\n        ...     mode='green_purple',\n        ...     interpolation=cv2.INTER_LINEAR,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; aberrated_image = transformed['image']\n\n    References:\n        - https://en.wikipedia.org/wiki/Chromatic_aberration\n        - https://www.researchgate.net/publication/320691320_Chromatic_Aberration_in_Digital_Images\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        primary_distortion_limit: SymmetricRangeType\n        secondary_distortion_limit: SymmetricRangeType\n        mode: ChromaticAberrationMode\n        interpolation: InterpolationType\n\n    def __init__(\n        self,\n        primary_distortion_limit: ScaleFloatType = (-0.02, 0.02),\n        secondary_distortion_limit: ScaleFloatType = (-0.05, 0.05),\n        mode: ChromaticAberrationMode = \"green_purple\",\n        interpolation: InterpolationType = cv2.INTER_LINEAR,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.primary_distortion_limit = cast(\n            tuple[float, float],\n            primary_distortion_limit,\n        )\n        self.secondary_distortion_limit = cast(\n            tuple[float, float],\n            secondary_distortion_limit,\n        )\n        self.mode = mode\n        self.interpolation = interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        primary_distortion_red: float,\n        secondary_distortion_red: float,\n        primary_distortion_blue: float,\n        secondary_distortion_blue: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.chromatic_aberration(\n            img,\n            primary_distortion_red,\n            secondary_distortion_red,\n            primary_distortion_blue,\n            secondary_distortion_blue,\n            self.interpolation,\n        )\n\n    def get_params(self) -&gt; dict[str, float]:\n        primary_distortion_red = self.py_random.uniform(*self.primary_distortion_limit)\n        secondary_distortion_red = self.py_random.uniform(\n            *self.secondary_distortion_limit,\n        )\n        primary_distortion_blue = self.py_random.uniform(*self.primary_distortion_limit)\n        secondary_distortion_blue = self.py_random.uniform(\n            *self.secondary_distortion_limit,\n        )\n\n        secondary_distortion_red = self._match_sign(\n            primary_distortion_red,\n            secondary_distortion_red,\n        )\n        secondary_distortion_blue = self._match_sign(\n            primary_distortion_blue,\n            secondary_distortion_blue,\n        )\n\n        if self.mode == \"green_purple\":\n            # distortion coefficients of the red and blue channels have the same sign\n            primary_distortion_blue = self._match_sign(\n                primary_distortion_red,\n                primary_distortion_blue,\n            )\n            secondary_distortion_blue = self._match_sign(\n                secondary_distortion_red,\n                secondary_distortion_blue,\n            )\n        if self.mode == \"red_blue\":\n            # distortion coefficients of the red and blue channels have the opposite sign\n            primary_distortion_blue = self._unmatch_sign(\n                primary_distortion_red,\n                primary_distortion_blue,\n            )\n            secondary_distortion_blue = self._unmatch_sign(\n                secondary_distortion_red,\n                secondary_distortion_blue,\n            )\n\n        return {\n            \"primary_distortion_red\": primary_distortion_red,\n            \"secondary_distortion_red\": secondary_distortion_red,\n            \"primary_distortion_blue\": primary_distortion_blue,\n            \"secondary_distortion_blue\": secondary_distortion_blue,\n        }\n\n    @staticmethod\n    def _match_sign(a: float, b: float) -&gt; float:\n        # Match the sign of b to a\n        if (a &lt; 0 &lt; b) or (a &gt; 0 &gt; b):\n            return -b\n        return b\n\n    @staticmethod\n    def _unmatch_sign(a: float, b: float) -&gt; float:\n        # Unmatch the sign of b to a\n        if (a &lt; 0 and b &lt; 0) or (a &gt; 0 and b &gt; 0):\n            return -b\n        return b\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return (\n            \"primary_distortion_limit\",\n            \"secondary_distortion_limit\",\n            \"mode\",\n            \"interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ColorJitter","title":"<code>class  ColorJitter</code> <code>       (brightness=(0.8, 1.2), contrast=(0.8, 1.2), saturation=(0.8, 1.2), hue=(-0.5, 0.5), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Randomly changes the brightness, contrast, saturation, and hue of an image.</p> <p>This transform is similar to torchvision's ColorJitter but with some differences due to the use of OpenCV instead of Pillow. The main differences are: 1. OpenCV and Pillow use different formulas to convert images to HSV format. 2. This implementation uses value saturation instead of uint8 overflow as in Pillow.</p> <p>These differences may result in slightly different output compared to torchvision's ColorJitter.</p> <p>Parameters:</p> Name Type Description <code>brightness</code> <code>tuple[float, float] | float</code> <p>How much to jitter brightness. If float:     The brightness factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness]. If tuple:     The brightness factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)</p> <code>contrast</code> <code>tuple[float, float] | float</code> <p>How much to jitter contrast. If float:     The contrast factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast]. If tuple:     The contrast factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)</p> <code>saturation</code> <code>tuple[float, float] | float</code> <p>How much to jitter saturation. If float:     The saturation factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation]. If tuple:     The saturation factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)</p> <code>hue</code> <code>float or tuple of float (min, max</code> <p>How much to jitter hue. If float:     The hue factor is chosen uniformly from [-hue, hue]. Should have 0 &lt;= hue &lt;= 0.5. If tuple:     The hue factor is sampled from the range specified. Values should be in range [-0.5, 0.5]. Default: (-0.5, 0.5)</p> <p>p (float): Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1, 3</p> <p>Note</p> <ul> <li>The order of application for these color transformations is random for each image.</li> <li>The ranges for brightness, contrast, and saturation are applied as multiplicative factors.</li> <li>The range for hue is applied as an additive factor.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; jittered_image = result['image']\n</code></pre> <p>References</p> <ul> <li>https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ColorJitter</li> <li>https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ColorJitter(ImageOnlyTransform):\n    \"\"\"Randomly changes the brightness, contrast, saturation, and hue of an image.\n\n    This transform is similar to torchvision's ColorJitter but with some differences due to the use of OpenCV\n    instead of Pillow. The main differences are:\n    1. OpenCV and Pillow use different formulas to convert images to HSV format.\n    2. This implementation uses value saturation instead of uint8 overflow as in Pillow.\n\n    These differences may result in slightly different output compared to torchvision's ColorJitter.\n\n    Args:\n        brightness (tuple[float, float] | float): How much to jitter brightness.\n            If float:\n                The brightness factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness].\n            If tuple:\n                The brightness factor is sampled from the range specified.\n            Should be non-negative numbers.\n            Default: (0.8, 1.2)\n\n        contrast (tuple[float, float] | float): How much to jitter contrast.\n            If float:\n                The contrast factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast].\n            If tuple:\n                The contrast factor is sampled from the range specified.\n            Should be non-negative numbers.\n            Default: (0.8, 1.2)\n\n        saturation (tuple[float, float] | float): How much to jitter saturation.\n            If float:\n                The saturation factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation].\n            If tuple:\n                The saturation factor is sampled from the range specified.\n            Should be non-negative numbers.\n            Default: (0.8, 1.2)\n\n        hue (float or tuple of float (min, max)): How much to jitter hue.\n            If float:\n                The hue factor is chosen uniformly from [-hue, hue]. Should have 0 &lt;= hue &lt;= 0.5.\n            If tuple:\n                The hue factor is sampled from the range specified. Values should be in range [-0.5, 0.5].\n            Default: (-0.5, 0.5)\n\n         p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1, 3\n\n    Note:\n        - The order of application for these color transformations is random for each image.\n        - The ranges for brightness, contrast, and saturation are applied as multiplicative factors.\n        - The range for hue is applied as an additive factor.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; jittered_image = result['image']\n\n    References:\n        - https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ColorJitter\n        - https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        brightness: ScaleFloatType\n        contrast: ScaleFloatType\n        saturation: ScaleFloatType\n        hue: ScaleFloatType\n\n        @field_validator(\"brightness\", \"contrast\", \"saturation\", \"hue\")\n        @classmethod\n        def check_ranges(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            if info.field_name == \"hue\":\n                bounds = -0.5, 0.5\n                bias = 0\n                clip = False\n            elif info.field_name in [\"brightness\", \"contrast\", \"saturation\"]:\n                bounds = 0, float(\"inf\")\n                bias = 1\n                clip = True\n\n            if isinstance(value, numbers.Number):\n                if value &lt; 0:\n                    raise ValueError(\n                        f\"If {info.field_name} is a single number, it must be non negative.\",\n                    )\n                left = bias - value\n                if clip:\n                    left = max(left, 0)\n                value = (left, bias + value)\n            elif isinstance(value, tuple) and len(value) == PAIR:\n                check_range(value, *bounds, info.field_name)\n\n            return cast(tuple[float, float], value)\n\n    def __init__(\n        self,\n        brightness: ScaleFloatType = (0.8, 1.2),\n        contrast: ScaleFloatType = (0.8, 1.2),\n        saturation: ScaleFloatType = (0.8, 1.2),\n        hue: ScaleFloatType = (-0.5, 0.5),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.brightness = cast(tuple[float, float], brightness)\n        self.contrast = cast(tuple[float, float], contrast)\n        self.saturation = cast(tuple[float, float], saturation)\n        self.hue = cast(tuple[float, float], hue)\n\n        self.transforms = [\n            fmain.adjust_brightness_torchvision,\n            fmain.adjust_contrast_torchvision,\n            fmain.adjust_saturation_torchvision,\n            fmain.adjust_hue_torchvision,\n        ]\n\n    def get_params(self) -&gt; dict[str, Any]:\n        brightness = self.py_random.uniform(*self.brightness)\n        contrast = self.py_random.uniform(*self.contrast)\n        saturation = self.py_random.uniform(*self.saturation)\n        hue = self.py_random.uniform(*self.hue)\n\n        order = [0, 1, 2, 3]\n        self.random_generator.shuffle(order)\n\n        return {\n            \"brightness\": brightness,\n            \"contrast\": contrast,\n            \"saturation\": saturation,\n            \"hue\": hue,\n            \"order\": order,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        brightness: float,\n        contrast: float,\n        saturation: float,\n        hue: float,\n        order: list[int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if not is_rgb_image(img) and not is_grayscale_image(img):\n            msg = \"ColorJitter transformation expects 1-channel or 3-channel images.\"\n            raise TypeError(msg)\n        color_transforms = [brightness, contrast, saturation, hue]\n        for i in order:\n            img = self.transforms[i](img, color_transforms[i])\n        return img\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"brightness\", \"contrast\", \"saturation\", \"hue\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Downscale","title":"<code>class  Downscale</code> <code>       (scale_min=None, scale_max=None, interpolation=None, scale_range=(0.25, 0.25), interpolation_pair={'upscale': 0, 'downscale': 0}, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Decrease image quality by downscaling and upscaling back.</p> <p>This transform simulates the effect of a low-resolution image by first downscaling the image to a lower resolution and then upscaling it back to its original size. This process introduces loss of detail and can be used to simulate low-quality images or to test the robustness of models to different image resolutions.</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for the downscaling factor. Should be two float values between 0 and 1, where the first value is less than or equal to the second. The actual downscaling factor will be randomly chosen from this range for each image. Lower values result in more aggressive downscaling. Default: (0.25, 0.25)</p> <code>interpolation_pair</code> <code>InterpolationDict</code> <p>A dictionary specifying the interpolation methods to use for downscaling and upscaling. Should contain two keys: - 'downscale': Interpolation method for downscaling - 'upscale': Interpolation method for upscaling Values should be OpenCV interpolation flags (e.g., cv2.INTER_NEAREST, cv2.INTER_LINEAR, etc.) Default: {'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_NEAREST}</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual downscaling factor is randomly chosen for each image from the range   specified in scale_range.</li> <li>Using different interpolation methods for downscaling and upscaling can produce   various effects. For example, using INTER_NEAREST for both can create a pixelated look,   while using INTER_LINEAR or INTER_CUBIC can produce smoother results.</li> <li>This transform can be useful for data augmentation, especially when training models   that need to be robust to variations in image quality or resolution.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; transform = A.Downscale(\n...     scale_range=(0.5, 0.75),\n...     interpolation_pair={'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_LINEAR},\n...     p=0.5\n... )\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; downscaled_image = transformed['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Downscale(ImageOnlyTransform):\n    \"\"\"Decrease image quality by downscaling and upscaling back.\n\n    This transform simulates the effect of a low-resolution image by first downscaling\n    the image to a lower resolution and then upscaling it back to its original size.\n    This process introduces loss of detail and can be used to simulate low-quality\n    images or to test the robustness of models to different image resolutions.\n\n    Args:\n        scale_range (tuple[float, float]): Range for the downscaling factor.\n            Should be two float values between 0 and 1, where the first value is less than or equal to the second.\n            The actual downscaling factor will be randomly chosen from this range for each image.\n            Lower values result in more aggressive downscaling.\n            Default: (0.25, 0.25)\n\n        interpolation_pair (InterpolationDict): A dictionary specifying the interpolation methods to use for\n            downscaling and upscaling. Should contain two keys:\n            - 'downscale': Interpolation method for downscaling\n            - 'upscale': Interpolation method for upscaling\n            Values should be OpenCV interpolation flags (e.g., cv2.INTER_NEAREST, cv2.INTER_LINEAR, etc.)\n            Default: {'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_NEAREST}\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual downscaling factor is randomly chosen for each image from the range\n          specified in scale_range.\n        - Using different interpolation methods for downscaling and upscaling can produce\n          various effects. For example, using INTER_NEAREST for both can create a pixelated look,\n          while using INTER_LINEAR or INTER_CUBIC can produce smoother results.\n        - This transform can be useful for data augmentation, especially when training models\n          that need to be robust to variations in image quality or resolution.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; transform = A.Downscale(\n        ...     scale_range=(0.5, 0.75),\n        ...     interpolation_pair={'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_LINEAR},\n        ...     p=0.5\n        ... )\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; downscaled_image = transformed['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_min: float | None\n        scale_max: float | None\n\n        interpolation: int | Interpolation | InterpolationDict | None = Field(\n            default_factory=lambda: Interpolation(\n                downscale=cv2.INTER_NEAREST,\n                upscale=cv2.INTER_NEAREST,\n            ),\n        )\n        interpolation_pair: InterpolationPydantic\n\n        scale_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        @model_validator(mode=\"after\")\n        def validate_params(self) -&gt; Self:\n            if self.scale_min is not None and self.scale_max is not None:\n                warn(\n                    \"scale_min and scale_max are deprecated. Use scale_range instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n                self.scale_range = (self.scale_min, self.scale_max)\n                self.scale_min = None\n                self.scale_max = None\n\n            if self.interpolation is not None:\n                warn(\n                    \"Downscale.interpolation is deprecated. Use Downscale.interpolation_pair instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n                if isinstance(self.interpolation, dict):\n                    self.interpolation_pair = InterpolationPydantic(\n                        **self.interpolation,\n                    )\n                elif isinstance(self.interpolation, int):\n                    self.interpolation_pair = InterpolationPydantic(\n                        upscale=self.interpolation,\n                        downscale=self.interpolation,\n                    )\n                elif isinstance(self.interpolation, Interpolation):\n                    self.interpolation_pair = InterpolationPydantic(\n                        upscale=self.interpolation.upscale,\n                        downscale=self.interpolation.downscale,\n                    )\n                self.interpolation = None\n\n            return self\n\n    def __init__(\n        self,\n        scale_min: float | None = None,\n        scale_max: float | None = None,\n        interpolation: int | Interpolation | InterpolationDict | None = None,\n        scale_range: tuple[float, float] = (0.25, 0.25),\n        interpolation_pair: InterpolationDict = InterpolationDict(\n            {\"upscale\": cv2.INTER_NEAREST, \"downscale\": cv2.INTER_NEAREST},\n        ),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_range = scale_range\n        self.interpolation_pair = interpolation_pair\n\n    def apply(self, img: np.ndarray, scale: float, **params: Any) -&gt; np.ndarray:\n        return fmain.downscale(\n            img,\n            scale=scale,\n            down_interpolation=self.interpolation_pair[\"downscale\"],\n            up_interpolation=self.interpolation_pair[\"upscale\"],\n        )\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\"scale\": self.py_random.uniform(*self.scale_range)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"scale_range\", \"interpolation_pair\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Emboss","title":"<code>class  Emboss</code> <code>       (alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply embossing effect to the input image.</p> <p>This transform creates an emboss effect by highlighting edges and creating a 3D-like texture in the image. It works by applying a specific convolution kernel to the image that emphasizes differences in adjacent pixel values.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>tuple[float, float]</code> <p>Range to choose the visibility of the embossed image. At 0, only the original image is visible, at 1.0 only its embossed version is visible. Values should be in the range [0, 1]. Alpha will be randomly selected from this range for each image. Default: (0.2, 0.5)</p> <code>strength</code> <code>tuple[float, float]</code> <p>Range to choose the strength of the embossing effect. Higher values create a more pronounced 3D effect. Values should be non-negative. Strength will be randomly selected from this range for each image. Default: (0.2, 0.7)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The emboss effect is created using a 3x3 convolution kernel.</li> <li>The 'alpha' parameter controls the blend between the original image and the embossed version.   A higher alpha value will result in a more pronounced emboss effect.</li> <li>The 'strength' parameter affects the intensity of the embossing. Higher strength values   will create more contrast in the embossed areas, resulting in a stronger 3D-like effect.</li> <li>This transform can be useful for creating artistic effects or for data augmentation   in tasks where edge information is important.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; embossed_image = result['image']\n</code></pre> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Image_embossing</li> <li>https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Emboss(ImageOnlyTransform):\n    \"\"\"Apply embossing effect to the input image.\n\n    This transform creates an emboss effect by highlighting edges and creating a 3D-like texture\n    in the image. It works by applying a specific convolution kernel to the image that emphasizes\n    differences in adjacent pixel values.\n\n    Args:\n        alpha (tuple[float, float]): Range to choose the visibility of the embossed image.\n            At 0, only the original image is visible, at 1.0 only its embossed version is visible.\n            Values should be in the range [0, 1].\n            Alpha will be randomly selected from this range for each image.\n            Default: (0.2, 0.5)\n\n        strength (tuple[float, float]): Range to choose the strength of the embossing effect.\n            Higher values create a more pronounced 3D effect.\n            Values should be non-negative.\n            Strength will be randomly selected from this range for each image.\n            Default: (0.2, 0.7)\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The emboss effect is created using a 3x3 convolution kernel.\n        - The 'alpha' parameter controls the blend between the original image and the embossed version.\n          A higher alpha value will result in a more pronounced emboss effect.\n        - The 'strength' parameter affects the intensity of the embossing. Higher strength values\n          will create more contrast in the embossed areas, resulting in a stronger 3D-like effect.\n        - This transform can be useful for creating artistic effects or for data augmentation\n          in tasks where edge information is important.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; embossed_image = result['image']\n\n    References:\n        - https://en.wikipedia.org/wiki/Image_embossing\n        - https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        alpha: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        strength: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, None))]\n\n    def __init__(\n        self,\n        alpha: tuple[float, float] = (0.2, 0.5),\n        strength: tuple[float, float] = (0.2, 0.7),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.alpha = alpha\n        self.strength = strength\n\n    @staticmethod\n    def __generate_emboss_matrix(\n        alpha_sample: np.ndarray,\n        strength_sample: np.ndarray,\n    ) -&gt; np.ndarray:\n        matrix_nochange = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 0]], dtype=np.float32)\n        matrix_effect = np.array(\n            [\n                [-1 - strength_sample, 0 - strength_sample, 0],\n                [0 - strength_sample, 1, 0 + strength_sample],\n                [0, 0 + strength_sample, 1 + strength_sample],\n            ],\n            dtype=np.float32,\n        )\n        return (1 - alpha_sample) * matrix_nochange + alpha_sample * matrix_effect\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        alpha = self.py_random.uniform(*self.alpha)\n        strength = self.py_random.uniform(*self.strength)\n        emboss_matrix = self.__generate_emboss_matrix(\n            alpha_sample=alpha,\n            strength_sample=strength,\n        )\n        return {\"emboss_matrix\": emboss_matrix}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        emboss_matrix: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.convolve(img, emboss_matrix)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"alpha\", \"strength\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Equalize","title":"<code>class  Equalize</code> <code>       (mode='cv', by_channels=True, mask=None, mask_params=(), always_apply=None, p=0.5)                         </code>  [view source on GitHub]","text":"<p>Equalize the image histogram.</p> <p>This transform applies histogram equalization to the input image. Histogram equalization is a method in image processing of contrast adjustment using the image's histogram.</p> <p>Parameters:</p> Name Type Description <code>mode</code> <code>Literal['cv', 'pil']</code> <p>Use OpenCV or Pillow equalization method. Default: 'cv'</p> <code>by_channels</code> <code>bool</code> <p>If True, use equalization by channels separately, else convert image to YCbCr representation and use equalization by <code>Y</code> channel. Default: True</p> <code>mask</code> <code>np.ndarray, callable</code> <p>If given, only the pixels selected by the mask are included in the analysis. Can be: - A 1-channel or 3-channel numpy array of the same size as the input image. - A callable (function) that generates a mask. The function should accept 'image'   as its first argument, and can accept additional arguments specified in mask_params. Default: None</p> <code>mask_params</code> <code>list[str]</code> <p>Additional parameters to pass to the mask function. These parameters will be taken from the data dict passed to call. Default: ()</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>When mode='cv', OpenCV's equalizeHist() function is used.</li> <li>When mode='pil', Pillow's equalize() function is used.</li> <li>The 'by_channels' parameter determines whether equalization is applied to each color channel   independently (True) or to the luminance channel only (False).</li> <li>If a mask is provided as a numpy array, it should have the same height and width as the input image.</li> <li>If a mask is provided as a function, it allows for dynamic mask generation based on the input image   and additional parameters. This is useful for scenarios where the mask depends on the image content   or external data (e.g., bounding boxes, segmentation masks).</li> </ul> <p>Mask Function:     When mask is a callable, it should have the following signature:     mask_func(image, *args) -&gt; np.ndarray</p> <pre><code>- image: The input image (numpy array)\n- *args: Additional arguments as specified in mask_params\n\nThe function should return a numpy array of the same height and width as the input image,\nwhere non-zero pixels indicate areas to be equalized.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using a static mask\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Equalize(mask=mask, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using a dynamic mask function\n&gt;&gt;&gt; def mask_func(image, bboxes):\n...     mask = np.ones_like(image[:, :, 0], dtype=np.uint8)\n...     for bbox in bboxes:\n...         x1, y1, x2, y2 = map(int, bbox)\n...         mask[y1:y2, x1:x2] = 0  # Exclude areas inside bounding boxes\n...     return mask\n&gt;&gt;&gt;\n&gt;&gt;&gt; transform = A.Equalize(mask=mask_func, mask_params=['bboxes'], p=1.0)\n&gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (60, 60, 90, 90)]  # Example bounding boxes\n&gt;&gt;&gt; result = transform(image=image, bboxes=bboxes)\n</code></pre> <p>References</p> <ul> <li>OpenCV equalizeHist: https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e</li> <li>Pillow ImageOps.equalize: https://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.equalize</li> <li>Histogram Equalization: https://en.wikipedia.org/wiki/Histogram_equalization</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Equalize(ImageOnlyTransform):\n    \"\"\"Equalize the image histogram.\n\n    This transform applies histogram equalization to the input image. Histogram equalization\n    is a method in image processing of contrast adjustment using the image's histogram.\n\n    Args:\n        mode (Literal['cv', 'pil']): Use OpenCV or Pillow equalization method.\n            Default: 'cv'\n        by_channels (bool): If True, use equalization by channels separately,\n            else convert image to YCbCr representation and use equalization by `Y` channel.\n            Default: True\n        mask (np.ndarray, callable): If given, only the pixels selected by\n            the mask are included in the analysis. Can be:\n            - A 1-channel or 3-channel numpy array of the same size as the input image.\n            - A callable (function) that generates a mask. The function should accept 'image'\n              as its first argument, and can accept additional arguments specified in mask_params.\n            Default: None\n        mask_params (list[str]): Additional parameters to pass to the mask function.\n            These parameters will be taken from the data dict passed to __call__.\n            Default: ()\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - When mode='cv', OpenCV's equalizeHist() function is used.\n        - When mode='pil', Pillow's equalize() function is used.\n        - The 'by_channels' parameter determines whether equalization is applied to each color channel\n          independently (True) or to the luminance channel only (False).\n        - If a mask is provided as a numpy array, it should have the same height and width as the input image.\n        - If a mask is provided as a function, it allows for dynamic mask generation based on the input image\n          and additional parameters. This is useful for scenarios where the mask depends on the image content\n          or external data (e.g., bounding boxes, segmentation masks).\n\n    Mask Function:\n        When mask is a callable, it should have the following signature:\n        mask_func(image, *args) -&gt; np.ndarray\n\n        - image: The input image (numpy array)\n        - *args: Additional arguments as specified in mask_params\n\n        The function should return a numpy array of the same height and width as the input image,\n        where non-zero pixels indicate areas to be equalized.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using a static mask\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Equalize(mask=mask, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using a dynamic mask function\n        &gt;&gt;&gt; def mask_func(image, bboxes):\n        ...     mask = np.ones_like(image[:, :, 0], dtype=np.uint8)\n        ...     for bbox in bboxes:\n        ...         x1, y1, x2, y2 = map(int, bbox)\n        ...         mask[y1:y2, x1:x2] = 0  # Exclude areas inside bounding boxes\n        ...     return mask\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; transform = A.Equalize(mask=mask_func, mask_params=['bboxes'], p=1.0)\n        &gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (60, 60, 90, 90)]  # Example bounding boxes\n        &gt;&gt;&gt; result = transform(image=image, bboxes=bboxes)\n\n    References:\n        - OpenCV equalizeHist: https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e\n        - Pillow ImageOps.equalize: https://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.equalize\n        - Histogram Equalization: https://en.wikipedia.org/wiki/Histogram_equalization\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mode: ImageMode\n        by_channels: bool\n        mask: np.ndarray | Callable[..., Any] | None\n        mask_params: Sequence[str]\n\n    def __init__(\n        self,\n        mode: ImageMode = \"cv\",\n        by_channels: bool = True,\n        mask: np.ndarray | Callable[..., Any] | None = None,\n        mask_params: Sequence[str] = (),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.mode = mode\n        self.by_channels = by_channels\n        self.mask = mask\n        self.mask_params = mask_params\n\n    def apply(self, img: np.ndarray, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.equalize(\n            img,\n            mode=self.mode,\n            by_channels=self.by_channels,\n            mask=mask,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        if not callable(self.mask):\n            return {\"mask\": self.mask}\n\n        mask_params = {\"image\": data[\"image\"]}\n        for key in self.mask_params:\n            if key not in data:\n                raise KeyError(\n                    f\"Required parameter '{key}' for mask function is missing in data.\",\n                )\n            mask_params[key] = data[key]\n\n        return {\"mask\": self.mask(**mask_params)}\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [*list(self.mask_params)]\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"mode\", \"by_channels\", \"mask\", \"mask_params\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.FancyPCA","title":"<code>class  FancyPCA</code> <code>       (alpha=0.1, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply Fancy PCA augmentation to the input image.</p> <p>This augmentation technique applies PCA (Principal Component Analysis) to the image's color channels, then adds multiples of the principal components to the image, with magnitudes proportional to the corresponding eigenvalues times a random variable drawn from a Gaussian with mean 0 and standard deviation 'alpha'.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>tuple[float, float] | float</code> <p>Standard deviation of the Gaussian distribution used to generate random noise for each principal component. If a single float is provided, it will be used for all channels. If a tuple of two floats (min, max) is provided, the standard deviation will be uniformly sampled from this range for each run. Default: 0.1.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> <p>Note</p> <ul> <li>This augmentation is particularly effective for RGB images but can work with any number of channels.</li> <li>For grayscale images, it applies a simplified version of the augmentation.</li> <li>The transform preserves the mean of the image while adjusting the color/intensity variation.</li> <li>This implementation is based on the paper by Krizhevsky et al. and is similar to the one used   in the original AlexNet paper.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.FancyPCA(alpha=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; augmented_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012). ImageNet classification with deep   convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).</li> <li>https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class FancyPCA(ImageOnlyTransform):\n    \"\"\"Apply Fancy PCA augmentation to the input image.\n\n    This augmentation technique applies PCA (Principal Component Analysis) to the image's color channels,\n    then adds multiples of the principal components to the image, with magnitudes proportional to the\n    corresponding eigenvalues times a random variable drawn from a Gaussian with mean 0 and standard\n    deviation 'alpha'.\n\n    Args:\n        alpha (tuple[float, float] | float): Standard deviation of the Gaussian distribution used to generate\n            random noise for each principal component. If a single float is provided, it will be used for\n            all channels. If a tuple of two floats (min, max) is provided, the standard deviation will be\n            uniformly sampled from this range for each run. Default: 0.1.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n\n    Note:\n        - This augmentation is particularly effective for RGB images but can work with any number of channels.\n        - For grayscale images, it applies a simplified version of the augmentation.\n        - The transform preserves the mean of the image while adjusting the color/intensity variation.\n        - This implementation is based on the paper by Krizhevsky et al. and is similar to the one used\n          in the original AlexNet paper.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.FancyPCA(alpha=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; augmented_image = result[\"image\"]\n\n    References:\n        - Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012). ImageNet classification with deep\n          convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).\n        - https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        alpha: float = Field(ge=0)\n\n    def __init__(\n        self,\n        alpha: float = 0.1,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.alpha = alpha\n\n    def apply(\n        self,\n        img: np.ndarray,\n        alpha_vector: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.fancy_pca(img, alpha_vector)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        shape = params[\"shape\"]\n        num_channels = shape[-1] if len(shape) == NUM_MULTI_CHANNEL_DIMENSIONS else 1\n        alpha_vector = self.random_generator.normal(0, self.alpha, num_channels).astype(\n            np.float32,\n        )\n        return {\"alpha_vector\": alpha_vector}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str]:\n        return (\"alpha\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.FromFloat","title":"<code>class  FromFloat</code> <code>       (dtype='uint8', max_value=None, always_apply=None, p=1.0)                     </code>  [view source on GitHub]","text":"<p>Convert an image from floating point representation to the specified data type.</p> <p>This transform is designed to convert images from a normalized floating-point representation (typically with values in the range [0, 1]) to other data types, scaling the values appropriately.</p> <p>Parameters:</p> Name Type Description <code>dtype</code> <code>str</code> <p>The desired output data type. Supported types include 'uint8', 'uint16',          'uint32'. Default: 'uint8'.</p> <code>max_value</code> <code>float | None</code> <p>The maximum value for the output dtype. If None, the transform                       will attempt to infer the maximum value based on the dtype.                       Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     float32, float64</p> <p>Note</p> <ul> <li>This is the inverse transform for ToFloat.</li> <li>Input images are expected to be in floating point format with values in the range [0, 1].</li> <li>For integer output types (uint8, uint16, uint32), the function will scale the values   to the appropriate range (e.g., 0-255 for uint8).</li> <li>For float output types (float32, float64), the values will remain in the [0, 1] range.</li> <li>The transform uses the <code>from_float</code> function internally, which ensures output values   are within the valid range for the specified dtype.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.FromFloat(dtype='uint8', max_value=None, p=1.0)\n&gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)  # Float image in [0, 1] range\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; uint8_image = result['image']\n&gt;&gt;&gt; assert uint8_image.dtype == np.uint8\n&gt;&gt;&gt; assert uint8_image.min() &gt;= 0 and uint8_image.max() &lt;= 255\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class FromFloat(ImageOnlyTransform):\n    \"\"\"Convert an image from floating point representation to the specified data type.\n\n    This transform is designed to convert images from a normalized floating-point representation\n    (typically with values in the range [0, 1]) to other data types, scaling the values appropriately.\n\n    Args:\n        dtype (str): The desired output data type. Supported types include 'uint8', 'uint16',\n                     'uint32'. Default: 'uint8'.\n        max_value (float | None): The maximum value for the output dtype. If None, the transform\n                                  will attempt to infer the maximum value based on the dtype.\n                                  Default: None.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, volume\n\n    Image types:\n        float32, float64\n\n    Note:\n        - This is the inverse transform for ToFloat.\n        - Input images are expected to be in floating point format with values in the range [0, 1].\n        - For integer output types (uint8, uint16, uint32), the function will scale the values\n          to the appropriate range (e.g., 0-255 for uint8).\n        - For float output types (float32, float64), the values will remain in the [0, 1] range.\n        - The transform uses the `from_float` function internally, which ensures output values\n          are within the valid range for the specified dtype.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.FromFloat(dtype='uint8', max_value=None, p=1.0)\n        &gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)  # Float image in [0, 1] range\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; uint8_image = result['image']\n        &gt;&gt;&gt; assert uint8_image.dtype == np.uint8\n        &gt;&gt;&gt; assert uint8_image.min() &gt;= 0 and uint8_image.max() &lt;= 255\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        dtype: Literal[\"uint8\", \"uint16\", \"float32\", \"float64\"]\n        max_value: float | None\n\n    def __init__(\n        self,\n        dtype: Literal[\"uint8\", \"uint16\", \"float32\", \"float64\"] = \"uint8\",\n        max_value: float | None = None,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.dtype = np.dtype(dtype)\n        self.max_value = max_value\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return from_float(img, self.dtype, self.max_value)\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\"dtype\": self.dtype.name, \"max_value\": self.max_value}\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.GaussNoise","title":"<code>class  GaussNoise</code> <code>       (var_limit=None, mean=None, std_range=(0.2, 0.44), mean_range=(0.0, 0.0), per_channel=True, noise_scale_factor=1, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply Gaussian noise to the input image.</p> <p>Parameters:</p> Name Type Description <code>std_range</code> <code>tuple[float, float]</code> <p>Range for noise standard deviation as a fraction of the maximum value (255 for uint8 images or 1.0 for float images). Values should be in range [0, 1]. Default: (0.2, 0.44).</p> <code>mean_range</code> <code>tuple[float, float]</code> <p>Range for noise mean as a fraction of the maximum value (255 for uint8 images or 1.0 for float images). Values should be in range [-1, 1]. Default: (0.0, 0.0).</p> <code>var_limit</code> <code>tuple[float, float] | float</code> <p>[Deprecated] Variance range for noise. If var_limit is a single float value, the range will be (0, var_limit). Default: (10.0, 50.0).</p> <code>mean</code> <code>float</code> <p>[Deprecated] Mean of the noise. Default: 0.</p> <code>per_channel</code> <code>bool</code> <p>If True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Default: True.</p> <code>noise_scale_factor</code> <code>float</code> <p>Scaling factor for noise generation. Value should be in the range (0, 1]. When set to 1, noise is sampled for each pixel independently. If less, noise is sampled for a smaller size and resized to fit the shape of the image. Smaller values make the transform faster. Default: 1.0.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The noise parameters (std_range and mean_range) are normalized to [0, 1] range:</li> <li>For uint8 images, they are multiplied by 255</li> <li>For float32 images, they are used directly</li> <li>The behavior differs between old and new parameters:</li> <li>When using var_limit (deprecated): samples variance uniformly and takes sqrt to get std dev</li> <li>When using std_range: samples standard deviation directly (aligned with torchvision/kornia)</li> <li>Setting per_channel=False is faster but applies the same noise to all channels</li> <li>The noise_scale_factor parameter allows for a trade-off between transform speed and noise granularity</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (224, 224, 3), dtype=np.uint8)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Apply Gaussian noise with normalized std_range\n&gt;&gt;&gt; transform = A.GaussNoise(std_range=(0.1, 0.2), p=1.0)  # 10-20% of max value\n&gt;&gt;&gt; noisy_image = transform(image=image)['image']\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using deprecated var_limit (will be converted to std_range)\n&gt;&gt;&gt; transform = A.GaussNoise(var_limit=(50.0, 100.0), mean=10, p=1.0)\n&gt;&gt;&gt; noisy_image = transform(image=image)['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class GaussNoise(ImageOnlyTransform):\n    \"\"\"Apply Gaussian noise to the input image.\n\n    Args:\n        std_range (tuple[float, float]): Range for noise standard deviation as a fraction\n            of the maximum value (255 for uint8 images or 1.0 for float images).\n            Values should be in range [0, 1]. Default: (0.2, 0.44).\n        mean_range (tuple[float, float]): Range for noise mean as a fraction\n            of the maximum value (255 for uint8 images or 1.0 for float images).\n            Values should be in range [-1, 1]. Default: (0.0, 0.0).\n        var_limit (tuple[float, float] | float): [Deprecated] Variance range for noise.\n            If var_limit is a single float value, the range will be (0, var_limit).\n            Default: (10.0, 50.0).\n        mean (float): [Deprecated] Mean of the noise. Default: 0.\n        per_channel (bool): If True, noise will be sampled for each channel independently.\n            Otherwise, the noise will be sampled once for all channels. Default: True.\n        noise_scale_factor (float): Scaling factor for noise generation. Value should be in the range (0, 1].\n            When set to 1, noise is sampled for each pixel independently. If less, noise is sampled for a smaller size\n            and resized to fit the shape of the image. Smaller values make the transform faster. Default: 1.0.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The noise parameters (std_range and mean_range) are normalized to [0, 1] range:\n          * For uint8 images, they are multiplied by 255\n          * For float32 images, they are used directly\n        - The behavior differs between old and new parameters:\n          * When using var_limit (deprecated): samples variance uniformly and takes sqrt to get std dev\n          * When using std_range: samples standard deviation directly (aligned with torchvision/kornia)\n        - Setting per_channel=False is faster but applies the same noise to all channels\n        - The noise_scale_factor parameter allows for a trade-off between transform speed and noise granularity\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (224, 224, 3), dtype=np.uint8)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Apply Gaussian noise with normalized std_range\n        &gt;&gt;&gt; transform = A.GaussNoise(std_range=(0.1, 0.2), p=1.0)  # 10-20% of max value\n        &gt;&gt;&gt; noisy_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using deprecated var_limit (will be converted to std_range)\n        &gt;&gt;&gt; transform = A.GaussNoise(var_limit=(50.0, 100.0), mean=10, p=1.0)\n        &gt;&gt;&gt; noisy_image = transform(image=image)['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        var_limit: ScaleFloatType | None\n        mean: float | None\n        std_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        mean_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(-1, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        per_channel: bool\n        noise_scale_factor: float = Field(gt=0, le=1)\n\n        @model_validator(mode=\"after\")\n        def check_range(self) -&gt; Self:\n            if self.var_limit is not None:\n                warnings.warn(\"`var_limit` deprecated. Use `std_range` instead.\", DeprecationWarning, stacklevel=2)\n                self.var_limit = to_tuple(self.var_limit, 0)\n                if self.var_limit[1] &gt; 1:\n                    # Convert legacy uint8 variance to normalized std dev\n                    self.std_range = (math.sqrt(10 / 255), math.sqrt(50 / 255))\n                else:\n                    # Already normalized variance, convert to std dev\n                    self.std_range = (\n                        math.sqrt(self.var_limit[0]),\n                        math.sqrt(self.var_limit[1]),\n                    )\n\n            if self.mean is not None:\n                warn(\"`mean` deprecated. Use `mean_range` instead.\", DeprecationWarning, stacklevel=2)\n                if self.mean &gt;= 1:\n                    # Convert legacy uint8 mean to normalized range\n                    self.mean_range = (self.mean / 255, self.mean / 255)\n                else:\n                    # Already normalized mean\n                    self.mean_range = (self.mean, self.mean)\n\n            return self\n\n    def __init__(\n        self,\n        var_limit: ScaleFloatType | None = None,\n        mean: float | None = None,\n        std_range: tuple[float, float] = (0.2, 0.44),  # sqrt(10 / 255), sqrt(50 / 255)\n        mean_range: tuple[float, float] = (0.0, 0.0),\n        per_channel: bool = True,\n        noise_scale_factor: float = 1,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.std_range = std_range\n        self.mean_range = mean_range\n        self.per_channel = per_channel\n        self.noise_scale_factor = noise_scale_factor\n\n        self.var_limit = var_limit\n\n    def apply(\n        self,\n        img: np.ndarray,\n        noise_map: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_noise(img, noise_map)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, float]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n        max_value = MAX_VALUES_BY_DTYPE[image.dtype]\n\n        if self.var_limit is not None:\n            # Legacy behavior: sample variance uniformly then take sqrt\n            var = self.py_random.uniform(self.std_range[0] ** 2, self.std_range[1] ** 2)\n            sigma = math.sqrt(var)\n        else:\n            # New behavior: sample std dev directly (aligned with torchvision/kornia)\n            sigma = self.py_random.uniform(*self.std_range)\n\n        mean = self.py_random.uniform(*self.mean_range)\n\n        noise_map = fmain.generate_noise(\n            noise_type=\"gaussian\",\n            spatial_mode=\"per_pixel\" if self.per_channel else \"shared\",\n            shape=image.shape,\n            params={\"mean_range\": (mean, mean), \"std_range\": (sigma, sigma)},\n            max_value=max_value,\n            approximation=self.noise_scale_factor,\n            random_generator=self.random_generator,\n        )\n\n        return {\"noise_map\": noise_map}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"std_range\", \"mean_range\", \"per_channel\", \"noise_scale_factor\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.GaussianParams","title":"<code>class  GaussianParams</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class GaussianParams(NoiseParamsBase):\n    noise_type: Literal[\"gaussian\"] = \"gaussian\"\n    mean_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=-1, max_val=1)),\n    ]\n    std_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0, max_val=1)),\n    ]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.HueSaturationValue","title":"<code>class  HueSaturationValue</code> <code>       (hue_shift_limit=(-20, 20), sat_shift_limit=(-30, 30), val_shift_limit=(-20, 20), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Randomly change hue, saturation and value of the input image.</p> <p>This transform adjusts the HSV (Hue, Saturation, Value) channels of an input RGB image. It allows for independent control over each channel, providing a wide range of color and brightness modifications.</p> <p>Parameters:</p> Name Type Description <code>hue_shift_limit</code> <code>float | tuple[float, float]</code> <p>Range for changing hue. If a single float value is provided, the range will be (-hue_shift_limit, hue_shift_limit). Values should be in the range [-180, 180]. Default: (-20, 20).</p> <code>sat_shift_limit</code> <code>float | tuple[float, float]</code> <p>Range for changing saturation. If a single float value is provided, the range will be (-sat_shift_limit, sat_shift_limit). Values should be in the range [-255, 255]. Default: (-30, 30).</p> <code>val_shift_limit</code> <code>float | tuple[float, float]</code> <p>Range for changing value (brightness). If a single float value is provided, the range will be (-val_shift_limit, val_shift_limit). Values should be in the range [-255, 255]. Default: (-20, 20).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The transform first converts the input RGB image to the HSV color space.</li> <li>Each channel (Hue, Saturation, Value) is adjusted independently.</li> <li>Hue is circular, so it wraps around at 180 degrees.</li> <li>For float32 images, the shift values are applied as percentages of the full range.</li> <li>This transform is particularly useful for color augmentation and simulating   different lighting conditions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.HueSaturationValue(\n...     hue_shift_limit=20,\n...     sat_shift_limit=30,\n...     val_shift_limit=20,\n...     p=0.7\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; augmented_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>HSV color space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class HueSaturationValue(ImageOnlyTransform):\n    \"\"\"Randomly change hue, saturation and value of the input image.\n\n    This transform adjusts the HSV (Hue, Saturation, Value) channels of an input RGB image.\n    It allows for independent control over each channel, providing a wide range of color\n    and brightness modifications.\n\n    Args:\n        hue_shift_limit (float | tuple[float, float]): Range for changing hue.\n            If a single float value is provided, the range will be (-hue_shift_limit, hue_shift_limit).\n            Values should be in the range [-180, 180]. Default: (-20, 20).\n\n        sat_shift_limit (float | tuple[float, float]): Range for changing saturation.\n            If a single float value is provided, the range will be (-sat_shift_limit, sat_shift_limit).\n            Values should be in the range [-255, 255]. Default: (-30, 30).\n\n        val_shift_limit (float | tuple[float, float]): Range for changing value (brightness).\n            If a single float value is provided, the range will be (-val_shift_limit, val_shift_limit).\n            Values should be in the range [-255, 255]. Default: (-20, 20).\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The transform first converts the input RGB image to the HSV color space.\n        - Each channel (Hue, Saturation, Value) is adjusted independently.\n        - Hue is circular, so it wraps around at 180 degrees.\n        - For float32 images, the shift values are applied as percentages of the full range.\n        - This transform is particularly useful for color augmentation and simulating\n          different lighting conditions.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.HueSaturationValue(\n        ...     hue_shift_limit=20,\n        ...     sat_shift_limit=30,\n        ...     val_shift_limit=20,\n        ...     p=0.7\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; augmented_image = result[\"image\"]\n\n    References:\n        - HSV color space: https://en.wikipedia.org/wiki/HSL_and_HSV\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        hue_shift_limit: SymmetricRangeType\n        sat_shift_limit: SymmetricRangeType\n        val_shift_limit: SymmetricRangeType\n\n    def __init__(\n        self,\n        hue_shift_limit: ScaleFloatType = (-20, 20),\n        sat_shift_limit: ScaleFloatType = (-30, 30),\n        val_shift_limit: ScaleFloatType = (-20, 20),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.hue_shift_limit = cast(tuple[float, float], hue_shift_limit)\n        self.sat_shift_limit = cast(tuple[float, float], sat_shift_limit)\n        self.val_shift_limit = cast(tuple[float, float], val_shift_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        hue_shift: int,\n        sat_shift: int,\n        val_shift: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if not is_rgb_image(img) and not is_grayscale_image(img):\n            msg = \"HueSaturationValue transformation expects 1-channel or 3-channel images.\"\n            raise TypeError(msg)\n        return fmain.shift_hsv(img, hue_shift, sat_shift, val_shift)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"hue_shift\": self.py_random.uniform(*self.hue_shift_limit),\n            \"sat_shift\": self.py_random.uniform(*self.sat_shift_limit),\n            \"val_shift\": self.py_random.uniform(*self.val_shift_limit),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"hue_shift_limit\", \"sat_shift_limit\", \"val_shift_limit\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ISONoise","title":"<code>class  ISONoise</code> <code>       (color_shift=(0.01, 0.05), intensity=(0.1, 0.5), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies camera sensor noise to the input image, simulating high ISO settings.</p> <p>This transform adds random noise to an image, mimicking the effect of using high ISO settings in digital photography. It simulates two main components of ISO noise: 1. Color noise: random shifts in color hue 2. Luminance noise: random variations in pixel intensity</p> <p>Parameters:</p> Name Type Description <code>color_shift</code> <code>tuple[float, float]</code> <p>Range for changing color hue. Values should be in the range [0, 1], where 1 represents a full 360\u00b0 hue rotation. Default: (0.01, 0.05)</p> <code>intensity</code> <code>tuple[float, float]</code> <p>Range for the noise intensity. Higher values increase the strength of both color and luminance noise. Default: (0.1, 0.5)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>This transform only works with RGB images. It will raise a TypeError if applied to   non-RGB images.</li> <li>The color shift is applied in the HSV color space, affecting the hue channel.</li> <li>Luminance noise is added to all channels independently.</li> <li>This transform can be useful for data augmentation in low-light scenarios or when   training models to be robust against noisy inputs.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), p=0.5)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; noisy_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>ISO noise in digital photography:   https://en.wikipedia.org/wiki/Image_noise#In_digital_cameras</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ISONoise(ImageOnlyTransform):\n    \"\"\"Applies camera sensor noise to the input image, simulating high ISO settings.\n\n    This transform adds random noise to an image, mimicking the effect of using high ISO settings\n    in digital photography. It simulates two main components of ISO noise:\n    1. Color noise: random shifts in color hue\n    2. Luminance noise: random variations in pixel intensity\n\n    Args:\n        color_shift (tuple[float, float]): Range for changing color hue.\n            Values should be in the range [0, 1], where 1 represents a full 360\u00b0 hue rotation.\n            Default: (0.01, 0.05)\n\n        intensity (tuple[float, float]): Range for the noise intensity.\n            Higher values increase the strength of both color and luminance noise.\n            Default: (0.1, 0.5)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - This transform only works with RGB images. It will raise a TypeError if applied to\n          non-RGB images.\n        - The color shift is applied in the HSV color space, affecting the hue channel.\n        - Luminance noise is added to all channels independently.\n        - This transform can be useful for data augmentation in low-light scenarios or when\n          training models to be robust against noisy inputs.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), p=0.5)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; noisy_image = result[\"image\"]\n\n    References:\n        - ISO noise in digital photography:\n          https://en.wikipedia.org/wiki/Image_noise#In_digital_cameras\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        color_shift: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        intensity: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n\n    def __init__(\n        self,\n        color_shift: tuple[float, float] = (0.01, 0.05),\n        intensity: tuple[float, float] = (0.1, 0.5),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.intensity = intensity\n        self.color_shift = color_shift\n\n    def apply(\n        self,\n        img: np.ndarray,\n        color_shift: float,\n        intensity: float,\n        random_seed: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.iso_noise(\n            img,\n            color_shift,\n            intensity,\n            np.random.default_rng(random_seed),\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        random_seed = self.random_generator.integers(0, 2**32 - 1)\n        return {\n            \"color_shift\": self.py_random.uniform(*self.color_shift),\n            \"intensity\": self.py_random.uniform(*self.intensity),\n            \"random_seed\": random_seed,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"intensity\", \"color_shift\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Illumination","title":"<code>class  Illumination</code> <code>       (mode='linear', intensity_range=(0.01, 0.2), effect_type='both', angle_range=(0, 360), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply various illumination effects to the image.</p> <p>This transform simulates different lighting conditions by applying controlled illumination patterns. It can create effects like: - Directional lighting (linear mode) - Corner shadows/highlights (corner mode) - Spotlights or local lighting (gaussian mode)</p> <p>These effects can be used to: - Simulate natural lighting variations - Add dramatic lighting effects - Create synthetic shadows or highlights - Augment training data with different lighting conditions</p> <p>Parameters:</p> Name Type Description <code>mode</code> <code>Literal[\"linear\", \"corner\", \"gaussian\"]</code> <p>Type of illumination pattern: - 'linear': Creates a smooth gradient across the image,            simulating directional lighting like sunlight            through a window - 'corner': Applies gradient from any corner,            simulating light source from a corner - 'gaussian': Creates a circular spotlight effect,              simulating local light sources Default: 'linear'</p> <code>intensity_range</code> <code>tuple[float, float]</code> <p>Range for effect strength. Values between 0.01 and 0.2: - 0.01-0.05: Subtle lighting changes - 0.05-0.1: Moderate lighting effects - 0.1-0.2: Strong lighting effects Default: (0.01, 0.2)</p> <code>effect_type</code> <code>str</code> <p>Type of lighting change: - 'brighten': Only adds light (like a spotlight) - 'darken': Only removes light (like a shadow) - 'both': Randomly chooses between brightening and darkening Default: 'both'</p> <code>angle_range</code> <code>tuple[float, float]</code> <p>Range for gradient angle in degrees. Controls direction of linear gradient: - 0\u00b0: Left to right - 90\u00b0: Top to bottom - 180\u00b0: Right to left - 270\u00b0: Bottom to top Only used for 'linear' mode. Default: (0, 360)</p> <code>center_range</code> <code>tuple[float, float]</code> <p>Range for spotlight position. Values between 0 and 1 representing relative position: - (0, 0): Top-left corner - (1, 1): Bottom-right corner - (0.5, 0.5): Center of image Only used for 'gaussian' mode. Default: (0.1, 0.9)</p> <code>sigma_range</code> <code>tuple[float, float]</code> <p>Range for spotlight size. Values between 0.2 and 1.0: - 0.2: Small, focused spotlight - 0.5: Medium-sized light area - 1.0: Broad, soft lighting Only used for 'gaussian' mode. Default: (0.2, 1.0)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Simulate sunlight through window\n&gt;&gt;&gt; transform = A.Illumination(\n...     mode='linear',\n...     intensity_range=(0.05, 0.1),\n...     effect_type='brighten',\n...     angle_range=(30, 60)\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Create dramatic corner shadow\n&gt;&gt;&gt; transform = A.Illumination(\n...     mode='corner',\n...     intensity_range=(0.1, 0.2),\n...     effect_type='darken'\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Add multiple spotlights\n&gt;&gt;&gt; transform1 = A.Illumination(\n...     mode='gaussian',\n...     intensity_range=(0.05, 0.15),\n...     effect_type='brighten',\n...     center_range=(0.2, 0.4),\n...     sigma_range=(0.2, 0.3)\n... )\n&gt;&gt;&gt; transform2 = A.Illumination(\n...     mode='gaussian',\n...     intensity_range=(0.05, 0.15),\n...     effect_type='darken',\n...     center_range=(0.6, 0.8),\n...     sigma_range=(0.3, 0.5)\n... )\n&gt;&gt;&gt; transforms = A.Compose([transform1, transform2])\n</code></pre> <p>References</p> <ul> <li> <p>Lighting in Computer Vision:   https://en.wikipedia.org/wiki/Lighting_in_computer_vision</p> </li> <li> <p>Image-based lighting:   https://en.wikipedia.org/wiki/Image-based_lighting</p> </li> <li> <p>Similar implementation in Kornia:   https://kornia.readthedocs.io/en/latest/augmentation.html#randomlinearillumination</p> </li> <li> <p>Research on lighting augmentation:   \"Learning Deep Representations of Fine-grained Visual Descriptions\"   https://arxiv.org/abs/1605.05395</p> </li> <li> <p>Photography lighting patterns:   https://en.wikipedia.org/wiki/Lighting_pattern</p> </li> </ul> <p>Note</p> <ul> <li>The transform preserves image range and dtype</li> <li>Effects are applied multiplicatively to preserve texture</li> <li>Can be combined with other transforms for complex lighting scenarios</li> <li>Useful for training models to be robust to lighting variations</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Illumination(ImageOnlyTransform):\n    \"\"\"Apply various illumination effects to the image.\n\n    This transform simulates different lighting conditions by applying controlled\n    illumination patterns. It can create effects like:\n    - Directional lighting (linear mode)\n    - Corner shadows/highlights (corner mode)\n    - Spotlights or local lighting (gaussian mode)\n\n    These effects can be used to:\n    - Simulate natural lighting variations\n    - Add dramatic lighting effects\n    - Create synthetic shadows or highlights\n    - Augment training data with different lighting conditions\n\n    Args:\n        mode (Literal[\"linear\", \"corner\", \"gaussian\"]): Type of illumination pattern:\n            - 'linear': Creates a smooth gradient across the image,\n                       simulating directional lighting like sunlight\n                       through a window\n            - 'corner': Applies gradient from any corner,\n                       simulating light source from a corner\n            - 'gaussian': Creates a circular spotlight effect,\n                         simulating local light sources\n            Default: 'linear'\n\n        intensity_range (tuple[float, float]): Range for effect strength.\n            Values between 0.01 and 0.2:\n            - 0.01-0.05: Subtle lighting changes\n            - 0.05-0.1: Moderate lighting effects\n            - 0.1-0.2: Strong lighting effects\n            Default: (0.01, 0.2)\n\n        effect_type (str): Type of lighting change:\n            - 'brighten': Only adds light (like a spotlight)\n            - 'darken': Only removes light (like a shadow)\n            - 'both': Randomly chooses between brightening and darkening\n            Default: 'both'\n\n        angle_range (tuple[float, float]): Range for gradient angle in degrees.\n            Controls direction of linear gradient:\n            - 0\u00b0: Left to right\n            - 90\u00b0: Top to bottom\n            - 180\u00b0: Right to left\n            - 270\u00b0: Bottom to top\n            Only used for 'linear' mode.\n            Default: (0, 360)\n\n        center_range (tuple[float, float]): Range for spotlight position.\n            Values between 0 and 1 representing relative position:\n            - (0, 0): Top-left corner\n            - (1, 1): Bottom-right corner\n            - (0.5, 0.5): Center of image\n            Only used for 'gaussian' mode.\n            Default: (0.1, 0.9)\n\n        sigma_range (tuple[float, float]): Range for spotlight size.\n            Values between 0.2 and 1.0:\n            - 0.2: Small, focused spotlight\n            - 0.5: Medium-sized light area\n            - 1.0: Broad, soft lighting\n            Only used for 'gaussian' mode.\n            Default: (0.2, 1.0)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Simulate sunlight through window\n        &gt;&gt;&gt; transform = A.Illumination(\n        ...     mode='linear',\n        ...     intensity_range=(0.05, 0.1),\n        ...     effect_type='brighten',\n        ...     angle_range=(30, 60)\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Create dramatic corner shadow\n        &gt;&gt;&gt; transform = A.Illumination(\n        ...     mode='corner',\n        ...     intensity_range=(0.1, 0.2),\n        ...     effect_type='darken'\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Add multiple spotlights\n        &gt;&gt;&gt; transform1 = A.Illumination(\n        ...     mode='gaussian',\n        ...     intensity_range=(0.05, 0.15),\n        ...     effect_type='brighten',\n        ...     center_range=(0.2, 0.4),\n        ...     sigma_range=(0.2, 0.3)\n        ... )\n        &gt;&gt;&gt; transform2 = A.Illumination(\n        ...     mode='gaussian',\n        ...     intensity_range=(0.05, 0.15),\n        ...     effect_type='darken',\n        ...     center_range=(0.6, 0.8),\n        ...     sigma_range=(0.3, 0.5)\n        ... )\n        &gt;&gt;&gt; transforms = A.Compose([transform1, transform2])\n\n    References:\n        - Lighting in Computer Vision:\n          https://en.wikipedia.org/wiki/Lighting_in_computer_vision\n\n        - Image-based lighting:\n          https://en.wikipedia.org/wiki/Image-based_lighting\n\n        - Similar implementation in Kornia:\n          https://kornia.readthedocs.io/en/latest/augmentation.html#randomlinearillumination\n\n        - Research on lighting augmentation:\n          \"Learning Deep Representations of Fine-grained Visual Descriptions\"\n          https://arxiv.org/abs/1605.05395\n\n        - Photography lighting patterns:\n          https://en.wikipedia.org/wiki/Lighting_pattern\n\n    Note:\n        - The transform preserves image range and dtype\n        - Effects are applied multiplicatively to preserve texture\n        - Can be combined with other transforms for complex lighting scenarios\n        - Useful for training models to be robust to lighting variations\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mode: Literal[\"linear\", \"corner\", \"gaussian\"]\n        intensity_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0.01, 0.2)),\n        ]\n        effect_type: Literal[\"brighten\", \"darken\", \"both\"]\n        angle_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 360)),\n        ]\n        center_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        sigma_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0.2, 1.0)),\n        ]\n\n    def __init__(\n        self,\n        mode: Literal[\"linear\", \"corner\", \"gaussian\"] = \"linear\",\n        intensity_range: tuple[float, float] = (0.01, 0.2),\n        effect_type: Literal[\"brighten\", \"darken\", \"both\"] = \"both\",\n        angle_range: tuple[float, float] = (0, 360),\n        center_range: tuple[float, float] = (0.1, 0.9),\n        sigma_range: tuple[float, float] = (0.2, 1.0),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(always_apply=always_apply, p=p)\n        self.mode = mode\n        self.intensity_range = intensity_range\n        self.effect_type = effect_type\n        self.angle_range = angle_range\n        self.center_range = center_range\n        self.sigma_range = sigma_range\n\n    def get_params(self) -&gt; dict[str, Any]:\n        intensity = self.py_random.uniform(*self.intensity_range)\n\n        # Determine if brightening or darkening\n        sign = 1  # brighten\n        if self.effect_type == \"both\":\n            sign = 1 if self.py_random.random() &gt; 0.5 else -1\n        elif self.effect_type == \"darken\":\n            sign = -1\n\n        intensity *= sign\n\n        if self.mode == \"linear\":\n            angle = self.py_random.uniform(*self.angle_range)\n            return {\n                \"intensity\": intensity,\n                \"angle\": angle,\n            }\n        if self.mode == \"corner\":\n            corner = self.py_random.randint(0, 3)  # Choose random corner\n            return {\n                \"intensity\": intensity,\n                \"corner\": corner,\n            }\n\n        x = self.py_random.uniform(*self.center_range)\n        y = self.py_random.uniform(*self.center_range)\n        sigma = self.py_random.uniform(*self.sigma_range)\n        return {\n            \"intensity\": intensity,\n            \"center\": (x, y),\n            \"sigma\": sigma,\n        }\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if self.mode == \"linear\":\n            return fmain.apply_linear_illumination(\n                img,\n                intensity=params[\"intensity\"],\n                angle=params[\"angle\"],\n            )\n        if self.mode == \"corner\":\n            return fmain.apply_corner_illumination(\n                img,\n                intensity=params[\"intensity\"],\n                corner=params[\"corner\"],\n            )\n\n        return fmain.apply_gaussian_illumination(\n            img,\n            intensity=params[\"intensity\"],\n            center=params[\"center\"],\n            sigma=params[\"sigma\"],\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"mode\",\n            \"intensity_range\",\n            \"effect_type\",\n            \"angle_range\",\n            \"center_range\",\n            \"sigma_range\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ImageCompression","title":"<code>class  ImageCompression</code> <code>       (quality_lower=None, quality_upper=None, compression_type='jpeg', quality_range=(99, 100), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Decrease image quality by applying JPEG or WebP compression.</p> <p>This transform simulates the effect of saving an image with lower quality settings, which can introduce compression artifacts. It's useful for data augmentation and for testing model robustness against varying image qualities.</p> <p>Parameters:</p> Name Type Description <code>quality_range</code> <code>tuple[int, int]</code> <p>Range for the compression quality. The values should be in [1, 100] range, where: - 1 is the lowest quality (maximum compression) - 100 is the highest quality (minimum compression) Default: (99, 100)</p> <code>compression_type</code> <code>Literal[\"jpeg\", \"webp\"]</code> <p>Type of compression to apply. - \"jpeg\": JPEG compression - \"webp\": WebP compression Default: \"jpeg\"</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform expects images with 1, 3, or 4 channels.</li> <li>For JPEG compression, alpha channels (4th channel) will be ignored.</li> <li>WebP compression supports transparency (4 channels).</li> <li>The actual file is not saved to disk; the compression is simulated in memory.</li> <li>Lower quality values result in smaller file sizes but may introduce visible artifacts.</li> <li>This transform can be useful for:</li> <li>Data augmentation to improve model robustness</li> <li>Testing how models perform on images of varying quality</li> <li>Simulating images transmitted over low-bandwidth connections</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ImageCompression(quality_range=(50, 90), compression_type=0, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; compressed_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>JPEG compression: https://en.wikipedia.org/wiki/JPEG</li> <li>WebP compression: https://developers.google.com/speed/webp</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ImageCompression(ImageOnlyTransform):\n    \"\"\"Decrease image quality by applying JPEG or WebP compression.\n\n    This transform simulates the effect of saving an image with lower quality settings,\n    which can introduce compression artifacts. It's useful for data augmentation and\n    for testing model robustness against varying image qualities.\n\n    Args:\n        quality_range (tuple[int, int]): Range for the compression quality.\n            The values should be in [1, 100] range, where:\n            - 1 is the lowest quality (maximum compression)\n            - 100 is the highest quality (minimum compression)\n            Default: (99, 100)\n\n        compression_type (Literal[\"jpeg\", \"webp\"]): Type of compression to apply.\n            - \"jpeg\": JPEG compression\n            - \"webp\": WebP compression\n            Default: \"jpeg\"\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform expects images with 1, 3, or 4 channels.\n        - For JPEG compression, alpha channels (4th channel) will be ignored.\n        - WebP compression supports transparency (4 channels).\n        - The actual file is not saved to disk; the compression is simulated in memory.\n        - Lower quality values result in smaller file sizes but may introduce visible artifacts.\n        - This transform can be useful for:\n          * Data augmentation to improve model robustness\n          * Testing how models perform on images of varying quality\n          * Simulating images transmitted over low-bandwidth connections\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ImageCompression(quality_range=(50, 90), compression_type=0, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; compressed_image = result[\"image\"]\n\n    References:\n        - JPEG compression: https://en.wikipedia.org/wiki/JPEG\n        - WebP compression: https://developers.google.com/speed/webp\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        quality_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, 100)),\n            AfterValidator(nondecreasing),\n        ]\n\n        quality_lower: int | None = Field(\n            ge=1,\n            le=100,\n        )\n        quality_upper: int | None = Field(\n            ge=1,\n            le=100,\n        )\n        compression_type: Literal[\"jpeg\", \"webp\"]\n\n        @model_validator(mode=\"after\")\n        def validate_ranges(self) -&gt; Self:\n            # Update the quality_range based on the non-None values of quality_lower and quality_upper\n            if self.quality_lower is not None or self.quality_upper is not None:\n                if self.quality_lower is not None:\n                    warn(\n                        \"`quality_lower` is deprecated. Use `quality_range` as tuple\"\n                        \" (quality_lower, quality_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.quality_upper is not None:\n                    warn(\n                        \"`quality_upper` is deprecated. Use `quality_range` as tuple\"\n                        \" (quality_lower, quality_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.quality_lower if self.quality_lower is not None else self.quality_range[0]\n                upper = self.quality_upper if self.quality_upper is not None else self.quality_range[1]\n                self.quality_range = (lower, upper)\n                # Clear the deprecated individual quality settings\n                self.quality_lower = None\n                self.quality_upper = None\n\n            # Validate the quality_range\n            if not (1 &lt;= self.quality_range[0] &lt;= MAX_JPEG_QUALITY and 1 &lt;= self.quality_range[1] &lt;= MAX_JPEG_QUALITY):\n                raise ValueError(\n                    f\"Quality range values should be within [1, {MAX_JPEG_QUALITY}] range.\",\n                )\n\n            return self\n\n    def __init__(\n        self,\n        quality_lower: int | None = None,\n        quality_upper: int | None = None,\n        compression_type: Literal[\"jpeg\", \"webp\"] = \"jpeg\",\n        quality_range: tuple[int, int] = (99, 100),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.quality_range = quality_range\n        self.compression_type = compression_type\n\n    def apply(\n        self,\n        img: np.ndarray,\n        quality: int,\n        image_type: Literal[\".jpg\", \".webp\"],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.image_compression(img, quality, image_type)\n\n    def get_params(self) -&gt; dict[str, int | str]:\n        if self.compression_type == \"jpeg\":\n            image_type = \".jpg\"\n        elif self.compression_type == \"webp\":\n            image_type = \".webp\"\n        else:\n            raise ValueError(f\"Unknown image compression type: {self.compression_type}\")\n\n        return {\n            \"quality\": self.py_random.randint(*self.quality_range),\n            \"image_type\": image_type,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"quality_range\", \"compression_type\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.InterpolationPydantic","title":"<code>class  InterpolationPydantic</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class InterpolationPydantic(BaseModel):\n    upscale: InterpolationType\n    downscale: InterpolationType\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.InvertImg","title":"<code>class  InvertImg</code> <code> </code>  [view source on GitHub]","text":"<p>Invert the input image by subtracting pixel values from max values of the image types, i.e., 255 for uint8 and 1.0 for float32.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class InvertImg(ImageOnlyTransform):\n    \"\"\"Invert the input image by subtracting pixel values from max values of the image types,\n    i.e., 255 for uint8 and 1.0 for float32.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    \"\"\"\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.invert(img)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Lambda","title":"<code>class  Lambda</code> <code>       (image=None, mask=None, keypoints=None, bboxes=None, name=None, always_apply=None, p=1.0)                               </code>  [view source on GitHub]","text":"<p>A flexible transformation class for using user-defined transformation functions per targets. Function signature must include **kwargs to accept optional arguments like interpolation method, image size, etc:</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>Callable[..., Any] | None</code> <p>Image transformation function.</p> <code>mask</code> <code>Callable[..., Any] | None</code> <p>Mask transformation function.</p> <code>keypoints</code> <code>Callable[..., Any] | None</code> <p>Keypoints transformation function.</p> <code>bboxes</code> <code>Callable[..., Any] | None</code> <p>BBoxes transformation function.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Lambda(NoOp):\n    \"\"\"A flexible transformation class for using user-defined transformation functions per targets.\n    Function signature must include **kwargs to accept optional arguments like interpolation method, image size, etc:\n\n    Args:\n        image: Image transformation function.\n        mask: Mask transformation function.\n        keypoints: Keypoints transformation function.\n        bboxes: BBoxes transformation function.\n        p: probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    \"\"\"\n\n    def __init__(\n        self,\n        image: Callable[..., Any] | None = None,\n        mask: Callable[..., Any] | None = None,\n        keypoints: Callable[..., Any] | None = None,\n        bboxes: Callable[..., Any] | None = None,\n        name: str | None = None,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.name = name\n        self.custom_apply_fns = {\n            target_name: fmain.noop for target_name in (\"image\", \"mask\", \"keypoints\", \"bboxes\", \"global_label\")\n        }\n        for target_name, custom_apply_fn in {\n            \"image\": image,\n            \"mask\": mask,\n            \"keypoints\": keypoints,\n            \"bboxes\": bboxes,\n        }.items():\n            if custom_apply_fn is not None:\n                if isinstance(custom_apply_fn, LambdaType) and custom_apply_fn.__name__ == \"&lt;lambda&gt;\":\n                    warnings.warn(\n                        \"Using lambda is incompatible with multiprocessing. \"\n                        \"Consider using regular functions or partial().\",\n                        stacklevel=2,\n                    )\n\n                self.custom_apply_fns[target_name] = custom_apply_fn\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        fn = self.custom_apply_fns[\"image\"]\n        return fn(img, **params)\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        fn = self.custom_apply_fns[\"mask\"]\n        return fn(mask, **params)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        is_ndarray = True\n\n        if not isinstance(bboxes, np.ndarray):\n            is_ndarray = False\n            bboxes = np.array(bboxes, dtype=np.float32)\n\n        fn = self.custom_apply_fns[\"bboxes\"]\n        result = fn(bboxes, **params)\n\n        if not is_ndarray:\n            return result.tolist()\n\n        return result\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        is_ndarray = True\n        if not isinstance(keypoints, np.ndarray):\n            is_ndarray = False\n            keypoints = np.array(keypoints, dtype=np.float32)\n\n        fn = self.custom_apply_fns[\"keypoints\"]\n        result = fn(keypoints, **params)\n\n        if not is_ndarray:\n            return result.tolist()\n\n        return result\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        if self.name is None:\n            msg = (\n                \"To make a Lambda transform serializable you should provide the `name` argument, \"\n                \"e.g. `Lambda(name='my_transform', image=&lt;some func&gt;, ...)`.\"\n            )\n            raise ValueError(msg)\n        return {\"__class_fullname__\": self.get_class_fullname(), \"__name__\": self.name}\n\n    def __repr__(self) -&gt; str:\n        state = {\"name\": self.name}\n        state.update(self.custom_apply_fns.items())  # type: ignore[arg-type]\n        state.update(self.get_base_init_args())\n        return f\"{self.__class__.__name__}({format_args(state)})\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.LaplaceParams","title":"<code>class  LaplaceParams</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class LaplaceParams(NoiseParamsBase):\n    noise_type: Literal[\"laplace\"] = \"laplace\"\n    mean_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=-1, max_val=1)),\n    ]\n    scale_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0, max_val=1)),\n    ]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Morphological","title":"<code>class  Morphological</code> <code>       (scale=(2, 3), operation='dilation', p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Apply a morphological operation (dilation or erosion) to an image, with particular value for enhancing document scans.</p> <p>Morphological operations modify the structure of the image. Dilation expands the white (foreground) regions in a binary or grayscale image, while erosion shrinks them. These operations are beneficial in document processing, for example: - Dilation helps in closing up gaps within text or making thin lines thicker,     enhancing legibility for OCR (Optical Character Recognition). - Erosion can remove small white noise and detach connected objects,     making the structure of larger objects more pronounced.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>int or tuple/list of int</code> <p>Specifies the size of the structuring element (kernel) used for the operation. - If an integer is provided, a square kernel of that size will be used. - If a tuple or list is provided, it should contain two integers representing the minimum     and maximum sizes for the dilation kernel.</p> <code>operation</code> <code>Literal[\"erosion\", \"dilation\"]</code> <p>The morphological operation to apply. Default is 'dilation'.</p> <code>p</code> <code>float</code> <p>The probability of applying this transformation. Default is 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>https://github.com/facebookresearch/nougat</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n&gt;&gt;&gt;     A.Morphological(scale=(2, 3), operation='dilation', p=0.5)\n&gt;&gt;&gt; ])\n&gt;&gt;&gt; image = transform(image=image)[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Morphological(DualTransform):\n    \"\"\"Apply a morphological operation (dilation or erosion) to an image,\n    with particular value for enhancing document scans.\n\n    Morphological operations modify the structure of the image.\n    Dilation expands the white (foreground) regions in a binary or grayscale image, while erosion shrinks them.\n    These operations are beneficial in document processing, for example:\n    - Dilation helps in closing up gaps within text or making thin lines thicker,\n        enhancing legibility for OCR (Optical Character Recognition).\n    - Erosion can remove small white noise and detach connected objects,\n        making the structure of larger objects more pronounced.\n\n    Args:\n        scale (int or tuple/list of int): Specifies the size of the structuring element (kernel) used for the operation.\n            - If an integer is provided, a square kernel of that size will be used.\n            - If a tuple or list is provided, it should contain two integers representing the minimum\n                and maximum sizes for the dilation kernel.\n        operation (Literal[\"erosion\", \"dilation\"]): The morphological operation to apply.\n            Default is 'dilation'.\n        p (float, optional): The probability of applying this transformation. Default is 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        https://github.com/facebookresearch/nougat\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        &gt;&gt;&gt;     A.Morphological(scale=(2, 3), operation='dilation', p=0.5)\n        &gt;&gt;&gt; ])\n        &gt;&gt;&gt; image = transform(image=image)[\"image\"]\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: OnePlusIntRangeType\n        operation: MorphologyMode\n\n    def __init__(\n        self,\n        scale: ScaleIntType = (2, 3),\n        operation: MorphologyMode = \"dilation\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale = cast(tuple[int, int], scale)\n        self.operation = operation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        kernel: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.morphology(img, kernel, self.operation)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        kernel: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"]\n\n        denormalized_boxes = denormalize_bboxes(bboxes, image_shape)\n\n        result = fmain.bboxes_morphology(\n            denormalized_boxes,\n            kernel,\n            self.operation,\n            image_shape,\n        )\n\n        return normalize_bboxes(result, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        kernel: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return keypoints\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"kernel\": cv2.getStructuringElement(cv2.MORPH_ELLIPSE, self.scale),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"scale\", \"operation\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.MultiplicativeNoise","title":"<code>class  MultiplicativeNoise</code> <code>       (multiplier=(0.9, 1.1), per_channel=False, elementwise=False, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply multiplicative noise to the input image.</p> <p>This transform multiplies each pixel in the image by a random value or array of values, effectively creating a noise pattern that scales with the image intensity.</p> <p>Parameters:</p> Name Type Description <code>multiplier</code> <code>tuple[float, float]</code> <p>The range for the random multiplier. Defines the range from which the multiplier is sampled. Default: (0.9, 1.1)</p> <code>per_channel</code> <code>bool</code> <p>If True, use a different random multiplier for each channel. If False, use the same multiplier for all channels. Setting this to False is slightly faster. Default: False</p> <code>elementwise</code> <code>bool</code> <p>If True, generates a unique multiplier for each pixel. If False, generates a single multiplier (or one per channel if per_channel=True). Default: False</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>When elementwise=False and per_channel=False, a single multiplier is applied to the entire image.</li> <li>When elementwise=False and per_channel=True, each channel gets a different multiplier.</li> <li>When elementwise=True and per_channel=False, each pixel gets the same multiplier across all channels.</li> <li>When elementwise=True and per_channel=True, each pixel in each channel gets a unique multiplier.</li> <li>Setting per_channel=False is slightly faster, especially for larger images.</li> <li>This transform can be used to simulate various lighting conditions or to create noise that   scales with image intensity.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; noisy_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Multiplicative noise: https://en.wikipedia.org/wiki/Multiplicative_noise</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class MultiplicativeNoise(ImageOnlyTransform):\n    \"\"\"Apply multiplicative noise to the input image.\n\n    This transform multiplies each pixel in the image by a random value or array of values,\n    effectively creating a noise pattern that scales with the image intensity.\n\n    Args:\n        multiplier (tuple[float, float]): The range for the random multiplier.\n            Defines the range from which the multiplier is sampled.\n            Default: (0.9, 1.1)\n\n        per_channel (bool): If True, use a different random multiplier for each channel.\n            If False, use the same multiplier for all channels.\n            Setting this to False is slightly faster.\n            Default: False\n\n        elementwise (bool): If True, generates a unique multiplier for each pixel.\n            If False, generates a single multiplier (or one per channel if per_channel=True).\n            Default: False\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - When elementwise=False and per_channel=False, a single multiplier is applied to the entire image.\n        - When elementwise=False and per_channel=True, each channel gets a different multiplier.\n        - When elementwise=True and per_channel=False, each pixel gets the same multiplier across all channels.\n        - When elementwise=True and per_channel=True, each pixel in each channel gets a unique multiplier.\n        - Setting per_channel=False is slightly faster, especially for larger images.\n        - This transform can be used to simulate various lighting conditions or to create noise that\n          scales with image intensity.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; noisy_image = result[\"image\"]\n\n    References:\n        - Multiplicative noise: https://en.wikipedia.org/wiki/Multiplicative_noise\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        multiplier: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n        per_channel: bool\n        elementwise: bool\n\n    def __init__(\n        self,\n        multiplier: ScaleFloatType = (0.9, 1.1),\n        per_channel: bool = False,\n        elementwise: bool = False,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.multiplier = cast(tuple[float, float], multiplier)\n        self.elementwise = elementwise\n        self.per_channel = per_channel\n\n    def apply(\n        self,\n        img: np.ndarray,\n        multiplier: float | np.ndarray,\n        **kwargs: Any,\n    ) -&gt; np.ndarray:\n        return multiply(img, multiplier)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        num_channels = get_num_channels(image)\n\n        if self.elementwise:\n            shape = image.shape if self.per_channel else (*image.shape[:2], 1)\n        else:\n            shape = (num_channels,) if self.per_channel else (1,)\n\n        multiplier = self.random_generator.uniform(\n            self.multiplier[0],\n            self.multiplier[1],\n            shape,\n        ).astype(np.float32)\n\n        if not self.per_channel and num_channels &gt; 1:\n            # Replicate the multiplier for all channels if not per_channel\n            multiplier = np.repeat(multiplier, num_channels, axis=-1)\n\n        if not self.elementwise and self.per_channel:\n            # Reshape to broadcast correctly when not elementwise but per_channel\n            multiplier = multiplier.reshape(1, 1, -1)\n\n        if multiplier.shape != image.shape:\n            multiplier = multiplier.squeeze()\n\n        return {\"multiplier\": multiplier}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str]:\n        return \"multiplier\", \"elementwise\", \"per_channel\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.NoiseParamsBase","title":"<code>class  NoiseParamsBase</code> <code> </code>","text":"<p>Base class for all noise parameter models.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class NoiseParamsBase(BaseModel):\n    \"\"\"Base class for all noise parameter models.\"\"\"\n\n    model_config = ConfigDict(extra=\"forbid\")\n    noise_type: str\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Normalize","title":"<code>class  Normalize</code> <code>       (mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, normalization='standard', always_apply=None, p=1.0)                           </code>  [view source on GitHub]","text":"<p>Applies various normalization techniques to an image. The specific normalization technique can be selected     with the <code>normalization</code> parameter.</p> <p>Standard normalization is applied using the formula:     <code>img = (img - mean * max_pixel_value) / (std * max_pixel_value)</code>.     Other normalization techniques adjust the image based on global or per-channel statistics,     or scale pixel values to a specified range.</p> <p>Parameters:</p> Name Type Description <code>mean</code> <code>ColorType | None</code> <p>Mean values for standard normalization. For \"standard\" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).</p> <code>std</code> <code>ColorType | None</code> <p>Standard deviation values for standard normalization. For \"standard\" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).</p> <code>max_pixel_value</code> <code>float | None</code> <p>Maximum possible pixel value, used for scaling in standard normalization. Defaults to 255.0.</p> <code>normalization</code> <code>Literal[\"standard\", \"image\", \"image_per_channel\", \"min_max\", \"min_max_per_channel\"]) Specifies the normalization technique to apply. Defaults to \"standard\". - \"standard\"</code> <p>Applies the formula <code>(img - mean * max_pixel_value) / (std * max_pixel_value)</code>.     The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5)     for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO. - \"image\": Normalizes the whole image based on its global mean and standard deviation. - \"image_per_channel\": Normalizes the image per channel based on each channel's mean and standard deviation. - \"min_max\": Scales the image pixel values to a [0, 1] range based on the global     minimum and maximum pixel values. - \"min_max_per_channel\": Scales each channel of the image pixel values to a [0, 1]     range based on the per-channel minimum and maximum pixel values.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Defaults to 1.0.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>For \"standard\" normalization, <code>mean</code>, <code>std</code>, and <code>max_pixel_value</code> must be provided.</li> <li>For other normalization types, these parameters are ignored.</li> <li>For inception normalization, use mean values of (0.5, 0.5, 0.5).</li> <li>For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1).</li> <li>This transform is often used as a final step in image preprocessing pipelines to   prepare images for neural network input.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; # Standard ImageNet normalization\n&gt;&gt;&gt; transform = A.Normalize(\n...     mean=(0.485, 0.456, 0.406),\n...     std=(0.229, 0.224, 0.225),\n...     max_pixel_value=255.0,\n...     p=1.0\n... )\n&gt;&gt;&gt; normalized_image = transform(image=image)[\"image\"]\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Min-max normalization\n&gt;&gt;&gt; transform_minmax = A.Normalize(normalization=\"min_max\", p=1.0)\n&gt;&gt;&gt; normalized_image_minmax = transform_minmax(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>ImageNet mean and std: https://pytorch.org/vision/stable/models.html</li> <li>Inception preprocessing: https://keras.io/api/applications/inceptionv3/</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Normalize(ImageOnlyTransform):\n    \"\"\"Applies various normalization techniques to an image. The specific normalization technique can be selected\n        with the `normalization` parameter.\n\n    Standard normalization is applied using the formula:\n        `img = (img - mean * max_pixel_value) / (std * max_pixel_value)`.\n        Other normalization techniques adjust the image based on global or per-channel statistics,\n        or scale pixel values to a specified range.\n\n    Args:\n        mean (ColorType | None): Mean values for standard normalization.\n            For \"standard\" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).\n        std (ColorType | None): Standard deviation values for standard normalization.\n            For \"standard\" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).\n        max_pixel_value (float | None): Maximum possible pixel value, used for scaling in standard normalization.\n            Defaults to 255.0.\n        normalization (Literal[\"standard\", \"image\", \"image_per_channel\", \"min_max\", \"min_max_per_channel\"])\n            Specifies the normalization technique to apply. Defaults to \"standard\".\n            - \"standard\": Applies the formula `(img - mean * max_pixel_value) / (std * max_pixel_value)`.\n                The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5)\n                for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO.\n            - \"image\": Normalizes the whole image based on its global mean and standard deviation.\n            - \"image_per_channel\": Normalizes the image per channel based on each channel's mean and standard deviation.\n            - \"min_max\": Scales the image pixel values to a [0, 1] range based on the global\n                minimum and maximum pixel values.\n            - \"min_max_per_channel\": Scales each channel of the image pixel values to a [0, 1]\n                range based on the per-channel minimum and maximum pixel values.\n\n        p (float): Probability of applying the transform. Defaults to 1.0.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - For \"standard\" normalization, `mean`, `std`, and `max_pixel_value` must be provided.\n        - For other normalization types, these parameters are ignored.\n        - For inception normalization, use mean values of (0.5, 0.5, 0.5).\n        - For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1).\n        - This transform is often used as a final step in image preprocessing pipelines to\n          prepare images for neural network input.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; # Standard ImageNet normalization\n        &gt;&gt;&gt; transform = A.Normalize(\n        ...     mean=(0.485, 0.456, 0.406),\n        ...     std=(0.229, 0.224, 0.225),\n        ...     max_pixel_value=255.0,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; normalized_image = transform(image=image)[\"image\"]\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Min-max normalization\n        &gt;&gt;&gt; transform_minmax = A.Normalize(normalization=\"min_max\", p=1.0)\n        &gt;&gt;&gt; normalized_image_minmax = transform_minmax(image=image)[\"image\"]\n\n    References:\n        - ImageNet mean and std: https://pytorch.org/vision/stable/models.html\n        - Inception preprocessing: https://keras.io/api/applications/inceptionv3/\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mean: ColorType | None\n        std: ColorType | None\n        max_pixel_value: float | None\n        normalization: Literal[\n            \"standard\",\n            \"image\",\n            \"image_per_channel\",\n            \"min_max\",\n            \"min_max_per_channel\",\n        ]\n\n        @model_validator(mode=\"after\")\n        def validate_normalization(self) -&gt; Self:\n            if (\n                self.mean is None\n                or self.std is None\n                or (self.max_pixel_value is None and self.normalization == \"standard\")\n            ):\n                raise ValueError(\n                    \"mean, std, and max_pixel_value must be provided for standard normalization.\",\n                )\n            return self\n\n    def __init__(\n        self,\n        mean: ColorType | None = (0.485, 0.456, 0.406),\n        std: ColorType | None = (0.229, 0.224, 0.225),\n        max_pixel_value: float | None = 255.0,\n        normalization: Literal[\n            \"standard\",\n            \"image\",\n            \"image_per_channel\",\n            \"min_max\",\n            \"min_max_per_channel\",\n        ] = \"standard\",\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.mean = mean\n        self.mean_np = np.array(mean, dtype=np.float32) * max_pixel_value\n        self.std = std\n        self.denominator = np.reciprocal(\n            np.array(std, dtype=np.float32) * max_pixel_value,\n        )\n        self.max_pixel_value = max_pixel_value\n        self.normalization = normalization\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if self.normalization == \"standard\":\n            return normalize(\n                img,\n                self.mean_np,\n                self.denominator,\n            )\n        return normalize_per_image(img, self.normalization)\n\n    @batch_transform(\"channel\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_images(self, images: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return self.apply(images, **params)\n\n    @batch_transform(\"channel\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_volume(self, volume: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return self.apply(volume, **params)\n\n    @batch_transform(\"channel\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return self.apply(volumes, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"mean\", \"std\", \"max_pixel_value\", \"normalization\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PixelDropout","title":"<code>class  PixelDropout</code> <code>       (dropout_prob=0.01, per_channel=False, drop_value=0, mask_drop_value=None, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Drops random pixels from the image.</p> <p>This transform randomly sets pixels in the image to a specified value, effectively \"dropping out\" those pixels. It can be applied to both the image and its corresponding mask.</p> <p>Parameters:</p> Name Type Description <code>dropout_prob</code> <code>float</code> <p>Probability of dropping out each pixel. Should be in the range [0, 1]. Default: 0.01</p> <code>per_channel</code> <code>bool</code> <p>If True, the dropout mask will be generated independently for each channel. If False, the same dropout mask will be applied to all channels. Default: False</p> <code>drop_value</code> <code>float | Sequence[float] | None</code> <p>Value to assign to the dropped pixels. If None, the value will be randomly sampled for each application:     - For uint8 images: Random integer in [0, 255]     - For float32 images: Random float in [0, 1] If a single number, that value will be used for all dropped pixels. If a sequence, it should contain one value per channel. Default: 0</p> <code>mask_drop_value</code> <code>float | Sequence[float] | None</code> <p>Value to assign to dropped pixels in the mask. If None, the mask will remain unchanged. If a single number, that value will be used for all dropped pixels in the mask. If a sequence, it should contain one value per channel of the mask. Note: Only applicable when per_channel=False. Default: None</p> <code>always_apply</code> <code>bool</code> <p>If True, the transform will always be applied. Default: False</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>When applied to bounding boxes, this transform may cause some boxes to have zero area   if all pixels within the box are dropped. Such boxes will be removed.</li> <li>When applied to keypoints, keypoints that fall on dropped pixels will be removed if   the keypoint processor is configured to remove invisible keypoints.</li> <li>The 'per_channel' option is not supported for mask dropout. If you need to drop pixels   in a multi-channel mask independently, consider applying this transform multiple times   with per_channel=False.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PixelDropout(dropout_prob=0.1, per_channel=True, p=1.0)\n&gt;&gt;&gt; result = transform(image=image, mask=mask)\n&gt;&gt;&gt; dropped_image, dropped_mask = result['image'], result['mask']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PixelDropout(DualTransform):\n    \"\"\"Drops random pixels from the image.\n\n    This transform randomly sets pixels in the image to a specified value, effectively \"dropping out\" those pixels.\n    It can be applied to both the image and its corresponding mask.\n\n    Args:\n        dropout_prob (float): Probability of dropping out each pixel. Should be in the range [0, 1].\n            Default: 0.01\n\n        per_channel (bool): If True, the dropout mask will be generated independently for each channel.\n            If False, the same dropout mask will be applied to all channels.\n            Default: False\n\n        drop_value (float | Sequence[float] | None): Value to assign to the dropped pixels.\n            If None, the value will be randomly sampled for each application:\n                - For uint8 images: Random integer in [0, 255]\n                - For float32 images: Random float in [0, 1]\n            If a single number, that value will be used for all dropped pixels.\n            If a sequence, it should contain one value per channel.\n            Default: 0\n\n        mask_drop_value (float | Sequence[float] | None): Value to assign to dropped pixels in the mask.\n            If None, the mask will remain unchanged.\n            If a single number, that value will be used for all dropped pixels in the mask.\n            If a sequence, it should contain one value per channel of the mask.\n            Note: Only applicable when per_channel=False.\n            Default: None\n\n        always_apply (bool): If True, the transform will always be applied.\n            Default: False\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - When applied to bounding boxes, this transform may cause some boxes to have zero area\n          if all pixels within the box are dropped. Such boxes will be removed.\n        - When applied to keypoints, keypoints that fall on dropped pixels will be removed if\n          the keypoint processor is configured to remove invisible keypoints.\n        - The 'per_channel' option is not supported for mask dropout. If you need to drop pixels\n          in a multi-channel mask independently, consider applying this transform multiple times\n          with per_channel=False.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PixelDropout(dropout_prob=0.1, per_channel=True, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image, mask=mask)\n        &gt;&gt;&gt; dropped_image, dropped_mask = result['image'], result['mask']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        dropout_prob: ProbabilityType\n        per_channel: bool\n        drop_value: ScaleFloatType | None\n        mask_drop_value: ScaleFloatType | None\n\n        @model_validator(mode=\"after\")\n        def validate_mask_drop_value(self) -&gt; Self:\n            if self.mask_drop_value is not None and self.per_channel:\n                msg = \"PixelDropout supports mask only with per_channel=False.\"\n                raise ValueError(msg)\n            return self\n\n    _targets = ALL_TARGETS\n\n    def __init__(\n        self,\n        dropout_prob: float = 0.01,\n        per_channel: bool = False,\n        drop_value: ScaleFloatType | None = 0,\n        mask_drop_value: ScaleFloatType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.dropout_prob = dropout_prob\n        self.per_channel = per_channel\n        self.drop_value = drop_value\n        self.mask_drop_value = mask_drop_value\n\n    def apply(\n        self,\n        img: np.ndarray,\n        drop_mask: np.ndarray,\n        drop_value: float | Sequence[float],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.pixel_dropout(img, drop_mask, drop_value)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        drop_mask: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if self.mask_drop_value is None:\n            return mask\n\n        if mask.ndim == MONO_CHANNEL_DIMENSIONS:\n            drop_mask = np.squeeze(drop_mask)\n\n        return fmain.pixel_dropout(mask, drop_mask, self.mask_drop_value)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        drop_mask: np.ndarray | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if drop_mask is None or self.per_channel:\n            return bboxes\n\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n\n        image_shape = params[\"shape\"][:2]\n\n        denormalized_bboxes = denormalize_bboxes(bboxes, image_shape)\n\n        result = fdropout.mask_dropout_bboxes(\n            denormalized_bboxes,\n            drop_mask,\n            image_shape,\n            processor.params.min_area,\n            processor.params.min_visibility,\n        )\n\n        return normalize_bboxes(result, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        drop_mask: np.ndarray | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if drop_mask is None or self.per_channel:\n            return keypoints\n\n        processor = cast(KeypointsProcessor, self.get_processor(\"keypoints\"))\n\n        if processor is None or not processor.params.remove_invisible:\n            return keypoints\n\n        return fdropout.mask_dropout_keypoints(keypoints, drop_mask)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        shape = image.shape if self.per_channel else image.shape[:2]\n\n        # Use choice to create boolean matrix, if we will use binomial after that we will need type conversion\n        drop_mask = self.random_generator.choice(\n            [True, False],\n            shape,\n            p=[self.dropout_prob, 1 - self.dropout_prob],\n        )\n\n        drop_value: float | Sequence[float] | np.ndarray\n\n        if drop_mask.ndim != image.ndim:\n            drop_mask = np.expand_dims(drop_mask, -1)\n        if self.drop_value is None:\n            drop_shape = 1 if is_grayscale_image(image) else int(image.shape[-1])\n\n            if image.dtype == np.uint8:\n                drop_value = self.random_generator.integers(\n                    0,\n                    int(MAX_VALUES_BY_DTYPE[image.dtype]),\n                    size=drop_shape,\n                    dtype=image.dtype,\n                )\n            elif image.dtype == np.float32:\n                drop_value = self.random_generator.uniform(\n                    0,\n                    1,\n                    size=drop_shape,\n                ).astype(image.dtype)\n            else:\n                raise ValueError(f\"Unsupported dtype: {image.dtype}\")\n        else:\n            drop_value = self.drop_value\n\n        return {\"drop_mask\": drop_mask, \"drop_value\": drop_value}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return (\"dropout_prob\", \"per_channel\", \"drop_value\", \"mask_drop_value\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PlanckianJitter","title":"<code>class  PlanckianJitter</code> <code>       (mode='blackbody', temperature_limit=None, sampling_method='uniform', p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Applies Planckian Jitter to the input image, simulating color temperature variations in illumination.</p> <p>This transform adjusts the color of an image to mimic the effect of different color temperatures of light sources, based on Planck's law of black body radiation. It can simulate the appearance of an image under various lighting conditions, from warm (reddish) to cool (bluish) color casts.</p> <p>PlanckianJitter vs. ColorJitter: PlanckianJitter is fundamentally different from ColorJitter in its approach and use cases: 1. Physics-based: PlanckianJitter is grounded in the physics of light, simulating real-world    color temperature changes. ColorJitter applies arbitrary color adjustments. 2. Natural effects: This transform produces color shifts that correspond to natural lighting    variations, making it ideal for outdoor scene simulation or color constancy problems. 3. Single parameter: Color changes are controlled by a single, physically meaningful parameter    (color temperature), unlike ColorJitter's multiple abstract parameters. 4. Correlated changes: Color shifts are correlated across channels in a way that mimics natural    light, whereas ColorJitter can make independent channel adjustments.</p> <p>When to use PlanckianJitter: - Simulating different times of day or lighting conditions in outdoor scenes - Augmenting data for computer vision tasks that need to be robust to natural lighting changes - Preparing synthetic data to better match real-world lighting variations - Color constancy research or applications - When you need physically plausible color variations rather than arbitrary color changes</p> <p>The logic behind PlanckianJitter: As the color temperature increases: 1. Lower temperatures (around 3000K) produce warm, reddish tones, simulating sunset or incandescent lighting. 2. Mid-range temperatures (around 5500K) correspond to daylight. 3. Higher temperatures (above 7000K) result in cool, bluish tones, similar to overcast sky or shade. This progression mimics the natural variation of sunlight throughout the day and in different weather conditions.</p> <p>Parameters:</p> Name Type Description <code>mode</code> <code>Literal[\"blackbody\", \"cied\"]</code> <p>The mode of the transformation. - \"blackbody\": Simulates blackbody radiation color changes. - \"cied\": Uses the CIE D illuminant series for color temperature simulation. Default: \"blackbody\"</p> <code>temperature_limit</code> <code>tuple[int, int] | None</code> <p>The range of color temperatures (in Kelvin) to sample from. - For \"blackbody\" mode: Should be within [3000K, 15000K]. Default: (3000, 15000) - For \"cied\" mode: Should be within [4000K, 15000K]. Default: (4000, 15000) If None, the default ranges will be used based on the selected mode. Higher temperatures produce cooler (bluish) images, lower temperatures produce warmer (reddish) images.</p> <code>sampling_method</code> <code>Literal[\"uniform\", \"gaussian\"]</code> <p>Method to sample the temperature. - \"uniform\": Samples uniformly across the specified range. - \"gaussian\": Samples from a Gaussian distribution centered at 6500K (approximate daylight). Default: \"uniform\"</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The transform preserves the overall brightness of the image while shifting its color.</li> <li>The \"blackbody\" mode provides a wider range of color shifts, especially in the lower (warmer) temperatures.</li> <li>The \"cied\" mode is based on standard illuminants and may provide more realistic daylight variations.</li> <li>The Gaussian sampling method tends to produce more subtle variations, as it's centered around daylight.</li> <li>Unlike ColorJitter, this transform ensures that color changes are physically plausible and correlated   across channels, maintaining the natural appearance of the scene under different lighting conditions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PlanckianJitter(mode=\"blackbody\",\n...                               temperature_range=(3000, 9000),\n...                               sampling_method=\"uniform\",\n...                               p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; jittered_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Planck's law: https://en.wikipedia.org/wiki/Planck%27s_law</li> <li>CIE Standard Illuminants: https://en.wikipedia.org/wiki/Standard_illuminant</li> <li>Color temperature: https://en.wikipedia.org/wiki/Color_temperature</li> <li>Implementation inspired by: https://github.com/TheZino/PlanckianJitter</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PlanckianJitter(ImageOnlyTransform):\n    \"\"\"Applies Planckian Jitter to the input image, simulating color temperature variations in illumination.\n\n    This transform adjusts the color of an image to mimic the effect of different color temperatures\n    of light sources, based on Planck's law of black body radiation. It can simulate the appearance\n    of an image under various lighting conditions, from warm (reddish) to cool (bluish) color casts.\n\n    PlanckianJitter vs. ColorJitter:\n    PlanckianJitter is fundamentally different from ColorJitter in its approach and use cases:\n    1. Physics-based: PlanckianJitter is grounded in the physics of light, simulating real-world\n       color temperature changes. ColorJitter applies arbitrary color adjustments.\n    2. Natural effects: This transform produces color shifts that correspond to natural lighting\n       variations, making it ideal for outdoor scene simulation or color constancy problems.\n    3. Single parameter: Color changes are controlled by a single, physically meaningful parameter\n       (color temperature), unlike ColorJitter's multiple abstract parameters.\n    4. Correlated changes: Color shifts are correlated across channels in a way that mimics natural\n       light, whereas ColorJitter can make independent channel adjustments.\n\n    When to use PlanckianJitter:\n    - Simulating different times of day or lighting conditions in outdoor scenes\n    - Augmenting data for computer vision tasks that need to be robust to natural lighting changes\n    - Preparing synthetic data to better match real-world lighting variations\n    - Color constancy research or applications\n    - When you need physically plausible color variations rather than arbitrary color changes\n\n    The logic behind PlanckianJitter:\n    As the color temperature increases:\n    1. Lower temperatures (around 3000K) produce warm, reddish tones, simulating sunset or incandescent lighting.\n    2. Mid-range temperatures (around 5500K) correspond to daylight.\n    3. Higher temperatures (above 7000K) result in cool, bluish tones, similar to overcast sky or shade.\n    This progression mimics the natural variation of sunlight throughout the day and in different weather conditions.\n\n    Args:\n        mode (Literal[\"blackbody\", \"cied\"]): The mode of the transformation.\n            - \"blackbody\": Simulates blackbody radiation color changes.\n            - \"cied\": Uses the CIE D illuminant series for color temperature simulation.\n            Default: \"blackbody\"\n\n        temperature_limit (tuple[int, int] | None): The range of color temperatures (in Kelvin) to sample from.\n            - For \"blackbody\" mode: Should be within [3000K, 15000K]. Default: (3000, 15000)\n            - For \"cied\" mode: Should be within [4000K, 15000K]. Default: (4000, 15000)\n            If None, the default ranges will be used based on the selected mode.\n            Higher temperatures produce cooler (bluish) images, lower temperatures produce warmer (reddish) images.\n\n        sampling_method (Literal[\"uniform\", \"gaussian\"]): Method to sample the temperature.\n            - \"uniform\": Samples uniformly across the specified range.\n            - \"gaussian\": Samples from a Gaussian distribution centered at 6500K (approximate daylight).\n            Default: \"uniform\"\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The transform preserves the overall brightness of the image while shifting its color.\n        - The \"blackbody\" mode provides a wider range of color shifts, especially in the lower (warmer) temperatures.\n        - The \"cied\" mode is based on standard illuminants and may provide more realistic daylight variations.\n        - The Gaussian sampling method tends to produce more subtle variations, as it's centered around daylight.\n        - Unlike ColorJitter, this transform ensures that color changes are physically plausible and correlated\n          across channels, maintaining the natural appearance of the scene under different lighting conditions.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PlanckianJitter(mode=\"blackbody\",\n        ...                               temperature_range=(3000, 9000),\n        ...                               sampling_method=\"uniform\",\n        ...                               p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; jittered_image = result[\"image\"]\n\n    References:\n        - Planck's law: https://en.wikipedia.org/wiki/Planck%27s_law\n        - CIE Standard Illuminants: https://en.wikipedia.org/wiki/Standard_illuminant\n        - Color temperature: https://en.wikipedia.org/wiki/Color_temperature\n        - Implementation inspired by: https://github.com/TheZino/PlanckianJitter\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mode: Literal[\"blackbody\", \"cied\"]\n        temperature_limit: Annotated[tuple[int, int], AfterValidator(nondecreasing)] | None\n        sampling_method: Literal[\"uniform\", \"gaussian\"]\n\n        @model_validator(mode=\"after\")\n        def validate_temperature(self) -&gt; Self:\n            max_temp = int(PLANKIAN_JITTER_CONST[\"MAX_TEMP\"])\n\n            if self.temperature_limit is None:\n                if self.mode == \"blackbody\":\n                    self.temperature_limit = (\n                        int(PLANKIAN_JITTER_CONST[\"MIN_BLACKBODY_TEMP\"]),\n                        max_temp,\n                    )\n                elif self.mode == \"cied\":\n                    self.temperature_limit = (\n                        int(PLANKIAN_JITTER_CONST[\"MIN_CIED_TEMP\"]),\n                        max_temp,\n                    )\n            else:\n                if self.mode == \"blackbody\" and (\n                    min(self.temperature_limit) &lt; PLANKIAN_JITTER_CONST[\"MIN_BLACKBODY_TEMP\"]\n                    or max(self.temperature_limit) &gt; max_temp\n                ):\n                    raise ValueError(\n                        \"Temperature limits for blackbody should be in [3000, 15000] range\",\n                    )\n                if self.mode == \"cied\" and (\n                    min(self.temperature_limit) &lt; PLANKIAN_JITTER_CONST[\"MIN_CIED_TEMP\"]\n                    or max(self.temperature_limit) &gt; max_temp\n                ):\n                    raise ValueError(\n                        \"Temperature limits for CIED should be in [4000, 15000] range\",\n                    )\n\n                if not self.temperature_limit[0] &lt;= PLANKIAN_JITTER_CONST[\"WHITE_TEMP\"] &lt;= self.temperature_limit[1]:\n                    raise ValueError(\n                        \"White temperature should be within the temperature limits\",\n                    )\n\n            return self\n\n    def __init__(\n        self,\n        mode: Literal[\"blackbody\", \"cied\"] = \"blackbody\",\n        temperature_limit: tuple[int, int] | None = None,\n        sampling_method: Literal[\"uniform\", \"gaussian\"] = \"uniform\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ) -&gt; None:\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.mode = mode\n        self.temperature_limit = cast(tuple[int, int], temperature_limit)\n        self.sampling_method = sampling_method\n\n    def apply(self, img: np.ndarray, temperature: int, **params: Any) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.planckian_jitter(img, temperature, mode=self.mode)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        sampling_prob_boundary = PLANKIAN_JITTER_CONST[\"SAMPLING_TEMP_PROB\"]\n        sampling_temp_boundary = PLANKIAN_JITTER_CONST[\"WHITE_TEMP\"]\n\n        if self.sampling_method == \"uniform\":\n            # Split into 2 cases to avoid selecting cold temperatures (&gt;6000) too often\n            if self.py_random.random() &lt; sampling_prob_boundary:\n                temperature = self.py_random.uniform(\n                    self.temperature_limit[0],\n                    sampling_temp_boundary,\n                )\n            else:\n                temperature = self.py_random.uniform(\n                    sampling_temp_boundary,\n                    self.temperature_limit[1],\n                )\n        elif self.sampling_method == \"gaussian\":\n            # Sample values from asymmetric gaussian distribution\n            if self.py_random.random() &lt; sampling_prob_boundary:\n                # Left side\n                shift = np.abs(\n                    self.py_random.gauss(\n                        0,\n                        np.abs(sampling_temp_boundary - self.temperature_limit[0]) / 3,\n                    ),\n                )\n                temperature = sampling_temp_boundary - shift\n            else:\n                # Right side\n                shift = np.abs(\n                    self.py_random.gauss(\n                        0,\n                        np.abs(self.temperature_limit[1] - sampling_temp_boundary) / 3,\n                    ),\n                )\n                temperature = sampling_temp_boundary + shift\n        else:\n            raise ValueError(f\"Unknown sampling method: {self.sampling_method}\")\n\n        # Ensure temperature is within the valid range\n        temperature = np.clip(\n            temperature,\n            self.temperature_limit[0],\n            self.temperature_limit[1],\n        )\n\n        return {\"temperature\": int(temperature)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"mode\", \"temperature_limit\", \"sampling_method\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PlasmaBrightnessContrast","title":"<code>class  PlasmaBrightnessContrast</code> <code>       (brightness_range=(-0.3, 0.3), contrast_range=(-0.3, 0.3), plasma_size=256, roughness=3.0, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply plasma fractal pattern to modify image brightness and contrast.</p> <p>This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns that are then used to create spatially-varying brightness and contrast adjustments. The result is a natural-looking, non-uniform modification of the image.</p> <p>Parameters:</p> Name Type Description <code>brightness_range</code> <code>float, float</code> <p>Range for brightness adjustment strength. Values between -1 and 1: - Positive values increase brightness - Negative values decrease brightness - 0 means no brightness change Default: (-0.3, 0.3)</p> <code>contrast_range</code> <code>float, float</code> <p>Range for contrast adjustment strength. Values between -1 and 1: - Positive values increase contrast - Negative values decrease contrast - 0 means no contrast change Default: (-0.3, 0.3)</p> <code>plasma_size</code> <code>int</code> <p>Size of the plasma pattern. Will be rounded up to nearest power of 2. Larger values create more detailed patterns. Default: 256</p> <code>roughness</code> <code>float</code> <p>Controls the roughness of the plasma pattern. Higher values create more rough/sharp transitions. Must be greater than 0. Typical values are between 1.0 and 5.0. Default: 3.0</p> <p>p (float): Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Mathematical Formulation:     1. Plasma Pattern Generation:        The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1] by:        - Starting with random corner values        - Recursively computing midpoints using:          M = (V1 + V2 + V3 + V4)/4 + R(d)        where V1..V4 are corner values and R(d) is random noise that        decreases with distance d according to the roughness parameter.</p> <pre><code>2. Brightness Adjustment:\n   For each pixel (x,y):\n   O(x,y) = I(x,y) + b\u00b7P(x,y)\u00b7max_value\n   where:\n   - I is the input image\n   - b is the brightness factor\n   - P is the plasma pattern\n   - max_value is the maximum possible pixel value\n\n3. Contrast Adjustment:\n   For each pixel (x,y):\n   O(x,y) = \u03bc + (I(x,y) - \u03bc)\u00b7(1 + c\u00b7P(x,y))\n   where:\n   - \u03bc is the mean pixel value\n   - c is the contrast factor\n   - P is the plasma pattern\n</code></pre> <p>Note</p> <ul> <li>The plasma pattern creates smooth, organic variations in the adjustments</li> <li>Brightness and contrast modifications are applied sequentially</li> <li>Final values are clipped to valid range [0, max_value]</li> <li>The same plasma pattern is used for both brightness and contrast   to maintain coherent spatial variations</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-parameters","title":"Default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(p=1.0)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-adjustments-with-fine-pattern","title":"Custom adjustments with fine pattern","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(\n...     brightness_range=(-0.5, 0.5),\n...     contrast_range=(-0.3, 0.3),\n...     plasma_size=512,  # More detailed pattern\n...     roughness=2.5,    # Smoother transitions\n...     p=1.0\n... )\n</code></pre> <p>References</p> <p>.. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"        Communications of the ACM, 1982.        Paper introducing the Diamond-Square algorithm.</p> <p>.. [2] Miller, \"The Diamond-Square Algorithm: A Detailed Analysis,\"        Journal of Computer Graphics Techniques, 2016.        Comprehensive analysis of the algorithm and its properties.</p> <p>.. [3] Ebert et al., \"Texturing &amp; Modeling: A Procedural Approach,\"        Chapter 12: Noise, Hypertexture, Antialiasing, and Gesture.        Detailed coverage of procedural noise patterns.</p> <p>.. [4] Diamond-Square algorithm:        https://en.wikipedia.org/wiki/Diamond-square_algorithm</p> <p>.. [5] Plasma effect:        https://lodev.org/cgtutor/plasma.html</p> <p>See Also:     - RandomBrightnessContrast: For uniform brightness/contrast adjustments     - CLAHE: For contrast limited adaptive histogram equalization     - FancyPCA: For color-based contrast enhancement     - HistogramMatching: For reference-based contrast adjustment</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PlasmaBrightnessContrast(ImageOnlyTransform):\n    \"\"\"Apply plasma fractal pattern to modify image brightness and contrast.\n\n    This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns\n    that are then used to create spatially-varying brightness and contrast adjustments.\n    The result is a natural-looking, non-uniform modification of the image.\n\n    Args:\n        brightness_range ((float, float)): Range for brightness adjustment strength.\n            Values between -1 and 1:\n            - Positive values increase brightness\n            - Negative values decrease brightness\n            - 0 means no brightness change\n            Default: (-0.3, 0.3)\n\n        contrast_range ((float, float)): Range for contrast adjustment strength.\n            Values between -1 and 1:\n            - Positive values increase contrast\n            - Negative values decrease contrast\n            - 0 means no contrast change\n            Default: (-0.3, 0.3)\n\n        plasma_size (int): Size of the plasma pattern. Will be rounded up to nearest power of 2.\n            Larger values create more detailed patterns. Default: 256\n\n        roughness (float): Controls the roughness of the plasma pattern.\n            Higher values create more rough/sharp transitions.\n            Must be greater than 0.\n            Typical values are between 1.0 and 5.0. Default: 3.0\n\n            p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Mathematical Formulation:\n        1. Plasma Pattern Generation:\n           The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1] by:\n           - Starting with random corner values\n           - Recursively computing midpoints using:\n             M = (V1 + V2 + V3 + V4)/4 + R(d)\n           where V1..V4 are corner values and R(d) is random noise that\n           decreases with distance d according to the roughness parameter.\n\n        2. Brightness Adjustment:\n           For each pixel (x,y):\n           O(x,y) = I(x,y) + b\u00b7P(x,y)\u00b7max_value\n           where:\n           - I is the input image\n           - b is the brightness factor\n           - P is the plasma pattern\n           - max_value is the maximum possible pixel value\n\n        3. Contrast Adjustment:\n           For each pixel (x,y):\n           O(x,y) = \u03bc + (I(x,y) - \u03bc)\u00b7(1 + c\u00b7P(x,y))\n           where:\n           - \u03bc is the mean pixel value\n           - c is the contrast factor\n           - P is the plasma pattern\n\n    Note:\n        - The plasma pattern creates smooth, organic variations in the adjustments\n        - Brightness and contrast modifications are applied sequentially\n        - Final values are clipped to valid range [0, max_value]\n        - The same plasma pattern is used for both brightness and contrast\n          to maintain coherent spatial variations\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Default parameters\n        &gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(p=1.0)\n\n        # Custom adjustments with fine pattern\n        &gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(\n        ...     brightness_range=(-0.5, 0.5),\n        ...     contrast_range=(-0.3, 0.3),\n        ...     plasma_size=512,  # More detailed pattern\n        ...     roughness=2.5,    # Smoother transitions\n        ...     p=1.0\n        ... )\n\n    References:\n        .. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"\n               Communications of the ACM, 1982.\n               Paper introducing the Diamond-Square algorithm.\n\n        .. [2] Miller, \"The Diamond-Square Algorithm: A Detailed Analysis,\"\n               Journal of Computer Graphics Techniques, 2016.\n               Comprehensive analysis of the algorithm and its properties.\n\n        .. [3] Ebert et al., \"Texturing &amp; Modeling: A Procedural Approach,\"\n               Chapter 12: Noise, Hypertexture, Antialiasing, and Gesture.\n               Detailed coverage of procedural noise patterns.\n\n        .. [4] Diamond-Square algorithm:\n               https://en.wikipedia.org/wiki/Diamond-square_algorithm\n\n        .. [5] Plasma effect:\n               https://lodev.org/cgtutor/plasma.html\n\n    See Also:\n        - RandomBrightnessContrast: For uniform brightness/contrast adjustments\n        - CLAHE: For contrast limited adaptive histogram equalization\n        - FancyPCA: For color-based contrast enhancement\n        - HistogramMatching: For reference-based contrast adjustment\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        brightness_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(-1, 1)),\n        ]\n        contrast_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(-1, 1)),\n        ]\n        plasma_size: int = Field(default=256, gt=0)\n        roughness: float = Field(default=3.0, gt=0)\n\n    def __init__(\n        self,\n        brightness_range: tuple[float, float] = (-0.3, 0.3),\n        contrast_range: tuple[float, float] = (-0.3, 0.3),\n        plasma_size: int = 256,\n        roughness: float = 3.0,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.brightness_range = brightness_range\n        self.contrast_range = contrast_range\n        self.plasma_size = plasma_size\n        self.roughness = roughness\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample adjustment strengths\n        brightness = self.py_random.uniform(*self.brightness_range)\n        contrast = self.py_random.uniform(*self.contrast_range)\n\n        # Generate plasma pattern\n        plasma = fmain.generate_plasma_pattern(\n            target_shape=image.shape[:2],\n            size=self.plasma_size,\n            roughness=self.roughness,\n            random_generator=self.random_generator,\n        )\n\n        return {\n            \"brightness_factor\": brightness,\n            \"contrast_factor\": contrast,\n            \"plasma_pattern\": plasma,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        brightness_factor: float,\n        contrast_factor: float,\n        plasma_pattern: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.apply_plasma_brightness_contrast(\n            img,\n            brightness_factor,\n            contrast_factor,\n            plasma_pattern,\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"brightness_range\", \"contrast_range\", \"plasma_size\", \"roughness\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PlasmaShadow","title":"<code>class  PlasmaShadow</code> <code>       (shadow_intensity_range=(0.3, 0.7), plasma_size=256, roughness=3.0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply plasma-based shadow effect to the image.</p> <p>Creates organic-looking shadows using plasma fractal noise pattern. The shadow intensity varies smoothly across the image, creating natural-looking darkening effects that can simulate shadows, shading, or lighting variations.</p> <p>Parameters:</p> Name Type Description <code>shadow_intensity_range</code> <code>tuple[float, float]</code> <p>Range for shadow intensity. Values between 0 and 1: - 0 means no shadow (original image) - 1 means maximum darkening (black) - Values between create partial shadows Default: (0.3, 0.7)</p> <code>plasma_size</code> <code>int</code> <p>Size of the plasma pattern. Will be rounded up to nearest power of 2. Larger values create more detailed shadow patterns: - Small values (~64): Large, smooth shadow regions - Medium values (~256): Balanced detail level - Large values (~512+): Fine shadow details Default: 256</p> <code>roughness</code> <code>float</code> <p>Controls the roughness of the plasma pattern. Higher values create more rough/sharp shadow transitions. Must be greater than 0: - Low values (~1.0): Very smooth transitions - Medium values (~3.0): Natural-looking shadows - High values (~5.0): More dramatic, sharp shadows Default: 3.0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform darkens the image using a plasma pattern</li> <li>Works with any number of channels (grayscale, RGB, multispectral)</li> <li>Shadow pattern is generated using Diamond-Square algorithm</li> <li>The same shadow pattern is applied to all channels</li> <li>Final values are clipped to valid range [0, max_value]</li> </ul> <p>Mathematical Formulation:     1. Plasma Pattern Generation:        The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1]        with fractal characteristics controlled by roughness parameter.</p> <pre><code>2. Shadow Application:\n   For each pixel (x,y):\n   O(x,y) = I(x,y) * (1 - i\u00b7P(x,y))\n   where:\n   - I is the input image\n   - P is the plasma pattern\n   - i is the shadow intensity\n   - O is the output image\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-parameters-for-natural-shadows","title":"Default parameters for natural shadows","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaShadow(p=1.0)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--subtle-smooth-shadows","title":"Subtle, smooth shadows","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaShadow(\n...     shadow_intensity=(0.1, 0.3),\n...     plasma_size=128,\n...     roughness=1.5,\n...     p=1.0\n... )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--dramatic-detailed-shadows","title":"Dramatic, detailed shadows","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaShadow(\n...     shadow_intensity=(0.5, 0.9),\n...     plasma_size=512,\n...     roughness=4.0,\n...     p=1.0\n... )\n</code></pre> <p>References</p> <p>.. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"        Communications of the ACM, 1982.        Paper introducing the Diamond-Square algorithm.</p> <p>.. [2] Diamond-Square algorithm:        https://en.wikipedia.org/wiki/Diamond-square_algorithm</p> <p>See Also:     - PlasmaBrightnessContrast: For brightness/contrast adjustments using plasma patterns     - RandomShadow: For geometric shadow effects     - RandomToneCurve: For global lighting adjustments</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PlasmaShadow(ImageOnlyTransform):\n    \"\"\"Apply plasma-based shadow effect to the image.\n\n    Creates organic-looking shadows using plasma fractal noise pattern.\n    The shadow intensity varies smoothly across the image, creating natural-looking\n    darkening effects that can simulate shadows, shading, or lighting variations.\n\n    Args:\n        shadow_intensity_range (tuple[float, float]): Range for shadow intensity.\n            Values between 0 and 1:\n            - 0 means no shadow (original image)\n            - 1 means maximum darkening (black)\n            - Values between create partial shadows\n            Default: (0.3, 0.7)\n\n        plasma_size (int): Size of the plasma pattern. Will be rounded up to nearest power of 2.\n            Larger values create more detailed shadow patterns:\n            - Small values (~64): Large, smooth shadow regions\n            - Medium values (~256): Balanced detail level\n            - Large values (~512+): Fine shadow details\n            Default: 256\n\n        roughness (float): Controls the roughness of the plasma pattern.\n            Higher values create more rough/sharp shadow transitions.\n            Must be greater than 0:\n            - Low values (~1.0): Very smooth transitions\n            - Medium values (~3.0): Natural-looking shadows\n            - High values (~5.0): More dramatic, sharp shadows\n            Default: 3.0\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform darkens the image using a plasma pattern\n        - Works with any number of channels (grayscale, RGB, multispectral)\n        - Shadow pattern is generated using Diamond-Square algorithm\n        - The same shadow pattern is applied to all channels\n        - Final values are clipped to valid range [0, max_value]\n\n    Mathematical Formulation:\n        1. Plasma Pattern Generation:\n           The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1]\n           with fractal characteristics controlled by roughness parameter.\n\n        2. Shadow Application:\n           For each pixel (x,y):\n           O(x,y) = I(x,y) * (1 - i\u00b7P(x,y))\n           where:\n           - I is the input image\n           - P is the plasma pattern\n           - i is the shadow intensity\n           - O is the output image\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Default parameters for natural shadows\n        &gt;&gt;&gt; transform = A.PlasmaShadow(p=1.0)\n\n        # Subtle, smooth shadows\n        &gt;&gt;&gt; transform = A.PlasmaShadow(\n        ...     shadow_intensity=(0.1, 0.3),\n        ...     plasma_size=128,\n        ...     roughness=1.5,\n        ...     p=1.0\n        ... )\n\n        # Dramatic, detailed shadows\n        &gt;&gt;&gt; transform = A.PlasmaShadow(\n        ...     shadow_intensity=(0.5, 0.9),\n        ...     plasma_size=512,\n        ...     roughness=4.0,\n        ...     p=1.0\n        ... )\n\n    References:\n        .. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"\n               Communications of the ACM, 1982.\n               Paper introducing the Diamond-Square algorithm.\n\n        .. [2] Diamond-Square algorithm:\n               https://en.wikipedia.org/wiki/Diamond-square_algorithm\n\n    See Also:\n        - PlasmaBrightnessContrast: For brightness/contrast adjustments using plasma patterns\n        - RandomShadow: For geometric shadow effects\n        - RandomToneCurve: For global lighting adjustments\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        shadow_intensity_range: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        plasma_size: int = Field(default=256, gt=0)\n        roughness: float = Field(default=3.0, gt=0)\n\n    def __init__(\n        self,\n        shadow_intensity_range: tuple[float, float] = (0.3, 0.7),\n        plasma_size: int = 256,\n        roughness: float = 3.0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.shadow_intensity_range = shadow_intensity_range\n        self.plasma_size = plasma_size\n        self.roughness = roughness\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample shadow intensity\n        intensity = self.py_random.uniform(*self.shadow_intensity_range)\n\n        # Generate plasma pattern\n        plasma = fmain.generate_plasma_pattern(\n            target_shape=image.shape[:2],\n            size=self.plasma_size,\n            roughness=self.roughness,\n            random_generator=self.random_generator,\n        )\n\n        return {\n            \"intensity\": intensity,\n            \"plasma_pattern\": plasma,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        intensity: float,\n        plasma_pattern: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.apply_plasma_shadow(img, intensity, plasma_pattern)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"shadow_intensity_range\", \"plasma_size\", \"roughness\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Posterize","title":"<code>class  Posterize</code> <code>       (num_bits=4, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Reduces the number of bits for each color channel in the image.</p> <p>This transform applies color posterization, a technique that reduces the number of distinct colors used in an image. It works by lowering the number of bits used to represent each color channel, effectively creating a \"poster-like\" effect with fewer color gradations.</p> <p>Parameters:</p> Name Type Description <code>num_bits</code> <code>int | tuple[int, int] | list[int] | list[tuple[int, int]]</code> <p>Defines the number of bits to keep for each color channel. Can be specified in several ways: - Single int: Same number of bits for all channels. Range: [1, 7]. - tuple of two ints: (min_bits, max_bits) to randomly choose from. Range for each: [1, 7]. - list of three ints: Specific number of bits for each channel [r_bits, g_bits, b_bits]. - list of three tuples: Ranges for each channel [(r_min, r_max), (g_min, g_max), (b_min, b_max)]. Default: 4</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The effect becomes more pronounced as the number of bits is reduced.</li> <li>This transform can create interesting artistic effects or be used for image compression simulation.</li> <li>Posterization is particularly useful for:</li> <li>Creating stylized or retro-looking images</li> <li>Reducing the color palette for specific artistic effects</li> <li>Simulating the look of older or lower-quality digital images</li> <li>Data augmentation in scenarios where color depth might vary</li> </ul> <p>Mathematical Background:     For an 8-bit color channel, posterization to n bits can be expressed as:     new_value = (old_value &gt;&gt; (8 - n)) &lt;&lt; (8 - n)     This operation keeps the n most significant bits and sets the rest to zero.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--posterize-all-channels-to-3-bits","title":"Posterize all channels to 3 bits","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=3, p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--randomly-posterize-between-2-and-5-bits","title":"Randomly posterize between 2 and 5 bits","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=(2, 5), p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--different-bits-for-each-channel","title":"Different bits for each channel","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=[3, 5, 2], p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--range-of-bits-for-each-channel","title":"Range of bits for each channel","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=[(1, 3), (3, 5), (2, 4)], p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Color Quantization: https://en.wikipedia.org/wiki/Color_quantization</li> <li>Posterization: https://en.wikipedia.org/wiki/Posterization</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Posterize(ImageOnlyTransform):\n    \"\"\"Reduces the number of bits for each color channel in the image.\n\n    This transform applies color posterization, a technique that reduces the number of distinct\n    colors used in an image. It works by lowering the number of bits used to represent each\n    color channel, effectively creating a \"poster-like\" effect with fewer color gradations.\n\n    Args:\n        num_bits (int | tuple[int, int] | list[int] | list[tuple[int, int]]):\n            Defines the number of bits to keep for each color channel. Can be specified in several ways:\n            - Single int: Same number of bits for all channels. Range: [1, 7].\n            - tuple of two ints: (min_bits, max_bits) to randomly choose from. Range for each: [1, 7].\n            - list of three ints: Specific number of bits for each channel [r_bits, g_bits, b_bits].\n            - list of three tuples: Ranges for each channel [(r_min, r_max), (g_min, g_max), (b_min, b_max)].\n            Default: 4\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The effect becomes more pronounced as the number of bits is reduced.\n        - This transform can create interesting artistic effects or be used for image compression simulation.\n        - Posterization is particularly useful for:\n          * Creating stylized or retro-looking images\n          * Reducing the color palette for specific artistic effects\n          * Simulating the look of older or lower-quality digital images\n          * Data augmentation in scenarios where color depth might vary\n\n    Mathematical Background:\n        For an 8-bit color channel, posterization to n bits can be expressed as:\n        new_value = (old_value &gt;&gt; (8 - n)) &lt;&lt; (8 - n)\n        This operation keeps the n most significant bits and sets the rest to zero.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Posterize all channels to 3 bits\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=3, p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n        # Randomly posterize between 2 and 5 bits\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=(2, 5), p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n        # Different bits for each channel\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=[3, 5, 2], p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n        # Range of bits for each channel\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=[(1, 3), (3, 5), (2, 4)], p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n    References:\n        - Color Quantization: https://en.wikipedia.org/wiki/Color_quantization\n        - Posterization: https://en.wikipedia.org/wiki/Posterization\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_bits: int | tuple[int, int] | list[tuple[int, int]]\n\n        @field_validator(\"num_bits\")\n        @classmethod\n        def validate_num_bits(\n            cls,\n            num_bits: Any,\n        ) -&gt; tuple[int, int] | list[tuple[int, int]]:\n            if isinstance(num_bits, int):\n                if num_bits &lt; 1 or num_bits &gt; SEVEN:\n                    raise ValueError(\"num_bits must be in the range [1, 7]\")\n                return (num_bits, num_bits)\n            if isinstance(num_bits, Sequence) and len(num_bits) &gt; PAIR:\n                return [to_tuple(i, i) for i in num_bits]\n            return cast(tuple[int, int], to_tuple(num_bits, num_bits))\n\n    def __init__(\n        self,\n        num_bits: int | tuple[int, int] | list[tuple[int, int]] = 4,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_bits = cast(Union[tuple[int, int], list[tuple[int, int]]], num_bits)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        num_bits: Literal[1, 2, 3, 4, 5, 6, 7] | list[Literal[1, 2, 3, 4, 5, 6, 7]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.posterize(img, num_bits)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        if isinstance(self.num_bits, list):\n            num_bits = [self.py_random.randint(*i) for i in self.num_bits]\n            return {\"num_bits\": num_bits}\n        return {\"num_bits\": self.py_random.randint(*self.num_bits)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"num_bits\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RGBShift","title":"<code>class  RGBShift</code> <code>       (r_shift_limit=(-20, 20), g_shift_limit=(-20, 20), b_shift_limit=(-20, 20), p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Randomly shift values for each channel of the input RGB image.</p> <p>A specialized version of AdditiveNoise that applies constant uniform shifts to RGB channels. Each channel (R,G,B) can have its own shift range specified.</p> <p>Parameters:</p> Name Type Description <code>r_shift_limit</code> <code>int, int) or int</code> <p>Range for shifting the red channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-r_shift_limit, r_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)</p> <code>g_shift_limit</code> <code>int, int) or int</code> <p>Range for shifting the green channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-g_shift_limit, g_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)</p> <code>b_shift_limit</code> <code>int, int) or int</code> <p>Range for shifting the blue channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-b_shift_limit, b_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Values are shifted independently for each channel</li> <li>For uint8 images:<ul> <li>Input ranges like (-20, 20) represent pixel value shifts</li> <li>A shift of 20 means adding 20 to that channel</li> <li>Final values are clipped to [0, 255]</li> </ul> </li> <li>For float32 images:<ul> <li>Input ranges like (-0.1, 0.1) represent relative shifts</li> <li>A shift of 0.1 means adding 0.1 to that channel</li> <li>Final values are clipped to [0, 1]</li> </ul> </li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--shift-rgb-channels-of-uint8-image","title":"Shift RGB channels of uint8 image","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RGBShift(\n...     r_shift_limit=30,  # Will sample red shift from [-30, 30]\n...     g_shift_limit=(-20, 20),  # Will sample green shift from [-20, 20]\n...     b_shift_limit=(-10, 10),  # Will sample blue shift from [-10, 10]\n...     p=1.0\n... )\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; shifted = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--same-effect-using-additivenoise","title":"Same effect using AdditiveNoise","text":"Python<pre><code>&gt;&gt;&gt; transform = A.AdditiveNoise(\n...     noise_type=\"uniform\",\n...     spatial_mode=\"constant\",  # One value per channel\n...     noise_params={\n...         \"ranges\": [(-30/255, 30/255), (-20/255, 20/255), (-10/255, 10/255)]\n...     },\n...     p=1.0\n... )\n</code></pre> <p>See Also:     - AdditiveNoise: More general noise transform with various options:         * Different noise distributions (uniform, gaussian, laplace, beta)         * Spatial modes (constant, per-pixel, shared)         * Approximation for faster computation     - RandomToneCurve: For non-linear color transformations     - RandomBrightnessContrast: For combined brightness and contrast adjustments     - PlankianJitter: For color temperature adjustments     - HueSaturationValue: For HSV color space adjustments     - ColorJitter: For combined brightness, contrast, saturation adjustments</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RGBShift(AdditiveNoise):\n    \"\"\"Randomly shift values for each channel of the input RGB image.\n\n    A specialized version of AdditiveNoise that applies constant uniform shifts to RGB channels.\n    Each channel (R,G,B) can have its own shift range specified.\n\n    Args:\n        r_shift_limit ((int, int) or int): Range for shifting the red channel. Options:\n            - If tuple (min, max): Sample shift value from this range\n            - If int: Sample shift value from (-r_shift_limit, r_shift_limit)\n            - For uint8 images: Values represent absolute shifts in [0, 255]\n            - For float images: Values represent relative shifts in [0, 1]\n            Default: (-20, 20)\n\n        g_shift_limit ((int, int) or int): Range for shifting the green channel. Options:\n            - If tuple (min, max): Sample shift value from this range\n            - If int: Sample shift value from (-g_shift_limit, g_shift_limit)\n            - For uint8 images: Values represent absolute shifts in [0, 255]\n            - For float images: Values represent relative shifts in [0, 1]\n            Default: (-20, 20)\n\n        b_shift_limit ((int, int) or int): Range for shifting the blue channel. Options:\n            - If tuple (min, max): Sample shift value from this range\n            - If int: Sample shift value from (-b_shift_limit, b_shift_limit)\n            - For uint8 images: Values represent absolute shifts in [0, 255]\n            - For float images: Values represent relative shifts in [0, 1]\n            Default: (-20, 20)\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Values are shifted independently for each channel\n        - For uint8 images:\n            * Input ranges like (-20, 20) represent pixel value shifts\n            * A shift of 20 means adding 20 to that channel\n            * Final values are clipped to [0, 255]\n        - For float32 images:\n            * Input ranges like (-0.1, 0.1) represent relative shifts\n            * A shift of 0.1 means adding 0.1 to that channel\n            * Final values are clipped to [0, 1]\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n\n        # Shift RGB channels of uint8 image\n        &gt;&gt;&gt; transform = A.RGBShift(\n        ...     r_shift_limit=30,  # Will sample red shift from [-30, 30]\n        ...     g_shift_limit=(-20, 20),  # Will sample green shift from [-20, 20]\n        ...     b_shift_limit=(-10, 10),  # Will sample blue shift from [-10, 10]\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; shifted = transform(image=image)[\"image\"]\n\n        # Same effect using AdditiveNoise\n        &gt;&gt;&gt; transform = A.AdditiveNoise(\n        ...     noise_type=\"uniform\",\n        ...     spatial_mode=\"constant\",  # One value per channel\n        ...     noise_params={\n        ...         \"ranges\": [(-30/255, 30/255), (-20/255, 20/255), (-10/255, 10/255)]\n        ...     },\n        ...     p=1.0\n        ... )\n\n    See Also:\n        - AdditiveNoise: More general noise transform with various options:\n            * Different noise distributions (uniform, gaussian, laplace, beta)\n            * Spatial modes (constant, per-pixel, shared)\n            * Approximation for faster computation\n        - RandomToneCurve: For non-linear color transformations\n        - RandomBrightnessContrast: For combined brightness and contrast adjustments\n        - PlankianJitter: For color temperature adjustments\n        - HueSaturationValue: For HSV color space adjustments\n        - ColorJitter: For combined brightness, contrast, saturation adjustments\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        r_shift_limit: SymmetricRangeType\n        g_shift_limit: SymmetricRangeType\n        b_shift_limit: SymmetricRangeType\n\n    def __init__(\n        self,\n        r_shift_limit: ScaleFloatType = (-20, 20),\n        g_shift_limit: ScaleFloatType = (-20, 20),\n        b_shift_limit: ScaleFloatType = (-20, 20),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        # Convert RGB shift limits to normalized ranges if needed\n        def normalize_range(limit: tuple[float, float]) -&gt; tuple[float, float]:\n            # If any value is &gt; 1, assume uint8 range and normalize\n            if abs(limit[0]) &gt; 1 or abs(limit[1]) &gt; 1:\n                return (limit[0] / 255.0, limit[1] / 255.0)\n            return limit\n\n        ranges = [\n            normalize_range(cast(tuple[float, float], r_shift_limit)),\n            normalize_range(cast(tuple[float, float], g_shift_limit)),\n            normalize_range(cast(tuple[float, float], b_shift_limit)),\n        ]\n\n        # Initialize with fixed noise type and spatial mode\n        super().__init__(\n            noise_type=\"uniform\",\n            spatial_mode=\"constant\",\n            noise_params={\"ranges\": ranges},\n            approximation=1.0,\n            p=p,\n        )\n\n        # Store original limits for get_transform_init_args\n        self.r_shift_limit = cast(tuple[float, float], r_shift_limit)\n        self.g_shift_limit = cast(tuple[float, float], g_shift_limit)\n        self.b_shift_limit = cast(tuple[float, float], b_shift_limit)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"r_shift_limit\", \"g_shift_limit\", \"b_shift_limit\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomBrightnessContrast","title":"<code>class  RandomBrightnessContrast</code> <code>       (brightness_limit=(-0.2, 0.2), contrast_limit=(-0.2, 0.2), brightness_by_max=True, ensure_safe_range=False, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Randomly changes the brightness and contrast of the input image.</p> <p>This transform adjusts the brightness and contrast of an image simultaneously, allowing for a wide range of lighting and contrast variations. It's particularly useful for data augmentation in computer vision tasks, helping models become more robust to different lighting conditions.</p> <p>Parameters:</p> Name Type Description <code>brightness_limit</code> <code>float | tuple[float, float]</code> <p>Factor range for changing brightness. If a single float value is provided, the range will be (-brightness_limit, brightness_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum brightness, and -1.0 means minimum brightness. Default: (-0.2, 0.2).</p> <code>contrast_limit</code> <code>float | tuple[float, float]</code> <p>Factor range for changing contrast. If a single float value is provided, the range will be (-contrast_limit, contrast_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum increase in contrast, and -1.0 means maximum decrease in contrast. Default: (-0.2, 0.2).</p> <code>brightness_by_max</code> <code>bool</code> <p>If True, adjusts brightness by scaling pixel values up to the maximum value of the image's dtype. If False, uses the mean pixel value for adjustment. Default: True.</p> <code>ensure_safe_range</code> <code>bool</code> <p>If True, adjusts alpha and beta to prevent overflow/underflow. This ensures output values stay within the valid range for the image dtype without clipping. Default: False.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The order of operation is: contrast adjustment, then brightness adjustment.</li> <li>For uint8 images, the output is clipped to [0, 255] range.</li> <li>For float32 images, the output is clipped to [0, 1] range.</li> <li>The <code>brightness_by_max</code> parameter affects how brightness is adjusted:</li> <li>If True, brightness adjustment is more pronounced and can lead to more saturated results.</li> <li>If False, brightness adjustment is more subtle and preserves the overall lighting better.</li> <li>This transform is useful for:</li> <li>Simulating different lighting conditions</li> <li>Enhancing low-light or overexposed images</li> <li>Data augmentation to improve model robustness</li> </ul> <p>Mathematical Formulation:     Let a be the contrast adjustment factor and \u03b2 be the brightness adjustment factor.     For each pixel value x:     1. Contrast adjustment: x' = clip((x - mean) * (1 + a) + mean)     2. Brightness adjustment:        If brightness_by_max is True:  x'' = clip(x' * (1 + \u03b2))        If brightness_by_max is False: x'' = clip(x' + \u03b2 * max_value)     Where clip() ensures values stay within the valid range for the image dtype.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomBrightnessContrast(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-brightness-and-contrast-limits","title":"Custom brightness and contrast limits","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n...     brightness_limit=0.3,\n...     contrast_limit=0.3,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--adjust-brightness-based-on-mean-value","title":"Adjust brightness based on mean value","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n...     brightness_limit=0.2,\n...     contrast_limit=0.2,\n...     brightness_by_max=False,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Brightness: https://en.wikipedia.org/wiki/Brightness</li> <li>Contrast: https://en.wikipedia.org/wiki/Contrast_(vision)</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomBrightnessContrast(ImageOnlyTransform):\n    \"\"\"Randomly changes the brightness and contrast of the input image.\n\n    This transform adjusts the brightness and contrast of an image simultaneously, allowing for\n    a wide range of lighting and contrast variations. It's particularly useful for data augmentation\n    in computer vision tasks, helping models become more robust to different lighting conditions.\n\n    Args:\n        brightness_limit (float | tuple[float, float]): Factor range for changing brightness.\n            If a single float value is provided, the range will be (-brightness_limit, brightness_limit).\n            Values should typically be in the range [-1.0, 1.0], where 0 means no change,\n            1.0 means maximum brightness, and -1.0 means minimum brightness.\n            Default: (-0.2, 0.2).\n\n        contrast_limit (float | tuple[float, float]): Factor range for changing contrast.\n            If a single float value is provided, the range will be (-contrast_limit, contrast_limit).\n            Values should typically be in the range [-1.0, 1.0], where 0 means no change,\n            1.0 means maximum increase in contrast, and -1.0 means maximum decrease in contrast.\n            Default: (-0.2, 0.2).\n\n        brightness_by_max (bool): If True, adjusts brightness by scaling pixel values up to the\n            maximum value of the image's dtype. If False, uses the mean pixel value for adjustment.\n            Default: True.\n\n        ensure_safe_range (bool): If True, adjusts alpha and beta to prevent overflow/underflow.\n            This ensures output values stay within the valid range for the image dtype without clipping.\n            Default: False.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The order of operation is: contrast adjustment, then brightness adjustment.\n        - For uint8 images, the output is clipped to [0, 255] range.\n        - For float32 images, the output is clipped to [0, 1] range.\n        - The `brightness_by_max` parameter affects how brightness is adjusted:\n          * If True, brightness adjustment is more pronounced and can lead to more saturated results.\n          * If False, brightness adjustment is more subtle and preserves the overall lighting better.\n        - This transform is useful for:\n          * Simulating different lighting conditions\n          * Enhancing low-light or overexposed images\n          * Data augmentation to improve model robustness\n\n    Mathematical Formulation:\n        Let a be the contrast adjustment factor and \u03b2 be the brightness adjustment factor.\n        For each pixel value x:\n        1. Contrast adjustment: x' = clip((x - mean) * (1 + a) + mean)\n        2. Brightness adjustment:\n           If brightness_by_max is True:  x'' = clip(x' * (1 + \u03b2))\n           If brightness_by_max is False: x'' = clip(x' + \u03b2 * max_value)\n        Where clip() ensures values stay within the valid range for the image dtype.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomBrightnessContrast(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Custom brightness and contrast limits\n        &gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n        ...     brightness_limit=0.3,\n        ...     contrast_limit=0.3,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Adjust brightness based on mean value\n        &gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n        ...     brightness_limit=0.2,\n        ...     contrast_limit=0.2,\n        ...     brightness_by_max=False,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Brightness: https://en.wikipedia.org/wiki/Brightness\n        - Contrast: https://en.wikipedia.org/wiki/Contrast_(vision)\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        brightness_limit: SymmetricRangeType\n        contrast_limit: SymmetricRangeType\n        brightness_by_max: bool\n        ensure_safe_range: bool\n\n    def __init__(\n        self,\n        brightness_limit: ScaleFloatType = (-0.2, 0.2),\n        contrast_limit: ScaleFloatType = (-0.2, 0.2),\n        brightness_by_max: bool = True,\n        ensure_safe_range: bool = False,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.brightness_limit = cast(tuple[float, float], brightness_limit)\n        self.contrast_limit = cast(tuple[float, float], contrast_limit)\n        self.brightness_by_max = brightness_by_max\n        self.ensure_safe_range = ensure_safe_range\n\n    def apply(\n        self,\n        img: np.ndarray,\n        alpha: float,\n        beta: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return albucore.multiply_add(img, alpha, beta, inplace=False)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, float]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample initial values\n        alpha = 1.0 + self.py_random.uniform(*self.contrast_limit)\n        beta = self.py_random.uniform(*self.brightness_limit)\n\n        max_value = MAX_VALUES_BY_DTYPE[image.dtype]\n        # Scale beta according to brightness_by_max setting\n        beta = beta * max_value if self.brightness_by_max else beta * np.mean(image)\n\n        # Clip values to safe ranges if needed\n        if self.ensure_safe_range:\n            alpha, beta = fmain.get_safe_brightness_contrast_params(\n                alpha,\n                beta,\n                max_value,\n            )\n\n        return {\n            \"alpha\": alpha,\n            \"beta\": beta,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"brightness_limit\",\n            \"contrast_limit\",\n            \"brightness_by_max\",\n            \"ensure_safe_range\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomFog","title":"<code>class  RandomFog</code> <code>       (fog_coef_lower=None, fog_coef_upper=None, alpha_coef=0.08, fog_coef_range=(0.3, 1), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Simulates fog for the image by adding random fog-like artifacts.</p> <p>This transform creates a fog effect by generating semi-transparent overlays that mimic the visual characteristics of fog. The fog intensity and distribution can be controlled to create various fog-like conditions.</p> <p>Parameters:</p> Name Type Description <code>fog_coef_range</code> <code>tuple[float, float]</code> <p>Range for fog intensity coefficient. Should be in [0, 1] range.</p> <code>alpha_coef</code> <code>float</code> <p>Transparency of the fog circles. Should be in [0, 1] range. Default: 0.08.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The fog effect is created by overlaying semi-transparent circles on the image.</li> <li>Higher fog coefficient values result in denser fog effects.</li> <li>The fog is typically denser in the center of the image and gradually decreases towards the edges.</li> <li>This transform is useful for:</li> <li>Simulating various weather conditions in outdoor scenes</li> <li>Data augmentation for improving model robustness to foggy conditions</li> <li>Creating atmospheric effects in image editing</li> </ul> <p>Mathematical Formulation:     For each fog particle:     1. A position (x, y) is randomly generated within the image.     2. A circle with random radius is drawn at this position.     3. The circle's alpha (transparency) is determined by the alpha_coef.     4. These circles are overlaid on the original image to create the fog effect.</p> <pre><code>The final pixel value is calculated as:\noutput = (1 - alpha) * original_pixel + alpha * fog_color\n\nwhere alpha is influenced by the fog_coef and alpha_coef parameters.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomFog(p=1.0)\n&gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-fog-intensity-range","title":"Custom fog intensity range","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.3, fog_coef_upper=0.8, p=1.0)\n&gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--adjust-fog-transparency","title":"Adjust fog transparency","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.2, fog_coef_upper=0.5, alpha_coef=0.1, p=1.0)\n&gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Fog: https://en.wikipedia.org/wiki/Fog</li> <li>Atmospheric perspective: https://en.wikipedia.org/wiki/Aerial_perspective</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomFog(ImageOnlyTransform):\n    \"\"\"Simulates fog for the image by adding random fog-like artifacts.\n\n    This transform creates a fog effect by generating semi-transparent overlays\n    that mimic the visual characteristics of fog. The fog intensity and distribution\n    can be controlled to create various fog-like conditions.\n\n    Args:\n        fog_coef_range (tuple[float, float]): Range for fog intensity coefficient. Should be in [0, 1] range.\n        alpha_coef (float): Transparency of the fog circles. Should be in [0, 1] range. Default: 0.08.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The fog effect is created by overlaying semi-transparent circles on the image.\n        - Higher fog coefficient values result in denser fog effects.\n        - The fog is typically denser in the center of the image and gradually decreases towards the edges.\n        - This transform is useful for:\n          * Simulating various weather conditions in outdoor scenes\n          * Data augmentation for improving model robustness to foggy conditions\n          * Creating atmospheric effects in image editing\n\n    Mathematical Formulation:\n        For each fog particle:\n        1. A position (x, y) is randomly generated within the image.\n        2. A circle with random radius is drawn at this position.\n        3. The circle's alpha (transparency) is determined by the alpha_coef.\n        4. These circles are overlaid on the original image to create the fog effect.\n\n        The final pixel value is calculated as:\n        output = (1 - alpha) * original_pixel + alpha * fog_color\n\n        where alpha is influenced by the fog_coef and alpha_coef parameters.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomFog(p=1.0)\n        &gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n\n        # Custom fog intensity range\n        &gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.3, fog_coef_upper=0.8, p=1.0)\n        &gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n\n        # Adjust fog transparency\n        &gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.2, fog_coef_upper=0.5, alpha_coef=0.1, p=1.0)\n        &gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n\n    References:\n        - Fog: https://en.wikipedia.org/wiki/Fog\n        - Atmospheric perspective: https://en.wikipedia.org/wiki/Aerial_perspective\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        fog_coef_lower: float | None = Field(\n            ge=0,\n            le=1,\n        )\n        fog_coef_upper: float | None = Field(\n            ge=0,\n            le=1,\n        )\n        fog_coef_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        alpha_coef: float = Field(ge=0, le=1)\n\n        @model_validator(mode=\"after\")\n        def validate_fog_coefficients(self) -&gt; Self:\n            if self.fog_coef_lower is not None:\n                warn(\n                    \"`fog_coef_lower` is deprecated, use `fog_coef_range` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n            if self.fog_coef_upper is not None:\n                warn(\n                    \"`fog_coef_upper` is deprecated, use `fog_coef_range` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            lower = self.fog_coef_lower if self.fog_coef_lower is not None else self.fog_coef_range[0]\n            upper = self.fog_coef_upper if self.fog_coef_upper is not None else self.fog_coef_range[1]\n            self.fog_coef_range = (lower, upper)\n\n            self.fog_coef_lower = None\n            self.fog_coef_upper = None\n\n            return self\n\n    def __init__(\n        self,\n        fog_coef_lower: float | None = None,\n        fog_coef_upper: float | None = None,\n        alpha_coef: float = 0.08,\n        fog_coef_range: tuple[float, float] = (0.3, 1),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.fog_coef_range = fog_coef_range\n        self.alpha_coef = alpha_coef\n\n    def apply(\n        self,\n        img: np.ndarray,\n        particle_positions: list[tuple[int, int]],\n        radiuses: list[int],\n        intensity: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.add_fog(\n            img,\n            intensity,\n            self.alpha_coef,\n            particle_positions,\n            radiuses,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        # Select a random fog intensity within the specified range\n        intensity = self.py_random.uniform(*self.fog_coef_range)\n\n        image_shape = params[\"shape\"][:2]\n\n        image_height, image_width = image_shape\n\n        # Calculate the size of the fog effect region based on image width and fog intensity\n        fog_region_size = max(1, int(image_width // 3 * intensity))\n\n        particle_positions = []\n\n        # Initialize the central region where fog will be most dense\n        center_x, center_y = (int(x) for x in fgeometric.center(image_shape))\n\n        # Define the initial size of the foggy area\n        current_width = image_width\n        current_height = image_height\n\n        # Define shrink factor for reducing the foggy area each iteration\n        shrink_factor = 0.1\n\n        max_iterations = 10  # Prevent infinite loop\n        iteration = 0\n\n        while current_width &gt; fog_region_size and current_height &gt; fog_region_size and iteration &lt; max_iterations:\n            # Calculate the number of particles for this region\n            area = current_width * current_height\n            particles_in_region = int(\n                area / (fog_region_size * fog_region_size) * intensity * 10,\n            )\n\n            for _ in range(particles_in_region):\n                # Generate random positions within the current region\n                x = self.py_random.randint(\n                    center_x - current_width // 2,\n                    center_x + current_width // 2,\n                )\n                y = self.py_random.randint(\n                    center_y - current_height // 2,\n                    center_y + current_height // 2,\n                )\n                particle_positions.append((x, y))\n\n            # Shrink the region for the next iteration\n            current_width = int(current_width * (1 - shrink_factor))\n            current_height = int(current_height * (1 - shrink_factor))\n\n            iteration += 1\n\n        radiuses = fmain.get_fog_particle_radiuses(\n            image_shape,\n            len(particle_positions),\n            intensity,\n            self.random_generator,\n        )\n\n        return {\n            \"particle_positions\": particle_positions,\n            \"intensity\": intensity,\n            \"radiuses\": radiuses,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"fog_coef_range\", \"alpha_coef\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomGamma","title":"<code>class  RandomGamma</code> <code>       (gamma_limit=(80, 120), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies random gamma correction to the input image.</p> <p>Gamma correction, or simply gamma, is a nonlinear operation used to encode and decode luminance or tristimulus values in imaging systems. This transform can adjust the brightness of an image while preserving the relative differences between darker and lighter areas, making it useful for simulating different lighting conditions or correcting for display characteristics.</p> <p>Parameters:</p> Name Type Description <code>gamma_limit</code> <code>float | tuple[float, float]</code> <p>If gamma_limit is a single float value, the range will be (1, gamma_limit). If it's a tuple of two floats, they will serve as the lower and upper bounds for gamma adjustment. Values are in terms of percentage change, e.g., (80, 120) means the gamma will be between 80% and 120% of the original. Default: (80, 120).</p> <code>eps</code> <p>A small value added to the gamma to avoid division by zero or log of zero errors. Default: 1e-7.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The gamma correction is applied using the formula: output = input^gamma</li> <li>Gamma values &gt; 1 will make the image darker, while values &lt; 1 will make it brighter</li> <li>This transform is particularly useful for:</li> <li>Simulating different lighting conditions</li> <li>Correcting for non-linear display characteristics</li> <li>Enhancing contrast in certain regions of the image</li> <li>Data augmentation in computer vision tasks</li> </ul> <p>Mathematical Formulation:     Let I be the input image and G (gamma) be the correction factor.     The gamma correction is applied as follows:     1. Normalize the image to [0, 1] range: I_norm = I / 255 (for uint8 images)     2. Apply gamma correction: I_corrected = I_norm ^ (1 / G)     3. Scale back to original range: output = I_corrected * 255 (for uint8 images)</p> <pre><code>The actual gamma value used is calculated as:\nG = 1 + (random_value / 100), where random_value is sampled from gamma_limit range.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGamma(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-gamma-range","title":"Custom gamma range","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGamma(gamma_limit=(50, 150), p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--applying-with-other-transforms","title":"Applying with other transforms","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomGamma(gamma_limit=(80, 120), p=0.5),\n...     A.RandomBrightnessContrast(p=0.5),\n... ])\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction</li> <li>Power law (Gamma) encoding: https://www.cambridgeincolour.com/tutorials/gamma-correction.htm</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomGamma(ImageOnlyTransform):\n    \"\"\"Applies random gamma correction to the input image.\n\n    Gamma correction, or simply gamma, is a nonlinear operation used to encode and decode luminance\n    or tristimulus values in imaging systems. This transform can adjust the brightness of an image\n    while preserving the relative differences between darker and lighter areas, making it useful\n    for simulating different lighting conditions or correcting for display characteristics.\n\n    Args:\n        gamma_limit (float | tuple[float, float]): If gamma_limit is a single float value, the range\n            will be (1, gamma_limit). If it's a tuple of two floats, they will serve as\n            the lower and upper bounds for gamma adjustment. Values are in terms of percentage change,\n            e.g., (80, 120) means the gamma will be between 80% and 120% of the original.\n            Default: (80, 120).\n        eps: A small value added to the gamma to avoid division by zero or log of zero errors.\n            Default: 1e-7.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The gamma correction is applied using the formula: output = input^gamma\n        - Gamma values &gt; 1 will make the image darker, while values &lt; 1 will make it brighter\n        - This transform is particularly useful for:\n          * Simulating different lighting conditions\n          * Correcting for non-linear display characteristics\n          * Enhancing contrast in certain regions of the image\n          * Data augmentation in computer vision tasks\n\n    Mathematical Formulation:\n        Let I be the input image and G (gamma) be the correction factor.\n        The gamma correction is applied as follows:\n        1. Normalize the image to [0, 1] range: I_norm = I / 255 (for uint8 images)\n        2. Apply gamma correction: I_corrected = I_norm ^ (1 / G)\n        3. Scale back to original range: output = I_corrected * 255 (for uint8 images)\n\n        The actual gamma value used is calculated as:\n        G = 1 + (random_value / 100), where random_value is sampled from gamma_limit range.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomGamma(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Custom gamma range\n        &gt;&gt;&gt; transform = A.RandomGamma(gamma_limit=(50, 150), p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Applying with other transforms\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomGamma(gamma_limit=(80, 120), p=0.5),\n        ...     A.RandomBrightnessContrast(p=0.5),\n        ... ])\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction\n        - Power law (Gamma) encoding: https://www.cambridgeincolour.com/tutorials/gamma-correction.htm\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        gamma_limit: OnePlusFloatRangeType\n\n    def __init__(\n        self,\n        gamma_limit: ScaleFloatType = (80, 120),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.gamma_limit = cast(tuple[float, float], gamma_limit)\n\n    def apply(self, img: np.ndarray, gamma: float, **params: Any) -&gt; np.ndarray:\n        return fmain.gamma_transform(img, gamma=gamma)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"gamma\": self.py_random.uniform(self.gamma_limit[0], self.gamma_limit[1]) / 100.0,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"gamma_limit\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomGravel","title":"<code>class  RandomGravel</code> <code>       (gravel_roi=(0.1, 0.4, 0.9, 0.9), number_of_patches=2, always_apply=None, p=0.5)                         </code>  [view source on GitHub]","text":"<p>Adds gravel-like artifacts to the input image.</p> <p>This transform simulates the appearance of gravel or small stones scattered across specific regions of an image. It's particularly useful for augmenting datasets of road or terrain images, adding realistic texture variations.</p> <p>Parameters:</p> Name Type Description <code>gravel_roi</code> <code>tuple[float, float, float, float]</code> <p>Region of interest where gravel will be added, specified as (x_min, y_min, x_max, y_max) in relative coordinates [0, 1]. Default: (0.1, 0.4, 0.9, 0.9).</p> <code>number_of_patches</code> <code>int</code> <p>Number of gravel patch regions to generate within the ROI. Each patch will contain multiple gravel particles. Default: 2.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The gravel effect is created by modifying the saturation channel in the HLS color space.</li> <li>Gravel particles are distributed within randomly generated patches inside the specified ROI.</li> <li>This transform is particularly useful for:</li> <li>Augmenting datasets for road condition analysis</li> <li>Simulating variations in terrain for computer vision tasks</li> <li>Adding realistic texture to synthetic images of outdoor scenes</li> </ul> <p>Mathematical Formulation:     For each gravel patch:     1. A rectangular region is randomly generated within the specified ROI.     2. Within this region, multiple gravel particles are placed.     3. For each particle:        - Random (x, y) coordinates are generated within the patch.        - A random radius (r) between 1 and 3 pixels is assigned.        - A random saturation value (sat) between 0 and 255 is assigned.     4. The saturation channel of the image is modified for each particle:        image_hls[y-r:y+r, x-r:x+r, 1] = sat</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGravel(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-roi-and-number-of-patches","title":"Custom ROI and number of patches","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGravel(\n...     gravel_roi=(0.2, 0.2, 0.8, 0.8),\n...     number_of_patches=5,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--combining-with-other-transforms","title":"Combining with other transforms","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomGravel(p=0.7),\n...     A.RandomBrightnessContrast(p=0.5),\n... ])\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Road surface textures: https://en.wikipedia.org/wiki/Road_surface</li> <li>HLS color space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomGravel(ImageOnlyTransform):\n    \"\"\"Adds gravel-like artifacts to the input image.\n\n    This transform simulates the appearance of gravel or small stones scattered across\n    specific regions of an image. It's particularly useful for augmenting datasets of\n    road or terrain images, adding realistic texture variations.\n\n    Args:\n        gravel_roi (tuple[float, float, float, float]): Region of interest where gravel\n            will be added, specified as (x_min, y_min, x_max, y_max) in relative coordinates\n            [0, 1]. Default: (0.1, 0.4, 0.9, 0.9).\n        number_of_patches (int): Number of gravel patch regions to generate within the ROI.\n            Each patch will contain multiple gravel particles. Default: 2.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The gravel effect is created by modifying the saturation channel in the HLS color space.\n        - Gravel particles are distributed within randomly generated patches inside the specified ROI.\n        - This transform is particularly useful for:\n          * Augmenting datasets for road condition analysis\n          * Simulating variations in terrain for computer vision tasks\n          * Adding realistic texture to synthetic images of outdoor scenes\n\n    Mathematical Formulation:\n        For each gravel patch:\n        1. A rectangular region is randomly generated within the specified ROI.\n        2. Within this region, multiple gravel particles are placed.\n        3. For each particle:\n           - Random (x, y) coordinates are generated within the patch.\n           - A random radius (r) between 1 and 3 pixels is assigned.\n           - A random saturation value (sat) between 0 and 255 is assigned.\n        4. The saturation channel of the image is modified for each particle:\n           image_hls[y-r:y+r, x-r:x+r, 1] = sat\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomGravel(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Custom ROI and number of patches\n        &gt;&gt;&gt; transform = A.RandomGravel(\n        ...     gravel_roi=(0.2, 0.2, 0.8, 0.8),\n        ...     number_of_patches=5,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Combining with other transforms\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomGravel(p=0.7),\n        ...     A.RandomBrightnessContrast(p=0.5),\n        ... ])\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Road surface textures: https://en.wikipedia.org/wiki/Road_surface\n        - HLS color space: https://en.wikipedia.org/wiki/HSL_and_HSV\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        gravel_roi: tuple[float, float, float, float]\n        number_of_patches: int = Field(ge=1)\n\n        @model_validator(mode=\"after\")\n        def validate_gravel_roi(self) -&gt; Self:\n            gravel_lower_x, gravel_lower_y, gravel_upper_x, gravel_upper_y = self.gravel_roi\n            if not 0 &lt;= gravel_lower_x &lt; gravel_upper_x &lt;= 1 or not 0 &lt;= gravel_lower_y &lt; gravel_upper_y &lt;= 1:\n                raise ValueError(f\"Invalid gravel_roi. Got: {self.gravel_roi}.\")\n            return self\n\n    def __init__(\n        self,\n        gravel_roi: tuple[float, float, float, float] = (0.1, 0.4, 0.9, 0.9),\n        number_of_patches: int = 2,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p, always_apply)\n        self.gravel_roi = gravel_roi\n        self.number_of_patches = number_of_patches\n\n    def generate_gravel_patch(\n        self,\n        rectangular_roi: tuple[int, int, int, int],\n    ) -&gt; np.ndarray:\n        x_min, y_min, x_max, y_max = rectangular_roi\n        area = abs((x_max - x_min) * (y_max - y_min))\n        count = area // 10\n        gravels = np.empty([count, 2], dtype=np.int64)\n        gravels[:, 0] = self.random_generator.integers(x_min, x_max, count)\n        gravels[:, 1] = self.random_generator.integers(y_min, y_max, count)\n        return gravels\n\n    def apply(\n        self,\n        img: np.ndarray,\n        gravels_infos: list[Any],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_gravel(img, gravels_infos)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n\n        # Calculate ROI in pixels\n        x_min, y_min, x_max, y_max = (\n            int(coord * dim) for coord, dim in zip(self.gravel_roi, [width, height, width, height])\n        )\n\n        roi_width = x_max - x_min\n        roi_height = y_max - y_min\n\n        gravels_info = []\n\n        for _ in range(self.number_of_patches):\n            # Generate a random rectangular region within the ROI\n            patch_width = self.py_random.randint(roi_width // 10, roi_width // 5)\n            patch_height = self.py_random.randint(roi_height // 10, roi_height // 5)\n\n            patch_x = self.py_random.randint(x_min, x_max - patch_width)\n            patch_y = self.py_random.randint(y_min, y_max - patch_height)\n\n            # Generate gravel particles within this patch\n            num_particles = (patch_width * patch_height) // 100  # Adjust this divisor to control density\n\n            for _ in range(num_particles):\n                x = self.py_random.randint(patch_x, patch_x + patch_width)\n                y = self.py_random.randint(patch_y, patch_y + patch_height)\n                r = self.py_random.randint(1, 3)\n                sat = self.py_random.randint(0, 255)\n\n                gravels_info.append(\n                    [\n                        max(y - r, 0),  # min_y\n                        min(y + r, height - 1),  # max_y\n                        max(x - r, 0),  # min_x\n                        min(x + r, width - 1),  # max_x\n                        sat,  # saturation\n                    ],\n                )\n\n        return {\"gravels_infos\": np.array(gravels_info, dtype=np.int64)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"gravel_roi\", \"number_of_patches\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomRain","title":"<code>class  RandomRain</code> <code>       (slant_lower=None, slant_upper=None, slant_range=(-10, 10), drop_length=20, drop_width=1, drop_color=(200, 200, 200), blur_value=7, brightness_coefficient=0.7, rain_type='default', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Adds rain effects to an image.</p> <p>This transform simulates rainfall by overlaying semi-transparent streaks onto the image, creating a realistic rain effect. It can be used to augment datasets for computer vision tasks that need to perform well in rainy conditions.</p> <p>Parameters:</p> Name Type Description <code>slant_range</code> <code>tuple[int, int]</code> <p>Range for the rain slant angle in degrees. Negative values slant to the left, positive to the right. Default: (-10, 10).</p> <code>drop_length</code> <code>int</code> <p>Length of the rain drops in pixels. Default: 20.</p> <code>drop_width</code> <code>int</code> <p>Width of the rain drops in pixels. Default: 1.</p> <code>drop_color</code> <code>tuple[int, int, int]</code> <p>Color of the rain drops in RGB format. Default: (200, 200, 200).</p> <code>blur_value</code> <code>int</code> <p>Blur value for simulating rain effect. Rainy views are typically blurry. Default: 7.</p> <code>brightness_coefficient</code> <code>float</code> <p>Coefficient to adjust the brightness of the image. Rainy scenes are usually darker. Should be in the range (0, 1]. Default: 0.7.</p> <code>rain_type</code> <code>Literal[\"drizzle\", \"heavy\", \"torrential\", \"default\"]</code> <p>Type of rain to simulate.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The rain effect is created by drawing semi-transparent lines on the image.</li> <li>The slant of the rain can be controlled to simulate wind effects.</li> <li>Different rain types (drizzle, heavy, torrential) adjust the density and appearance of the rain.</li> <li>The transform also adjusts image brightness and applies a blur to simulate the visual effects of rain.</li> <li>This transform is particularly useful for:</li> <li>Augmenting datasets for autonomous driving in rainy conditions</li> <li>Testing the robustness of computer vision models to weather effects</li> <li>Creating realistic rainy scenes for image editing or film production</li> </ul> <p>Mathematical Formulation:     For each raindrop:     1. Start position (x1, y1) is randomly generated within the image.     2. End position (x2, y2) is calculated based on drop_length and slant:        x2 = x1 + drop_length * sin(slant)        y2 = y1 + drop_length * cos(slant)     3. A line is drawn from (x1, y1) to (x2, y2) with the specified drop_color and drop_width.     4. The image is then blurred and its brightness is adjusted.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomRain(p=1.0)\n&gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-rain-parameters","title":"Custom rain parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomRain(\n...     slant_range=(-15, 15),\n...     drop_length=30,\n...     drop_width=2,\n...     drop_color=(180, 180, 180),\n...     blur_value=5,\n...     brightness_coefficient=0.8,\n...     p=1.0\n... )\n&gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--simulating-heavy-rain","title":"Simulating heavy rain","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomRain(rain_type=\"heavy\", p=1.0)\n&gt;&gt;&gt; heavy_rain_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Rain visualization techniques: https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-27-real-time-rain-rendering</li> <li>Weather effects in computer vision: https://www.sciencedirect.com/science/article/pii/S1077314220300692</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomRain(ImageOnlyTransform):\n    \"\"\"Adds rain effects to an image.\n\n    This transform simulates rainfall by overlaying semi-transparent streaks onto the image,\n    creating a realistic rain effect. It can be used to augment datasets for computer vision\n    tasks that need to perform well in rainy conditions.\n\n    Args:\n        slant_range (tuple[int, int]): Range for the rain slant angle in degrees.\n            Negative values slant to the left, positive to the right. Default: (-10, 10).\n        drop_length (int): Length of the rain drops in pixels. Default: 20.\n        drop_width (int): Width of the rain drops in pixels. Default: 1.\n        drop_color (tuple[int, int, int]): Color of the rain drops in RGB format. Default: (200, 200, 200).\n        blur_value (int): Blur value for simulating rain effect. Rainy views are typically blurry. Default: 7.\n        brightness_coefficient (float): Coefficient to adjust the brightness of the image.\n            Rainy scenes are usually darker. Should be in the range (0, 1]. Default: 0.7.\n        rain_type (Literal[\"drizzle\", \"heavy\", \"torrential\", \"default\"]): Type of rain to simulate.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The rain effect is created by drawing semi-transparent lines on the image.\n        - The slant of the rain can be controlled to simulate wind effects.\n        - Different rain types (drizzle, heavy, torrential) adjust the density and appearance of the rain.\n        - The transform also adjusts image brightness and applies a blur to simulate the visual effects of rain.\n        - This transform is particularly useful for:\n          * Augmenting datasets for autonomous driving in rainy conditions\n          * Testing the robustness of computer vision models to weather effects\n          * Creating realistic rainy scenes for image editing or film production\n\n    Mathematical Formulation:\n        For each raindrop:\n        1. Start position (x1, y1) is randomly generated within the image.\n        2. End position (x2, y2) is calculated based on drop_length and slant:\n           x2 = x1 + drop_length * sin(slant)\n           y2 = y1 + drop_length * cos(slant)\n        3. A line is drawn from (x1, y1) to (x2, y2) with the specified drop_color and drop_width.\n        4. The image is then blurred and its brightness is adjusted.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomRain(p=1.0)\n        &gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n\n        # Custom rain parameters\n        &gt;&gt;&gt; transform = A.RandomRain(\n        ...     slant_range=(-15, 15),\n        ...     drop_length=30,\n        ...     drop_width=2,\n        ...     drop_color=(180, 180, 180),\n        ...     blur_value=5,\n        ...     brightness_coefficient=0.8,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n\n        # Simulating heavy rain\n        &gt;&gt;&gt; transform = A.RandomRain(rain_type=\"heavy\", p=1.0)\n        &gt;&gt;&gt; heavy_rain_image = transform(image=image)[\"image\"]\n\n    References:\n        - Rain visualization techniques: https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-27-real-time-rain-rendering\n        - Weather effects in computer vision: https://www.sciencedirect.com/science/article/pii/S1077314220300692\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        slant_lower: int | None = Field(default=None)\n        slant_upper: int | None = Field(default=None)\n        slant_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]\n        drop_length: int = Field(ge=1)\n        drop_width: int = Field(ge=1)\n        drop_color: tuple[int, int, int]\n        blur_value: int = Field(ge=1)\n        brightness_coefficient: float = Field(gt=0, le=1)\n        rain_type: RainMode\n\n        @model_validator(mode=\"after\")\n        def validate_ranges(self) -&gt; Self:\n            if self.slant_lower is not None or self.slant_upper is not None:\n                if self.slant_lower is not None:\n                    warn(\n                        \"`slant_lower` deprecated. Use `slant_range` as tuple (slant_lower, slant_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.slant_upper is not None:\n                    warn(\n                        \"`slant_upper` deprecated. Use `slant_range` as tuple (slant_lower, slant_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.slant_lower if self.slant_lower is not None else self.slant_range[0]\n                upper = self.slant_upper if self.slant_upper is not None else self.slant_range[1]\n                self.slant_range = (lower, upper)\n                self.slant_lower = None\n                self.slant_upper = None\n\n            # Validate the slant_range\n            if not (-MAX_RAIN_ANGLE &lt;= self.slant_range[0] &lt;= self.slant_range[1] &lt;= MAX_RAIN_ANGLE):\n                raise ValueError(\n                    f\"slant_range values should be increasing within [-{MAX_RAIN_ANGLE}, {MAX_RAIN_ANGLE}] range.\",\n                )\n            return self\n\n    def __init__(\n        self,\n        slant_lower: int | None = None,\n        slant_upper: int | None = None,\n        slant_range: tuple[int, int] = (-10, 10),\n        drop_length: int = 20,\n        drop_width: int = 1,\n        drop_color: tuple[int, int, int] = (200, 200, 200),\n        blur_value: int = 7,\n        brightness_coefficient: float = 0.7,\n        rain_type: RainMode = \"default\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.slant_range = slant_range\n        self.drop_length = drop_length\n        self.drop_width = drop_width\n        self.drop_color = drop_color\n        self.blur_value = blur_value\n        self.brightness_coefficient = brightness_coefficient\n        self.rain_type = rain_type\n\n    def apply(\n        self,\n        img: np.ndarray,\n        slant: int,\n        drop_length: int,\n        rain_drops: list[tuple[int, int]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n\n        return fmain.add_rain(\n            img,\n            slant,\n            drop_length,\n            self.drop_width,\n            self.drop_color,\n            self.blur_value,\n            self.brightness_coefficient,\n            rain_drops,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        slant = int(self.py_random.uniform(*self.slant_range))\n\n        height, width = params[\"shape\"][:2]\n        area = height * width\n\n        if self.rain_type == \"drizzle\":\n            num_drops = area // 770\n            drop_length = 10\n        elif self.rain_type == \"heavy\":\n            num_drops = width * height // 600\n            drop_length = 30\n        elif self.rain_type == \"torrential\":\n            num_drops = area // 500\n            drop_length = 60\n        else:\n            drop_length = self.drop_length\n            num_drops = area // 600\n\n        rain_drops = []\n\n        for _ in range(num_drops):  # If You want heavy rain, try increasing this\n            x = self.py_random.randint(slant, width) if slant &lt; 0 else self.py_random.randint(0, max(width - slant, 0))\n            y = self.py_random.randint(0, max(height - drop_length, 0))\n\n            rain_drops.append((x, y))\n\n        return {\"drop_length\": drop_length, \"slant\": slant, \"rain_drops\": rain_drops}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"slant_range\",\n            \"drop_length\",\n            \"drop_width\",\n            \"drop_color\",\n            \"blur_value\",\n            \"brightness_coefficient\",\n            \"rain_type\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomShadow","title":"<code>class  RandomShadow</code> <code>       (shadow_roi=(0, 0.5, 1, 1), num_shadows_limit=(1, 2), num_shadows_lower=None, num_shadows_upper=None, shadow_dimension=5, shadow_intensity_range=(0.5, 0.5), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Simulates shadows for the image by reducing the brightness of the image in shadow regions.</p> <p>This transform adds realistic shadow effects to images, which can be useful for augmenting datasets for outdoor scene analysis, autonomous driving, or any computer vision task where shadows may be present.</p> <p>Parameters:</p> Name Type Description <code>shadow_roi</code> <code>tuple[float, float, float, float]</code> <p>Region of the image where shadows will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1]. Default: (0, 0.5, 1, 1).</p> <code>num_shadows_limit</code> <code>tuple[int, int]</code> <p>Lower and upper limits for the possible number of shadows. Default: (1, 2).</p> <code>shadow_dimension</code> <code>int</code> <p>Number of edges in the shadow polygons. Default: 5.</p> <code>shadow_intensity_range</code> <code>tuple[float, float]</code> <p>Range for the shadow intensity. Larger value means darker shadow. Should be two float values between 0 and 1. Default: (0.5, 0.5).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>Shadows are created by generating random polygons within the specified ROI and   reducing the brightness of the image in these areas.</li> <li>The number of shadows, their shapes, and intensities can be randomized for variety.</li> <li>This transform is particularly useful for:</li> <li>Augmenting datasets for outdoor scene understanding</li> <li>Improving robustness of object detection models to shadowed conditions</li> <li>Simulating different lighting conditions in synthetic datasets</li> </ul> <p>Mathematical Formulation:     For each shadow:     1. A polygon with <code>shadow_dimension</code> vertices is generated within the shadow ROI.     2. The shadow intensity a is randomly chosen from <code>shadow_intensity_range</code>.     3. For each pixel (x, y) within the polygon:        new_pixel_value = original_pixel_value * (1 - a)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomShadow(p=1.0)\n&gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-shadow-parameters","title":"Custom shadow parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomShadow(\n...     shadow_roi=(0.2, 0.2, 0.8, 0.8),\n...     num_shadows_limit=(2, 4),\n...     shadow_dimension=8,\n...     shadow_intensity_range=(0.3, 0.7),\n...     p=1.0\n... )\n&gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--combining-with-other-transforms","title":"Combining with other transforms","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomShadow(p=0.5),\n...     A.RandomBrightnessContrast(p=0.5),\n... ])\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Shadow detection and removal: https://www.sciencedirect.com/science/article/pii/S1047320315002035</li> <li>Shadows in computer vision: https://en.wikipedia.org/wiki/Shadow_detection</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomShadow(ImageOnlyTransform):\n    \"\"\"Simulates shadows for the image by reducing the brightness of the image in shadow regions.\n\n    This transform adds realistic shadow effects to images, which can be useful for augmenting\n    datasets for outdoor scene analysis, autonomous driving, or any computer vision task where\n    shadows may be present.\n\n    Args:\n        shadow_roi (tuple[float, float, float, float]): Region of the image where shadows\n            will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1].\n            Default: (0, 0.5, 1, 1).\n        num_shadows_limit (tuple[int, int]): Lower and upper limits for the possible number of shadows.\n            Default: (1, 2).\n        shadow_dimension (int): Number of edges in the shadow polygons. Default: 5.\n        shadow_intensity_range (tuple[float, float]): Range for the shadow intensity. Larger value\n            means darker shadow. Should be two float values between 0 and 1. Default: (0.5, 0.5).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - Shadows are created by generating random polygons within the specified ROI and\n          reducing the brightness of the image in these areas.\n        - The number of shadows, their shapes, and intensities can be randomized for variety.\n        - This transform is particularly useful for:\n          * Augmenting datasets for outdoor scene understanding\n          * Improving robustness of object detection models to shadowed conditions\n          * Simulating different lighting conditions in synthetic datasets\n\n    Mathematical Formulation:\n        For each shadow:\n        1. A polygon with `shadow_dimension` vertices is generated within the shadow ROI.\n        2. The shadow intensity a is randomly chosen from `shadow_intensity_range`.\n        3. For each pixel (x, y) within the polygon:\n           new_pixel_value = original_pixel_value * (1 - a)\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomShadow(p=1.0)\n        &gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n\n        # Custom shadow parameters\n        &gt;&gt;&gt; transform = A.RandomShadow(\n        ...     shadow_roi=(0.2, 0.2, 0.8, 0.8),\n        ...     num_shadows_limit=(2, 4),\n        ...     shadow_dimension=8,\n        ...     shadow_intensity_range=(0.3, 0.7),\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n\n        # Combining with other transforms\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomShadow(p=0.5),\n        ...     A.RandomBrightnessContrast(p=0.5),\n        ... ])\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Shadow detection and removal: https://www.sciencedirect.com/science/article/pii/S1047320315002035\n        - Shadows in computer vision: https://en.wikipedia.org/wiki/Shadow_detection\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        shadow_roi: tuple[float, float, float, float]\n        num_shadows_limit: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, None)),\n            AfterValidator(nondecreasing),\n        ]\n        num_shadows_lower: int | None\n        num_shadows_upper: int | None\n        shadow_dimension: int = Field(ge=3)\n\n        shadow_intensity_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        @model_validator(mode=\"after\")\n        def validate_shadows(self) -&gt; Self:\n            if self.num_shadows_lower is not None:\n                warn(\n                    \"`num_shadows_lower` is deprecated. Use `num_shadows_limit` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.num_shadows_upper is not None:\n                warn(\n                    \"`num_shadows_upper` is deprecated. Use `num_shadows_limit` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.num_shadows_lower is not None or self.num_shadows_upper is not None:\n                num_shadows_lower = (\n                    self.num_shadows_lower if self.num_shadows_lower is not None else self.num_shadows_limit[0]\n                )\n                num_shadows_upper = (\n                    self.num_shadows_upper if self.num_shadows_upper is not None else self.num_shadows_limit[1]\n                )\n\n                self.num_shadows_limit = (num_shadows_lower, num_shadows_upper)\n                self.num_shadows_lower = None\n                self.num_shadows_upper = None\n\n            shadow_lower_x, shadow_lower_y, shadow_upper_x, shadow_upper_y = self.shadow_roi\n\n            if not 0 &lt;= shadow_lower_x &lt;= shadow_upper_x &lt;= 1 or not 0 &lt;= shadow_lower_y &lt;= shadow_upper_y &lt;= 1:\n                raise ValueError(f\"Invalid shadow_roi. Got: {self.shadow_roi}\")\n\n            if isinstance(self.shadow_intensity_range, float):\n                if not (0 &lt;= self.shadow_intensity_range &lt;= 1):\n                    raise ValueError(\n                        f\"shadow_intensity_range value should be within [0, 1] range. \"\n                        f\"Got: {self.shadow_intensity_range}\",\n                    )\n            elif isinstance(self.shadow_intensity_range, tuple):\n                if not (0 &lt;= self.shadow_intensity_range[0] &lt;= self.shadow_intensity_range[1] &lt;= 1):\n                    raise ValueError(\n                        f\"shadow_intensity_range values should be within [0, 1] range and increasing. \"\n                        f\"Got: {self.shadow_intensity_range}\",\n                    )\n            else:\n                raise TypeError(\n                    \"shadow_intensity_range should be an float or a tuple of floats.\",\n                )\n\n            return self\n\n    def __init__(\n        self,\n        shadow_roi: tuple[float, float, float, float] = (0, 0.5, 1, 1),\n        num_shadows_limit: tuple[int, int] = (1, 2),\n        num_shadows_lower: int | None = None,\n        num_shadows_upper: int | None = None,\n        shadow_dimension: int = 5,\n        shadow_intensity_range: tuple[float, float] = (0.5, 0.5),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.shadow_roi = shadow_roi\n        self.shadow_dimension = shadow_dimension\n        self.num_shadows_limit = num_shadows_limit\n        self.shadow_intensity_range = shadow_intensity_range\n\n    def apply(\n        self,\n        img: np.ndarray,\n        vertices_list: list[np.ndarray],\n        intensities: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_shadow(img, vertices_list, intensities)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, list[np.ndarray]]:\n        height, width = params[\"shape\"][:2]\n\n        num_shadows = self.py_random.randint(*self.num_shadows_limit)\n\n        x_min, y_min, x_max, y_max = self.shadow_roi\n\n        x_min = int(x_min * width)\n        x_max = int(x_max * width)\n        y_min = int(y_min * height)\n        y_max = int(y_max * height)\n\n        vertices_list = [\n            np.stack(\n                [\n                    self.random_generator.integers(\n                        x_min,\n                        x_max,\n                        size=self.shadow_dimension,\n                    ),\n                    self.random_generator.integers(\n                        y_min,\n                        y_max,\n                        size=self.shadow_dimension,\n                    ),\n                ],\n                axis=1,\n            )\n            for _ in range(num_shadows)\n        ]\n\n        # Sample shadow intensity for each shadow\n        intensities = self.random_generator.uniform(\n            *self.shadow_intensity_range,\n            size=num_shadows,\n        )\n\n        return {\"vertices_list\": vertices_list, \"intensities\": intensities}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"shadow_roi\",\n            \"num_shadows_limit\",\n            \"shadow_dimension\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomSnow","title":"<code>class  RandomSnow</code> <code>       (snow_point_lower=None, snow_point_upper=None, brightness_coeff=2.5, snow_point_range=(0.1, 0.3), method='bleach', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies a random snow effect to the input image.</p> <p>This transform simulates snowfall by either bleaching out some pixel values or adding a snow texture to the image, depending on the chosen method.</p> <p>Parameters:</p> Name Type Description <code>snow_point_range</code> <code>tuple[float, float]</code> <p>Range for the snow point threshold. Both values should be in the (0, 1) range. Default: (0.1, 0.3).</p> <code>brightness_coeff</code> <code>float</code> <p>Coefficient applied to increase the brightness of pixels below the snow_point threshold. Larger values lead to more pronounced snow effects. Should be &gt; 0. Default: 2.5.</p> <code>method</code> <code>Literal[\"bleach\", \"texture\"]</code> <p>The snow simulation method to use. Options are: - \"bleach\": Uses a simple pixel value thresholding technique. - \"texture\": Applies a more realistic snow texture overlay. Default: \"texture\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The \"bleach\" method increases the brightness of pixels above a certain threshold,   creating a simple snow effect. This method is faster but may look less realistic.</li> <li>The \"texture\" method creates a more realistic snow effect through the following steps:</li> <li>Converts the image to HSV color space for better control over brightness.</li> <li>Increases overall image brightness to simulate the reflective nature of snow.</li> <li>Generates a snow texture using Gaussian noise, which is then smoothed with a Gaussian filter.</li> <li>Applies a depth effect to the snow texture, making it more prominent at the top of the image.</li> <li>Blends the snow texture with the original image using alpha compositing.</li> <li>Adds a slight blue tint to simulate the cool color of snow.</li> <li>Adds random sparkle effects to simulate light reflecting off snow crystals.   This method produces a more realistic result but is computationally more expensive.</li> </ul> <p>Mathematical Formulation:     For the \"bleach\" method:     Let L be the lightness channel in HLS color space.     For each pixel (i, j):     If L[i, j] &gt; snow_point:         L[i, j] = L[i, j] * brightness_coeff</p> <pre><code>For the \"texture\" method:\n1. Brightness adjustment: V_new = V * (1 + brightness_coeff * snow_point)\n2. Snow texture generation: T = GaussianFilter(GaussianNoise(\u03bc=0.5, sigma=0.3))\n3. Depth effect: D = LinearGradient(1.0 to 0.2)\n4. Final pixel value: P = (1 - alpha) * original_pixel + alpha * (T * D * 255)\n   where alpha is the snow intensity factor derived from snow_point.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage-bleach-method","title":"Default usage (bleach method)","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSnow(p=1.0)\n&gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--using-texture-method-with-custom-parameters","title":"Using texture method with custom parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSnow(\n...     snow_point_range=(0.2, 0.4),\n...     brightness_coeff=2.0,\n...     method=\"texture\",\n...     p=1.0\n... )\n&gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Bleach method: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</li> <li>Texture method: Inspired by computer graphics techniques for snow rendering   and atmospheric scattering simulations.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomSnow(ImageOnlyTransform):\n    \"\"\"Applies a random snow effect to the input image.\n\n    This transform simulates snowfall by either bleaching out some pixel values or\n    adding a snow texture to the image, depending on the chosen method.\n\n    Args:\n        snow_point_range (tuple[float, float]): Range for the snow point threshold.\n            Both values should be in the (0, 1) range. Default: (0.1, 0.3).\n        brightness_coeff (float): Coefficient applied to increase the brightness of pixels\n            below the snow_point threshold. Larger values lead to more pronounced snow effects.\n            Should be &gt; 0. Default: 2.5.\n        method (Literal[\"bleach\", \"texture\"]): The snow simulation method to use. Options are:\n            - \"bleach\": Uses a simple pixel value thresholding technique.\n            - \"texture\": Applies a more realistic snow texture overlay.\n            Default: \"texture\".\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The \"bleach\" method increases the brightness of pixels above a certain threshold,\n          creating a simple snow effect. This method is faster but may look less realistic.\n        - The \"texture\" method creates a more realistic snow effect through the following steps:\n          1. Converts the image to HSV color space for better control over brightness.\n          2. Increases overall image brightness to simulate the reflective nature of snow.\n          3. Generates a snow texture using Gaussian noise, which is then smoothed with a Gaussian filter.\n          4. Applies a depth effect to the snow texture, making it more prominent at the top of the image.\n          5. Blends the snow texture with the original image using alpha compositing.\n          6. Adds a slight blue tint to simulate the cool color of snow.\n          7. Adds random sparkle effects to simulate light reflecting off snow crystals.\n          This method produces a more realistic result but is computationally more expensive.\n\n    Mathematical Formulation:\n        For the \"bleach\" method:\n        Let L be the lightness channel in HLS color space.\n        For each pixel (i, j):\n        If L[i, j] &gt; snow_point:\n            L[i, j] = L[i, j] * brightness_coeff\n\n        For the \"texture\" method:\n        1. Brightness adjustment: V_new = V * (1 + brightness_coeff * snow_point)\n        2. Snow texture generation: T = GaussianFilter(GaussianNoise(\u03bc=0.5, sigma=0.3))\n        3. Depth effect: D = LinearGradient(1.0 to 0.2)\n        4. Final pixel value: P = (1 - alpha) * original_pixel + alpha * (T * D * 255)\n           where alpha is the snow intensity factor derived from snow_point.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage (bleach method)\n        &gt;&gt;&gt; transform = A.RandomSnow(p=1.0)\n        &gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n\n        # Using texture method with custom parameters\n        &gt;&gt;&gt; transform = A.RandomSnow(\n        ...     snow_point_range=(0.2, 0.4),\n        ...     brightness_coeff=2.0,\n        ...     method=\"texture\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n\n    References:\n        - Bleach method: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n        - Texture method: Inspired by computer graphics techniques for snow rendering\n          and atmospheric scattering simulations.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        snow_point_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        snow_point_lower: float | None = Field(\n            gt=0,\n            lt=1,\n        )\n        snow_point_upper: float | None = Field(\n            gt=0,\n            lt=1,\n        )\n        brightness_coeff: float = Field(gt=0)\n        method: Literal[\"bleach\", \"texture\"]\n\n        @model_validator(mode=\"after\")\n        def validate_ranges(self) -&gt; Self:\n            if self.snow_point_lower is not None or self.snow_point_upper is not None:\n                if self.snow_point_lower is not None:\n                    warn(\n                        \"`snow_point_lower` deprecated. Use `snow_point_range` as tuple\"\n                        \" (snow_point_lower, snow_point_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.snow_point_upper is not None:\n                    warn(\n                        \"`snow_point_upper` deprecated. Use `snow_point_range` as tuple\"\n                        \"(snow_point_lower, snow_point_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.snow_point_lower if self.snow_point_lower is not None else self.snow_point_range[0]\n                upper = self.snow_point_upper if self.snow_point_upper is not None else self.snow_point_range[1]\n                self.snow_point_range = (lower, upper)\n                self.snow_point_lower = None\n                self.snow_point_upper = None\n\n            # Validate the snow_point_range\n            if not (0 &lt; self.snow_point_range[0] &lt;= self.snow_point_range[1] &lt; 1):\n                raise ValueError(\n                    \"snow_point_range values should be increasing within (0, 1) range.\",\n                )\n\n            return self\n\n    def __init__(\n        self,\n        snow_point_lower: float | None = None,\n        snow_point_upper: float | None = None,\n        brightness_coeff: float = 2.5,\n        snow_point_range: tuple[float, float] = (0.1, 0.3),\n        method: Literal[\"bleach\", \"texture\"] = \"bleach\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.snow_point_range = snow_point_range\n        self.brightness_coeff = brightness_coeff\n        self.method = method\n\n    def apply(\n        self,\n        img: np.ndarray,\n        snow_point: float,\n        snow_texture: np.ndarray,\n        sparkle_mask: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n\n        if self.method == \"bleach\":\n            return fmain.add_snow_bleach(img, snow_point, self.brightness_coeff)\n        if self.method == \"texture\":\n            return fmain.add_snow_texture(\n                img,\n                snow_point,\n                self.brightness_coeff,\n                snow_texture,\n                sparkle_mask,\n            )\n\n        raise ValueError(f\"Unknown snow method: {self.method}\")\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray | None]:\n        image_shape = params[\"shape\"][:2]\n        result = {\n            \"snow_point\": self.py_random.uniform(*self.snow_point_range),\n            \"snow_texture\": None,\n            \"sparkle_mask\": None,\n        }\n\n        if self.method == \"texture\":\n            snow_texture, sparkle_mask = fmain.generate_snow_textures(\n                img_shape=image_shape,\n                random_generator=self.random_generator,\n            )\n            result[\"snow_texture\"] = snow_texture\n            result[\"sparkle_mask\"] = sparkle_mask\n\n        return result\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"snow_point_range\", \"brightness_coeff\", \"method\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomSunFlare","title":"<code>class  RandomSunFlare</code> <code>       (flare_roi=(0, 0, 1, 0.5), angle_lower=None, angle_upper=None, num_flare_circles_lower=None, num_flare_circles_upper=None, src_radius=400, src_color=(255, 255, 255), angle_range=(0, 1), num_flare_circles_range=(6, 10), method='overlay', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Simulates a sun flare effect on the image by adding circles of light.</p> <p>This transform creates a sun flare effect by overlaying multiple semi-transparent circles of varying sizes and intensities along a line originating from a \"sun\" point. It offers two methods: a simple overlay technique and a more complex physics-based approach.</p> <p>Parameters:</p> Name Type Description <code>flare_roi</code> <code>tuple[float, float, float, float]</code> <p>Region of interest where the sun flare can appear. Values are in the range [0, 1] and represent (x_min, y_min, x_max, y_max) in relative coordinates. Default: (0, 0, 1, 0.5).</p> <code>angle_range</code> <code>tuple[float, float]</code> <p>Range of angles (in radians) for the flare direction. Values should be in the range [0, 1], where 0 represents 0 radians and 1 represents 2\u03c0 radians. Default: (0, 1).</p> <code>num_flare_circles_range</code> <code>tuple[int, int]</code> <p>Range for the number of flare circles to generate. Default: (6, 10).</p> <code>src_radius</code> <code>int</code> <p>Radius of the sun circle in pixels. Default: 400.</p> <code>src_color</code> <code>tuple[int, int, int]</code> <p>Color of the sun in RGB format. Default: (255, 255, 255).</p> <code>method</code> <code>Literal[\"overlay\", \"physics_based\"]</code> <p>Method to use for generating the sun flare. \"overlay\" uses a simple alpha blending technique, while \"physics_based\" simulates more realistic optical phenomena. Default: \"physics_based\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <p>The transform offers two methods for generating sun flares:</p> <ol> <li>Overlay Method (\"overlay\"):</li> <li>Creates a simple sun flare effect using basic alpha blending.</li> <li>Steps:      a. Generate the main sun circle with a radial gradient.      b. Create smaller flare circles along the flare line.      c. Blend these elements with the original image using alpha compositing.</li> <li> <p>Characteristics:</p> <ul> <li>Faster computation</li> <li>Less realistic appearance</li> <li>Suitable for basic augmentation or when performance is a priority</li> </ul> </li> <li> <p>Physics-based Method (\"physics_based\"):</p> </li> <li>Simulates more realistic optical phenomena observed in actual lens flares.</li> <li>Steps:      a. Create a separate flare layer for complex manipulations.      b. Add the main sun circle and diffraction spikes to simulate light diffraction.      c. Generate and add multiple flare circles with varying properties.      d. Apply Gaussian blur to create a soft, glowing effect.      e. Create and apply a radial gradient mask for natural fading from the center.      f. Simulate chromatic aberration by applying different blurs to color channels.      g. Blend the flare with the original image using screen blending mode.</li> <li>Characteristics:<ul> <li>More computationally intensive</li> <li>Produces more realistic and visually appealing results</li> <li>Includes effects like diffraction spikes and chromatic aberration</li> <li>Suitable for high-quality augmentation or realistic image synthesis</li> </ul> </li> </ol> <p>Mathematical Formulation:     For both methods:     1. Sun position (x_s, y_s) is randomly chosen within the specified ROI.     2. Flare angle \u03b8 is randomly chosen from the angle_range.     3. For each flare circle i:        - Position (x_i, y_i) = (x_s + t_i * cos(\u03b8), y_s + t_i * sin(\u03b8))          where t_i is a random distance along the flare line.        - Radius r_i is randomly chosen, with larger circles closer to the sun.        - Alpha (transparency) alpha_i is randomly chosen in the range [0.05, 0.2].        - Color (R_i, G_i, B_i) is randomly chosen close to src_color.</p> <pre><code>Overlay method blending:\nnew_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n\nPhysics-based method blending:\nnew_pixel = 255 - ((255 - original_pixel) * (255 - flare_pixel) / 255)\n\n4. Each flare circle is blended with the image using alpha compositing:\n   new_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-sun-flare-overlay-method","title":"Default sun flare (overlay method)","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n&gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--physics-based-sun-flare-with-custom-parameters","title":"Physics-based sun flare with custom parameters","text":""},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-sun-flare","title":"Default sun flare","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n&gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-sun-flare-parameters","title":"Custom sun flare parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSunFlare(\n...     flare_roi=(0.1, 0, 0.9, 0.3),\n...     angle_range=(0.25, 0.75),\n...     num_flare_circles_range=(5, 15),\n...     src_radius=200,\n...     src_color=(255, 200, 100),\n...     method=\"physics_based\",\n...     p=1.0\n... )\n&gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Lens flare: https://en.wikipedia.org/wiki/Lens_flare</li> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Diffraction: https://en.wikipedia.org/wiki/Diffraction</li> <li>Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration</li> <li>Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomSunFlare(ImageOnlyTransform):\n    \"\"\"Simulates a sun flare effect on the image by adding circles of light.\n\n    This transform creates a sun flare effect by overlaying multiple semi-transparent\n    circles of varying sizes and intensities along a line originating from a \"sun\" point.\n    It offers two methods: a simple overlay technique and a more complex physics-based approach.\n\n    Args:\n        flare_roi (tuple[float, float, float, float]): Region of interest where the sun flare\n            can appear. Values are in the range [0, 1] and represent (x_min, y_min, x_max, y_max)\n            in relative coordinates. Default: (0, 0, 1, 0.5).\n        angle_range (tuple[float, float]): Range of angles (in radians) for the flare direction.\n            Values should be in the range [0, 1], where 0 represents 0 radians and 1 represents 2\u03c0 radians.\n            Default: (0, 1).\n        num_flare_circles_range (tuple[int, int]): Range for the number of flare circles to generate.\n            Default: (6, 10).\n        src_radius (int): Radius of the sun circle in pixels. Default: 400.\n        src_color (tuple[int, int, int]): Color of the sun in RGB format. Default: (255, 255, 255).\n        method (Literal[\"overlay\", \"physics_based\"]): Method to use for generating the sun flare.\n            \"overlay\" uses a simple alpha blending technique, while \"physics_based\" simulates\n            more realistic optical phenomena. Default: \"physics_based\".\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        The transform offers two methods for generating sun flares:\n\n        1. Overlay Method (\"overlay\"):\n           - Creates a simple sun flare effect using basic alpha blending.\n           - Steps:\n             a. Generate the main sun circle with a radial gradient.\n             b. Create smaller flare circles along the flare line.\n             c. Blend these elements with the original image using alpha compositing.\n           - Characteristics:\n             * Faster computation\n             * Less realistic appearance\n             * Suitable for basic augmentation or when performance is a priority\n\n        2. Physics-based Method (\"physics_based\"):\n           - Simulates more realistic optical phenomena observed in actual lens flares.\n           - Steps:\n             a. Create a separate flare layer for complex manipulations.\n             b. Add the main sun circle and diffraction spikes to simulate light diffraction.\n             c. Generate and add multiple flare circles with varying properties.\n             d. Apply Gaussian blur to create a soft, glowing effect.\n             e. Create and apply a radial gradient mask for natural fading from the center.\n             f. Simulate chromatic aberration by applying different blurs to color channels.\n             g. Blend the flare with the original image using screen blending mode.\n           - Characteristics:\n             * More computationally intensive\n             * Produces more realistic and visually appealing results\n             * Includes effects like diffraction spikes and chromatic aberration\n             * Suitable for high-quality augmentation or realistic image synthesis\n\n    Mathematical Formulation:\n        For both methods:\n        1. Sun position (x_s, y_s) is randomly chosen within the specified ROI.\n        2. Flare angle \u03b8 is randomly chosen from the angle_range.\n        3. For each flare circle i:\n           - Position (x_i, y_i) = (x_s + t_i * cos(\u03b8), y_s + t_i * sin(\u03b8))\n             where t_i is a random distance along the flare line.\n           - Radius r_i is randomly chosen, with larger circles closer to the sun.\n           - Alpha (transparency) alpha_i is randomly chosen in the range [0.05, 0.2].\n           - Color (R_i, G_i, B_i) is randomly chosen close to src_color.\n\n        Overlay method blending:\n        new_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n\n        Physics-based method blending:\n        new_pixel = 255 - ((255 - original_pixel) * (255 - flare_pixel) / 255)\n\n        4. Each flare circle is blended with the image using alpha compositing:\n           new_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n\n        # Default sun flare (overlay method)\n        &gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n        &gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n\n        # Physics-based sun flare with custom parameters\n\n        # Default sun flare\n        &gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n        &gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n\n        # Custom sun flare parameters\n\n        &gt;&gt;&gt; transform = A.RandomSunFlare(\n        ...     flare_roi=(0.1, 0, 0.9, 0.3),\n        ...     angle_range=(0.25, 0.75),\n        ...     num_flare_circles_range=(5, 15),\n        ...     src_radius=200,\n        ...     src_color=(255, 200, 100),\n        ...     method=\"physics_based\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n\n    References:\n        - Lens flare: https://en.wikipedia.org/wiki/Lens_flare\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Diffraction: https://en.wikipedia.org/wiki/Diffraction\n        - Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration\n        - Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        flare_roi: tuple[float, float, float, float]\n        angle_lower: float | None = Field(ge=0, le=1)\n        angle_upper: float | None = Field(ge=0, le=1)\n\n        num_flare_circles_lower: int | None = Field(\n            ge=0,\n        )\n        num_flare_circles_upper: int | None = Field(\n            gt=0,\n        )\n        src_radius: int = Field(gt=1)\n        src_color: tuple[int, ...]\n\n        angle_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        num_flare_circles_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, None)),\n            AfterValidator(nondecreasing),\n        ]\n        method: Literal[\"overlay\", \"physics_based\"]\n\n        @model_validator(mode=\"after\")\n        def validate_parameters(self) -&gt; Self:\n            (\n                flare_center_lower_x,\n                flare_center_lower_y,\n                flare_center_upper_x,\n                flare_center_upper_y,\n            ) = self.flare_roi\n            if (\n                not 0 &lt;= flare_center_lower_x &lt; flare_center_upper_x &lt;= 1\n                or not 0 &lt;= flare_center_lower_y &lt; flare_center_upper_y &lt;= 1\n            ):\n                raise ValueError(f\"Invalid flare_roi. Got: {self.flare_roi}\")\n\n            if self.angle_lower is not None or self.angle_upper is not None:\n                if self.angle_lower is not None:\n                    warn(\n                        \"`angle_lower` deprecated. Use `angle_range` as tuple (angle_lower, angle_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.angle_upper is not None:\n                    warn(\n                        \"`angle_upper` deprecated. Use `angle_range` as tuple(angle_lower, angle_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.angle_lower if self.angle_lower is not None else self.angle_range[0]\n                upper = self.angle_upper if self.angle_upper is not None else self.angle_range[1]\n                self.angle_range = (lower, upper)\n\n            if self.num_flare_circles_lower is not None or self.num_flare_circles_upper is not None:\n                if self.num_flare_circles_lower is not None:\n                    warn(\n                        \"`num_flare_circles_lower` deprecated. Use `num_flare_circles_range` as tuple\"\n                        \" (num_flare_circles_lower, num_flare_circles_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.num_flare_circles_upper is not None:\n                    warn(\n                        \"`num_flare_circles_upper` deprecated. Use `num_flare_circles_range` as tuple\"\n                        \" (num_flare_circles_lower, num_flare_circles_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = (\n                    self.num_flare_circles_lower\n                    if self.num_flare_circles_lower is not None\n                    else self.num_flare_circles_range[0]\n                )\n                upper = (\n                    self.num_flare_circles_upper\n                    if self.num_flare_circles_upper is not None\n                    else self.num_flare_circles_range[1]\n                )\n                self.num_flare_circles_range = (lower, upper)\n\n            return self\n\n    def __init__(\n        self,\n        flare_roi: tuple[float, float, float, float] = (0, 0, 1, 0.5),\n        angle_lower: float | None = None,\n        angle_upper: float | None = None,\n        num_flare_circles_lower: int | None = None,\n        num_flare_circles_upper: int | None = None,\n        src_radius: int = 400,\n        src_color: tuple[int, ...] = (255, 255, 255),\n        angle_range: tuple[float, float] = (0, 1),\n        num_flare_circles_range: tuple[int, int] = (6, 10),\n        method: Literal[\"overlay\", \"physics_based\"] = \"overlay\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.angle_range = angle_range\n        self.num_flare_circles_range = num_flare_circles_range\n\n        self.src_radius = src_radius\n        self.src_color = src_color\n        self.flare_roi = flare_roi\n        self.method = method\n\n    def apply(\n        self,\n        img: np.ndarray,\n        flare_center: tuple[float, float],\n        circles: list[Any],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        if self.method == \"overlay\":\n            return fmain.add_sun_flare_overlay(\n                img,\n                flare_center,\n                self.src_radius,\n                self.src_color,\n                circles,\n            )\n        if self.method == \"physics_based\":\n            return fmain.add_sun_flare_physics_based(\n                img,\n                flare_center,\n                self.src_radius,\n                self.src_color,\n                circles,\n            )\n\n        raise ValueError(f\"Invalid method: {self.method}\")\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n        diagonal = math.sqrt(height**2 + width**2)\n\n        angle = 2 * math.pi * self.py_random.uniform(*self.angle_range)\n\n        # Calculate flare center in pixel coordinates\n        x_min, y_min, x_max, y_max = self.flare_roi\n        flare_center_x = int(width * self.py_random.uniform(x_min, x_max))\n        flare_center_y = int(height * self.py_random.uniform(y_min, y_max))\n\n        num_circles = self.py_random.randint(*self.num_flare_circles_range)\n\n        # Calculate parameters relative to image size\n        step_size = max(1, int(diagonal * 0.01))  # 1% of diagonal, minimum 1 pixel\n        max_radius = max(2, int(height * 0.01))  # 1% of height, minimum 2 pixels\n        color_range = int(max(self.src_color) * 0.2)  # 20% of max color value\n\n        def line(t: float) -&gt; tuple[float, float]:\n            return (\n                flare_center_x + t * math.cos(angle),\n                flare_center_y + t * math.sin(angle),\n            )\n\n        # Generate points along the flare line\n        t_range = range(-flare_center_x, width - flare_center_x, step_size)\n        points = [line(t) for t in t_range]\n\n        circles = []\n        for _ in range(num_circles):\n            alpha = self.py_random.uniform(0.05, 0.2)\n            point = self.py_random.choice(points)\n            rad = self.py_random.randint(1, max_radius)\n\n            # Generate colors relative to src_color\n            colors = [self.py_random.randint(max(c - color_range, 0), c) for c in self.src_color]\n\n            circles.append(\n                (\n                    alpha,\n                    (int(point[0]), int(point[1])),\n                    pow(rad, 3),\n                    tuple(colors),\n                ),\n            )\n\n        return {\n            \"circles\": circles,\n            \"flare_center\": (flare_center_x, flare_center_y),\n        }\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"flare_roi\": self.flare_roi,\n            \"angle_range\": self.angle_range,\n            \"num_flare_circles_range\": self.num_flare_circles_range,\n            \"src_radius\": self.src_radius,\n            \"src_color\": self.src_color,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomToneCurve","title":"<code>class  RandomToneCurve</code> <code>       (scale=0.1, per_channel=False, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Randomly change the relationship between bright and dark areas of the image by manipulating its tone curve.</p> <p>This transform applies a random S-curve to the image's tone curve, adjusting the brightness and contrast in a non-linear manner. It can be applied to the entire image or to each channel separately.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>float</code> <p>Standard deviation of the normal distribution used to sample random distances to move two control points that modify the image's curve. Values should be in range [0, 1]. Higher values will result in more dramatic changes to the image. Default: 0.1</p> <code>per_channel</code> <code>bool</code> <p>If True, the tone curve will be applied to each channel of the input image separately, which can lead to color distortion. If False, the same curve is applied to all channels, preserving the original color relationships. Default: False</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform modifies the image's histogram by applying a smooth, S-shaped curve to it.</li> <li>The S-curve is defined by moving two control points of a quadratic B\u00e9zier curve.</li> <li>When per_channel is False, the same curve is applied to all channels, maintaining color balance.</li> <li>When per_channel is True, different curves are applied to each channel, which can create color shifts.</li> <li>This transform can be used to adjust image contrast and brightness in a more natural way than linear     transforms.</li> <li>The effect can range from subtle contrast adjustments to more dramatic \"vintage\" or \"faded\" looks.</li> </ul> <p>Mathematical Formulation:     1. Two control points are randomly moved from their default positions (0.25, 0.25) and (0.75, 0.75).     2. The new positions are sampled from a normal distribution: N(\u03bc, \u03c3\u00b2), where \u03bc is the original position     and alpha is the scale parameter.     3. These points, along with fixed points at (0, 0) and (1, 1), define a quadratic B\u00e9zier curve.     4. The curve is applied as a lookup table to the image intensities:        new_intensity = curve(original_intensity)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-a-random-tone-curve-to-all-channels-together","title":"Apply a random tone curve to all channels together","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.1, per_channel=False, p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-random-tone-curves-to-each-channel-separately","title":"Apply random tone curves to each channel separately","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.2, per_channel=True, p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>\"What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance\"   by Mahmoud Afifi and Michael S. Brown, ICCV 2019.</li> <li>B\u00e9zier curve: https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Quadratic_B%C3%A9zier_curves</li> <li>Tone mapping: https://en.wikipedia.org/wiki/Tone_mapping</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomToneCurve(ImageOnlyTransform):\n    \"\"\"Randomly change the relationship between bright and dark areas of the image by manipulating its tone curve.\n\n    This transform applies a random S-curve to the image's tone curve, adjusting the brightness and contrast\n    in a non-linear manner. It can be applied to the entire image or to each channel separately.\n\n    Args:\n        scale (float): Standard deviation of the normal distribution used to sample random distances\n            to move two control points that modify the image's curve. Values should be in range [0, 1].\n            Higher values will result in more dramatic changes to the image. Default: 0.1\n        per_channel (bool): If True, the tone curve will be applied to each channel of the input image separately,\n            which can lead to color distortion. If False, the same curve is applied to all channels,\n            preserving the original color relationships. Default: False\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform modifies the image's histogram by applying a smooth, S-shaped curve to it.\n        - The S-curve is defined by moving two control points of a quadratic B\u00e9zier curve.\n        - When per_channel is False, the same curve is applied to all channels, maintaining color balance.\n        - When per_channel is True, different curves are applied to each channel, which can create color shifts.\n        - This transform can be used to adjust image contrast and brightness in a more natural way than linear\n            transforms.\n        - The effect can range from subtle contrast adjustments to more dramatic \"vintage\" or \"faded\" looks.\n\n    Mathematical Formulation:\n        1. Two control points are randomly moved from their default positions (0.25, 0.25) and (0.75, 0.75).\n        2. The new positions are sampled from a normal distribution: N(\u03bc, \u03c3\u00b2), where \u03bc is the original position\n        and alpha is the scale parameter.\n        3. These points, along with fixed points at (0, 0) and (1, 1), define a quadratic B\u00e9zier curve.\n        4. The curve is applied as a lookup table to the image intensities:\n           new_intensity = curve(original_intensity)\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply a random tone curve to all channels together\n        &gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.1, per_channel=False, p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n\n        # Apply random tone curves to each channel separately\n        &gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.2, per_channel=True, p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n\n    References:\n        - \"What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance\"\n          by Mahmoud Afifi and Michael S. Brown, ICCV 2019.\n        - B\u00e9zier curve: https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Quadratic_B%C3%A9zier_curves\n        - Tone mapping: https://en.wikipedia.org/wiki/Tone_mapping\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: float = Field(\n            ge=0,\n            le=1,\n        )\n        per_channel: bool\n\n    def __init__(\n        self,\n        scale: float = 0.1,\n        per_channel: bool = False,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale = scale\n        self.per_channel = per_channel\n\n    def apply(\n        self,\n        img: np.ndarray,\n        low_y: float | np.ndarray,\n        high_y: float | np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.move_tone_curve(img, low_y, high_y)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        num_channels = get_num_channels(image)\n\n        if self.per_channel and num_channels != 1:\n            return {\n                \"low_y\": np.clip(\n                    self.random_generator.normal(\n                        loc=0.25,\n                        scale=self.scale,\n                        size=(num_channels,),\n                    ),\n                    0,\n                    1,\n                ),\n                \"high_y\": np.clip(\n                    self.random_generator.normal(\n                        loc=0.75,\n                        scale=self.scale,\n                        size=(num_channels,),\n                    ),\n                    0,\n                    1,\n                ),\n            }\n        # Same values for all channels\n        low_y = np.clip(self.random_generator.normal(loc=0.25, scale=self.scale), 0, 1)\n        high_y = np.clip(self.random_generator.normal(loc=0.75, scale=self.scale), 0, 1)\n\n        return {\"low_y\": low_y, \"high_y\": high_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"scale\", \"per_channel\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RingingOvershoot","title":"<code>class  RingingOvershoot</code> <code>       (blur_limit=(7, 15), cutoff=(0.7853981633974483, 1.5707963267948966), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Create ringing or overshoot artifacts by convolving the image with a 2D sinc filter.</p> <p>This transform simulates the ringing artifacts that can occur in digital image processing, particularly after sharpening or edge enhancement operations. It creates oscillations or overshoots near sharp transitions in the image.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Maximum kernel size for the sinc filter. Must be an odd number in the range [3, inf). If a single int is provided, the kernel size will be randomly chosen from the range (3, blur_limit). If a tuple (min, max) is provided, the kernel size will be randomly chosen from the range (min, max). Default: (7, 15).</p> <code>cutoff</code> <code>tuple[float, float]</code> <p>Range to choose the cutoff frequency in radians. Values should be in the range (0, \u03c0). A lower cutoff frequency will result in more pronounced ringing effects. Default: (\u03c0/4, \u03c0/2).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>Ringing artifacts are oscillations of the image intensity function in the neighborhood   of sharp transitions, such as edges or object boundaries.</li> <li>This transform uses a 2D sinc filter (also known as a 2D cardinal sine function)   to introduce these artifacts.</li> <li>The severity of the ringing effect is controlled by both the kernel size (blur_limit)   and the cutoff frequency.</li> <li>Larger kernel sizes and lower cutoff frequencies will generally produce more   noticeable ringing effects.</li> <li>This transform can be useful for:</li> <li>Simulating imperfections in image processing or transmission systems</li> <li>Testing the robustness of computer vision models to ringing artifacts</li> <li>Creating artistic effects that emphasize edges and transitions in images</li> </ul> <p>Mathematical Formulation:     The 2D sinc filter kernel is defined as:</p> <pre><code>K(x, y) = cutoff * J\u2081(cutoff * \u221a(x\u00b2 + y\u00b2)) / (2\u03c0 * \u221a(x\u00b2 + y\u00b2))\n\nwhere:\n- J\u2081 is the Bessel function of the first kind of order 1\n- cutoff is the chosen cutoff frequency\n- x and y are the distances from the kernel center\n\nThe filtered image I' is obtained by convolving the input image I with the kernel K:\n\nI'(x, y) = \u2211\u2211 I(x-u, y-v) * K(u, v)\n\nThe convolution operation introduces the ringing artifacts near sharp transitions.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-ringing-effect-with-default-parameters","title":"Apply ringing effect with default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RingingOvershoot(p=1.0)\n&gt;&gt;&gt; ringing_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-ringing-effect-with-custom-parameters","title":"Apply ringing effect with custom parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RingingOvershoot(\n...     blur_limit=(9, 17),\n...     cutoff=(np.pi/6, np.pi/3),\n...     p=1.0\n... )\n&gt;&gt;&gt; ringing_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>Ringing artifacts: https://en.wikipedia.org/wiki/Ringing_artifacts</li> <li>Sinc filter: https://en.wikipedia.org/wiki/Sinc_filter</li> <li>\"The Importance of Ringing Artifacts in Image Processing\" by Jae S. Lim, 1981</li> <li>\"Digital Image Processing\" by Rafael C. Gonzalez and Richard E. Woods, 4th Edition</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RingingOvershoot(ImageOnlyTransform):\n    \"\"\"Create ringing or overshoot artifacts by convolving the image with a 2D sinc filter.\n\n    This transform simulates the ringing artifacts that can occur in digital image processing,\n    particularly after sharpening or edge enhancement operations. It creates oscillations\n    or overshoots near sharp transitions in the image.\n\n    Args:\n        blur_limit (tuple[int, int] | int): Maximum kernel size for the sinc filter.\n            Must be an odd number in the range [3, inf).\n            If a single int is provided, the kernel size will be randomly chosen\n            from the range (3, blur_limit). If a tuple (min, max) is provided,\n            the kernel size will be randomly chosen from the range (min, max).\n            Default: (7, 15).\n        cutoff (tuple[float, float]): Range to choose the cutoff frequency in radians.\n            Values should be in the range (0, \u03c0). A lower cutoff frequency will\n            result in more pronounced ringing effects.\n            Default: (\u03c0/4, \u03c0/2).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - Ringing artifacts are oscillations of the image intensity function in the neighborhood\n          of sharp transitions, such as edges or object boundaries.\n        - This transform uses a 2D sinc filter (also known as a 2D cardinal sine function)\n          to introduce these artifacts.\n        - The severity of the ringing effect is controlled by both the kernel size (blur_limit)\n          and the cutoff frequency.\n        - Larger kernel sizes and lower cutoff frequencies will generally produce more\n          noticeable ringing effects.\n        - This transform can be useful for:\n          * Simulating imperfections in image processing or transmission systems\n          * Testing the robustness of computer vision models to ringing artifacts\n          * Creating artistic effects that emphasize edges and transitions in images\n\n    Mathematical Formulation:\n        The 2D sinc filter kernel is defined as:\n\n        K(x, y) = cutoff * J\u2081(cutoff * \u221a(x\u00b2 + y\u00b2)) / (2\u03c0 * \u221a(x\u00b2 + y\u00b2))\n\n        where:\n        - J\u2081 is the Bessel function of the first kind of order 1\n        - cutoff is the chosen cutoff frequency\n        - x and y are the distances from the kernel center\n\n        The filtered image I' is obtained by convolving the input image I with the kernel K:\n\n        I'(x, y) = \u2211\u2211 I(x-u, y-v) * K(u, v)\n\n        The convolution operation introduces the ringing artifacts near sharp transitions.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Apply ringing effect with default parameters\n        &gt;&gt;&gt; transform = A.RingingOvershoot(p=1.0)\n        &gt;&gt;&gt; ringing_image = transform(image=image)['image']\n\n        # Apply ringing effect with custom parameters\n        &gt;&gt;&gt; transform = A.RingingOvershoot(\n        ...     blur_limit=(9, 17),\n        ...     cutoff=(np.pi/6, np.pi/3),\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; ringing_image = transform(image=image)['image']\n\n    References:\n        - Ringing artifacts: https://en.wikipedia.org/wiki/Ringing_artifacts\n        - Sinc filter: https://en.wikipedia.org/wiki/Sinc_filter\n        - \"The Importance of Ringing Artifacts in Image Processing\" by Jae S. Lim, 1981\n        - \"Digital Image Processing\" by Rafael C. Gonzalez and Richard E. Woods, 4th Edition\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        blur_limit: ScaleIntType\n        cutoff: Annotated[tuple[float, float], nondecreasing]\n\n        @field_validator(\"cutoff\")\n        @classmethod\n        def check_cutoff(\n            cls,\n            v: tuple[float, float],\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = 0, np.pi\n            check_range(v, *bounds, info.field_name)\n            return v\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (7, 15),\n        cutoff: tuple[float, float] = (np.pi / 4, np.pi / 2),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.cutoff = cutoff\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        ksize = self.py_random.randrange(self.blur_limit[0], self.blur_limit[1] + 1, 2)\n        if ksize % 2 == 0:\n            raise ValueError(f\"Kernel size must be odd. Got: {ksize}\")\n\n        cutoff = self.py_random.uniform(*self.cutoff)\n\n        # From dsp.stackexchange.com/questions/58301/2-d-circularly-symmetric-low-pass-filter\n        with np.errstate(divide=\"ignore\", invalid=\"ignore\"):\n            kernel = np.fromfunction(\n                lambda x, y: cutoff\n                * special.j1(\n                    cutoff * np.sqrt((x - (ksize - 1) / 2) ** 2 + (y - (ksize - 1) / 2) ** 2),\n                )\n                / (2 * np.pi * np.sqrt((x - (ksize - 1) / 2) ** 2 + (y - (ksize - 1) / 2) ** 2)),\n                [ksize, ksize],\n            )\n        kernel[(ksize - 1) // 2, (ksize - 1) // 2] = cutoff**2 / (4 * np.pi)\n\n        # Normalize kernel\n        kernel = kernel.astype(np.float32) / np.sum(kernel)\n\n        return {\"kernel\": kernel}\n\n    def apply(self, img: np.ndarray, kernel: int, **params: Any) -&gt; np.ndarray:\n        return fmain.convolve(img, kernel)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"blur_limit\", \"cutoff\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.SaltAndPepper","title":"<code>class  SaltAndPepper</code> <code>       (amount=(0.01, 0.06), salt_vs_pepper=(0.4, 0.6), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply salt and pepper noise to the input image.</p> <p>Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt) or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.</p> <p>Parameters:</p> Name Type Description <code>amount</code> <code>float, float</code> <p>Range for total amount of noise (both salt and pepper). Values between 0 and 1. For example: - 0.05 means 5% of all pixels will be replaced with noise - (0.01, 0.06) will sample amount uniformly from 1% to 6% Default: (0.01, 0.06)</p> <code>salt_vs_pepper</code> <code>float, float</code> <p>Range for ratio of salt (white) vs pepper (black) noise. Values between 0 and 1. For example: - 0.5 means equal amounts of salt and pepper - 0.7 means 70% of noisy pixels will be salt, 30% pepper - (0.4, 0.6) will sample ratio uniformly from 40% to 60% Default: (0.4, 0.6)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Salt noise sets pixels to maximum value (255 for uint8, 1.0 for float32)</li> <li>Pepper noise sets pixels to 0</li> <li>Salt and pepper masks are generated independently, so a pixel could theoretically   be selected for both (in this case, pepper overrides salt)</li> <li>The actual number of affected pixels might slightly differ from the specified amount   due to random sampling and potential overlap of salt and pepper masks</li> </ul> <p>Mathematical Formulation:     For an input image I, the output O is:     O[x,y] = max_value,  if salt_mask[x,y] = True     O[x,y] = 0,         if pepper_mask[x,y] = True     O[x,y] = I[x,y],    otherwise</p> <pre><code>where:\nP(salt_mask[x,y] = True) = amount * salt_ratio\nP(pepper_mask[x,y] = True) = amount * (1 - salt_ratio)\namount \u2208 [amount_min, amount_max]\nsalt_ratio \u2208 [salt_vs_pepper_min, salt_vs_pepper_max]\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-salt-and-pepper-noise-with-default-parameters","title":"Apply salt and pepper noise with default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.SaltAndPepper(p=1.0)\n&gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--heavy-noise-with-more-salt-than-pepper","title":"Heavy noise with more salt than pepper","text":"Python<pre><code>&gt;&gt;&gt; transform = A.SaltAndPepper(\n...     amount=(0.1, 0.2),       # 10-20% of pixels will be noisy\n...     salt_vs_pepper=(0.7, 0.9),  # 70-90% of noise will be salt\n...     p=1.0\n... )\n&gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <p>.. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"        Chapter 5: Image Restoration and Reconstruction.</p> <p>.. [2] A. K. Jain, \"Fundamentals of Digital Image Processing,\"        Chapter 7: Image Degradation and Restoration.</p> <p>.. [3] Salt and pepper noise:        https://en.wikipedia.org/wiki/Salt-and-pepper_noise</p> <p>See Also:     - GaussNoise: For additive Gaussian noise     - MultiplicativeNoise: For multiplicative noise     - ISONoise: For camera sensor noise simulation</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class SaltAndPepper(ImageOnlyTransform):\n    \"\"\"Apply salt and pepper noise to the input image.\n\n    Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt)\n    or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.\n\n    Args:\n        amount ((float, float)): Range for total amount of noise (both salt and pepper).\n            Values between 0 and 1. For example:\n            - 0.05 means 5% of all pixels will be replaced with noise\n            - (0.01, 0.06) will sample amount uniformly from 1% to 6%\n            Default: (0.01, 0.06)\n\n        salt_vs_pepper ((float, float)): Range for ratio of salt (white) vs pepper (black) noise.\n            Values between 0 and 1. For example:\n            - 0.5 means equal amounts of salt and pepper\n            - 0.7 means 70% of noisy pixels will be salt, 30% pepper\n            - (0.4, 0.6) will sample ratio uniformly from 40% to 60%\n            Default: (0.4, 0.6)\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Salt noise sets pixels to maximum value (255 for uint8, 1.0 for float32)\n        - Pepper noise sets pixels to 0\n        - Salt and pepper masks are generated independently, so a pixel could theoretically\n          be selected for both (in this case, pepper overrides salt)\n        - The actual number of affected pixels might slightly differ from the specified amount\n          due to random sampling and potential overlap of salt and pepper masks\n\n    Mathematical Formulation:\n        For an input image I, the output O is:\n        O[x,y] = max_value,  if salt_mask[x,y] = True\n        O[x,y] = 0,         if pepper_mask[x,y] = True\n        O[x,y] = I[x,y],    otherwise\n\n        where:\n        P(salt_mask[x,y] = True) = amount * salt_ratio\n        P(pepper_mask[x,y] = True) = amount * (1 - salt_ratio)\n        amount \u2208 [amount_min, amount_max]\n        salt_ratio \u2208 [salt_vs_pepper_min, salt_vs_pepper_max]\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Apply salt and pepper noise with default parameters\n        &gt;&gt;&gt; transform = A.SaltAndPepper(p=1.0)\n        &gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n\n        # Heavy noise with more salt than pepper\n        &gt;&gt;&gt; transform = A.SaltAndPepper(\n        ...     amount=(0.1, 0.2),       # 10-20% of pixels will be noisy\n        ...     salt_vs_pepper=(0.7, 0.9),  # 70-90% of noise will be salt\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n\n    References:\n        .. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"\n               Chapter 5: Image Restoration and Reconstruction.\n\n        .. [2] A. K. Jain, \"Fundamentals of Digital Image Processing,\"\n               Chapter 7: Image Degradation and Restoration.\n\n        .. [3] Salt and pepper noise:\n               https://en.wikipedia.org/wiki/Salt-and-pepper_noise\n\n    See Also:\n        - GaussNoise: For additive Gaussian noise\n        - MultiplicativeNoise: For multiplicative noise\n        - ISONoise: For camera sensor noise simulation\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        amount: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        salt_vs_pepper: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n\n    def __init__(\n        self,\n        amount: tuple[float, float] = (0.01, 0.06),\n        salt_vs_pepper: tuple[float, float] = (0.4, 0.6),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.amount = amount\n        self.salt_vs_pepper = salt_vs_pepper\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample total amount and salt ratio\n        total_amount = self.py_random.uniform(*self.amount)\n        salt_ratio = self.py_random.uniform(*self.salt_vs_pepper)\n\n        # Calculate individual probabilities\n        prob_salt = total_amount * salt_ratio\n        prob_pepper = total_amount * (1 - salt_ratio)\n\n        # Generate masks\n        salt_mask = self.random_generator.random(image.shape) &lt; prob_salt\n        pepper_mask = self.random_generator.random(image.shape) &lt; prob_pepper\n\n        return {\n            \"salt_mask\": salt_mask,\n            \"pepper_mask\": pepper_mask,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        salt_mask: np.ndarray,\n        pepper_mask: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.apply_salt_and_pepper(img, salt_mask, pepper_mask)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"amount\", \"salt_vs_pepper\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Sharpen","title":"<code>class  Sharpen</code> <code>       (alpha=(0.2, 0.5), lightness=(0.5, 1.0), method='kernel', kernel_size=5, sigma=1.0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Sharpen the input image using either kernel-based or Gaussian interpolation method.</p> <p>Implements two different approaches to image sharpening: 1. Traditional kernel-based method using Laplacian operator 2. Gaussian interpolation method (similar to Kornia's approach)</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>tuple[float, float]</code> <p>Range for the visibility of sharpening effect. At 0, only the original image is visible, at 1.0 only its processed version is visible. Values should be in the range [0, 1]. Used in both methods. Default: (0.2, 0.5).</p> <code>lightness</code> <code>tuple[float, float]</code> <p>Range for the lightness of the sharpened image. Only used in 'kernel' method. Larger values create higher contrast. Values should be greater than 0. Default: (0.5, 1.0).</p> <code>method</code> <code>Literal['kernel', 'gaussian']</code> <p>Sharpening algorithm to use: - 'kernel': Traditional kernel-based sharpening using Laplacian operator - 'gaussian': Interpolation between Gaussian blurred and original image Default: 'kernel'</p> <code>kernel_size</code> <code>int</code> <p>Size of the Gaussian blur kernel for 'gaussian' method. Must be odd. Default: 5</p> <code>sigma</code> <code>float</code> <p>Standard deviation for Gaussian kernel in 'gaussian' method. Default: 1.0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Mathematical Formulation:     1. Kernel Method:        The sharpening operation is based on the Laplacian operator L:        L = [[-1, -1, -1],             [-1,  8, -1],             [-1, -1, -1]]</p> <pre><code>   The final kernel K is a weighted sum:\n   K = (1 - a)I + a(L + \u03bbI)\n\n   where:\n   - a is the alpha value\n   - \u03bb is the lightness value\n   - I is the identity kernel\n\n   The output image O is computed as:\n   O = K * I  (convolution)\n\n2. Gaussian Method:\n   Based on the unsharp mask principle:\n   O = aI + (1-a)G\n\n   where:\n   - I is the input image\n   - G is the Gaussian blurred version of I\n   - a is the alpha value (sharpness)\n\n   The Gaussian kernel G(x,y) is defined as:\n   G(x,y) = (1/(2\u03c0s\u00b2))exp(-(x\u00b2+y\u00b2)/(2s\u00b2))\n</code></pre> <p>Note</p> <ul> <li>Kernel sizes must be odd to maintain spatial alignment</li> <li>Methods produce different visual results:</li> <li>Kernel method: More pronounced edges, possible artifacts</li> <li>Gaussian method: More natural look, limited to original sharpness</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--traditional-kernel-sharpening","title":"Traditional kernel sharpening","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Sharpen(\n...     alpha=(0.2, 0.5),\n...     lightness=(0.5, 1.0),\n...     method='kernel',\n...     p=1.0\n... )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--gaussian-interpolation-sharpening","title":"Gaussian interpolation sharpening","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Sharpen(\n...     alpha=(0.5, 1.0),\n...     method='gaussian',\n...     kernel_size=5,\n...     sigma=1.0,\n...     p=1.0\n... )\n</code></pre> <p>References</p> <p>.. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"        Chapter 3: Intensity Transformations and Spatial Filtering.</p> <p>.. [2] J. C. Russ, \"The Image Processing Handbook (7th Edition),\"        Chapter 4: Image Enhancement.</p> <p>.. [3] T. Acharya and A. K. Ray, \"Image Processing: Principles and Applications,\"        Chapter 5: Image Enhancement.</p> <p>.. [4] Unsharp masking:        https://en.wikipedia.org/wiki/Unsharp_masking</p> <p>.. [5] Laplacian operator:        https://en.wikipedia.org/wiki/Laplace_operator</p> <p>.. [6] Gaussian blur:        https://en.wikipedia.org/wiki/Gaussian_blur</p> <p>See Also:     - Blur: For Gaussian blurring     - UnsharpMask: Alternative sharpening method     - RandomBrightnessContrast: For adjusting image contrast</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Sharpen(ImageOnlyTransform):\n    \"\"\"Sharpen the input image using either kernel-based or Gaussian interpolation method.\n\n    Implements two different approaches to image sharpening:\n    1. Traditional kernel-based method using Laplacian operator\n    2. Gaussian interpolation method (similar to Kornia's approach)\n\n    Args:\n        alpha (tuple[float, float]): Range for the visibility of sharpening effect.\n            At 0, only the original image is visible, at 1.0 only its processed version is visible.\n            Values should be in the range [0, 1].\n            Used in both methods. Default: (0.2, 0.5).\n\n        lightness (tuple[float, float]): Range for the lightness of the sharpened image.\n            Only used in 'kernel' method. Larger values create higher contrast.\n            Values should be greater than 0. Default: (0.5, 1.0).\n\n        method (Literal['kernel', 'gaussian']): Sharpening algorithm to use:\n            - 'kernel': Traditional kernel-based sharpening using Laplacian operator\n            - 'gaussian': Interpolation between Gaussian blurred and original image\n            Default: 'kernel'\n\n        kernel_size (int): Size of the Gaussian blur kernel for 'gaussian' method.\n            Must be odd. Default: 5\n\n        sigma (float): Standard deviation for Gaussian kernel in 'gaussian' method.\n            Default: 1.0\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Mathematical Formulation:\n        1. Kernel Method:\n           The sharpening operation is based on the Laplacian operator L:\n           L = [[-1, -1, -1],\n                [-1,  8, -1],\n                [-1, -1, -1]]\n\n           The final kernel K is a weighted sum:\n           K = (1 - a)I + a(L + \u03bbI)\n\n           where:\n           - a is the alpha value\n           - \u03bb is the lightness value\n           - I is the identity kernel\n\n           The output image O is computed as:\n           O = K * I  (convolution)\n\n        2. Gaussian Method:\n           Based on the unsharp mask principle:\n           O = aI + (1-a)G\n\n           where:\n           - I is the input image\n           - G is the Gaussian blurred version of I\n           - a is the alpha value (sharpness)\n\n           The Gaussian kernel G(x,y) is defined as:\n           G(x,y) = (1/(2\u03c0s\u00b2))exp(-(x\u00b2+y\u00b2)/(2s\u00b2))\n\n    Note:\n        - Kernel sizes must be odd to maintain spatial alignment\n        - Methods produce different visual results:\n          * Kernel method: More pronounced edges, possible artifacts\n          * Gaussian method: More natural look, limited to original sharpness\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Traditional kernel sharpening\n        &gt;&gt;&gt; transform = A.Sharpen(\n        ...     alpha=(0.2, 0.5),\n        ...     lightness=(0.5, 1.0),\n        ...     method='kernel',\n        ...     p=1.0\n        ... )\n\n        # Gaussian interpolation sharpening\n        &gt;&gt;&gt; transform = A.Sharpen(\n        ...     alpha=(0.5, 1.0),\n        ...     method='gaussian',\n        ...     kernel_size=5,\n        ...     sigma=1.0,\n        ...     p=1.0\n        ... )\n\n    References:\n        .. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"\n               Chapter 3: Intensity Transformations and Spatial Filtering.\n\n        .. [2] J. C. Russ, \"The Image Processing Handbook (7th Edition),\"\n               Chapter 4: Image Enhancement.\n\n        .. [3] T. Acharya and A. K. Ray, \"Image Processing: Principles and Applications,\"\n               Chapter 5: Image Enhancement.\n\n        .. [4] Unsharp masking:\n               https://en.wikipedia.org/wiki/Unsharp_masking\n\n        .. [5] Laplacian operator:\n               https://en.wikipedia.org/wiki/Laplace_operator\n\n        .. [6] Gaussian blur:\n               https://en.wikipedia.org/wiki/Gaussian_blur\n\n    See Also:\n        - Blur: For Gaussian blurring\n        - UnsharpMask: Alternative sharpening method\n        - RandomBrightnessContrast: For adjusting image contrast\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        alpha: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        lightness: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, None))]\n        method: Literal[\"kernel\", \"gaussian\"]\n        kernel_size: int = Field(ge=3)\n        sigma: float = Field(gt=0)\n\n    @field_validator(\"kernel_size\")\n    @classmethod\n    def check_kernel_size(cls, value: int) -&gt; int:\n        return value + 1 if value % 2 == 0 else value\n\n    def __init__(\n        self,\n        alpha: tuple[float, float] = (0.2, 0.5),\n        lightness: tuple[float, float] = (0.5, 1.0),\n        method: Literal[\"kernel\", \"gaussian\"] = \"kernel\",\n        kernel_size: int = 5,\n        sigma: float = 1.0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.alpha = alpha\n        self.lightness = lightness\n        self.method = method\n        self.kernel_size = kernel_size\n        self.sigma = sigma\n\n    @staticmethod\n    def __generate_sharpening_matrix(\n        alpha: np.ndarray,\n        lightness: np.ndarray,\n    ) -&gt; np.ndarray:\n        matrix_nochange = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 0]], dtype=np.float32)\n        matrix_effect = np.array(\n            [[-1, -1, -1], [-1, 8 + lightness, -1], [-1, -1, -1]],\n            dtype=np.float32,\n        )\n\n        return (1 - alpha) * matrix_nochange + alpha * matrix_effect\n\n    def get_params(self) -&gt; dict[str, Any]:\n        alpha = self.py_random.uniform(*self.alpha)\n\n        if self.method == \"kernel\":\n            lightness = self.py_random.uniform(*self.lightness)\n            return {\n                \"alpha\": alpha,\n                \"sharpening_matrix\": self.__generate_sharpening_matrix(\n                    alpha,\n                    lightness,\n                ),\n            }\n\n        return {\"alpha\": alpha, \"sharpening_matrix\": None}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        alpha: float,\n        sharpening_matrix: np.ndarray | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if self.method == \"kernel\":\n            return fmain.convolve(img, sharpening_matrix)\n        return fmain.sharpen_gaussian(img, alpha, self.kernel_size, self.sigma)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"alpha\", \"lightness\", \"method\", \"kernel_size\", \"sigma\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ShotNoise","title":"<code>class  ShotNoise</code> <code>       (scale_range=(0.1, 0.3), p=0.5, always_apply=False)                       </code>  [view source on GitHub]","text":"<p>Apply shot noise to the image by modeling photon counting as a Poisson process.</p> <p>Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light. When photons hit an imaging sensor, they arrive at random times following Poisson statistics. This transform simulates this physical process in linear light space by: 1. Converting to linear space (removing gamma) 2. Treating each pixel value as an expected photon count 3. Sampling actual photon counts from a Poisson distribution 4. Converting back to display space (reapplying gamma)</p> <p>The noise characteristics follow real camera behavior: - Noise variance equals signal mean in linear space (Poisson statistics) - Brighter regions have more absolute noise but less relative noise - Darker regions have less absolute noise but more relative noise - Noise is generated independently for each pixel and color channel</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for sampling the noise scale factor. Represents the reciprocal of the expected photon count per unit intensity. Higher values mean more noise: - scale = 0.1: ~100 photons per unit intensity (low noise) - scale = 1.0: ~1 photon per unit intensity (moderate noise) - scale = 10.0: ~0.1 photons per unit intensity (high noise) Default: (0.1, 0.3)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Performs calculations in linear light space (gamma = 2.2)</li> <li>Preserves the image's mean intensity</li> <li>Memory efficient with in-place operations</li> <li>Thread-safe with independent random seeds</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Generate synthetic image\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; # Apply moderate shot noise\n&gt;&gt;&gt; transform = A.ShotNoise(scale_range=(0.1, 1.0), p=1.0)\n&gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Shot noise: https://en.wikipedia.org/wiki/Shot_noise</li> <li>Original paper: https://doi.org/10.1002/andp.19183622304 (Schottky, 1918)</li> <li>Poisson process: https://en.wikipedia.org/wiki/Poisson_point_process</li> <li>Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ShotNoise(ImageOnlyTransform):\n    \"\"\"Apply shot noise to the image by modeling photon counting as a Poisson process.\n\n    Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light.\n    When photons hit an imaging sensor, they arrive at random times following Poisson statistics.\n    This transform simulates this physical process in linear light space by:\n    1. Converting to linear space (removing gamma)\n    2. Treating each pixel value as an expected photon count\n    3. Sampling actual photon counts from a Poisson distribution\n    4. Converting back to display space (reapplying gamma)\n\n    The noise characteristics follow real camera behavior:\n    - Noise variance equals signal mean in linear space (Poisson statistics)\n    - Brighter regions have more absolute noise but less relative noise\n    - Darker regions have less absolute noise but more relative noise\n    - Noise is generated independently for each pixel and color channel\n\n    Args:\n        scale_range (tuple[float, float]): Range for sampling the noise scale factor.\n            Represents the reciprocal of the expected photon count per unit intensity.\n            Higher values mean more noise:\n            - scale = 0.1: ~100 photons per unit intensity (low noise)\n            - scale = 1.0: ~1 photon per unit intensity (moderate noise)\n            - scale = 10.0: ~0.1 photons per unit intensity (high noise)\n            Default: (0.1, 0.3)\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Performs calculations in linear light space (gamma = 2.2)\n        - Preserves the image's mean intensity\n        - Memory efficient with in-place operations\n        - Thread-safe with independent random seeds\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Generate synthetic image\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; # Apply moderate shot noise\n        &gt;&gt;&gt; transform = A.ShotNoise(scale_range=(0.1, 1.0), p=1.0)\n        &gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n\n    References:\n        - Shot noise: https://en.wikipedia.org/wiki/Shot_noise\n        - Original paper: https://doi.org/10.1002/andp.19183622304 (Schottky, 1918)\n        - Poisson process: https://en.wikipedia.org/wiki/Poisson_point_process\n        - Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_range: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, None)),\n        ]\n\n    def __init__(\n        self,\n        scale_range: tuple[float, float] = (0.1, 0.3),\n        p: float = 0.5,\n        always_apply: bool = False,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_range = scale_range\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        random_seed: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.shot_noise(img, scale, np.random.default_rng(random_seed))\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"scale\": self.py_random.uniform(*self.scale_range),\n            \"random_seed\": self.random_generator.integers(0, 2**32 - 1),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"scale_range\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Solarize","title":"<code>class  Solarize</code> <code>       (threshold=None, threshold_range=(0.5, 0.5), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Invert all pixel values above a threshold.</p> <p>This transform applies a solarization effect to the input image. Solarization is a phenomenon in photography in which the image recorded on a negative or on a photographic print is wholly or partially reversed in tone. Dark areas appear light or light areas appear dark.</p> <p>In this implementation, all pixel values above a threshold are inverted.</p> <p>Parameters:</p> Name Type Description <code>threshold_range</code> <code>tuple[float, float]</code> <p>Range for solarizing threshold as a fraction of maximum value. The threshold_range should be in the range [0, 1] and will be multiplied by the maximum value of the image type (255 for uint8 images or 1.0 for float images). Default: (0.5, 0.5) (corresponds to 127.5 for uint8 and 0.5 for float32).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>For uint8 images, pixel values above the threshold are inverted as: 255 - pixel_value</li> <li>For float32 images, pixel values above the threshold are inverted as: 1.0 - pixel_value</li> <li>The threshold is applied to each channel independently</li> <li>The threshold is calculated in two steps:</li> <li>Sample a value from threshold_range</li> <li>Multiply by the image's maximum value:<ul> <li>For uint8: threshold = sampled_value * 255</li> <li>For float32: threshold = sampled_value * 1.0</li> </ul> </li> <li>This transform can create interesting artistic effects or be used for data augmentation</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n# Solarize uint8 image with fixed threshold at 50% of max value (127.5)\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n&gt;&gt;&gt; solarized_image = transform(image=image)['image']\n&gt;&gt;&gt;\n# Solarize uint8 image with random threshold between 40-60% of max value (102-153)\n&gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.4, 0.6), p=1.0)\n&gt;&gt;&gt; solarized_image = transform(image=image)['image']\n&gt;&gt;&gt;\n# Solarize float32 image at 50% of max value (0.5)\n&gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n&gt;&gt;&gt; solarized_image = transform(image=image)['image']\n</code></pre> <p>Mathematical Formulation:     Let f be a value sampled from threshold_range (min, max).     For each pixel value p:     threshold = f * max_value     if p &gt; threshold:         p_new = max_value - p     !!! else         p_new = p</p> <pre><code>Where max_value is 255 for uint8 images and 1.0 for float32 images.\n</code></pre> <p>See Also:     Invert: For inverting all pixel values regardless of a threshold.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Solarize(ImageOnlyTransform):\n    \"\"\"Invert all pixel values above a threshold.\n\n    This transform applies a solarization effect to the input image. Solarization is a phenomenon in\n    photography in which the image recorded on a negative or on a photographic print is wholly or\n    partially reversed in tone. Dark areas appear light or light areas appear dark.\n\n    In this implementation, all pixel values above a threshold are inverted.\n\n    Args:\n        threshold_range (tuple[float, float]): Range for solarizing threshold as a fraction\n            of maximum value. The threshold_range should be in the range [0, 1] and will be multiplied by the\n            maximum value of the image type (255 for uint8 images or 1.0 for float images).\n            Default: (0.5, 0.5) (corresponds to 127.5 for uint8 and 0.5 for float32).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - For uint8 images, pixel values above the threshold are inverted as: 255 - pixel_value\n        - For float32 images, pixel values above the threshold are inverted as: 1.0 - pixel_value\n        - The threshold is applied to each channel independently\n        - The threshold is calculated in two steps:\n          1. Sample a value from threshold_range\n          2. Multiply by the image's maximum value:\n             * For uint8: threshold = sampled_value * 255\n             * For float32: threshold = sampled_value * 1.0\n        - This transform can create interesting artistic effects or be used for data augmentation\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        # Solarize uint8 image with fixed threshold at 50% of max value (127.5)\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n        &gt;&gt;&gt; solarized_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        # Solarize uint8 image with random threshold between 40-60% of max value (102-153)\n        &gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.4, 0.6), p=1.0)\n        &gt;&gt;&gt; solarized_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        # Solarize float32 image at 50% of max value (0.5)\n        &gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n        &gt;&gt;&gt; solarized_image = transform(image=image)['image']\n\n    Mathematical Formulation:\n        Let f be a value sampled from threshold_range (min, max).\n        For each pixel value p:\n        threshold = f * max_value\n        if p &gt; threshold:\n            p_new = max_value - p\n        else:\n            p_new = p\n\n        Where max_value is 255 for uint8 images and 1.0 for float32 images.\n\n    See Also:\n        Invert: For inverting all pixel values regardless of a threshold.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        threshold: ScaleFloatType | None\n        threshold_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        @staticmethod\n        def normalize_threshold(\n            threshold: ScaleFloatType | None,\n            threshold_range: tuple[float, float],\n        ) -&gt; tuple[float, float]:\n            \"\"\"Convert legacy threshold or use threshold_range, normalizing to [0,1] range.\"\"\"\n            if threshold is not None:\n                warn(\"`threshold` deprecated. Use `threshold_range` instead.\", DeprecationWarning, stacklevel=2)\n                value = to_tuple(threshold, threshold)\n                return (value[0] / 255, value[1] / 255) if value[1] &gt; 1 else value\n            return threshold_range\n\n        @model_validator(mode=\"after\")\n        def process_threshold(self) -&gt; Self:\n            self.threshold_range = self.normalize_threshold(\n                self.threshold,\n                self.threshold_range,\n            )\n            return self\n\n    def __init__(\n        self,\n        threshold: ScaleFloatType | None = None,\n        threshold_range: tuple[float, float] = (0.5, 0.5),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.threshold_range = threshold_range\n\n    def apply(self, img: np.ndarray, threshold: float, **params: Any) -&gt; np.ndarray:\n        return fmain.solarize(img, threshold)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"threshold\": self.py_random.uniform(*self.threshold_range)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"threshold_range\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Spatter","title":"<code>class  Spatter</code> <code>       (mean=(0.65, 0.65), std=(0.3, 0.3), gauss_sigma=(2, 2), cutout_threshold=(0.68, 0.68), intensity=(0.6, 0.6), mode='rain', color=None, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply spatter transform. It simulates corruption which can occlude a lens in the form of rain or mud.</p> <p>Parameters:</p> Name Type Description <code>mean</code> <code>tuple[float, float] | float</code> <p>Mean value of normal distribution for generating liquid layer. If single float mean will be sampled from <code>(0, mean)</code> If tuple of float mean will be sampled from range <code>(mean[0], mean[1])</code>. If you want constant value use (mean, mean). Default (0.65, 0.65)</p> <code>std</code> <code>tuple[float, float] | float</code> <p>Standard deviation value of normal distribution for generating liquid layer. If single float the number will be sampled from <code>(0, std)</code>. If tuple of float std will be sampled from range <code>(std[0], std[1])</code>. If you want constant value use (std, std). Default: (0.3, 0.3).</p> <code>gauss_sigma</code> <code>tuple[float, float] | floats</code> <p>Sigma value for gaussian filtering of liquid layer. If single float the number will be sampled from <code>(0, gauss_sigma)</code>. If tuple of float gauss_sigma will be sampled from range <code>(gauss_sigma[0], gauss_sigma[1])</code>. If you want constant value use (gauss_sigma, gauss_sigma). Default: (2, 3).</p> <code>cutout_threshold</code> <code>tuple[float, float] | floats</code> <p>Threshold for filtering liqued layer (determines number of drops). If single float it will used as cutout_threshold. If single float the number will be sampled from <code>(0, cutout_threshold)</code>. If tuple of float cutout_threshold will be sampled from range <code>(cutout_threshold[0], cutout_threshold[1])</code>. If you want constant value use <code>(cutout_threshold, cutout_threshold)</code>. Default: (0.68, 0.68).</p> <code>intensity</code> <code>tuple[float, float] | floats</code> <p>Intensity of corruption. If single float the number will be sampled from <code>(0, intensity)</code>. If tuple of float intensity will be sampled from range <code>(intensity[0], intensity[1])</code>. If you want constant value use <code>(intensity, intensity)</code>. Default: (0.6, 0.6).</p> <code>mode</code> <code>str, or list[str]</code> <p>Type of corruption. Currently, supported options are 'rain' and 'mud'.  If list is provided type of corruption will be sampled list. Default: (\"rain\").</p> <code>color</code> <code>list of (r, g, b) or dict or None</code> <p>Corruption elements color. If list uses provided list as color for specified mode. If dict uses provided color for specified mode. Color for each specified mode should be provided in dict. If None uses default colors (rain: (238, 238, 175), mud: (20, 42, 63)).</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>https://arxiv.org/abs/1903.12261 https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Spatter(ImageOnlyTransform):\n    \"\"\"Apply spatter transform. It simulates corruption which can occlude a lens in the form of rain or mud.\n\n    Args:\n        mean (tuple[float, float] | float): Mean value of normal distribution for generating liquid layer.\n            If single float mean will be sampled from `(0, mean)`\n            If tuple of float mean will be sampled from range `(mean[0], mean[1])`.\n            If you want constant value use (mean, mean).\n            Default (0.65, 0.65)\n        std (tuple[float, float] | float): Standard deviation value of normal distribution for generating liquid layer.\n            If single float the number will be sampled from `(0, std)`.\n            If tuple of float std will be sampled from range `(std[0], std[1])`.\n            If you want constant value use (std, std).\n            Default: (0.3, 0.3).\n        gauss_sigma (tuple[float, float] | floats): Sigma value for gaussian filtering of liquid layer.\n            If single float the number will be sampled from `(0, gauss_sigma)`.\n            If tuple of float gauss_sigma will be sampled from range `(gauss_sigma[0], gauss_sigma[1])`.\n            If you want constant value use (gauss_sigma, gauss_sigma).\n            Default: (2, 3).\n        cutout_threshold (tuple[float, float] | floats): Threshold for filtering liqued layer\n            (determines number of drops). If single float it will used as cutout_threshold.\n            If single float the number will be sampled from `(0, cutout_threshold)`.\n            If tuple of float cutout_threshold will be sampled from range `(cutout_threshold[0], cutout_threshold[1])`.\n            If you want constant value use `(cutout_threshold, cutout_threshold)`.\n            Default: (0.68, 0.68).\n        intensity (tuple[float, float] | floats): Intensity of corruption.\n            If single float the number will be sampled from `(0, intensity)`.\n            If tuple of float intensity will be sampled from range `(intensity[0], intensity[1])`.\n            If you want constant value use `(intensity, intensity)`.\n            Default: (0.6, 0.6).\n        mode (str, or list[str]): Type of corruption. Currently, supported options are 'rain' and 'mud'.\n             If list is provided type of corruption will be sampled list. Default: (\"rain\").\n        color (list of (r, g, b) or dict or None): Corruption elements color.\n            If list uses provided list as color for specified mode.\n            If dict uses provided color for specified mode. Color for each specified mode should be provided in dict.\n            If None uses default colors (rain: (238, 238, 175), mud: (20, 42, 63)).\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        https://arxiv.org/abs/1903.12261\n        https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mean: ZeroOneRangeType = (0.65, 0.65)\n        std: ZeroOneRangeType = (0.3, 0.3)\n        gauss_sigma: NonNegativeFloatRangeType = (2, 2)\n        cutout_threshold: ZeroOneRangeType = (0.68, 0.68)\n        intensity: ZeroOneRangeType = (0.6, 0.6)\n        mode: SpatterMode | Sequence[SpatterMode]\n        color: Sequence[int] | dict[str, Sequence[int]] | None = None\n\n        @field_validator(\"mode\")\n        @classmethod\n        def check_mode(\n            cls,\n            mode: SpatterMode | Sequence[SpatterMode],\n        ) -&gt; Sequence[SpatterMode]:\n            if isinstance(mode, str):\n                return [mode]\n            return mode\n\n        @model_validator(mode=\"after\")\n        def check_color(self) -&gt; Self:\n            if self.color is None:\n                self.color = {\"rain\": [238, 238, 175], \"mud\": [20, 42, 63]}\n\n            elif isinstance(self.color, (list, tuple)) and len(self.mode) == 1:\n                if len(self.color) != NUM_RGB_CHANNELS:\n                    msg = \"Color must be a list of three integers for RGB format.\"\n                    raise ValueError(msg)\n                self.color = {self.mode[0]: self.color}\n            elif isinstance(self.color, dict):\n                result = {}\n                for mode in self.mode:\n                    if mode not in self.color:\n                        raise ValueError(f\"Color for mode {mode} is not specified.\")\n                    if len(self.color[mode]) != NUM_RGB_CHANNELS:\n                        raise ValueError(\n                            f\"Color for mode {mode} must be in RGB format.\",\n                        )\n                    result[mode] = self.color[mode]\n            else:\n                msg = \"Color must be a list of RGB values or a dict mapping mode to RGB values.\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        mean: ScaleFloatType = (0.65, 0.65),\n        std: ScaleFloatType = (0.3, 0.3),\n        gauss_sigma: ScaleFloatType = (2, 2),\n        cutout_threshold: ScaleFloatType = (0.68, 0.68),\n        intensity: ScaleFloatType = (0.6, 0.6),\n        mode: SpatterMode | Sequence[SpatterMode] = \"rain\",\n        color: Sequence[int] | dict[str, Sequence[int]] | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.mean = cast(tuple[float, float], mean)\n        self.std = cast(tuple[float, float], std)\n        self.gauss_sigma = cast(tuple[float, float], gauss_sigma)\n        self.cutout_threshold = cast(tuple[float, float], cutout_threshold)\n        self.intensity = cast(tuple[float, float], intensity)\n        self.mode = mode\n        self.color = cast(dict[str, Sequence[int]], color)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        non_mud: np.ndarray,\n        mud: np.ndarray,\n        drops: np.ndarray,\n        mode: SpatterMode,\n        **params: dict[str, Any],\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.spatter(img, non_mud, mud, drops, mode)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        mean = self.py_random.uniform(*self.mean)\n        std = self.py_random.uniform(*self.std)\n        cutout_threshold = self.py_random.uniform(*self.cutout_threshold)\n        sigma = self.py_random.uniform(*self.gauss_sigma)\n        mode = self.py_random.choice(self.mode)\n        intensity = self.py_random.uniform(*self.intensity)\n        color = np.array(self.color[mode]) / 255.0\n\n        liquid_layer = self.random_generator.normal(\n            size=(height, width),\n            loc=mean,\n            scale=std,\n        )\n        liquid_layer = gaussian_filter(liquid_layer, sigma=sigma, mode=\"nearest\")\n        liquid_layer[liquid_layer &lt; cutout_threshold] = 0\n\n        if mode == \"rain\":\n            liquid_layer = clip(liquid_layer * 255, np.uint8, inplace=False)\n            dist = 255 - cv2.Canny(liquid_layer, 50, 150)\n            dist = cv2.distanceTransform(dist, cv2.DIST_L2, 5)\n            _, dist = cv2.threshold(dist, 20, 20, cv2.THRESH_TRUNC)\n            dist = clip(fblur.blur(dist, 3), np.uint8, inplace=True)\n            dist = fmain.equalize(dist)\n\n            ker = np.array([[-2, -1, 0], [-1, 1, 1], [0, 1, 2]])\n            dist = fmain.convolve(dist, ker)\n            dist = fblur.blur(dist, 3).astype(np.float32)\n\n            m = liquid_layer * dist\n            m *= 1 / np.max(m, axis=(0, 1))\n\n            drops = m[:, :, None] * color * intensity\n            mud = None\n            non_mud = None\n        else:\n            m = np.where(liquid_layer &gt; cutout_threshold, 1, 0)\n            m = gaussian_filter(m.astype(np.float32), sigma=sigma, mode=\"nearest\")\n            m[m &lt; 1.2 * cutout_threshold] = 0\n            m = m[..., np.newaxis]\n\n            mud = m * color\n            non_mud = 1 - m\n            drops = None\n\n        return {\n            \"non_mud\": non_mud,\n            \"mud\": mud,\n            \"drops\": drops,\n            \"mode\": mode,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str, str, str, str]:\n        return (\n            \"mean\",\n            \"std\",\n            \"gauss_sigma\",\n            \"intensity\",\n            \"cutout_threshold\",\n            \"mode\",\n            \"color\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Superpixels","title":"<code>class  Superpixels</code> <code>       (p_replace=(0, 0.1), n_segments=(100, 100), max_size=128, interpolation=1, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Transform images partially/completely to their superpixel representation.</p> <p>Parameters:</p> Name Type Description <code>p_replace</code> <code>tuple[float, float] | float</code> <p>Defines for any segment the probability that the pixels within that segment are replaced by their average color (otherwise, the pixels are not changed).</p> <ul> <li>A probability of <code>0.0</code> would mean, that the pixels in no     segment are replaced by their average color (image is not     changed at all).</li> <li>A probability of <code>0.5</code> would mean, that around half of all     segments are replaced by their average color.</li> <li>A probability of <code>1.0</code> would mean, that all segments are     replaced by their average color (resulting in a voronoi     image).</li> </ul> <p>Behavior based on chosen data types for this parameter: * If a <code>float</code>, then that <code>float</code> will always be used. * If <code>tuple</code> <code>(a, b)</code>, then a random probability will be sampled from the interval <code>[a, b]</code> per image. Default: (0.1, 0.3)</p> <code>n_segments</code> <code>tuple[int, int] | int</code> <p>Rough target number of how many superpixels to generate. The algorithm may deviate from this number. Lower value will lead to coarser superpixels. Higher values are computationally more intensive and will hence lead to a slowdown. If tuple <code>(a, b)</code>, then a value from the discrete interval <code>[a..b]</code> will be sampled per image. Default: (15, 120)</p> <code>max_size</code> <code>int | None</code> <p>Maximum image size at which the augmentation is performed. If the width or height of an image exceeds this value, it will be downscaled before the augmentation so that the longest side matches <code>max_size</code>. This is done to speed up the process. The final output image has the same size as the input image. Note that in case <code>p_replace</code> is below <code>1.0</code>, the down-/upscaling will affect the not-replaced pixels too. Use <code>None</code> to apply no down-/upscaling. Default: 128</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform can significantly change the visual appearance of the image.</li> <li>The transform makes use of a superpixel algorithm, which tends to be slow. If performance is a concern, consider using <code>max_size</code> to limit the image size.</li> <li>The effect of this transform can vary greatly depending on the <code>p_replace</code> and <code>n_segments</code> parameters.</li> <li>When <code>p_replace</code> is high, the image can become highly abstracted, resembling a voronoi diagram.</li> <li>The transform preserves the original image type (uint8 or float32).</li> </ul> <p>Mathematical Formulation:     1. The image is segmented into approximately <code>n_segments</code> superpixels using the SLIC algorithm.     2. For each superpixel:     - With probability <code>p_replace</code>, all pixels in the superpixel are replaced with their mean color.     - With probability <code>1 - p_replace</code>, the superpixel is left unchanged.     3. If the image was resized due to <code>max_size</code>, it is resized back to its original dimensions.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-superpixels-with-default-parameters","title":"Apply superpixels with default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Superpixels(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-superpixels-with-custom-parameters","title":"Apply superpixels with custom parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Superpixels(\n...     p_replace=(0.5, 0.7),\n...     n_segments=(50, 100),\n...     max_size=None,\n...     interpolation=cv2.INTER_NEAREST,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Superpixels(ImageOnlyTransform):\n    \"\"\"Transform images partially/completely to their superpixel representation.\n\n    Args:\n        p_replace (tuple[float, float] | float): Defines for any segment the probability that the pixels within that\n            segment are replaced by their average color (otherwise, the pixels are not changed).\n\n\n            * A probability of ``0.0`` would mean, that the pixels in no\n                segment are replaced by their average color (image is not\n                changed at all).\n            * A probability of ``0.5`` would mean, that around half of all\n                segments are replaced by their average color.\n            * A probability of ``1.0`` would mean, that all segments are\n                replaced by their average color (resulting in a voronoi\n                image).\n\n            Behavior based on chosen data types for this parameter:\n            * If a ``float``, then that ``float`` will always be used.\n            * If ``tuple`` ``(a, b)``, then a random probability will be\n            sampled from the interval ``[a, b]`` per image.\n            Default: (0.1, 0.3)\n\n        n_segments (tuple[int, int] | int): Rough target number of how many superpixels to generate.\n            The algorithm may deviate from this number.\n            Lower value will lead to coarser superpixels.\n            Higher values are computationally more intensive and will hence lead to a slowdown.\n            If tuple ``(a, b)``, then a value from the discrete interval ``[a..b]`` will be sampled per image.\n            Default: (15, 120)\n\n        max_size (int | None): Maximum image size at which the augmentation is performed.\n            If the width or height of an image exceeds this value, it will be\n            downscaled before the augmentation so that the longest side matches `max_size`.\n            This is done to speed up the process. The final output image has the same size as the input image.\n            Note that in case `p_replace` is below ``1.0``,\n            the down-/upscaling will affect the not-replaced pixels too.\n            Use ``None`` to apply no down-/upscaling.\n            Default: 128\n\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform can significantly change the visual appearance of the image.\n        - The transform makes use of a superpixel algorithm, which tends to be slow.\n        If performance is a concern, consider using `max_size` to limit the image size.\n        - The effect of this transform can vary greatly depending on the `p_replace` and `n_segments` parameters.\n        - When `p_replace` is high, the image can become highly abstracted, resembling a voronoi diagram.\n        - The transform preserves the original image type (uint8 or float32).\n\n    Mathematical Formulation:\n        1. The image is segmented into approximately `n_segments` superpixels using the SLIC algorithm.\n        2. For each superpixel:\n        - With probability `p_replace`, all pixels in the superpixel are replaced with their mean color.\n        - With probability `1 - p_replace`, the superpixel is left unchanged.\n        3. If the image was resized due to `max_size`, it is resized back to its original dimensions.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply superpixels with default parameters\n        &gt;&gt;&gt; transform = A.Superpixels(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n\n        # Apply superpixels with custom parameters\n        &gt;&gt;&gt; transform = A.Superpixels(\n        ...     p_replace=(0.5, 0.7),\n        ...     n_segments=(50, 100),\n        ...     max_size=None,\n        ...     interpolation=cv2.INTER_NEAREST,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        p_replace: ZeroOneRangeType\n        n_segments: OnePlusIntRangeType\n        max_size: int | None = Field(ge=1)\n        interpolation: InterpolationType\n\n    def __init__(\n        self,\n        p_replace: ScaleFloatType = (0, 0.1),\n        n_segments: ScaleIntType = (100, 100),\n        max_size: int | None = 128,\n        interpolation: int = cv2.INTER_LINEAR,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.p_replace = cast(tuple[float, float], p_replace)\n        self.n_segments = cast(tuple[int, int], n_segments)\n        self.max_size = max_size\n        self.interpolation = interpolation\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"p_replace\", \"n_segments\", \"max_size\", \"interpolation\"\n\n    def get_params(self) -&gt; dict[str, Any]:\n        n_segments = self.py_random.randint(*self.n_segments)\n        p = self.py_random.uniform(*self.p_replace)\n        return {\n            \"replace_samples\": self.random_generator.random(n_segments) &lt; p,\n            \"n_segments\": n_segments,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        replace_samples: Sequence[bool],\n        n_segments: int,\n        **kwargs: Any,\n    ) -&gt; np.ndarray:\n        return fmain.superpixels(\n            img,\n            n_segments,\n            replace_samples,\n            self.max_size,\n            self.interpolation,\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToFloat","title":"<code>class  ToFloat</code> <code>       (max_value=None, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Convert the input image to a floating-point representation.</p> <p>This transform divides pixel values by <code>max_value</code> to get a float32 output array where all values lie in the range [0, 1.0]. It's useful for normalizing image data before feeding it into neural networks or other algorithms that expect float input.</p> <p>Parameters:</p> Name Type Description <code>max_value</code> <code>float | None</code> <p>The maximum possible input value. If None, the transform will try to infer the maximum value by inspecting the data type of the input image: - uint8: 255 - uint16: 65535 - uint32: 4294967295 - float32: 1.0 Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, uint16, uint32, float32</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image in floating point representation, with values in range [0, 1.0].</p> <p>Note</p> <ul> <li>If the input image is already float32 with values in [0, 1], it will be returned unchanged.</li> <li>For integer types (uint8, uint16, uint32), the function will scale the values to [0, 1] range.</li> <li>The output will always be float32, regardless of the input type.</li> <li>This transform is often used as a preprocessing step before applying other transformations   or feeding the image into a neural network.</li> </ul> <p>Exceptions:</p> Type Description <code>TypeError</code> <p>If the input image data type is not supported.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n# Convert uint8 image to float\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ToFloat(max_value=None)\n&gt;&gt;&gt; float_image = transform(image=image)['image']\n&gt;&gt;&gt; assert float_image.dtype == np.float32\n&gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n&gt;&gt;&gt;\n# Convert uint16 image to float with custom max_value\n&gt;&gt;&gt; image = np.random.randint(0, 4096, (100, 100, 3), dtype=np.uint16)\n&gt;&gt;&gt; transform = A.ToFloat(max_value=4095)\n&gt;&gt;&gt; float_image = transform(image=image)['image']\n&gt;&gt;&gt; assert float_image.dtype == np.float32\n&gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n</code></pre> <p>See Also:     FromFloat: The inverse operation, converting from float back to the original data type.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToFloat(ImageOnlyTransform):\n    \"\"\"Convert the input image to a floating-point representation.\n\n    This transform divides pixel values by `max_value` to get a float32 output array\n    where all values lie in the range [0, 1.0]. It's useful for normalizing image data\n    before feeding it into neural networks or other algorithms that expect float input.\n\n    Args:\n        max_value (float | None): The maximum possible input value. If None, the transform\n            will try to infer the maximum value by inspecting the data type of the input image:\n            - uint8: 255\n            - uint16: 65535\n            - uint32: 4294967295\n            - float32: 1.0\n            Default: None.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, uint16, uint32, float32\n\n    Returns:\n        np.ndarray: Image in floating point representation, with values in range [0, 1.0].\n\n    Note:\n        - If the input image is already float32 with values in [0, 1], it will be returned unchanged.\n        - For integer types (uint8, uint16, uint32), the function will scale the values to [0, 1] range.\n        - The output will always be float32, regardless of the input type.\n        - This transform is often used as a preprocessing step before applying other transformations\n          or feeding the image into a neural network.\n\n    Raises:\n        TypeError: If the input image data type is not supported.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        # Convert uint8 image to float\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ToFloat(max_value=None)\n        &gt;&gt;&gt; float_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert float_image.dtype == np.float32\n        &gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n        &gt;&gt;&gt;\n        # Convert uint16 image to float with custom max_value\n        &gt;&gt;&gt; image = np.random.randint(0, 4096, (100, 100, 3), dtype=np.uint16)\n        &gt;&gt;&gt; transform = A.ToFloat(max_value=4095)\n        &gt;&gt;&gt; float_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert float_image.dtype == np.float32\n        &gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n\n    See Also:\n        FromFloat: The inverse operation, converting from float back to the original data type.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        max_value: float | None\n\n    def __init__(\n        self,\n        max_value: float | None = None,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p, always_apply)\n        self.max_value = max_value\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return to_float(img, self.max_value)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str]:\n        return (\"max_value\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToGray","title":"<code>class  ToGray</code> <code>       (num_output_channels=3, method='weighted_average', always_apply=None, p=0.5)                     </code>  [view source on GitHub]","text":"<p>Convert an image to grayscale and optionally replicate the grayscale channel.</p> <p>This transform first converts a color image to a single-channel grayscale image using various methods, then replicates the grayscale channel if num_output_channels is greater than 1.</p> <p>Parameters:</p> Name Type Description <code>num_output_channels</code> <code>int</code> <p>The number of channels in the output image. If greater than 1, the grayscale channel will be replicated. Default: 3.</p> <code>method</code> <code>Literal[\"weighted_average\", \"from_lab\", \"desaturation\", \"average\", \"max\", \"pca\"]</code> <p>The method used for grayscale conversion: - \"weighted_average\": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B).   Works only with 3-channel images. Provides realistic results based on human perception. - \"from_lab\": Extracts the L channel from the LAB color space.   Works only with 3-channel images. Gives perceptually uniform results. - \"desaturation\": Averages the maximum and minimum values across channels.   Works with any number of channels. Fast but may not preserve perceived brightness well. - \"average\": Simple average of all channels.   Works with any number of channels. Fast but may not give realistic results. - \"max\": Takes the maximum value across all channels.   Works with any number of channels. Tends to produce brighter results. - \"pca\": Applies Principal Component Analysis to reduce channels.   Works with any number of channels. Can preserve more information but is computationally intensive.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Exceptions:</p> Type Description <code>TypeError</code> <p>If the input image doesn't have 3 channels for methods that require it.</p> <p>Note</p> <ul> <li>The transform first converts the input image to single-channel grayscale, then replicates   this channel if num_output_channels &gt; 1.</li> <li>\"weighted_average\" and \"from_lab\" are typically used in image processing and computer vision   applications where accurate representation of human perception is important.</li> <li>\"desaturation\" and \"average\" are often used in simple image manipulation tools or when   computational speed is a priority.</li> <li>\"max\" method can be useful in scenarios where preserving bright features is important,   such as in some medical imaging applications.</li> <li>\"pca\" might be used in advanced image analysis tasks or when dealing with hyperspectral images.</li> </ul> <p>Image types:     uint8, float32</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image with the specified number of channels.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToGray(ImageOnlyTransform):\n    \"\"\"Convert an image to grayscale and optionally replicate the grayscale channel.\n\n    This transform first converts a color image to a single-channel grayscale image using various methods,\n    then replicates the grayscale channel if num_output_channels is greater than 1.\n\n    Args:\n        num_output_channels (int): The number of channels in the output image. If greater than 1,\n            the grayscale channel will be replicated. Default: 3.\n        method (Literal[\"weighted_average\", \"from_lab\", \"desaturation\", \"average\", \"max\", \"pca\"]):\n            The method used for grayscale conversion:\n            - \"weighted_average\": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B).\n              Works only with 3-channel images. Provides realistic results based on human perception.\n            - \"from_lab\": Extracts the L channel from the LAB color space.\n              Works only with 3-channel images. Gives perceptually uniform results.\n            - \"desaturation\": Averages the maximum and minimum values across channels.\n              Works with any number of channels. Fast but may not preserve perceived brightness well.\n            - \"average\": Simple average of all channels.\n              Works with any number of channels. Fast but may not give realistic results.\n            - \"max\": Takes the maximum value across all channels.\n              Works with any number of channels. Tends to produce brighter results.\n            - \"pca\": Applies Principal Component Analysis to reduce channels.\n              Works with any number of channels. Can preserve more information but is computationally intensive.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Raises:\n        TypeError: If the input image doesn't have 3 channels for methods that require it.\n\n    Note:\n        - The transform first converts the input image to single-channel grayscale, then replicates\n          this channel if num_output_channels &gt; 1.\n        - \"weighted_average\" and \"from_lab\" are typically used in image processing and computer vision\n          applications where accurate representation of human perception is important.\n        - \"desaturation\" and \"average\" are often used in simple image manipulation tools or when\n          computational speed is a priority.\n        - \"max\" method can be useful in scenarios where preserving bright features is important,\n          such as in some medical imaging applications.\n        - \"pca\" might be used in advanced image analysis tasks or when dealing with hyperspectral images.\n\n    Image types:\n        uint8, float32\n\n    Returns:\n        np.ndarray: Grayscale image with the specified number of channels.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_output_channels: int = Field(\n            default=3,\n            description=\"The number of output channels.\",\n            ge=1,\n        )\n        method: Literal[\n            \"weighted_average\",\n            \"from_lab\",\n            \"desaturation\",\n            \"average\",\n            \"max\",\n            \"pca\",\n        ]\n\n    def __init__(\n        self,\n        num_output_channels: int = 3,\n        method: Literal[\n            \"weighted_average\",\n            \"from_lab\",\n            \"desaturation\",\n            \"average\",\n            \"max\",\n            \"pca\",\n        ] = \"weighted_average\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_output_channels = num_output_channels\n        self.method = method\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if is_grayscale_image(img):\n            warnings.warn(\"The image is already gray.\", stacklevel=2)\n            return img\n\n        num_channels = get_num_channels(img)\n\n        if num_channels != NUM_RGB_CHANNELS and self.method not in {\n            \"desaturation\",\n            \"average\",\n            \"max\",\n            \"pca\",\n        }:\n            msg = \"ToGray transformation expects 3-channel images.\"\n            raise TypeError(msg)\n\n        return fmain.to_gray(img, self.num_output_channels, self.method)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"num_output_channels\", \"method\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToRGB","title":"<code>class  ToRGB</code> <code>       (num_output_channels=3, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Convert an input image from grayscale to RGB format.</p> <p>Parameters:</p> Name Type Description <code>num_output_channels</code> <code>int</code> <p>The number of channels in the output image. Default: 3.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1</p> <p>Note</p> <ul> <li>For single-channel (grayscale) images, the channel is replicated to create an RGB image.</li> <li>If the input is already a 3-channel RGB image, it is returned unchanged.</li> <li>This transform does not change the data type of the image (e.g., uint8 remains uint8).</li> </ul> <p>Exceptions:</p> Type Description <code>TypeError</code> <p>If the input image has more than 1 channel.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Convert a grayscale image to RGB\n&gt;&gt;&gt; transform = A.Compose([A.ToRGB(p=1.0)])\n&gt;&gt;&gt; grayscale_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; rgb_image = transform(image=grayscale_image)['image']\n&gt;&gt;&gt; assert rgb_image.shape == (100, 100, 3)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToRGB(ImageOnlyTransform):\n    \"\"\"Convert an input image from grayscale to RGB format.\n\n    Args:\n        num_output_channels (int): The number of channels in the output image. Default: 3.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1\n\n    Note:\n        - For single-channel (grayscale) images, the channel is replicated to create an RGB image.\n        - If the input is already a 3-channel RGB image, it is returned unchanged.\n        - This transform does not change the data type of the image (e.g., uint8 remains uint8).\n\n    Raises:\n        TypeError: If the input image has more than 1 channel.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Convert a grayscale image to RGB\n        &gt;&gt;&gt; transform = A.Compose([A.ToRGB(p=1.0)])\n        &gt;&gt;&gt; grayscale_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; rgb_image = transform(image=grayscale_image)['image']\n        &gt;&gt;&gt; assert rgb_image.shape == (100, 100, 3)\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_output_channels: int = Field(ge=1)\n\n    def __init__(\n        self,\n        num_output_channels: int = 3,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.num_output_channels = num_output_channels\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if is_rgb_image(img):\n            warnings.warn(\"The image is already an RGB.\", stacklevel=2)\n            return np.ascontiguousarray(img)\n        if not is_grayscale_image(img):\n            msg = \"ToRGB transformation expects 2-dim images or 3-dim with the last dimension equal to 1.\"\n            raise TypeError(msg)\n\n        return fmain.grayscale_to_multichannel(\n            img,\n            num_output_channels=self.num_output_channels,\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str]:\n        return (\"num_output_channels\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToSepia","title":"<code>class  ToSepia</code> <code>       (p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply a sepia filter to the input image.</p> <p>This transform converts a color image to a sepia tone, giving it a warm, brownish tint that is reminiscent of old photographs. The sepia effect is achieved by applying a specific color transformation matrix to the RGB channels of the input image. For grayscale images, the transform is a no-op and returns the original image.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1,3</p> <p>Note</p> <ul> <li>The sepia effect only works with RGB images (3 channels). For grayscale images,   the original image is returned unchanged since the sepia transformation would   have no visible effect when R=G=B.</li> <li>The sepia effect is created using a fixed color transformation matrix:   [[0.393, 0.769, 0.189],    [0.349, 0.686, 0.168],    [0.272, 0.534, 0.131]]</li> <li>The output image will have the same data type as the input image.</li> <li>For float32 images, ensure the input values are in the range [0, 1].</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n# Apply sepia effect to a uint8 RGB image\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n&gt;&gt;&gt; sepia_image = transform(image=image)['image']\n&gt;&gt;&gt; assert sepia_image.shape == image.shape\n&gt;&gt;&gt; assert sepia_image.dtype == np.uint8\n&gt;&gt;&gt;\n# Apply sepia effect to a float32 RGB image\n&gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n&gt;&gt;&gt; sepia_image = transform(image=image)['image']\n&gt;&gt;&gt; assert sepia_image.shape == image.shape\n&gt;&gt;&gt; assert sepia_image.dtype == np.float32\n&gt;&gt;&gt; assert 0 &lt;= sepia_image.min() &lt;= sepia_image.max() &lt;= 1.0\n&gt;&gt;&gt;\n# No effect on grayscale images\n&gt;&gt;&gt; gray_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n&gt;&gt;&gt; result = transform(image=gray_image)['image']\n&gt;&gt;&gt; assert np.array_equal(result, gray_image)\n</code></pre> <p>Mathematical Formulation:     Given an input pixel [R, G, B], the sepia tone is calculated as:     R_sepia = 0.393R + 0.769G + 0.189B     G_sepia = 0.349R + 0.686G + 0.168B     B_sepia = 0.272R + 0.534G + 0.131*B</p> <pre><code>For grayscale images where R=G=B, this transformation would result in a simple\nscaling of the original value, so we skip it.\n\nThe output values are clipped to the valid range for the image's data type.\n</code></pre> <p>See Also:     ToGray: For converting images to grayscale instead of sepia.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToSepia(ImageOnlyTransform):\n    \"\"\"Apply a sepia filter to the input image.\n\n    This transform converts a color image to a sepia tone, giving it a warm, brownish tint\n    that is reminiscent of old photographs. The sepia effect is achieved by applying a\n    specific color transformation matrix to the RGB channels of the input image.\n    For grayscale images, the transform is a no-op and returns the original image.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1,3\n\n    Note:\n        - The sepia effect only works with RGB images (3 channels). For grayscale images,\n          the original image is returned unchanged since the sepia transformation would\n          have no visible effect when R=G=B.\n        - The sepia effect is created using a fixed color transformation matrix:\n          [[0.393, 0.769, 0.189],\n           [0.349, 0.686, 0.168],\n           [0.272, 0.534, 0.131]]\n        - The output image will have the same data type as the input image.\n        - For float32 images, ensure the input values are in the range [0, 1].\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        # Apply sepia effect to a uint8 RGB image\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n        &gt;&gt;&gt; sepia_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert sepia_image.shape == image.shape\n        &gt;&gt;&gt; assert sepia_image.dtype == np.uint8\n        &gt;&gt;&gt;\n        # Apply sepia effect to a float32 RGB image\n        &gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n        &gt;&gt;&gt; sepia_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert sepia_image.shape == image.shape\n        &gt;&gt;&gt; assert sepia_image.dtype == np.float32\n        &gt;&gt;&gt; assert 0 &lt;= sepia_image.min() &lt;= sepia_image.max() &lt;= 1.0\n        &gt;&gt;&gt;\n        # No effect on grayscale images\n        &gt;&gt;&gt; gray_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n        &gt;&gt;&gt; result = transform(image=gray_image)['image']\n        &gt;&gt;&gt; assert np.array_equal(result, gray_image)\n\n    Mathematical Formulation:\n        Given an input pixel [R, G, B], the sepia tone is calculated as:\n        R_sepia = 0.393*R + 0.769*G + 0.189*B\n        G_sepia = 0.349*R + 0.686*G + 0.168*B\n        B_sepia = 0.272*R + 0.534*G + 0.131*B\n\n        For grayscale images where R=G=B, this transformation would result in a simple\n        scaling of the original value, so we skip it.\n\n        The output values are clipped to the valid range for the image's data type.\n\n    See Also:\n        ToGray: For converting images to grayscale instead of sepia.\n    \"\"\"\n\n    def __init__(self, p: float = 0.5, always_apply: bool | None = None):\n        super().__init__(p, always_apply)\n        self.sepia_transformation_matrix = np.array(\n            [[0.393, 0.769, 0.189], [0.349, 0.686, 0.168], [0.272, 0.534, 0.131]],\n        )\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if is_grayscale_image(img):\n            return img\n\n        if not is_rgb_image(img):\n            msg = \"ToSepia transformation expects 1 or 3-channel images.\"\n            raise TypeError(msg)\n        return fmain.linear_transformation_rgb(img, self.sepia_transformation_matrix)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.UniformParams","title":"<code>class  UniformParams</code> <code> </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class UniformParams(NoiseParamsBase):\n    noise_type: Literal[\"uniform\"] = \"uniform\"\n    ranges: list[Sequence[float]] = Field(\n        description=\"List of (min, max) ranges for each channel\",\n        min_length=1,\n    )\n\n    @field_validator(\"ranges\", mode=\"after\")\n    @classmethod\n    def validate_ranges(cls, v: list[Sequence[float]]) -&gt; list[tuple[float, float]]:\n        result = []\n        for range_values in v:\n            if len(range_values) != PAIR:\n                raise ValueError(\"Each range must have exactly 2 values\")\n            min_val, max_val = range_values\n            if not (-1 &lt;= min_val &lt;= max_val &lt;= 1):\n                raise ValueError(\"Range values must be in [-1, 1] and min &lt;= max\")\n            result.append((float(min_val), float(max_val)))\n        return result\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.UnsharpMask","title":"<code>class  UnsharpMask</code> <code>       (blur_limit=(3, 7), sigma_limit=0.0, alpha=(0.2, 0.5), threshold=10, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.</p> <p>Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased     sharpness. This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask which is combined with the original image to enhance edges and fine details.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as <code>round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1</code>. If set single value <code>blur_limit</code> will be in range (0, blur_limit). Default: (3, 7).</p> <code>sigma_limit</code> <code>tuple[float, float] | float</code> <p>Gaussian kernel standard deviation. Must be in range [0, inf). If set single value <code>sigma_limit</code> will be in range (0, sigma_limit). If set to 0 sigma will be computed as <code>sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8</code>. Default: 0.</p> <code>alpha</code> <code>tuple[float, float]</code> <p>range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).</p> <code>threshold</code> <code>int</code> <p>Value to limit sharpening only for areas with high pixel difference between original image and it's smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 255]. Default: 10.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian     blurred version.</li> <li>The final image is computed as: output = I + M if |I - G| &gt; threshold, else I.</li> <li>Higher alpha values increase the strength of the sharpening effect.</li> <li>Higher threshold values limit the sharpening effect to areas with more significant edges or details.</li> <li>The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.</li> </ul> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Unsharp_masking</li> <li>https://arxiv.org/pdf/2107.10833.pdf</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt;\n# Apply UnsharpMask with default parameters\n&gt;&gt;&gt; transform = A.UnsharpMask(p=1.0)\n&gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n&gt;&gt;&gt;\n# Apply UnsharpMask with custom parameters\n&gt;&gt;&gt; transform = A.UnsharpMask(\n...     blur_limit=(3, 7),\n...     sigma_limit=(0.1, 0.5),\n...     alpha=(0.2, 0.7),\n...     threshold=15,\n...     p=1.0\n... )\n&gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class UnsharpMask(ImageOnlyTransform):\n    \"\"\"Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.\n\n    Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased\n        sharpness.\n    This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask\n    which is combined with the original image to enhance edges and fine details.\n\n    Args:\n        blur_limit (tuple[int, int] | int): maximum Gaussian kernel size for blurring the input image.\n            Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma\n            as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`.\n            If set single value `blur_limit` will be in range (0, blur_limit).\n            Default: (3, 7).\n        sigma_limit (tuple[float, float] | float): Gaussian kernel standard deviation. Must be in range [0, inf).\n            If set single value `sigma_limit` will be in range (0, sigma_limit).\n            If set to 0 sigma will be computed as `sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8`. Default: 0.\n        alpha (tuple[float, float]): range to choose the visibility of the sharpened image.\n            At 0, only the original image is visible, at 1.0 only its sharpened version is visible.\n            Default: (0.2, 0.5).\n        threshold (int): Value to limit sharpening only for areas with high pixel difference between original image\n            and it's smoothed version. Higher threshold means less sharpening on flat areas.\n            Must be in range [0, 255]. Default: 10.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian\n            blurred version.\n        - The final image is computed as: output = I + M if |I - G| &gt; threshold, else I.\n        - Higher alpha values increase the strength of the sharpening effect.\n        - Higher threshold values limit the sharpening effect to areas with more significant edges or details.\n        - The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.\n\n    References:\n        - https://en.wikipedia.org/wiki/Unsharp_masking\n        - https://arxiv.org/pdf/2107.10833.pdf\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt;\n        # Apply UnsharpMask with default parameters\n        &gt;&gt;&gt; transform = A.UnsharpMask(p=1.0)\n        &gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        # Apply UnsharpMask with custom parameters\n        &gt;&gt;&gt; transform = A.UnsharpMask(\n        ...     blur_limit=(3, 7),\n        ...     sigma_limit=(0.1, 0.5),\n        ...     alpha=(0.2, 0.7),\n        ...     threshold=15,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        sigma_limit: NonNegativeFloatRangeType\n        alpha: ZeroOneRangeType\n        threshold: int = Field(ge=0, le=255)\n        blur_limit: ScaleIntType\n\n        @field_validator(\"blur_limit\")\n        @classmethod\n        def process_blur(\n            cls,\n            value: ScaleIntType,\n            info: ValidationInfo,\n        ) -&gt; tuple[int, int]:\n            return fblur.process_blur_limit(value, info, min_value=3)\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        sigma_limit: ScaleFloatType = 0.0,\n        alpha: ScaleFloatType = (0.2, 0.5),\n        threshold: int = 10,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.sigma_limit = cast(tuple[float, float], sigma_limit)\n        self.alpha = cast(tuple[float, float], alpha)\n        self.threshold = threshold\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"ksize\": self.py_random.randrange(\n                self.blur_limit[0],\n                self.blur_limit[1] + 1,\n                2,\n            ),\n            \"sigma\": self.py_random.uniform(*self.sigma_limit),\n            \"alpha\": self.py_random.uniform(*self.alpha),\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        ksize: int,\n        sigma: int,\n        alpha: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.unsharp_mask(\n            img,\n            ksize,\n            sigma=sigma,\n            alpha=alpha,\n            threshold=self.threshold,\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"blur_limit\", \"sigma_limit\", \"alpha\", \"threshold\"\n</code></pre>"},{"location":"api_reference/augmentations/blur/","title":"Index","text":"<ul> <li>Blur transforms (albumentations.augmentations.blur.transforms)</li> </ul>"},{"location":"api_reference/augmentations/blur/functional/","title":"Blur functional transforms (augmentations.blur.functional)","text":""},{"location":"api_reference/augmentations/blur/functional/#albumentations.augmentations.blur.functional.create_motion_kernel","title":"<code>def create_motion_kernel    (kernel_size, angle, direction, allow_shifted, random_state)    </code> [view source on GitHub]","text":"<p>Create a motion blur kernel.</p> <p>Parameters:</p> Name Type Description <code>kernel_size</code> <code>int</code> <p>Size of the kernel (must be odd)</p> <code>angle</code> <code>float</code> <p>Angle in degrees (counter-clockwise)</p> <code>direction</code> <code>float</code> <p>Blur direction (-1.0 to 1.0)</p> <code>allow_shifted</code> <code>bool</code> <p>Allow kernel to be randomly shifted from center</p> <code>random_state</code> <code>Random</code> <p>Python's random.Random instance</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Motion blur kernel</p> Source code in <code>albumentations/augmentations/blur/functional.py</code> Python<pre><code>def create_motion_kernel(\n    kernel_size: int,\n    angle: float,\n    direction: float,\n    allow_shifted: bool,\n    random_state: Random,\n) -&gt; np.ndarray:\n    \"\"\"Create a motion blur kernel.\n\n    Args:\n        kernel_size: Size of the kernel (must be odd)\n        angle: Angle in degrees (counter-clockwise)\n        direction: Blur direction (-1.0 to 1.0)\n        allow_shifted: Allow kernel to be randomly shifted from center\n        random_state: Python's random.Random instance\n\n    Returns:\n        Motion blur kernel\n    \"\"\"\n    kernel = np.zeros((kernel_size, kernel_size), dtype=np.float32)\n    center = kernel_size // 2\n\n    # Convert angle to radians\n    angle_rad = np.deg2rad(angle)\n\n    # Calculate direction vector\n    dx = np.cos(angle_rad)\n    dy = np.sin(angle_rad)\n\n    # Create line points with direction bias\n    line_length = kernel_size // 2\n    t = np.linspace(-line_length, line_length, kernel_size * 2)\n\n    # Apply direction bias\n    if direction != 0:\n        t = t * (1 + abs(direction))\n        if direction &lt; 0:\n            t = t * -1\n\n    # Generate line coordinates\n    x = center + dx * t\n    y = center + dy * t\n\n    # Apply random shift if allowed\n    if allow_shifted and random_state is not None:\n        shift_x = random_state.uniform(-1, 1) * line_length / 2\n        shift_y = random_state.uniform(-1, 1) * line_length / 2\n        x += shift_x\n        y += shift_y\n\n    # Round coordinates and clip to kernel bounds\n    x = np.clip(np.round(x), 0, kernel_size - 1).astype(int)\n    y = np.clip(np.round(y), 0, kernel_size - 1).astype(int)\n\n    # Keep only unique points to avoid multiple assignments\n    points = np.unique(np.column_stack([y, x]), axis=0)\n    kernel[points[:, 0], points[:, 1]] = 1\n\n    # Ensure at least one point is set\n    if not kernel.any():\n        kernel[center, center] = 1\n\n    return kernel\n</code></pre>"},{"location":"api_reference/augmentations/blur/functional/#albumentations.augmentations.blur.functional.process_blur_limit","title":"<code>def process_blur_limit    (value, info, min_value=0)    </code> [view source on GitHub]","text":"<p>Process blur limit to ensure valid kernel sizes.</p> Source code in <code>albumentations/augmentations/blur/functional.py</code> Python<pre><code>def process_blur_limit(value: ScaleIntType, info: ValidationInfo, min_value: int = 0) -&gt; tuple[int, int]:\n    \"\"\"Process blur limit to ensure valid kernel sizes.\"\"\"\n    result = value if isinstance(value, Sequence) else (min_value, value)\n\n    result = _ensure_min_value(result, min_value, info.field_name)\n    result = _ensure_odd_values(result, info.field_name)\n\n    if result[0] &gt; result[1]:\n        final_result = (result[1], result[1])\n        warn(\n            f\"{info.field_name}: Invalid range {result} (min &gt; max). \"\n            f\"Range automatically adjusted to {final_result}.\",\n            UserWarning,\n            stacklevel=2,\n        )\n        return final_result\n\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/blur/functional/#albumentations.augmentations.blur.functional.sample_odd_from_range","title":"<code>def sample_odd_from_range    (random_state, low, high)    </code> [view source on GitHub]","text":"<p>Sample an odd number from the range [low, high] (inclusive).</p> <p>Parameters:</p> Name Type Description <code>random_state</code> <code>Random</code> <p>instance of random.Random</p> <code>low</code> <code>int</code> <p>lower bound (will be converted to nearest valid odd number)</p> <code>high</code> <code>int</code> <p>upper bound (will be converted to nearest valid odd number)</p> <p>Returns:</p> Type Description <code>int</code> <p>Randomly sampled odd number from the range</p> <p>Note</p> <ul> <li>Input values will be converted to nearest valid odd numbers:</li> <li>Values less than 3 will become 3</li> <li>Even values will be rounded up to next odd number</li> <li>After normalization, high must be &gt;= low</li> </ul> Source code in <code>albumentations/augmentations/blur/functional.py</code> Python<pre><code>def sample_odd_from_range(random_state: Random, low: int, high: int) -&gt; int:\n    \"\"\"Sample an odd number from the range [low, high] (inclusive).\n\n    Args:\n        random_state: instance of random.Random\n        low: lower bound (will be converted to nearest valid odd number)\n        high: upper bound (will be converted to nearest valid odd number)\n\n    Returns:\n        Randomly sampled odd number from the range\n\n    Note:\n        - Input values will be converted to nearest valid odd numbers:\n          * Values less than 3 will become 3\n          * Even values will be rounded up to next odd number\n        - After normalization, high must be &gt;= low\n    \"\"\"\n    # Normalize low value\n    low = max(3, low + (low % 2 == 0))\n    # Normalize high value\n    high = max(3, high + (high % 2 == 0))\n\n    # Ensure high &gt;= low after normalization\n    high = max(high, low)\n\n    if low == high:\n        return low\n\n    # Calculate number of possible odd values\n    num_odd_values = (high - low) // 2 + 1\n    # Generate random index and convert to corresponding odd number\n    rand_idx = random_state.randint(0, num_odd_values - 1)\n    return low + (2 * rand_idx)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/","title":"Blur transforms (augmentations.blur.transforms)","text":""},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.AdvancedBlur","title":"<code>class  AdvancedBlur</code> <code>       (blur_limit=(3, 7), sigma_x_limit=(0.2, 1.0), sigma_y_limit=(0.2, 1.0), sigmaX_limit=None, sigmaY_limit=None, rotate_limit=(-90, 90), beta_limit=(0.5, 8.0), noise_limit=(0.9, 1.1), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies a Generalized Gaussian blur to the input image with randomized parameters for advanced data augmentation.</p> <p>This transform creates a custom blur kernel based on the Generalized Gaussian distribution, which allows for a wide range of blur effects beyond standard Gaussian blur. It then applies this kernel to the input image through convolution. The transform also incorporates noise into the kernel, resulting in a unique combination of blurring and noise injection.</p> <p>Key features of this augmentation:</p> <ol> <li> <p>Generalized Gaussian Kernel: Uses a generalized normal distribution to create kernels    that can range from box-like blurs to very peaked blurs, controlled by the beta parameter.</p> </li> <li> <p>Anisotropic Blurring: Allows for different blur strengths in horizontal and vertical    directions (controlled by sigma_x and sigma_y), and rotation of the kernel.</p> </li> <li> <p>Kernel Noise: Adds multiplicative noise to the kernel before applying it to the image,    creating more diverse and realistic blur effects.</p> </li> </ol> <p>Implementation Details:     The kernel is generated using a 2D Generalized Gaussian function. The process involves:     1. Creating a 2D grid based on the kernel size     2. Applying rotation to this grid     3. Calculating the kernel values using the Generalized Gaussian formula     4. Adding multiplicative noise to the kernel     5. Normalizing the kernel</p> <pre><code>The resulting kernel is then applied to the image using convolution.\n</code></pre> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Controls the size of the blur kernel. If a single int is provided, the kernel size will be randomly chosen between 3 and that value. Must be odd and \u2265 3. Larger values create stronger blur effects. Default: (3, 7)</p> <code>sigma_x_limit</code> <code>tuple[float, float] | float</code> <p>Controls the spread of the blur in the x direction. Higher values increase blur strength. If a single float is provided, the range will be (0, limit). Default: (0.2, 1.0)</p> <code>sigma_y_limit</code> <code>tuple[float, float] | float</code> <p>Controls the spread of the blur in the y direction. Higher values increase blur strength. If a single float is provided, the range will be (0, limit). Default: (0.2, 1.0)</p> <code>rotate_limit</code> <code>tuple[int, int] | int</code> <p>Range of angles (in degrees) for rotating the kernel. This rotation allows for diagonal blur directions. If limit is a single int, an angle is picked from (-rotate_limit, rotate_limit). Default: (-90, 90)</p> <code>beta_limit</code> <code>tuple[float, float] | float</code> <p>Shape parameter of the Generalized Gaussian distribution. - beta = 1 gives a standard Gaussian distribution - beta &lt; 1 creates heavier tails, resulting in more uniform, box-like blur - beta &gt; 1 creates lighter tails, resulting in more peaked, focused blur Default: (0.5, 8.0)</p> <code>noise_limit</code> <code>tuple[float, float] | float</code> <p>Controls the strength of multiplicative noise applied to the kernel. Values around 1.0 keep the original kernel mostly intact, while values further from 1.0 introduce more variation. Default: (0.75, 1.25)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Notes</p> <ul> <li>This transform is particularly useful for simulating complex, real-world blur effects   that go beyond simple Gaussian blur.</li> <li>The combination of blur and noise can help in creating more robust models by simulating   a wider range of image degradations.</li> <li>Extreme values, especially for beta and noise, may result in unrealistic effects and   should be used cautiously.</li> </ul> <p>Reference</p> <p>This transform is inspired by techniques described in: \"Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data\" https://arxiv.org/abs/2107.10833</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class AdvancedBlur(ImageOnlyTransform):\n    \"\"\"Applies a Generalized Gaussian blur to the input image with randomized parameters for advanced data augmentation.\n\n    This transform creates a custom blur kernel based on the Generalized Gaussian distribution,\n    which allows for a wide range of blur effects beyond standard Gaussian blur. It then applies\n    this kernel to the input image through convolution. The transform also incorporates noise\n    into the kernel, resulting in a unique combination of blurring and noise injection.\n\n    Key features of this augmentation:\n\n    1. Generalized Gaussian Kernel: Uses a generalized normal distribution to create kernels\n       that can range from box-like blurs to very peaked blurs, controlled by the beta parameter.\n\n    2. Anisotropic Blurring: Allows for different blur strengths in horizontal and vertical\n       directions (controlled by sigma_x and sigma_y), and rotation of the kernel.\n\n    3. Kernel Noise: Adds multiplicative noise to the kernel before applying it to the image,\n       creating more diverse and realistic blur effects.\n\n    Implementation Details:\n        The kernel is generated using a 2D Generalized Gaussian function. The process involves:\n        1. Creating a 2D grid based on the kernel size\n        2. Applying rotation to this grid\n        3. Calculating the kernel values using the Generalized Gaussian formula\n        4. Adding multiplicative noise to the kernel\n        5. Normalizing the kernel\n\n        The resulting kernel is then applied to the image using convolution.\n\n    Args:\n        blur_limit (tuple[int, int] | int, optional): Controls the size of the blur kernel. If a single int\n            is provided, the kernel size will be randomly chosen between 3 and that value.\n            Must be odd and \u2265 3. Larger values create stronger blur effects.\n            Default: (3, 7)\n\n        sigma_x_limit (tuple[float, float] | float): Controls the spread of the blur in the x direction.\n            Higher values increase blur strength.\n            If a single float is provided, the range will be (0, limit).\n            Default: (0.2, 1.0)\n\n        sigma_y_limit (tuple[float, float] | float): Controls the spread of the blur in the y direction.\n            Higher values increase blur strength.\n            If a single float is provided, the range will be (0, limit).\n            Default: (0.2, 1.0)\n\n        rotate_limit (tuple[int, int] | int): Range of angles (in degrees) for rotating the kernel.\n            This rotation allows for diagonal blur directions. If limit is a single int, an angle is picked\n            from (-rotate_limit, rotate_limit).\n            Default: (-90, 90)\n\n        beta_limit (tuple[float, float] | float): Shape parameter of the Generalized Gaussian distribution.\n            - beta = 1 gives a standard Gaussian distribution\n            - beta &lt; 1 creates heavier tails, resulting in more uniform, box-like blur\n            - beta &gt; 1 creates lighter tails, resulting in more peaked, focused blur\n            Default: (0.5, 8.0)\n\n        noise_limit (tuple[float, float] | float): Controls the strength of multiplicative noise\n            applied to the kernel. Values around 1.0 keep the original kernel mostly intact,\n            while values further from 1.0 introduce more variation.\n            Default: (0.75, 1.25)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Notes:\n        - This transform is particularly useful for simulating complex, real-world blur effects\n          that go beyond simple Gaussian blur.\n        - The combination of blur and noise can help in creating more robust models by simulating\n          a wider range of image degradations.\n        - Extreme values, especially for beta and noise, may result in unrealistic effects and\n          should be used cautiously.\n\n    Reference:\n        This transform is inspired by techniques described in:\n        \"Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data\"\n        https://arxiv.org/abs/2107.10833\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        sigma_x_limit: NonNegativeFloatRangeType\n        sigma_y_limit: NonNegativeFloatRangeType\n        beta_limit: NonNegativeFloatRangeType\n        noise_limit: NonNegativeFloatRangeType\n        rotate_limit: SymmetricRangeType\n\n        @field_validator(\"beta_limit\")\n        @classmethod\n        def check_beta_limit(cls, value: ScaleFloatType) -&gt; tuple[float, float]:\n            result = to_tuple(value, low=0)\n            if not (result[0] &lt; 1.0 &lt; result[1]):\n                msg = \"beta_limit is expected to include 1.0.\"\n                raise ValueError(msg)\n            return result\n\n        @model_validator(mode=\"after\")\n        def validate_limits(self) -&gt; Self:\n            if (\n                isinstance(self.sigma_x_limit, (tuple, list))\n                and self.sigma_x_limit[0] == 0\n                and isinstance(self.sigma_y_limit, (tuple, list))\n                and self.sigma_y_limit[0] == 0\n            ):\n                msg = \"sigma_x_limit and sigma_y_limit minimum value cannot be both equal to 0.\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        sigma_x_limit: ScaleFloatType = (0.2, 1.0),\n        sigma_y_limit: ScaleFloatType = (0.2, 1.0),\n        sigmaX_limit: ScaleFloatType | None = None,  # noqa: N803\n        sigmaY_limit: ScaleFloatType | None = None,  # noqa: N803\n        rotate_limit: ScaleIntType = (-90, 90),\n        beta_limit: ScaleFloatType = (0.5, 8.0),\n        noise_limit: ScaleFloatType = (0.9, 1.1),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        if sigmaX_limit is not None:\n            warnings.warn(\n                \"sigmaX_limit is deprecated; use sigma_x_limit instead.\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            sigma_x_limit = sigmaX_limit\n\n        if sigmaY_limit is not None:\n            warnings.warn(\n                \"sigmaY_limit is deprecated; use sigma_y_limit instead.\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            sigma_y_limit = sigmaY_limit\n\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.sigma_x_limit = cast(tuple[float, float], sigma_x_limit)\n        self.sigma_y_limit = cast(tuple[float, float], sigma_y_limit)\n        self.rotate_limit = cast(tuple[int, int], rotate_limit)\n        self.beta_limit = cast(tuple[float, float], beta_limit)\n        self.noise_limit = cast(tuple[float, float], noise_limit)\n\n    def apply(self, img: np.ndarray, kernel: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.convolve(img, kernel=kernel)\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        ksize = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n        sigma_x = self.py_random.uniform(*self.sigma_x_limit)\n        sigma_y = self.py_random.uniform(*self.sigma_y_limit)\n        angle = np.deg2rad(self.py_random.uniform(*self.rotate_limit))\n\n        # Split into 2 cases to avoid selection of narrow kernels (beta &gt; 1) too often.\n        beta = (\n            self.py_random.uniform(self.beta_limit[0], 1)\n            if self.py_random.random() &lt; HALF\n            else self.py_random.uniform(1, self.beta_limit[1])\n        )\n\n        noise_matrix = self.random_generator.uniform(\n            *self.noise_limit,\n            size=(ksize, ksize),\n        )\n\n        # Generate mesh grid centered at zero.\n        ax = np.arange(-ksize // 2 + 1.0, ksize // 2 + 1.0)\n        # &gt; Shape (ksize, ksize, 2)\n        grid = np.stack(np.meshgrid(ax, ax), axis=-1)\n\n        # Calculate rotated sigma matrix\n        d_matrix = np.array([[sigma_x**2, 0], [0, sigma_y**2]])\n        u_matrix = np.array(\n            [[np.cos(angle), -np.sin(angle)], [np.sin(angle), np.cos(angle)]],\n        )\n        sigma_matrix = np.dot(u_matrix, np.dot(d_matrix, u_matrix.T))\n\n        inverse_sigma = np.linalg.inv(sigma_matrix)\n        # Described in \"Parameter Estimation For Multivariate Generalized Gaussian Distributions\"\n        kernel = np.exp(\n            -0.5 * np.power(np.sum(np.dot(grid, inverse_sigma) * grid, 2), beta),\n        )\n        # Add noise\n        kernel *= noise_matrix\n\n        # Normalize kernel\n        kernel = kernel.astype(np.float32) / np.sum(kernel)\n        return {\"kernel\": kernel}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str, str, str]:\n        return (\n            \"blur_limit\",\n            \"sigma_x_limit\",\n            \"sigma_y_limit\",\n            \"rotate_limit\",\n            \"beta_limit\",\n            \"noise_limit\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.Blur","title":"<code>class  Blur</code> <code>       (blur_limit=(3, 7), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply uniform box blur to the input image using a randomly sized square kernel.</p> <p>This transform uses OpenCV's cv2.blur function, which performs a simple box filter blur. The size of the blur kernel is randomly selected for each application, allowing for varying degrees of blur intensity.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Controls the range of the blur kernel size. - If a single int is provided, the kernel size will be randomly chosen   between 3 and that value. - If a tuple of two ints is provided, it defines the inclusive range   of possible kernel sizes. The kernel size must be odd and greater than or equal to 3. Larger kernel sizes produce stronger blur effects. Default: (3, 7)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Notes</p> <ul> <li>The blur kernel is always square (same width and height).</li> <li>Only odd kernel sizes are used to ensure the blur has a clear center pixel.</li> <li>Box blur is faster than Gaussian blur but may produce less natural results.</li> <li>This blur method averages all pixels under the kernel area, which can   reduce noise but also reduce image detail.</li> </ul> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Blur(blur_limit=(3, 7), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; blurred_image = result[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class Blur(ImageOnlyTransform):\n    \"\"\"Apply uniform box blur to the input image using a randomly sized square kernel.\n\n    This transform uses OpenCV's cv2.blur function, which performs a simple box filter blur.\n    The size of the blur kernel is randomly selected for each application, allowing for\n    varying degrees of blur intensity.\n\n    Args:\n        blur_limit (tuple[int, int] | int): Controls the range of the blur kernel size.\n            - If a single int is provided, the kernel size will be randomly chosen\n              between 3 and that value.\n            - If a tuple of two ints is provided, it defines the inclusive range\n              of possible kernel sizes.\n            The kernel size must be odd and greater than or equal to 3.\n            Larger kernel sizes produce stronger blur effects.\n            Default: (3, 7)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Notes:\n        - The blur kernel is always square (same width and height).\n        - Only odd kernel sizes are used to ensure the blur has a clear center pixel.\n        - Box blur is faster than Gaussian blur but may produce less natural results.\n        - This blur method averages all pixels under the kernel area, which can\n          reduce noise but also reduce image detail.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Blur(blur_limit=(3, 7), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; blurred_image = result[\"image\"]\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        pass\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n\n    def apply(self, img: np.ndarray, kernel: int, **params: Any) -&gt; np.ndarray:\n        return fblur.blur(img, kernel)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        kernel = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n        return {\"kernel\": kernel}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"blur_limit\",)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.BlurInitSchema","title":"<code>class  BlurInitSchema</code> <code> </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class BlurInitSchema(BaseTransformInitSchema):\n    blur_limit: ScaleIntType\n\n    @field_validator(\"blur_limit\")\n    @classmethod\n    def process_blur(cls, value: ScaleIntType, info: ValidationInfo) -&gt; tuple[int, int]:\n        return fblur.process_blur_limit(value, info, min_value=3)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.Defocus","title":"<code>class  Defocus</code> <code>       (radius=(3, 10), alias_blur=(0.1, 0.5), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply defocus blur to the input image.</p> <p>This transform simulates the effect of an out-of-focus camera by applying a defocus blur to the image. It uses a combination of disc kernels and Gaussian blur to create a realistic defocus effect.</p> <p>Parameters:</p> Name Type Description <code>radius</code> <code>tuple[int, int] | int</code> <p>Range for the radius of the defocus blur. If a single int is provided, the range will be [1, radius]. Larger values create a stronger blur effect. Default: (3, 10)</p> <code>alias_blur</code> <code>tuple[float, float] | float</code> <p>Range for the standard deviation of the Gaussian blur applied after the main defocus blur. This helps to reduce aliasing artifacts. If a single float is provided, the range will be (0, alias_blur). Larger values create a smoother, more aliased effect. Default: (0.1, 0.5)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The defocus effect is created using a disc kernel, which simulates the shape of a camera's aperture.</li> <li>The additional Gaussian blur (alias_blur) helps to soften the edges of the disc kernel, creating a   more natural-looking defocus effect.</li> <li>Larger radius values will create a stronger, more noticeable defocus effect.</li> <li>The alias_blur parameter can be used to fine-tune the appearance of the defocus, with larger values   creating a smoother, potentially more realistic effect.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Defocus(radius=(4, 8), alias_blur=(0.2, 0.4), always_apply=True)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; defocused_image = result['image']\n</code></pre> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Defocus_aberration</li> <li>https://www.researchgate.net/publication/261311609_Realistic_Defocus_Blur_for_Multiplane_Computer-Generated_Holography</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class Defocus(ImageOnlyTransform):\n    \"\"\"Apply defocus blur to the input image.\n\n    This transform simulates the effect of an out-of-focus camera by applying a defocus blur\n    to the image. It uses a combination of disc kernels and Gaussian blur to create a realistic\n    defocus effect.\n\n    Args:\n        radius (tuple[int, int] | int): Range for the radius of the defocus blur.\n            If a single int is provided, the range will be [1, radius].\n            Larger values create a stronger blur effect.\n            Default: (3, 10)\n\n        alias_blur (tuple[float, float] | float): Range for the standard deviation of the Gaussian blur\n            applied after the main defocus blur. This helps to reduce aliasing artifacts.\n            If a single float is provided, the range will be (0, alias_blur).\n            Larger values create a smoother, more aliased effect.\n            Default: (0.1, 0.5)\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The defocus effect is created using a disc kernel, which simulates the shape of a camera's aperture.\n        - The additional Gaussian blur (alias_blur) helps to soften the edges of the disc kernel, creating a\n          more natural-looking defocus effect.\n        - Larger radius values will create a stronger, more noticeable defocus effect.\n        - The alias_blur parameter can be used to fine-tune the appearance of the defocus, with larger values\n          creating a smoother, potentially more realistic effect.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Defocus(radius=(4, 8), alias_blur=(0.2, 0.4), always_apply=True)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; defocused_image = result['image']\n\n    References:\n        - https://en.wikipedia.org/wiki/Defocus_aberration\n        - https://www.researchgate.net/publication/261311609_Realistic_Defocus_Blur_for_Multiplane_Computer-Generated_Holography\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        radius: OnePlusIntRangeType\n        alias_blur: NonNegativeFloatRangeType\n\n    def __init__(\n        self,\n        radius: ScaleIntType = (3, 10),\n        alias_blur: ScaleFloatType = (0.1, 0.5),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.radius = cast(tuple[int, int], radius)\n        self.alias_blur = cast(tuple[float, float], alias_blur)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        radius: int,\n        alias_blur: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.defocus(img, radius, alias_blur)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"radius\": self.py_random.randint(*self.radius),\n            \"alias_blur\": self.py_random.uniform(*self.alias_blur),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"radius\", \"alias_blur\")\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.GaussianBlur","title":"<code>class  GaussianBlur</code> <code>       (blur_limit=(3, 7), sigma_limit=0, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply Gaussian blur to the input image using a randomly sized kernel.</p> <p>This transform blurs the input image using a Gaussian filter with a random kernel size and sigma value. Gaussian blur is a widely used image processing technique that reduces image noise and detail, creating a smoothing effect.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Controls the range of the Gaussian kernel size. - If a single int is provided, the kernel size will be randomly chosen   between 0 and that value. - If a tuple of two ints is provided, it defines the inclusive range   of possible kernel sizes. Must be zero or odd and in range [0, inf). If set to 0, it will be computed from sigma as <code>round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1</code>. Larger kernel sizes produce stronger blur effects. Default: (3, 7)</p> <code>sigma_limit</code> <code>tuple[float, float] | float</code> <p>Range for the Gaussian kernel standard deviation (sigma). Must be in range [0, inf). - If a single float is provided, sigma will be randomly chosen   between 0 and that value. - If a tuple of two floats is provided, it defines the inclusive range   of possible sigma values. If set to 0, sigma will be computed as <code>sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8</code>. Larger sigma values produce stronger blur effects. Default: 0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The relationship between kernel size and sigma affects the blur strength:   larger kernel sizes allow for stronger blurring effects.</li> <li>When both blur_limit and sigma_limit are set to ranges starting from 0,   the blur_limit minimum is automatically set to 3 to ensure a valid kernel size.</li> <li>For uint8 images, the computation might be faster than for floating-point images.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.GaussianBlur(blur_limit=(3, 7), sigma_limit=(0.1, 2), p=1)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; blurred_image = result[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class GaussianBlur(ImageOnlyTransform):\n    \"\"\"Apply Gaussian blur to the input image using a randomly sized kernel.\n\n    This transform blurs the input image using a Gaussian filter with a random kernel size\n    and sigma value. Gaussian blur is a widely used image processing technique that reduces\n    image noise and detail, creating a smoothing effect.\n\n    Args:\n        blur_limit (tuple[int, int] | int): Controls the range of the Gaussian kernel size.\n            - If a single int is provided, the kernel size will be randomly chosen\n              between 0 and that value.\n            - If a tuple of two ints is provided, it defines the inclusive range\n              of possible kernel sizes.\n            Must be zero or odd and in range [0, inf). If set to 0, it will be computed\n            from sigma as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`.\n            Larger kernel sizes produce stronger blur effects.\n            Default: (3, 7)\n\n        sigma_limit (tuple[float, float] | float): Range for the Gaussian kernel standard\n            deviation (sigma). Must be in range [0, inf).\n            - If a single float is provided, sigma will be randomly chosen\n              between 0 and that value.\n            - If a tuple of two floats is provided, it defines the inclusive range\n              of possible sigma values.\n            If set to 0, sigma will be computed as `sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8`.\n            Larger sigma values produce stronger blur effects.\n            Default: 0\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The relationship between kernel size and sigma affects the blur strength:\n          larger kernel sizes allow for stronger blurring effects.\n        - When both blur_limit and sigma_limit are set to ranges starting from 0,\n          the blur_limit minimum is automatically set to 3 to ensure a valid kernel size.\n        - For uint8 images, the computation might be faster than for floating-point images.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.GaussianBlur(blur_limit=(3, 7), sigma_limit=(0.1, 2), p=1)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; blurred_image = result[\"image\"]\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        sigma_limit: NonNegativeFloatRangeType\n\n        @field_validator(\"blur_limit\")\n        @classmethod\n        def process_blur(\n            cls,\n            value: ScaleIntType,\n            info: ValidationInfo,\n        ) -&gt; tuple[int, int]:\n            return fblur.process_blur_limit(value, info, min_value=0)\n\n        @model_validator(mode=\"after\")\n        def validate_limits(self) -&gt; Self:\n            if (\n                isinstance(self.blur_limit, (tuple, list))\n                and self.blur_limit[0] == 0\n                and isinstance(self.sigma_limit, (tuple, list))\n                and self.sigma_limit[0] == 0\n            ):\n                self.blur_limit = 3, max(3, self.blur_limit[1])\n                warnings.warn(\n                    \"blur_limit and sigma_limit minimum value can not be both equal to 0. \"\n                    \"blur_limit minimum value changed to 3.\",\n                    stacklevel=2,\n                )\n\n            return self\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        sigma_limit: ScaleFloatType = 0,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p, always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.sigma_limit = cast(tuple[float, float], sigma_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        ksize: int,\n        sigma: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.gaussian_blur(img, ksize, sigma=sigma)\n\n    def get_params(self) -&gt; dict[str, float]:\n        ksize = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n\n        return {\"ksize\": ksize, \"sigma\": self.py_random.uniform(*self.sigma_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"blur_limit\", \"sigma_limit\"\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.GlassBlur","title":"<code>class  GlassBlur</code> <code>       (sigma=0.7, max_delta=4, iterations=2, mode='fast', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply a glass blur effect to the input image.</p> <p>This transform simulates the effect of looking through textured glass by locally shuffling pixels in the image. It creates a distorted, frosted glass-like appearance.</p> <p>Parameters:</p> Name Type Description <code>sigma</code> <code>float</code> <p>Standard deviation for the Gaussian kernel used in the process. Higher values increase the blur effect. Must be non-negative. Default: 0.7</p> <code>max_delta</code> <code>int</code> <p>Maximum distance in pixels for shuffling. Determines how far pixels can be moved. Larger values create more distortion. Must be a positive integer. Default: 4</p> <code>iterations</code> <code>int</code> <p>Number of times to apply the glass blur effect. More iterations create a stronger effect but increase computation time. Must be a positive integer. Default: 2</p> <code>mode</code> <code>Literal[\"fast\", \"exact\"]</code> <p>Mode of computation. Options are: - \"fast\": Uses a faster but potentially less accurate method. - \"exact\": Uses a slower but more precise method. Default: \"fast\"</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform is particularly effective for creating a 'looking through   glass' effect or simulating the view through a frosted window.</li> <li>The 'fast' mode is recommended for most use cases as it provides a good   balance between effect quality and computation speed.</li> <li>Increasing 'iterations' will strengthen the effect but also increase the   processing time linearly.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.GlassBlur(sigma=0.7, max_delta=4, iterations=3, mode=\"fast\", p=1)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; glass_blurred_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>This implementation is based on the technique described in:   \"ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness\"   https://arxiv.org/abs/1903.12261</li> <li>Original implementation:   https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class GlassBlur(ImageOnlyTransform):\n    \"\"\"Apply a glass blur effect to the input image.\n\n    This transform simulates the effect of looking through textured glass by locally\n    shuffling pixels in the image. It creates a distorted, frosted glass-like appearance.\n\n    Args:\n        sigma (float): Standard deviation for the Gaussian kernel used in the process.\n            Higher values increase the blur effect. Must be non-negative.\n            Default: 0.7\n\n        max_delta (int): Maximum distance in pixels for shuffling.\n            Determines how far pixels can be moved. Larger values create more distortion.\n            Must be a positive integer.\n            Default: 4\n\n        iterations (int): Number of times to apply the glass blur effect.\n            More iterations create a stronger effect but increase computation time.\n            Must be a positive integer.\n            Default: 2\n\n        mode (Literal[\"fast\", \"exact\"]): Mode of computation. Options are:\n            - \"fast\": Uses a faster but potentially less accurate method.\n            - \"exact\": Uses a slower but more precise method.\n            Default: \"fast\"\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform is particularly effective for creating a 'looking through\n          glass' effect or simulating the view through a frosted window.\n        - The 'fast' mode is recommended for most use cases as it provides a good\n          balance between effect quality and computation speed.\n        - Increasing 'iterations' will strengthen the effect but also increase the\n          processing time linearly.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.GlassBlur(sigma=0.7, max_delta=4, iterations=3, mode=\"fast\", p=1)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; glass_blurred_image = result[\"image\"]\n\n    References:\n        - This implementation is based on the technique described in:\n          \"ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness\"\n          https://arxiv.org/abs/1903.12261\n        - Original implementation:\n          https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        sigma: float = Field(ge=0)\n        max_delta: int = Field(ge=1)\n        iterations: int = Field(ge=1)\n        mode: Literal[\"fast\", \"exact\"]\n\n    def __init__(\n        self,\n        sigma: float = 0.7,\n        max_delta: int = 4,\n        iterations: int = 2,\n        mode: Literal[\"fast\", \"exact\"] = \"fast\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.sigma = sigma\n        self.max_delta = max_delta\n        self.iterations = iterations\n        self.mode = mode\n\n    def apply(\n        self,\n        img: np.ndarray,\n        *args: Any,\n        dxy: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.glass_blur(\n            img,\n            self.sigma,\n            self.max_delta,\n            self.iterations,\n            dxy,\n            self.mode,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n\n        # generate array containing all necessary values for transformations\n        width_pixels = height - self.max_delta * 2\n        height_pixels = width - self.max_delta * 2\n        total_pixels = int(width_pixels * height_pixels)\n        dxy = self.random_generator.integers(\n            -self.max_delta,\n            self.max_delta,\n            size=(total_pixels, self.iterations, 2),\n        )\n\n        return {\"dxy\": dxy}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return \"sigma\", \"max_delta\", \"iterations\", \"mode\"\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.MedianBlur","title":"<code>class  MedianBlur</code> <code>       (blur_limit=7, p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Apply median blur to the input image.</p> <p>This transform uses a median filter to blur the input image. Median filtering is particularly effective at removing salt-and-pepper noise while preserving edges, making it a popular choice for noise reduction in image processing.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>int | tuple[int, int]</code> <p>Maximum aperture linear size for blurring the input image. Must be odd and in the range [3, inf). - If a single int is provided, the kernel size will be randomly chosen   between 3 and that value. - If a tuple of two ints is provided, it defines the inclusive range   of possible kernel sizes. Default: (3, 7)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The kernel size (aperture linear size) must always be odd and greater than 1.</li> <li>Unlike mean blur or Gaussian blur, median blur uses the median of all pixels under   the kernel area, making it more robust to outliers.</li> <li>This transform is particularly useful for:</li> <li>Removing salt-and-pepper noise</li> <li>Preserving edges while smoothing images</li> <li>Pre-processing images for edge detection algorithms</li> <li>For color images, the median is calculated independently for each channel.</li> <li>Larger kernel sizes result in stronger blurring effects but may also remove   fine details from the image.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.MedianBlur(blur_limit=(3, 7), p=0.5)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; blurred_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Median filter: https://en.wikipedia.org/wiki/Median_filter</li> <li>OpenCV medianBlur: https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga564869aa33e58769b4469101aac458f9</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class MedianBlur(Blur):\n    \"\"\"Apply median blur to the input image.\n\n    This transform uses a median filter to blur the input image. Median filtering is particularly\n    effective at removing salt-and-pepper noise while preserving edges, making it a popular choice\n    for noise reduction in image processing.\n\n    Args:\n        blur_limit (int | tuple[int, int]): Maximum aperture linear size for blurring the input image.\n            Must be odd and in the range [3, inf).\n            - If a single int is provided, the kernel size will be randomly chosen\n              between 3 and that value.\n            - If a tuple of two ints is provided, it defines the inclusive range\n              of possible kernel sizes.\n            Default: (3, 7)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The kernel size (aperture linear size) must always be odd and greater than 1.\n        - Unlike mean blur or Gaussian blur, median blur uses the median of all pixels under\n          the kernel area, making it more robust to outliers.\n        - This transform is particularly useful for:\n          * Removing salt-and-pepper noise\n          * Preserving edges while smoothing images\n          * Pre-processing images for edge detection algorithms\n        - For color images, the median is calculated independently for each channel.\n        - Larger kernel sizes result in stronger blurring effects but may also remove\n          fine details from the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.MedianBlur(blur_limit=(3, 7), p=0.5)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; blurred_image = result[\"image\"]\n\n    References:\n        - Median filter: https://en.wikipedia.org/wiki/Median_filter\n        - OpenCV medianBlur: https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga564869aa33e58769b4469101aac458f9\n    \"\"\"\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = 7,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(blur_limit=blur_limit, p=p, always_apply=always_apply)\n\n    def apply(self, img: np.ndarray, kernel: int, **params: Any) -&gt; np.ndarray:\n        return fblur.median_blur(img, kernel)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.MotionBlur","title":"<code>class  MotionBlur</code> <code>       (blur_limit=7, allow_shifted=True, angle_range=(0, 360), direction_range=(-1.0, 1.0), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply motion blur to the input image using a directional kernel.</p> <p>This transform simulates motion blur effects that occur during image capture, such as camera shake or object movement. It creates a directional blur using a line-shaped kernel with controllable angle, direction, and position.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>int | tuple[int, int]</code> <p>Maximum kernel size for blurring. Should be in range [3, inf). - If int: kernel size will be randomly chosen from [3, blur_limit] - If tuple: kernel size will be randomly chosen from [min, max] Larger values create stronger blur effects. Default: (3, 7)</p> <code>angle_range</code> <code>tuple[float, float]</code> <p>Range of possible angles in degrees. Controls the rotation of the motion blur line: - 0\u00b0: Horizontal motion blur \u2192 - 45\u00b0: Diagonal motion blur \u2197 - 90\u00b0: Vertical motion blur \u2191 - 135\u00b0: Diagonal motion blur \u2196 Default: (0, 360)</p> <code>direction_range</code> <code>tuple[float, float]</code> <p>Range for motion bias. Controls how the blur extends from the center: - -1.0: Blur extends only backward (\u2190) -  0.0: Blur extends equally in both directions (\u2190\u2192) -  1.0: Blur extends only forward (\u2192) For example, with angle=0: - direction=-1.0: \u2190\u2022 - direction=0.0:  \u2190\u2022\u2192 - direction=1.0:   \u2022\u2192 Default: (-1.0, 1.0)</p> <code>allow_shifted</code> <code>bool</code> <p>Allow random kernel position shifts. - If True: Kernel can be randomly offset from center - If False: Kernel will always be centered Default: True</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Examples of angle vs direction:     1. Horizontal motion (angle=0\u00b0):        - direction=0.0:   \u2190\u2022\u2192   (symmetric blur)        - direction=1.0:    \u2022\u2192   (forward blur)        - direction=-1.0:  \u2190\u2022    (backward blur)</p> <pre><code>2. Vertical motion (angle=90\u00b0):\n   - direction=0.0:   \u2191\u2022\u2193   (symmetric blur)\n   - direction=1.0:    \u2022\u2191   (upward blur)\n   - direction=-1.0:  \u2193\u2022    (downward blur)\n\n3. Diagonal motion (angle=45\u00b0):\n   - direction=0.0:   \u2199\u2022\u2197   (symmetric blur)\n   - direction=1.0:    \u2022\u2197   (forward diagonal blur)\n   - direction=-1.0:  \u2199\u2022    (backward diagonal blur)\n</code></pre> <p>Note</p> <ul> <li>angle controls the orientation of the motion line</li> <li>direction controls the distribution of the blur along that line</li> <li>Together they can simulate various motion effects:</li> <li>Camera shake: Small angle range + direction near 0</li> <li>Object motion: Specific angle + direction=1.0</li> <li>Complex motion: Random angle + random direction</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Horizontal camera shake (symmetric)\n&gt;&gt;&gt; transform = A.MotionBlur(\n...     angle_range=(-5, 5),      # Near-horizontal motion\n...     direction_range=(0, 0),    # Symmetric blur\n...     p=1.0\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Object moving right\n&gt;&gt;&gt; transform = A.MotionBlur(\n...     angle_range=(0, 0),        # Horizontal motion\n...     direction_range=(0.8, 1.0), # Strong forward bias\n...     p=1.0\n... )\n</code></pre> <p>References</p> <ul> <li> <p>Motion blur fundamentals:   https://en.wikipedia.org/wiki/Motion_blur</p> </li> <li> <p>Directional blur kernels:   https://www.sciencedirect.com/topics/computer-science/directional-blur</p> </li> <li> <p>OpenCV filter2D (used for convolution):   https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga27c049795ce870216ddfb366086b5a04</p> </li> <li> <p>Research on motion blur simulation:   \"Understanding and Evaluating Blind Deconvolution Algorithms\" (CVPR 2009)   https://doi.org/10.1109/CVPR.2009.5206815</p> </li> <li> <p>Motion blur in photography:   \"The Manual of Photography\", Chapter 7: Motion in Photography   ISBN: 978-0240520377</p> </li> <li> <p>Kornia's implementation (similar approach):   https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomMotionBlur</p> </li> </ul> <p>See Also:     - GaussianBlur: For uniform blur effects     - MedianBlur: For noise reduction while preserving edges     - RandomRain: Another motion-based effect     - Perspective: For geometric motion-like distortions</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class MotionBlur(Blur):\n    \"\"\"Apply motion blur to the input image using a directional kernel.\n\n    This transform simulates motion blur effects that occur during image capture,\n    such as camera shake or object movement. It creates a directional blur using\n    a line-shaped kernel with controllable angle, direction, and position.\n\n    Args:\n        blur_limit (int | tuple[int, int]): Maximum kernel size for blurring.\n            Should be in range [3, inf).\n            - If int: kernel size will be randomly chosen from [3, blur_limit]\n            - If tuple: kernel size will be randomly chosen from [min, max]\n            Larger values create stronger blur effects.\n            Default: (3, 7)\n\n        angle_range (tuple[float, float]): Range of possible angles in degrees.\n            Controls the rotation of the motion blur line:\n            - 0\u00b0: Horizontal motion blur \u2192\n            - 45\u00b0: Diagonal motion blur \u2197\n            - 90\u00b0: Vertical motion blur \u2191\n            - 135\u00b0: Diagonal motion blur \u2196\n            Default: (0, 360)\n\n        direction_range (tuple[float, float]): Range for motion bias.\n            Controls how the blur extends from the center:\n            - -1.0: Blur extends only backward (\u2190)\n            -  0.0: Blur extends equally in both directions (\u2190\u2192)\n            -  1.0: Blur extends only forward (\u2192)\n            For example, with angle=0:\n            - direction=-1.0: \u2190\u2022\n            - direction=0.0:  \u2190\u2022\u2192\n            - direction=1.0:   \u2022\u2192\n            Default: (-1.0, 1.0)\n\n        allow_shifted (bool): Allow random kernel position shifts.\n            - If True: Kernel can be randomly offset from center\n            - If False: Kernel will always be centered\n            Default: True\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Examples of angle vs direction:\n        1. Horizontal motion (angle=0\u00b0):\n           - direction=0.0:   \u2190\u2022\u2192   (symmetric blur)\n           - direction=1.0:    \u2022\u2192   (forward blur)\n           - direction=-1.0:  \u2190\u2022    (backward blur)\n\n        2. Vertical motion (angle=90\u00b0):\n           - direction=0.0:   \u2191\u2022\u2193   (symmetric blur)\n           - direction=1.0:    \u2022\u2191   (upward blur)\n           - direction=-1.0:  \u2193\u2022    (downward blur)\n\n        3. Diagonal motion (angle=45\u00b0):\n           - direction=0.0:   \u2199\u2022\u2197   (symmetric blur)\n           - direction=1.0:    \u2022\u2197   (forward diagonal blur)\n           - direction=-1.0:  \u2199\u2022    (backward diagonal blur)\n\n    Note:\n        - angle controls the orientation of the motion line\n        - direction controls the distribution of the blur along that line\n        - Together they can simulate various motion effects:\n          * Camera shake: Small angle range + direction near 0\n          * Object motion: Specific angle + direction=1.0\n          * Complex motion: Random angle + random direction\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Horizontal camera shake (symmetric)\n        &gt;&gt;&gt; transform = A.MotionBlur(\n        ...     angle_range=(-5, 5),      # Near-horizontal motion\n        ...     direction_range=(0, 0),    # Symmetric blur\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Object moving right\n        &gt;&gt;&gt; transform = A.MotionBlur(\n        ...     angle_range=(0, 0),        # Horizontal motion\n        ...     direction_range=(0.8, 1.0), # Strong forward bias\n        ...     p=1.0\n        ... )\n\n    References:\n        - Motion blur fundamentals:\n          https://en.wikipedia.org/wiki/Motion_blur\n\n        - Directional blur kernels:\n          https://www.sciencedirect.com/topics/computer-science/directional-blur\n\n        - OpenCV filter2D (used for convolution):\n          https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga27c049795ce870216ddfb366086b5a04\n\n        - Research on motion blur simulation:\n          \"Understanding and Evaluating Blind Deconvolution Algorithms\" (CVPR 2009)\n          https://doi.org/10.1109/CVPR.2009.5206815\n\n        - Motion blur in photography:\n          \"The Manual of Photography\", Chapter 7: Motion in Photography\n          ISBN: 978-0240520377\n\n        - Kornia's implementation (similar approach):\n          https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomMotionBlur\n\n    See Also:\n        - GaussianBlur: For uniform blur effects\n        - MedianBlur: For noise reduction while preserving edges\n        - RandomRain: Another motion-based effect\n        - Perspective: For geometric motion-like distortions\n\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        allow_shifted: bool\n        angle_range: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 360)),\n        ]\n        direction_range: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(min_val=-1.0, max_val=1.0)),\n        ]\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = 7,\n        allow_shifted: bool = True,\n        angle_range: tuple[float, float] = (0, 360),\n        direction_range: tuple[float, float] = (-1.0, 1.0),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(blur_limit=blur_limit, p=p)\n        self.allow_shifted = allow_shifted\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.angle_range = angle_range\n        self.direction_range = direction_range\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"allow_shifted\",\n            \"angle_range\",\n            \"direction_range\",\n        )\n\n    def apply(self, img: np.ndarray, kernel: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.convolve(img, kernel=kernel)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        ksize = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n\n        angle = self.py_random.uniform(*self.angle_range)\n        direction = self.py_random.uniform(*self.direction_range)\n\n        # Create motion blur kernel\n        kernel = fblur.create_motion_kernel(\n            ksize,\n            angle,\n            direction,\n            allow_shifted=self.allow_shifted,\n            random_state=self.py_random,\n        )\n\n        return {\"kernel\": kernel.astype(np.float32) / np.sum(kernel)}\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.ZoomBlur","title":"<code>class  ZoomBlur</code> <code>       (max_factor=(1, 1.31), step_factor=(0.01, 0.03), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply zoom blur transform.</p> <p>Parameters:</p> Name Type Description <code>max_factor</code> <code>float, float) or float</code> <p>range for max factor for blurring. If max_factor is a single float, the range will be (1, limit). Default: (1, 1.31). All max_factor values should be larger than 1.</p> <code>step_factor</code> <code>float, float) or float</code> <p>If single float will be used as step parameter for np.arange. If tuple of float step_factor will be in range <code>[step_factor[0], step_factor[1])</code>. Default: (0.01, 0.03). All step_factor values should be positive.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     unit8, float32</p> <p>Reference</p> <p>https://arxiv.org/abs/1903.12261</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class ZoomBlur(ImageOnlyTransform):\n    \"\"\"Apply zoom blur transform.\n\n    Args:\n        max_factor ((float, float) or float): range for max factor for blurring.\n            If max_factor is a single float, the range will be (1, limit). Default: (1, 1.31).\n            All max_factor values should be larger than 1.\n        step_factor ((float, float) or float): If single float will be used as step parameter for np.arange.\n            If tuple of float step_factor will be in range `[step_factor[0], step_factor[1])`. Default: (0.01, 0.03).\n            All step_factor values should be positive.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        unit8, float32\n\n    Reference:\n        https://arxiv.org/abs/1903.12261\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        max_factor: OnePlusFloatRangeType\n        step_factor: NonNegativeFloatRangeType\n\n    def __init__(\n        self,\n        max_factor: ScaleFloatType = (1, 1.31),\n        step_factor: ScaleFloatType = (0.01, 0.03),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_factor = cast(tuple[float, float], max_factor)\n        self.step_factor = cast(tuple[float, float], step_factor)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        zoom_factors: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.zoom_blur(img, zoom_factors)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        step_factor = self.py_random.uniform(*self.step_factor)\n        max_factor = max(1 + step_factor, self.py_random.uniform(*self.max_factor))\n        return {\"zoom_factors\": np.arange(1.0, max_factor, step_factor)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"max_factor\", \"step_factor\")\n</code></pre>"},{"location":"api_reference/augmentations/crops/","title":"Index","text":"<ul> <li>Crop functional transforms (albumentations.augmentations.crops.functional)</li> <li>Crop transforms (albumentations.augmentations.crops.transforms)</li> </ul>"},{"location":"api_reference/augmentations/crops/functional/","title":"Crop functional transforms (augmentations.crops.functional)","text":""},{"location":"api_reference/augmentations/crops/functional/#albumentations.augmentations.crops.functional.crop_and_pad_keypoints","title":"<code>def crop_and_pad_keypoints    (keypoints, crop_params=None, pad_params=None, image_shape=(0, 0), result_shape=(0, 0), keep_size=False)    </code> [view source on GitHub]","text":"<p>Crop and pad multiple keypoints simultaneously.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).</p> <code>crop_params</code> <code>Sequence[int]</code> <p>Crop parameters [crop_x1, crop_y1, ...].</p> <code>pad_params</code> <code>Sequence[int]</code> <p>Pad parameters [top, bottom, left, right].</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>Original image shape (rows, cols).</p> <code>result_shape</code> <code>Tuple[int, int]</code> <p>Result image shape (rows, cols).</p> <code>keep_size</code> <code>bool</code> <p>Whether to keep the original size.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of transformed keypoints with the same shape as input.</p> Source code in <code>albumentations/augmentations/crops/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef crop_and_pad_keypoints(\n    keypoints: np.ndarray,\n    crop_params: tuple[int, int, int, int] | None = None,\n    pad_params: tuple[int, int, int, int] | None = None,\n    image_shape: tuple[int, int] = (0, 0),\n    result_shape: tuple[int, int] = (0, 0),\n    keep_size: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Crop and pad multiple keypoints simultaneously.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).\n        crop_params (Sequence[int], optional): Crop parameters [crop_x1, crop_y1, ...].\n        pad_params (Sequence[int], optional): Pad parameters [top, bottom, left, right].\n        image_shape (Tuple[int, int]): Original image shape (rows, cols).\n        result_shape (Tuple[int, int]): Result image shape (rows, cols).\n        keep_size (bool): Whether to keep the original size.\n\n    Returns:\n        np.ndarray: Array of transformed keypoints with the same shape as input.\n    \"\"\"\n    transformed_keypoints = keypoints.copy()\n\n    if crop_params is not None:\n        crop_x1, crop_y1 = crop_params[:2]\n        transformed_keypoints[:, 0] -= crop_x1\n        transformed_keypoints[:, 1] -= crop_y1\n\n    if pad_params is not None:\n        top, _, left, _ = pad_params\n        transformed_keypoints[:, 0] += left\n        transformed_keypoints[:, 1] += top\n\n    rows, cols = image_shape[:2]\n    result_rows, result_cols = result_shape[:2]\n\n    if keep_size and (result_cols != cols or result_rows != rows):\n        scale_x = cols / result_cols\n        scale_y = rows / result_rows\n        return fgeometric.keypoints_scale(transformed_keypoints, scale_x, scale_y)\n\n    return transformed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/crops/functional/#albumentations.augmentations.crops.functional.crop_bboxes_by_coords","title":"<code>def crop_bboxes_by_coords    (bboxes, crop_coords, image_shape, normalized_input=True)    </code> [view source on GitHub]","text":"<p>Crop bounding boxes based on given crop coordinates.</p> <p>This function adjusts bounding boxes to fit within a cropped image.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) where each row is                  [x_min, y_min, x_max, y_max, ...]. The bounding box coordinates                  can be either normalized (in [0, 1]) if normalized_input=True or                  absolute pixel values if normalized_input=False.</p> <code>crop_coords</code> <code>tuple[int, int, int, int]</code> <p>Crop coordinates (x_min, y_min, x_max, y_max)                                      in absolute pixel values.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Original image shape (height, width).</p> <code>normalized_input</code> <code>bool</code> <p>Whether input boxes are in normalized coordinates.                    If True, assumes input is normalized [0,1] and returns normalized coordinates.                    If False, assumes input is in absolute pixels and returns absolute coordinates.                    Default: True for backward compatibility.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of cropped bounding boxes. Coordinates will be in the same format as input            (normalized if normalized_input=True, absolute pixels if normalized_input=False).</p> <p>Note</p> <p>Bounding boxes that fall completely outside the crop area will be removed. Bounding boxes that partially overlap with the crop area will be adjusted to fit within it.</p> Source code in <code>albumentations/augmentations/crops/functional.py</code> Python<pre><code>def crop_bboxes_by_coords(\n    bboxes: np.ndarray,\n    crop_coords: tuple[int, int, int, int],\n    image_shape: tuple[int, int],\n    normalized_input: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Crop bounding boxes based on given crop coordinates.\n\n    This function adjusts bounding boxes to fit within a cropped image.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (N, 4+) where each row is\n                             [x_min, y_min, x_max, y_max, ...]. The bounding box coordinates\n                             can be either normalized (in [0, 1]) if normalized_input=True or\n                             absolute pixel values if normalized_input=False.\n        crop_coords (tuple[int, int, int, int]): Crop coordinates (x_min, y_min, x_max, y_max)\n                                                 in absolute pixel values.\n        image_shape (tuple[int, int]): Original image shape (height, width).\n        normalized_input (bool): Whether input boxes are in normalized coordinates.\n                               If True, assumes input is normalized [0,1] and returns normalized coordinates.\n                               If False, assumes input is in absolute pixels and returns absolute coordinates.\n                               Default: True for backward compatibility.\n\n    Returns:\n        np.ndarray: Array of cropped bounding boxes. Coordinates will be in the same format as input\n                   (normalized if normalized_input=True, absolute pixels if normalized_input=False).\n\n    Note:\n        Bounding boxes that fall completely outside the crop area will be removed.\n        Bounding boxes that partially overlap with the crop area will be adjusted to fit within it.\n    \"\"\"\n    if not bboxes.size:\n        return bboxes\n\n    # Convert to absolute coordinates if needed\n    if normalized_input:\n        cropped_bboxes = denormalize_bboxes(bboxes.copy().astype(np.float32), image_shape)\n    else:\n        cropped_bboxes = bboxes.copy().astype(np.float32)\n\n    x_min, y_min = crop_coords[:2]\n\n    # Subtract crop coordinates\n    cropped_bboxes[:, [0, 2]] -= x_min\n    cropped_bboxes[:, [1, 3]] -= y_min\n\n    # Calculate crop shape\n    crop_height = crop_coords[3] - crop_coords[1]\n    crop_width = crop_coords[2] - crop_coords[0]\n    crop_shape = (crop_height, crop_width)\n\n    # Return in same format as input\n    return normalize_bboxes(cropped_bboxes, crop_shape) if normalized_input else cropped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/crops/functional/#albumentations.augmentations.crops.functional.crop_keypoints_by_coords","title":"<code>def crop_keypoints_by_coords    (keypoints, crop_coords)    </code> [view source on GitHub]","text":"<p>Crop keypoints using the provided coordinates of bottom-left and top-right corners in pixels.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>An array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).</p> <code>crop_coords</code> <code>tuple</code> <p>Crop box coords (x1, y1, x2, y2).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of cropped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/crops/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef crop_keypoints_by_coords(\n    keypoints: np.ndarray,\n    crop_coords: tuple[int, int, int, int],\n) -&gt; np.ndarray:\n    \"\"\"Crop keypoints using the provided coordinates of bottom-left and top-right corners in pixels.\n\n    Args:\n        keypoints (np.ndarray): An array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).\n        crop_coords (tuple): Crop box coords (x1, y1, x2, y2).\n\n    Returns:\n        np.ndarray: An array of cropped keypoints with the same shape as the input.\n    \"\"\"\n    x1, y1 = crop_coords[:2]\n\n    cropped_keypoints = keypoints.copy()\n    cropped_keypoints[:, 0] -= x1  # Adjust x coordinates\n    cropped_keypoints[:, 1] -= y1  # Adjust y coordinates\n\n    return cropped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/","title":"Crop transforms (augmentations.crops.transforms)","text":""},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.AtLeastOneBBoxRandomCrop","title":"<code>class  AtLeastOneBBoxRandomCrop</code> <code>       (height, width, erosion_factor=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crops an image to a fixed resolution, while ensuring that at least one bounding box is always in the crop. The maximal erosion factor define by how much the target bounding box can be thinned out. For example, erosion_factor = 0.2 means that the bounding box dimensions can be thinned by up to 20%.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>Height of the crop.</p> <code>width</code> <code>int</code> <p>Width of the crop.</p> <code>erosion_factor</code> <code>float</code> <p>Maximal erosion factor of the height and width of the target bounding box. Default: 0.0.</p> <code>p</code> <code>float</code> <p>The probability of applying the transform. Default: 1.0.</p> <code>always_apply</code> <code>bool | None</code> <p>Whether to apply the transform systematically.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class AtLeastOneBBoxRandomCrop(BaseCrop):\n    \"\"\"Crops an image to a fixed resolution, while ensuring that at least one bounding box is always in the crop.\n    The maximal erosion factor define by how much the target bounding box can be thinned out.\n    For example, erosion_factor = 0.2 means that the bounding box dimensions can be thinned by up to 20%.\n\n    Args:\n        height: Height of the crop.\n        width: Width of the crop.\n        erosion_factor: Maximal erosion factor of the height and width of the target bounding box. Default: 0.0.\n        p: The probability of applying the transform. Default: 1.0.\n        always_apply: Whether to apply the transform systematically.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseCrop.InitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        erosion_factor: Annotated[float, Field(ge=0.0, le=1.0)]\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        erosion_factor: float = 0.0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.height = height\n        self.width = width\n        self.erosion_factor = erosion_factor\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_height, image_width = params[\"shape\"][:2]\n        bboxes = data.get(\"bboxes\", [])\n\n        if self.height &gt; image_height or self.width &gt; image_width:\n            raise CropSizeError(\n                f\"Crop size (height, width) exceeds image dimensions (height, width):\"\n                f\" {(self.height, self.width)} vs {image_height, image_width}\",\n            )\n\n        if len(bboxes) &gt; 0:\n            # Pick a bbox amongst all possible as our reference bbox.\n            bboxes = denormalize_bboxes(bboxes, image_shape=(image_height, image_width))\n            bbox = self.py_random.choice(bboxes)\n\n            x1, y1, x2, y2 = bbox[:4]\n\n            w = x2 - x1\n            h = y2 - y1\n\n            # Compute the eroded width and height\n            ew = w * (1.0 - self.erosion_factor)\n            eh = h * (1.0 - self.erosion_factor)\n\n            # Compute the lower and upper bounds for the x-axis and y-axis.\n            ax1 = np.clip(\n                a=x1 + ew - self.width,\n                a_min=0.0,\n                a_max=image_width - self.width,\n            )\n            bx1 = np.clip(\n                a=x2 - ew,\n                a_min=0.0,\n                a_max=image_width - self.width,\n            )\n\n            ay1 = np.clip(\n                a=y1 + eh - self.height,\n                a_min=0.0,\n                a_max=image_height - self.height,\n            )\n            by1 = np.clip(\n                a=y2 - eh,\n                a_min=0.0,\n                a_max=image_height - self.height,\n            )\n        else:\n            # If there are no bboxes, just crop anywhere in the image.\n            ax1 = 0.0\n            bx1 = image_width - self.width\n\n            ay1 = 0.0\n            by1 = image_height - self.height\n\n        # Randomly draw the upper-left corner.\n        x1 = int(self.py_random.uniform(a=ax1, b=bx1))\n        y1 = int(self.py_random.uniform(a=ay1, b=by1))\n\n        x2 = x1 + self.width\n        y2 = y1 + self.height\n\n        return {\"crop_coords\": (x1, y1, x2, y2)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"erosion_factor\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BBoxSafeRandomCrop","title":"<code>class  BBoxSafeRandomCrop</code> <code>       (erosion_rate=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input without loss of bounding boxes.</p> <p>This transform performs a random crop of the input image while ensuring that all bounding boxes remain within the cropped area. It's particularly useful for object detection tasks where preserving all objects in the image is crucial.</p> <p>Parameters:</p> Name Type Description <code>erosion_rate</code> <code>float</code> <p>A value between 0.0 and 1.0 that determines the minimum allowable size of the crop as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be at least 80% of the original image height. Default: 0.0 (no minimum size).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>This transform ensures that all bounding boxes in the original image are fully contained within the cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out), it will default to cropping the entire image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.ones((300, 300, 3), dtype=np.uint8)\n&gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n&gt;&gt;&gt; transform = A.Compose([\n...     A.BBoxSafeRandomCrop(erosion_rate=0.2, p=1.0),\n... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n&gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BBoxSafeRandomCrop(BaseCrop):\n    \"\"\"Crop a random part of the input without loss of bounding boxes.\n\n    This transform performs a random crop of the input image while ensuring that all bounding boxes remain within\n    the cropped area. It's particularly useful for object detection tasks where preserving all objects in the image\n    is crucial.\n\n    Args:\n        erosion_rate (float): A value between 0.0 and 1.0 that determines the minimum allowable size of the crop\n            as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be\n            at least 80% of the original image height. Default: 0.0 (no minimum size).\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        This transform ensures that all bounding boxes in the original image are fully contained within the\n        cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out),\n        it will default to cropping the entire image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.ones((300, 300, 3), dtype=np.uint8)\n        &gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.BBoxSafeRandomCrop(erosion_rate=0.2, p=1.0),\n        ... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n        &gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        erosion_rate: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n\n    def __init__(self, erosion_rate: float = 0.0, p: float = 1.0, always_apply: bool | None = None):\n        super().__init__(p=p)\n        self.erosion_rate = erosion_rate\n\n    def _get_coords_no_bbox(self, image_shape: tuple[int, int]) -&gt; tuple[int, int, int, int]:\n        image_height, image_width = image_shape\n\n        erosive_h = int(image_height * (1.0 - self.erosion_rate))\n        crop_height = image_height if erosive_h &gt;= image_height else self.py_random.randint(erosive_h, image_height)\n\n        crop_width = int(crop_height * image_width / image_height)\n\n        h_start = self.py_random.random()\n        w_start = self.py_random.random()\n\n        crop_shape = (crop_height, crop_width)\n\n        return fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_shape = params[\"shape\"][:2]\n\n        if len(data[\"bboxes\"]) == 0:  # less likely, this class is for use with bboxes.\n            crop_coords = self._get_coords_no_bbox(image_shape)\n            return {\"crop_coords\": crop_coords}\n\n        bbox_union = union_of_bboxes(bboxes=data[\"bboxes\"], erosion_rate=self.erosion_rate)\n\n        if bbox_union is None:\n            crop_coords = self._get_coords_no_bbox(image_shape)\n            return {\"crop_coords\": crop_coords}\n\n        x_min, y_min, x_max, y_max = bbox_union\n\n        x_min = np.clip(x_min, 0, 1)\n        y_min = np.clip(y_min, 0, 1)\n        x_max = np.clip(x_max, x_min, 1)\n        y_max = np.clip(y_max, y_min, 1)\n\n        image_height, image_width = image_shape\n\n        crop_x_min = int(x_min * self.py_random.random() * image_width)\n        crop_y_min = int(y_min * self.py_random.random() * image_height)\n\n        bbox_xmax = x_max + (1 - x_max) * self.py_random.random()\n        bbox_ymax = y_max + (1 - y_max) * self.py_random.random()\n        crop_x_max = int(bbox_xmax * image_width)\n        crop_y_max = int(bbox_ymax * image_height)\n\n        return {\"crop_coords\": (crop_x_min, crop_y_min, crop_x_max, crop_y_max)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"erosion_rate\",)\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BaseCrop","title":"<code>class  BaseCrop</code> <code> </code>  [view source on GitHub]","text":"<p>Base class for transforms that only perform cropping.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BaseCrop(DualTransform):\n    \"\"\"Base class for transforms that only perform cropping.\"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop(img, x_min=crop_coords[0], y_min=crop_coords[1], x_max=crop_coords[2], y_max=crop_coords[3])\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_bboxes_by_coords(bboxes, crop_coords, params[\"shape\"][:2])\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_keypoints_by_coords(keypoints, crop_coords)\n\n    @staticmethod\n    def _clip_bbox(bbox: tuple[int, int, int, int], image_shape: tuple[int, int]) -&gt; tuple[int, int, int, int]:\n        height, width = image_shape[:2]\n        x_min, y_min, x_max, y_max = bbox\n        x_min = np.clip(x_min, 0, width)\n        y_min = np.clip(y_min, 0, height)\n\n        x_max = np.clip(x_max, x_min, width)\n        y_max = np.clip(y_max, y_min, height)\n        return x_min, y_min, x_max, y_max\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BaseCropAndPad","title":"<code>class  BaseCropAndPad</code> <code>       (pad_if_needed, border_mode, fill, fill_mask, pad_position, p, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Base class for transforms that need both cropping and padding.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BaseCropAndPad(BaseCrop):\n    \"\"\"Base class for transforms that need both cropping and padding.\"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        pad_if_needed: bool\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_position: PositionType\n\n    def __init__(\n        self,\n        pad_if_needed: bool,\n        border_mode: int,\n        fill: ColorType,\n        fill_mask: ColorType,\n        pad_position: PositionType,\n        p: float,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p)\n        self.pad_if_needed = pad_if_needed\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.pad_position = pad_position\n\n    def _get_pad_params(self, image_shape: tuple[int, int], target_shape: tuple[int, int]) -&gt; dict[str, Any] | None:\n        \"\"\"Calculate padding parameters if needed.\"\"\"\n        if not self.pad_if_needed:\n            return None\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.get_padding_params(\n            image_shape=image_shape,\n            min_height=target_shape[0],\n            min_width=target_shape[1],\n            pad_height_divisor=None,\n            pad_width_divisor=None,\n        )\n\n        if h_pad_top == h_pad_bottom == w_pad_left == w_pad_right == 0:\n            return None\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.adjust_padding_by_position(\n            h_top=h_pad_top,\n            h_bottom=h_pad_bottom,\n            w_left=w_pad_left,\n            w_right=w_pad_right,\n            position=self.pad_position,\n            py_random=self.py_random,\n        )\n\n        return {\n            \"pad_top\": h_pad_top,\n            \"pad_bottom\": h_pad_bottom,\n            \"pad_left\": w_pad_left,\n            \"pad_right\": w_pad_right,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        if pad_params is not None:\n            img = fgeometric.pad_with_params(\n                img,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                border_mode=self.border_mode,\n                value=self.fill,\n            )\n        return BaseCrop.apply(self, img, crop_coords, **params)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        crop_coords: Any,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        if pad_params is not None:\n            mask = fgeometric.pad_with_params(\n                mask,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                border_mode=self.border_mode,\n                value=self.fill_mask,\n            )\n        # Note' that super().apply would apply the padding twice as it is looped to this.apply\n        return BaseCrop.apply(self, mask, crop_coords=crop_coords, **params)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        image_shape = params[\"shape\"][:2]\n\n        if pad_params is not None:\n            # First denormalize bboxes to absolute coordinates\n            bboxes_np = denormalize_bboxes(bboxes, image_shape)\n\n            # Apply padding to bboxes (already works with absolute coordinates)\n            bboxes_np = fgeometric.pad_bboxes(\n                bboxes_np,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                self.border_mode,\n                image_shape=image_shape,\n            )\n\n            # Update shape to padded dimensions\n            padded_height = image_shape[0] + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            padded_width = image_shape[1] + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n            padded_shape = (padded_height, padded_width)\n\n            bboxes_np = normalize_bboxes(bboxes_np, padded_shape)\n\n            params[\"shape\"] = padded_shape\n\n            return BaseCrop.apply_to_bboxes(self, bboxes_np, crop_coords, **params)\n\n        # If no padding, use original function behavior\n        return BaseCrop.apply_to_bboxes(self, bboxes, crop_coords, **params)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        image_shape = params[\"shape\"][:2]\n\n        if pad_params is not None:\n            # Calculate padded dimensions\n            padded_height = image_shape[0] + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            padded_width = image_shape[1] + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n\n            # First apply padding to keypoints using original image shape\n            keypoints = fgeometric.pad_keypoints(\n                keypoints,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                self.border_mode,\n                image_shape=image_shape,\n            )\n\n            # Update image shape for subsequent crop operation\n            params = {**params, \"shape\": (padded_height, padded_width)}\n\n        return BaseCrop.apply_to_keypoints(self, keypoints, crop_coords, **params)\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BaseRandomSizedCropInitSchema","title":"<code>class  BaseRandomSizedCropInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BaseRandomSizedCropInitSchema(BaseTransformInitSchema):\n    size: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.CenterCrop","title":"<code>class  CenterCrop</code> <code>       (height, width, pad_if_needed=False, pad_mode=None, pad_cval=None, pad_cval_mask=None, pad_position='center', border_mode=0, fill=0.0, fill_mask=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop the central part of the input.</p> <p>This transform crops the center of the input image, mask, bounding boxes, and keypoints to the specified dimensions. It's useful when you want to focus on the central region of the input, discarding peripheral information.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>The height of the crop. Must be greater than 0.</p> <code>width</code> <code>int</code> <p>The width of the crop. Must be greater than 0.</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if crop size exceeds image size. Default: False.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value for images if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for masks if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>pad_position</code> <code>Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']</code> <p>Position of padding. Default: 'center'.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If pad_if_needed is False and crop size exceeds image dimensions, it will raise a CropSizeError.</li> <li>If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded.</li> <li>For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class CenterCrop(BaseCropAndPad):\n    \"\"\"Crop the central part of the input.\n\n    This transform crops the center of the input image, mask, bounding boxes, and keypoints to the specified dimensions.\n    It's useful when you want to focus on the central region of the input, discarding peripheral information.\n\n    Args:\n        height (int): The height of the crop. Must be greater than 0.\n        width (int): The width of the crop. Must be greater than 0.\n        pad_if_needed (bool): Whether to pad if crop size exceeds image size. Default: False.\n        border_mode (OpenCV flag): OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value for images if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fill_mask (ColorType): Padding value for masks if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']):\n            Position of padding. Default: 'center'.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If pad_if_needed is False and crop size exceeds image dimensions, it will raise a CropSizeError.\n        - If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded.\n        - For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.\n    \"\"\"\n\n    class InitSchema(BaseCropAndPad.InitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_dimensions(self) -&gt; Self:\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n            return self\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        pad_if_needed: bool = False,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        pad_position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0.0,\n        fill_mask: ColorType = 0.0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=pad_position,\n            p=p,\n        )\n        self.height = height\n        self.width = width\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"height\",\n            \"width\",\n            \"pad_if_needed\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"pad_position\",\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        if not self.pad_if_needed and (self.height &gt; image_height or self.width &gt; image_width):\n            raise CropSizeError(\n                f\"Crop size (height, width) exceeds image dimensions (height, width):\"\n                f\" {(self.height, self.width)} vs {image_shape[:2]}\",\n            )\n\n        # Get padding params first if needed\n        pad_params = self._get_pad_params(image_shape, (self.height, self.width))\n\n        # If padding is needed, adjust the image shape for crop calculation\n        if pad_params is not None:\n            pad_top = pad_params[\"pad_top\"]\n            pad_bottom = pad_params[\"pad_bottom\"]\n            pad_left = pad_params[\"pad_left\"]\n            pad_right = pad_params[\"pad_right\"]\n\n            padded_height = image_height + pad_top + pad_bottom\n            padded_width = image_width + pad_left + pad_right\n            padded_shape = (padded_height, padded_width)\n\n            # Get crop coordinates based on padded dimensions\n            crop_coords = fcrops.get_center_crop_coords(padded_shape, (self.height, self.width))\n        else:\n            # Get crop coordinates based on original dimensions\n            crop_coords = fcrops.get_center_crop_coords(image_shape, (self.height, self.width))\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.Crop","title":"<code>class  Crop</code> <code>       (x_min=0, y_min=0, x_max=1024, y_max=1024, pad_if_needed=False, pad_mode=None, pad_cval=None, pad_cval_mask=None, pad_position='center', border_mode=0, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a specific region from the input image.</p> <p>This transform crops a rectangular region from the input image, mask, bounding boxes, and keypoints based on specified coordinates. It's useful when you want to extract a specific area of interest from your inputs.</p> <p>Parameters:</p> Name Type Description <code>x_min</code> <code>int</code> <p>Minimum x-coordinate of the crop region (left edge). Must be &gt;= 0. Default: 0.</p> <code>y_min</code> <code>int</code> <p>Minimum y-coordinate of the crop region (top edge). Must be &gt;= 0. Default: 0.</p> <code>x_max</code> <code>int</code> <p>Maximum x-coordinate of the crop region (right edge). Must be &gt; x_min. Default: 1024.</p> <code>y_max</code> <code>int</code> <p>Maximum y-coordinate of the crop region (bottom edge). Must be &gt; y_min. Default: 1024.</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if crop coordinates exceed image dimensions. Default: False.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for masks. Default: 0.</p> <code>pad_position</code> <code>Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']</code> <p>Position of padding. Default: 'center'.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The crop coordinates are applied as follows: x_min &lt;= x &lt; x_max and y_min &lt;= y &lt; y_max.</li> <li>If pad_if_needed is False and crop region extends beyond image boundaries, it will be clipped.</li> <li>If pad_if_needed is True, image will be padded to accommodate the full crop region.</li> <li>For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class Crop(BaseCropAndPad):\n    \"\"\"Crop a specific region from the input image.\n\n    This transform crops a rectangular region from the input image, mask, bounding boxes, and keypoints\n    based on specified coordinates. It's useful when you want to extract a specific area of interest\n    from your inputs.\n\n    Args:\n        x_min (int): Minimum x-coordinate of the crop region (left edge). Must be &gt;= 0. Default: 0.\n        y_min (int): Minimum y-coordinate of the crop region (top edge). Must be &gt;= 0. Default: 0.\n        x_max (int): Maximum x-coordinate of the crop region (right edge). Must be &gt; x_min. Default: 1024.\n        y_max (int): Maximum y-coordinate of the crop region (bottom edge). Must be &gt; y_min. Default: 1024.\n        pad_if_needed (bool): Whether to pad if crop coordinates exceed image dimensions. Default: False.\n        border_mode (OpenCV flag): OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.\n        fill_mask (ColorType): Padding value for masks. Default: 0.\n        pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']):\n            Position of padding. Default: 'center'.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The crop coordinates are applied as follows: x_min &lt;= x &lt; x_max and y_min &lt;= y &lt; y_max.\n        - If pad_if_needed is False and crop region extends beyond image boundaries, it will be clipped.\n        - If pad_if_needed is True, image will be padded to accommodate the full crop region.\n        - For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.\n    \"\"\"\n\n    class InitSchema(BaseCropAndPad.InitSchema):\n        x_min: Annotated[int, Field(ge=0)]\n        y_min: Annotated[int, Field(ge=0)]\n        x_max: Annotated[int, Field(gt=0)]\n        y_max: Annotated[int, Field(gt=0)]\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_coordinates(self) -&gt; Self:\n            if not self.x_min &lt; self.x_max:\n                msg = \"x_max must be greater than x_min\"\n                raise ValueError(msg)\n            if not self.y_min &lt; self.y_max:\n                msg = \"y_max must be greater than y_min\"\n                raise ValueError(msg)\n\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n\n            return self\n\n    def __init__(\n        self,\n        x_min: int = 0,\n        y_min: int = 0,\n        x_max: int = 1024,\n        y_max: int = 1024,\n        pad_if_needed: bool = False,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        pad_position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=pad_position,\n            p=p,\n        )\n        self.x_min = x_min\n        self.y_min = y_min\n        self.x_max = x_max\n        self.y_max = y_max\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        crop_height = self.y_max - self.y_min\n        crop_width = self.x_max - self.x_min\n\n        if not self.pad_if_needed:\n            # If no padding, clip coordinates to image boundaries\n            x_min = np.clip(self.x_min, 0, image_width)\n            y_min = np.clip(self.y_min, 0, image_height)\n            x_max = np.clip(self.x_max, x_min, image_width)\n            y_max = np.clip(self.y_max, y_min, image_height)\n            return {\"crop_coords\": (x_min, y_min, x_max, y_max)}\n\n        # Calculate padding if needed\n        pad_params = self._get_pad_params(\n            image_shape=image_shape,\n            target_shape=(max(crop_height, image_height), max(crop_width, image_width)),\n        )\n\n        if pad_params is not None:\n            # Adjust crop coordinates based on padding\n            x_min = self.x_min + pad_params[\"pad_left\"]\n            y_min = self.y_min + pad_params[\"pad_top\"]\n            x_max = self.x_max + pad_params[\"pad_left\"]\n            y_max = self.y_max + pad_params[\"pad_top\"]\n            crop_coords = (x_min, y_min, x_max, y_max)\n        else:\n            crop_coords = (self.x_min, self.y_min, self.x_max, self.y_max)\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"x_min\",\n            \"y_min\",\n            \"x_max\",\n            \"y_max\",\n            \"pad_if_needed\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"pad_position\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.CropAndPad","title":"<code>class  CropAndPad</code> <code>       (px=None, percent=None, pad_mode=None, pad_cval=None, pad_cval_mask=None, keep_size=True, sample_independently=True, interpolation=1, mask_interpolation=0, border_mode=0, fill=0, fill_mask=0, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Crop and pad images by pixel amounts or fractions of image sizes.</p> <p>This transform allows for simultaneous cropping and padding of images. Cropping removes pixels from the sides (i.e., extracts a subimage), while padding adds pixels to the sides (e.g., black pixels). The amount of cropping/padding can be specified either in absolute pixels or as a fraction of the image size.</p> <p>Parameters:</p> Name Type Description <code>px</code> <code>int, tuple of int, tuple of tuples of int, or None</code> <p>The number of pixels to crop (negative values) or pad (positive values) on each side of the image. Either this or the parameter <code>percent</code> may be set, not both at the same time. - If int: crop/pad all sides by this value. - If tuple of 2 ints: crop/pad by (top/bottom, left/right). - If tuple of 4 ints: crop/pad by (top, right, bottom, left). - Each int can also be a tuple of 2 ints for a range, or a list of ints for discrete choices. Default: None.</p> <code>percent</code> <code>float, tuple of float, tuple of tuples of float, or None</code> <p>The fraction of the image size to crop (negative values) or pad (positive values) on each side. Either this or the parameter <code>px</code> may be set, not both at the same time. - If float: crop/pad all sides by this fraction. - If tuple of 2 floats: crop/pad by (top/bottom, left/right) fractions. - If tuple of 4 floats: crop/pad by (top, right, bottom, left) fractions. - Each float can also be a tuple of 2 floats for a range, or a list of floats for discrete choices. Default: None.</p> <code>border_mode</code> <code>int</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>The constant value to use for padding if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Same as fill but used for mask padding. Default: 0.</p> <code>keep_size</code> <code>bool</code> <p>If True, the output image will be resized to the input image size after cropping/padding. Default: True.</p> <code>sample_independently</code> <code>bool</code> <p>If True and ranges are used for px/percent, sample a value for each side independently. If False, sample one value and use it for all sides. Default: True.</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag used for resizing if keep_size is True. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>int</code> <p>OpenCV interpolation flag used for resizing if keep_size is True. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform will never crop images below a height or width of 1.</li> <li>When using pixel values (px), the image will be cropped/padded by exactly that many pixels.</li> <li>When using percentages (percent), the amount of crop/pad will be calculated based on the image size.</li> <li>Bounding boxes that end up fully outside the image after cropping will be removed.</li> <li>Keypoints that end up outside the image after cropping will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.CropAndPad(px=(-10, 20, 30, -40), border_mode=cv2.BORDER_REFLECT, fill=128, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class CropAndPad(DualTransform):\n    \"\"\"Crop and pad images by pixel amounts or fractions of image sizes.\n\n    This transform allows for simultaneous cropping and padding of images. Cropping removes pixels from the sides\n    (i.e., extracts a subimage), while padding adds pixels to the sides (e.g., black pixels). The amount of\n    cropping/padding can be specified either in absolute pixels or as a fraction of the image size.\n\n    Args:\n        px (int, tuple of int, tuple of tuples of int, or None):\n            The number of pixels to crop (negative values) or pad (positive values) on each side of the image.\n            Either this or the parameter `percent` may be set, not both at the same time.\n            - If int: crop/pad all sides by this value.\n            - If tuple of 2 ints: crop/pad by (top/bottom, left/right).\n            - If tuple of 4 ints: crop/pad by (top, right, bottom, left).\n            - Each int can also be a tuple of 2 ints for a range, or a list of ints for discrete choices.\n            Default: None.\n\n        percent (float, tuple of float, tuple of tuples of float, or None):\n            The fraction of the image size to crop (negative values) or pad (positive values) on each side.\n            Either this or the parameter `px` may be set, not both at the same time.\n            - If float: crop/pad all sides by this fraction.\n            - If tuple of 2 floats: crop/pad by (top/bottom, left/right) fractions.\n            - If tuple of 4 floats: crop/pad by (top, right, bottom, left) fractions.\n            - Each float can also be a tuple of 2 floats for a range, or a list of floats for discrete choices.\n            Default: None.\n\n        border_mode (int):\n            OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n\n        fill (ColorType):\n            The constant value to use for padding if border_mode is cv2.BORDER_CONSTANT.\n            Default: 0.\n\n        fill_mask (ColorType):\n            Same as fill but used for mask padding. Default: 0.\n\n        keep_size (bool):\n            If True, the output image will be resized to the input image size after cropping/padding.\n            Default: True.\n\n        sample_independently (bool):\n            If True and ranges are used for px/percent, sample a value for each side independently.\n            If False, sample one value and use it for all sides. Default: True.\n\n        interpolation (int):\n            OpenCV interpolation flag used for resizing if keep_size is True.\n            Default: cv2.INTER_LINEAR.\n\n        mask_interpolation (int):\n            OpenCV interpolation flag used for resizing if keep_size is True.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n\n        p (float):\n            Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform will never crop images below a height or width of 1.\n        - When using pixel values (px), the image will be cropped/padded by exactly that many pixels.\n        - When using percentages (percent), the amount of crop/pad will be calculated based on the image size.\n        - Bounding boxes that end up fully outside the image after cropping will be removed.\n        - Keypoints that end up outside the image after cropping will be removed.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.CropAndPad(px=(-10, 20, 30, -40), border_mode=cv2.BORDER_REFLECT, fill=128, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        px: PxType | None\n        percent: PercentType | None\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n        keep_size: bool\n        sample_independently: bool\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        @model_validator(mode=\"after\")\n        def check_px_percent(self) -&gt; Self:\n            if self.px is None and self.percent is None:\n                msg = \"Both px and percent parameters cannot be None simultaneously.\"\n                raise ValueError(msg)\n            if self.px is not None and self.percent is not None:\n                msg = \"Only px or percent may be set!\"\n                raise ValueError(msg)\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n\n            return self\n\n    def __init__(\n        self,\n        px: int | list[int] | None = None,\n        percent: float | list[float] | None = None,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        keep_size: bool = True,\n        sample_independently: bool = True,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        border_mode: BorderModeType = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.px = px\n        self.percent = percent\n\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n        self.keep_size = keep_size\n        self.sample_independently = sample_independently\n\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_params: Sequence[int],\n        pad_params: Sequence[int],\n        fill: ColorType,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad(\n            img,\n            crop_params,\n            pad_params,\n            fill,\n            params[\"shape\"][:2],\n            self.interpolation,\n            self.border_mode,\n            self.keep_size,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        crop_params: Sequence[int],\n        pad_params: Sequence[int],\n        fill_mask: ColorType,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad(\n            mask,\n            crop_params,\n            pad_params,\n            fill_mask,\n            params[\"shape\"][:2],\n            self.mask_interpolation,\n            self.border_mode,\n            self.keep_size,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        crop_params: tuple[int, int, int, int],\n        pad_params: tuple[int, int, int, int],\n        result_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad_bboxes(bboxes, crop_params, pad_params, params[\"shape\"][:2], result_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_params: tuple[int, int, int, int],\n        pad_params: tuple[int, int, int, int],\n        result_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad_keypoints(\n            keypoints,\n            crop_params,\n            pad_params,\n            params[\"shape\"][:2],\n            result_shape,\n            self.keep_size,\n        )\n\n    @staticmethod\n    def __prevent_zero(val1: int, val2: int, max_val: int) -&gt; tuple[int, int]:\n        regain = abs(max_val) + 1\n        regain1 = regain // 2\n        regain2 = regain // 2\n        if regain1 + regain2 &lt; regain:\n            regain1 += 1\n\n        if regain1 &gt; val1:\n            diff = regain1 - val1\n            regain1 = val1\n            regain2 += diff\n        elif regain2 &gt; val2:\n            diff = regain2 - val2\n            regain2 = val2\n            regain1 += diff\n\n        return val1 - regain1, val2 - regain2\n\n    @staticmethod\n    def _prevent_zero(crop_params: list[int], height: int, width: int) -&gt; list[int]:\n        top, right, bottom, left = crop_params\n\n        remaining_height = height - (top + bottom)\n        remaining_width = width - (left + right)\n\n        if remaining_height &lt; 1:\n            top, bottom = CropAndPad.__prevent_zero(top, bottom, height)\n        if remaining_width &lt; 1:\n            left, right = CropAndPad.__prevent_zero(left, right, width)\n\n        return [max(top, 0), max(right, 0), max(bottom, 0), max(left, 0)]\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        if self.px is not None:\n            new_params = self._get_px_params()\n        else:\n            percent_params = self._get_percent_params()\n            new_params = [\n                int(percent_params[0] * height),\n                int(percent_params[1] * width),\n                int(percent_params[2] * height),\n                int(percent_params[3] * width),\n            ]\n\n        pad_params = [max(i, 0) for i in new_params]\n\n        crop_params = self._prevent_zero([-min(i, 0) for i in new_params], height, width)\n\n        top, right, bottom, left = crop_params\n        crop_params = [left, top, width - right, height - bottom]\n        result_rows = crop_params[3] - crop_params[1]\n        result_cols = crop_params[2] - crop_params[0]\n        if result_cols == width and result_rows == height:\n            crop_params = []\n\n        top, right, bottom, left = pad_params\n        pad_params = [top, bottom, left, right]\n        if any(pad_params):\n            result_rows += top + bottom\n            result_cols += left + right\n        else:\n            pad_params = []\n\n        return {\n            \"crop_params\": crop_params or None,\n            \"pad_params\": pad_params or None,\n            \"fill\": None if pad_params is None else self._get_pad_value(cast(ColorType, self.fill)),\n            \"fill_mask\": None if pad_params is None else self._get_pad_value(cast(ColorType, self.fill_mask)),\n            \"result_shape\": (result_rows, result_cols),\n        }\n\n    def _get_px_params(self) -&gt; list[int]:\n        if self.px is None:\n            msg = \"px is not set\"\n            raise ValueError(msg)\n\n        if isinstance(self.px, int):\n            params = [self.px] * 4\n        elif len(self.px) == PAIR:\n            if self.sample_independently:\n                params = [self.py_random.randrange(*self.px) for _ in range(4)]\n            else:\n                px = self.py_random.randrange(*self.px)\n                params = [px] * 4\n        elif isinstance(self.px[0], int):\n            params = self.px\n        elif len(self.px[0]) == PAIR:\n            params = [self.py_random.randrange(*i) for i in self.px]\n        else:\n            params = [self.py_random.choice(i) for i in self.px]\n\n        return params\n\n    def _get_percent_params(self) -&gt; list[float]:\n        if self.percent is None:\n            msg = \"percent is not set\"\n            raise ValueError(msg)\n\n        if isinstance(self.percent, float):\n            params = [self.percent] * 4\n        elif len(self.percent) == PAIR:\n            if self.sample_independently:\n                params = [self.py_random.uniform(*self.percent) for _ in range(4)]\n            else:\n                px = self.py_random.uniform(*self.percent)\n                params = [px] * 4\n        elif isinstance(self.percent[0], (int, float)):\n            params = self.percent\n        elif len(self.percent[0]) == PAIR:\n            params = [self.py_random.uniform(*i) for i in self.percent]\n        else:\n            params = [self.py_random.choice(i) for i in self.percent]\n\n        return params  # params = [top, right, bottom, left]\n\n    def _get_pad_value(\n        self,\n        fill: ColorType,\n    ) -&gt; int | float:\n        if isinstance(fill, (list, tuple)):\n            if len(fill) == PAIR:\n                a, b = fill\n                if isinstance(a, int) and isinstance(b, int):\n                    return self.py_random.randint(a, b)\n                return self.py_random.uniform(a, b)\n            return self.py_random.choice(fill)\n\n        if isinstance(fill, Real):\n            return fill\n\n        msg = \"fill should be a number or list, or tuple of two numbers.\"\n        raise ValueError(msg)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"px\",\n            \"percent\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"keep_size\",\n            \"sample_independently\",\n            \"interpolation\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.CropNonEmptyMaskIfExists","title":"<code>class  CropNonEmptyMaskIfExists</code> <code>       (height, width, ignore_values=None, ignore_channels=None, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop area with mask if mask is non-empty, else make random crop.</p> <p>This transform attempts to crop a region containing a mask (non-zero pixels). If the mask is empty or not provided, it falls back to a random crop. This is particularly useful for segmentation tasks where you want to focus on regions of interest defined by the mask.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>Vertical size of crop in pixels. Must be &gt; 0.</p> <code>width</code> <code>int</code> <p>Horizontal size of crop in pixels. Must be &gt; 0.</p> <code>ignore_values</code> <code>list of int</code> <p>Values to ignore in mask, <code>0</code> values are always ignored. For example, if background value is 5, set <code>ignore_values=[5]</code> to ignore it. Default: None.</p> <code>ignore_channels</code> <code>list of int</code> <p>Channels to ignore in mask. For example, if background is the first channel, set <code>ignore_channels=[0]</code> to ignore it. Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If a mask is provided, the transform will try to crop an area containing non-zero (or non-ignored) pixels.</li> <li>If no suitable area is found in the mask or no mask is provided, it will perform a random crop.</li> <li>The crop size (height, width) must not exceed the original image dimensions.</li> <li>Bounding boxes and keypoints are also cropped along with the image and mask.</li> </ul> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the specified crop size is larger than the input image dimensions.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n&gt;&gt;&gt; mask[25:75, 25:75] = 1  # Create a non-empty region in the mask\n&gt;&gt;&gt; transform = A.Compose([\n...     A.CropNonEmptyMaskIfExists(height=50, width=50, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n# The resulting crop will likely include part of the non-zero region in the mask\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class CropNonEmptyMaskIfExists(BaseCrop):\n    \"\"\"Crop area with mask if mask is non-empty, else make random crop.\n\n    This transform attempts to crop a region containing a mask (non-zero pixels). If the mask is empty or not provided,\n    it falls back to a random crop. This is particularly useful for segmentation tasks where you want to focus on\n    regions of interest defined by the mask.\n\n    Args:\n        height (int): Vertical size of crop in pixels. Must be &gt; 0.\n        width (int): Horizontal size of crop in pixels. Must be &gt; 0.\n        ignore_values (list of int, optional): Values to ignore in mask, `0` values are always ignored.\n            For example, if background value is 5, set `ignore_values=[5]` to ignore it. Default: None.\n        ignore_channels (list of int, optional): Channels to ignore in mask.\n            For example, if background is the first channel, set `ignore_channels=[0]` to ignore it. Default: None.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If a mask is provided, the transform will try to crop an area containing non-zero (or non-ignored) pixels.\n        - If no suitable area is found in the mask or no mask is provided, it will perform a random crop.\n        - The crop size (height, width) must not exceed the original image dimensions.\n        - Bounding boxes and keypoints are also cropped along with the image and mask.\n\n    Raises:\n        ValueError: If the specified crop size is larger than the input image dimensions.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; mask[25:75, 25:75] = 1  # Create a non-empty region in the mask\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.CropNonEmptyMaskIfExists(height=50, width=50, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        # The resulting crop will likely include part of the non-zero region in the mask\n    \"\"\"\n\n    class InitSchema(BaseCrop.InitSchema):\n        ignore_values: list[int] | None\n        ignore_channels: list[int] | None\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        ignore_values: list[int] | None = None,\n        ignore_channels: list[int] | None = None,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p)\n\n        self.height = height\n        self.width = width\n        self.ignore_values = ignore_values\n        self.ignore_channels = ignore_channels\n\n    def _preprocess_mask(self, mask: np.ndarray) -&gt; np.ndarray:\n        mask_height, mask_width = mask.shape[:2]\n\n        if self.ignore_values is not None:\n            ignore_values_np = np.array(self.ignore_values)\n            mask = np.where(np.isin(mask, ignore_values_np), 0, mask)\n\n        if mask.ndim == NUM_MULTI_CHANNEL_DIMENSIONS and self.ignore_channels is not None:\n            target_channels = np.array([ch for ch in range(mask.shape[-1]) if ch not in self.ignore_channels])\n            mask = np.take(mask, target_channels, axis=-1)\n\n        if self.height &gt; mask_height or self.width &gt; mask_width:\n            raise ValueError(\n                f\"Crop size ({self.height},{self.width}) is larger than image ({mask_height},{mask_width})\",\n            )\n\n        return mask\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        \"\"\"Get crop coordinates based on mask content.\"\"\"\n        if \"mask\" in data:\n            mask = self._preprocess_mask(data[\"mask\"])\n        elif \"masks\" in data and len(data[\"masks\"]):\n            masks = data[\"masks\"]\n            mask = self._preprocess_mask(np.copy(masks[0]))\n            for m in masks[1:]:\n                mask |= self._preprocess_mask(m)\n        else:\n            msg = \"Can not find mask for CropNonEmptyMaskIfExists\"\n            raise RuntimeError(msg)\n\n        mask_height, mask_width = mask.shape[:2]\n\n        if mask.any():\n            # Find non-zero regions in mask\n            mask_sum = mask.sum(axis=-1) if mask.ndim == NUM_MULTI_CHANNEL_DIMENSIONS else mask\n            non_zero_yx = np.argwhere(mask_sum)\n            y, x = self.py_random.choice(non_zero_yx)\n\n            # Calculate crop coordinates centered around chosen point\n            x_min = x - self.py_random.randint(0, self.width - 1)\n            y_min = y - self.py_random.randint(0, self.height - 1)\n            x_min = np.clip(x_min, 0, mask_width - self.width)\n            y_min = np.clip(y_min, 0, mask_height - self.height)\n        else:\n            # Random crop if no non-zero regions\n            x_min = self.py_random.randint(0, mask_width - self.width)\n            y_min = self.py_random.randint(0, mask_height - self.height)\n\n        x_max = x_min + self.width\n        y_max = y_min + self.height\n\n        return {\"crop_coords\": (x_min, y_min, x_max, y_max)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"ignore_values\", \"ignore_channels\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomCrop","title":"<code>class  RandomCrop</code> <code>       (height, width, pad_if_needed=False, pad_mode=None, pad_cval=None, pad_cval_mask=None, pad_position='center', border_mode=0, fill=0.0, fill_mask=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>height of the crop.</p> <code>width</code> <code>int</code> <p>width of the crop.</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if crop size exceeds image size. Default: False.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value for images if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for masks if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>pad_position</code> <code>Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']</code> <p>Position of padding. Default: 'center'.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded before applying the random crop.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomCrop(BaseCropAndPad):\n    \"\"\"Crop a random part of the input.\n\n    Args:\n        height: height of the crop.\n        width: width of the crop.\n        pad_if_needed (bool): Whether to pad if crop size exceeds image size. Default: False.\n        border_mode (OpenCV flag): OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value for images if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fill_mask (ColorType): Padding value for masks if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']):\n            Position of padding. Default: 'center'.\n        p: probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded\n        before applying the random crop.\n    \"\"\"\n\n    class InitSchema(BaseCropAndPad.InitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_dimensions(self) -&gt; Self:\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n            return self\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        pad_if_needed: bool = False,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        pad_position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0.0,\n        fill_mask: ColorType = 0.0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=pad_position,\n            p=p,\n        )\n        self.height = height\n        self.width = width\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:  # Changed return type to be more flexible\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        if not self.pad_if_needed and (self.height &gt; image_height or self.width &gt; image_width):\n            raise CropSizeError(\n                f\"Crop size (height, width) exceeds image dimensions (height, width):\"\n                f\" {(self.height, self.width)} vs {image_shape[:2]}\",\n            )\n\n        # Get padding params first if needed\n        pad_params = self._get_pad_params(image_shape, (self.height, self.width))\n\n        # If padding is needed, adjust the image shape for crop calculation\n        if pad_params is not None:\n            pad_top = pad_params[\"pad_top\"]\n            pad_bottom = pad_params[\"pad_bottom\"]\n            pad_left = pad_params[\"pad_left\"]\n            pad_right = pad_params[\"pad_right\"]\n\n            padded_height = image_height + pad_top + pad_bottom\n            padded_width = image_width + pad_left + pad_right\n            padded_shape = (padded_height, padded_width)\n\n            # Get random crop coordinates based on padded dimensions\n            h_start = self.py_random.random()\n            w_start = self.py_random.random()\n            crop_coords = fcrops.get_crop_coords(padded_shape, (self.height, self.width), h_start, w_start)\n        else:\n            # Get random crop coordinates based on original dimensions\n            h_start = self.py_random.random()\n            w_start = self.py_random.random()\n            crop_coords = fcrops.get_crop_coords(image_shape, (self.height, self.width), h_start, w_start)\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"height\",\n            \"width\",\n            \"pad_if_needed\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"pad_position\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomCropFromBorders","title":"<code>class  RandomCropFromBorders</code> <code>       (crop_left=0.1, crop_right=0.1, crop_top=0.1, crop_bottom=0.1, always_apply=None, p=1.0)                     </code>  [view source on GitHub]","text":"<p>Randomly crops the input from its borders without resizing.</p> <p>This transform randomly crops parts of the input (image, mask, bounding boxes, or keypoints) from each of its borders. The amount of cropping is specified as a fraction of the input's dimensions for each side independently.</p> <p>Parameters:</p> Name Type Description <code>crop_left</code> <code>float</code> <p>The maximum fraction of width to crop from the left side. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>crop_right</code> <code>float</code> <p>The maximum fraction of width to crop from the right side. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>crop_top</code> <code>float</code> <p>The maximum fraction of height to crop from the top. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>crop_bottom</code> <code>float</code> <p>The maximum fraction of height to crop from the bottom. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual amount of cropping for each side is randomly chosen between 0 and   the specified maximum for each application of the transform.</li> <li>The sum of crop_left and crop_right must not exceed 1.0, and the sum of   crop_top and crop_bottom must not exceed 1.0. Otherwise, a ValueError will be raised.</li> <li>This transform does not resize the input after cropping, so the output dimensions   will be smaller than the input dimensions.</li> <li>Bounding boxes that end up fully outside the cropped area will be removed.</li> <li>Keypoints that end up outside the cropped area will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomCropFromBorders(\n...     crop_left=0.1, crop_right=0.2, crop_top=0.2, crop_bottom=0.1, p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# The resulting image will have random crops from each border, with the maximum\n# possible crops being 10% from the left, 20% from the right, 20% from the top,\n# and 10% from the bottom. The image size will be reduced accordingly.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomCropFromBorders(BaseCrop):\n    \"\"\"Randomly crops the input from its borders without resizing.\n\n    This transform randomly crops parts of the input (image, mask, bounding boxes, or keypoints)\n    from each of its borders. The amount of cropping is specified as a fraction of the input's\n    dimensions for each side independently.\n\n    Args:\n        crop_left (float): The maximum fraction of width to crop from the left side.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        crop_right (float): The maximum fraction of width to crop from the right side.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        crop_top (float): The maximum fraction of height to crop from the top.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        crop_bottom (float): The maximum fraction of height to crop from the bottom.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual amount of cropping for each side is randomly chosen between 0 and\n          the specified maximum for each application of the transform.\n        - The sum of crop_left and crop_right must not exceed 1.0, and the sum of\n          crop_top and crop_bottom must not exceed 1.0. Otherwise, a ValueError will be raised.\n        - This transform does not resize the input after cropping, so the output dimensions\n          will be smaller than the input dimensions.\n        - Bounding boxes that end up fully outside the cropped area will be removed.\n        - Keypoints that end up outside the cropped area will be removed.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomCropFromBorders(\n        ...     crop_left=0.1, crop_right=0.2, crop_top=0.2, crop_bottom=0.1, p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # The resulting image will have random crops from each border, with the maximum\n        # possible crops being 10% from the left, 20% from the right, 20% from the top,\n        # and 10% from the bottom. The image size will be reduced accordingly.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        crop_left: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        crop_right: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        crop_top: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        crop_bottom: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n\n        @model_validator(mode=\"after\")\n        def validate_crop_values(self) -&gt; Self:\n            if self.crop_left + self.crop_right &gt; 1.0:\n                msg = \"The sum of crop_left and crop_right must be &lt;= 1.\"\n                raise ValueError(msg)\n            if self.crop_top + self.crop_bottom &gt; 1.0:\n                msg = \"The sum of crop_top and crop_bottom must be &lt;= 1.\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        crop_left: float = 0.1,\n        crop_right: float = 0.1,\n        crop_top: float = 0.1,\n        crop_bottom: float = 0.1,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p)\n        self.crop_left = crop_left\n        self.crop_right = crop_right\n        self.crop_top = crop_top\n        self.crop_bottom = crop_bottom\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        height, width = params[\"shape\"][:2]\n\n        x_min = self.py_random.randint(0, int(self.crop_left * width))\n        x_max = self.py_random.randint(max(x_min + 1, int((1 - self.crop_right) * width)), width)\n\n        y_min = self.py_random.randint(0, int(self.crop_top * height))\n        y_max = self.py_random.randint(max(y_min + 1, int((1 - self.crop_bottom) * height)), height)\n\n        crop_coords = x_min, y_min, x_max, y_max\n\n        return {\"crop_coords\": crop_coords}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"crop_left\", \"crop_right\", \"crop_top\", \"crop_bottom\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomCropNearBBox","title":"<code>class  RandomCropNearBBox</code> <code>       (max_part_shift=(0, 0.3), cropping_bbox_key='cropping_bbox', cropping_box_key=None, always_apply=None, p=1.0)                       </code>  [view source on GitHub]","text":"<p>Crop bbox from image with random shift by x,y coordinates</p> <p>Parameters:</p> Name Type Description <code>max_part_shift</code> <code>float, (float, float</code> <p>Max shift in <code>height</code> and <code>width</code> dimensions relative to <code>cropping_bbox</code> dimension. If max_part_shift is a single float, the range will be (0, max_part_shift). Default (0, 0.3).</p> <code>cropping_bbox_key</code> <code>str</code> <p>Additional target key for cropping box. Default <code>cropping_bbox</code>.</p> <code>cropping_box_key</code> <code>str</code> <p>[Deprecated] Use <code>cropping_bbox_key</code> instead.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_bbox_key='test_bbox')],\n&gt;&gt;&gt;              bbox_params=BboxParams(\"pascal_voc\"))\n&gt;&gt;&gt; result = aug(image=image, bboxes=bboxes, test_bbox=[0, 5, 10, 20])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomCropNearBBox(BaseCrop):\n    \"\"\"Crop bbox from image with random shift by x,y coordinates\n\n    Args:\n        max_part_shift (float, (float, float)): Max shift in `height` and `width` dimensions relative\n            to `cropping_bbox` dimension.\n            If max_part_shift is a single float, the range will be (0, max_part_shift).\n            Default (0, 0.3).\n        cropping_bbox_key (str): Additional target key for cropping box. Default `cropping_bbox`.\n        cropping_box_key (str): [Deprecated] Use `cropping_bbox_key` instead.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Examples:\n        &gt;&gt;&gt; aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_bbox_key='test_bbox')],\n        &gt;&gt;&gt;              bbox_params=BboxParams(\"pascal_voc\"))\n        &gt;&gt;&gt; result = aug(image=image, bboxes=bboxes, test_bbox=[0, 5, 10, 20])\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_part_shift: ZeroOneRangeType\n        cropping_bbox_key: str\n\n    def __init__(\n        self,\n        max_part_shift: ScaleFloatType = (0, 0.3),\n        cropping_bbox_key: str = \"cropping_bbox\",\n        cropping_box_key: str | None = None,  # Deprecated\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p)\n        # Check for deprecated parameter and issue warning\n        if cropping_box_key is not None:\n            warn(\n                \"The parameter 'cropping_box_key' is deprecated and will be removed in future versions. \"\n                \"Use 'cropping_bbox_key' instead.\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            # Ensure the new parameter is used even if the old one is passed\n            cropping_bbox_key = cropping_box_key\n\n        self.max_part_shift = cast(tuple[float, float], max_part_shift)\n        self.cropping_bbox_key = cropping_bbox_key\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[float, ...]]:\n        bbox = data[self.cropping_bbox_key]\n\n        image_shape = params[\"shape\"][:2]\n\n        bbox = self._clip_bbox(bbox, image_shape)\n\n        h_max_shift = round((bbox[3] - bbox[1]) * self.max_part_shift[0])\n        w_max_shift = round((bbox[2] - bbox[0]) * self.max_part_shift[1])\n\n        x_min = bbox[0] - self.py_random.randint(-w_max_shift, w_max_shift)\n        x_max = bbox[2] + self.py_random.randint(-w_max_shift, w_max_shift)\n\n        y_min = bbox[1] - self.py_random.randint(-h_max_shift, h_max_shift)\n        y_max = bbox[3] + self.py_random.randint(-h_max_shift, h_max_shift)\n\n        crop_coords = self._clip_bbox((x_min, y_min, x_max, y_max), image_shape)\n\n        if crop_coords[0] == crop_coords[2] or crop_coords[1] == crop_coords[3]:\n            crop_shape = (bbox[3] - bbox[1], bbox[2] - bbox[0])\n            crop_coords = fcrops.get_center_crop_coords(image_shape, crop_shape)\n\n        return {\"crop_coords\": crop_coords}\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [self.cropping_bbox_key]\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_part_shift\", \"cropping_bbox_key\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomResizedCrop","title":"<code>class  RandomResizedCrop</code> <code>       (size=None, width=None, height=None, *, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input and rescale it to a specified size.</p> <p>This transform first crops a random portion of the input image (or mask, bounding boxes, keypoints) and then resizes the crop to a specified size. It's particularly useful for training neural networks on images of varying sizes and aspect ratios.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, int]</code> <p>Target size for the output image, i.e. (height, width) after crop and resize.</p> <code>scale</code> <code>tuple[float, float]</code> <p>Range of the random size of the crop relative to the input size. For example, (0.08, 1.0) means the crop size will be between 8% and 100% of the input size. Default: (0.08, 1.0)</p> <code>ratio</code> <code>tuple[float, float]</code> <p>Range of aspect ratios of the random crop. For example, (0.75, 1.3333) allows crop aspect ratios from 3:4 to 4:3. Default: (0.75, 1.3333333333333333)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform attempts to crop a random area with an aspect ratio and relative size   specified by 'ratio' and 'scale' parameters. If it fails to find a suitable crop after   10 attempts, it will return a crop from the center of the image.</li> <li>The crop's aspect ratio is defined as width / height.</li> <li>Bounding boxes that end up fully outside the cropped area will be removed.</li> <li>Keypoints that end up outside the cropped area will be removed.</li> <li>After cropping, the result is resized to the specified size.</li> </ul> <p>Mathematical Details:     1. A target area A is sampled from the range [scale[0] * input_area, scale[1] * input_area].     2. A target aspect ratio r is sampled from the range [ratio[0], ratio[1]].     3. The crop width and height are computed as:        w = sqrt(A * r)        h = sqrt(A / r)     4. If w and h are within the input image dimensions, the crop is accepted.        Otherwise, steps 1-3 are repeated (up to 10 times).     5. If no valid crop is found after 10 attempts, a centered crop is taken.     6. The crop is then resized to the specified size.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomResizedCrop(size=80, scale=(0.5, 1.0), ratio=(0.75, 1.33), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# transformed_image will be a 80x80 crop from a random location in the original image,\n# with the crop's size between 50% and 100% of the original image size,\n# and the crop's aspect ratio between 3:4 and 4:3.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomResizedCrop(_BaseRandomSizedCrop):\n    \"\"\"Crop a random part of the input and rescale it to a specified size.\n\n    This transform first crops a random portion of the input image (or mask, bounding boxes, keypoints)\n    and then resizes the crop to a specified size. It's particularly useful for training neural networks\n    on images of varying sizes and aspect ratios.\n\n    Args:\n        size (tuple[int, int]): Target size for the output image, i.e. (height, width) after crop and resize.\n        scale (tuple[float, float]): Range of the random size of the crop relative to the input size.\n            For example, (0.08, 1.0) means the crop size will be between 8% and 100% of the input size.\n            Default: (0.08, 1.0)\n        ratio (tuple[float, float]): Range of aspect ratios of the random crop.\n            For example, (0.75, 1.3333) allows crop aspect ratios from 3:4 to 4:3.\n            Default: (0.75, 1.3333333333333333)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform attempts to crop a random area with an aspect ratio and relative size\n          specified by 'ratio' and 'scale' parameters. If it fails to find a suitable crop after\n          10 attempts, it will return a crop from the center of the image.\n        - The crop's aspect ratio is defined as width / height.\n        - Bounding boxes that end up fully outside the cropped area will be removed.\n        - Keypoints that end up outside the cropped area will be removed.\n        - After cropping, the result is resized to the specified size.\n\n    Mathematical Details:\n        1. A target area A is sampled from the range [scale[0] * input_area, scale[1] * input_area].\n        2. A target aspect ratio r is sampled from the range [ratio[0], ratio[1]].\n        3. The crop width and height are computed as:\n           w = sqrt(A * r)\n           h = sqrt(A / r)\n        4. If w and h are within the input image dimensions, the crop is accepted.\n           Otherwise, steps 1-3 are repeated (up to 10 times).\n        5. If no valid crop is found after 10 attempts, a centered crop is taken.\n        6. The crop is then resized to the specified size.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomResizedCrop(size=80, scale=(0.5, 1.0), ratio=(0.75, 1.33), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # transformed_image will be a 80x80 crop from a random location in the original image,\n        # with the crop's size between 50% and 100% of the original image size,\n        # and the crop's aspect ratio between 3:4 and 4:3.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1)), AfterValidator(nondecreasing)]\n        ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n        width: int | None\n        height: int | None\n        size: ScaleIntType | None\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def process(self) -&gt; Self:\n            if isinstance(self.size, int):\n                if isinstance(self.width, int):\n                    warn(\n                        \"Initializing with 'size' as an integer and a separate 'width', `height` are deprecated. \"\n                        \"Please use a tuple (height, width) for the 'size' argument.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                    self.size = (self.size, self.width)\n                else:\n                    msg = \"If size is an integer, width as integer must be specified.\"\n                    raise TypeError(msg)\n\n            if self.size is None:\n                if self.height is None or self.width is None:\n                    message = \"If 'size' is not provided, both 'height' and 'width' must be specified.\"\n                    raise ValueError(message)\n                self.size = (self.height, self.width)\n\n            return self\n\n    def __init__(\n        self,\n        # NOTE @zetyquickly: when (width, height) are deprecated, make 'size' non optional\n        size: ScaleIntType | None = None,\n        width: int | None = None,\n        height: int | None = None,\n        *,\n        scale: tuple[float, float] = (0.08, 1.0),\n        ratio: tuple[float, float] = (0.75, 1.3333333333333333),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            size=cast(tuple[int, int], size),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.scale = scale\n        self.ratio = ratio\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        area = image_height * image_width\n\n        for _ in range(10):\n            target_area = self.py_random.uniform(*self.scale) * area\n            log_ratio = (math.log(self.ratio[0]), math.log(self.ratio[1]))\n            aspect_ratio = math.exp(self.py_random.uniform(*log_ratio))\n\n            width = int(round(math.sqrt(target_area * aspect_ratio)))\n            height = int(round(math.sqrt(target_area / aspect_ratio)))\n\n            if 0 &lt; width &lt;= image_width and 0 &lt; height &lt;= image_height:\n                i = self.py_random.randint(0, image_height - height)\n                j = self.py_random.randint(0, image_width - width)\n\n                h_start = i * 1.0 / (image_height - height + 1e-10)\n                w_start = j * 1.0 / (image_width - width + 1e-10)\n\n                crop_shape = (height, width)\n\n                crop_coords = fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n                return {\"crop_coords\": crop_coords}\n\n        # Fallback to central crop\n        in_ratio = image_width / image_height\n        if in_ratio &lt; min(self.ratio):\n            width = image_width\n            height = int(round(image_width / min(self.ratio)))\n        elif in_ratio &gt; max(self.ratio):\n            height = image_height\n            width = int(round(height * max(self.ratio)))\n        else:  # whole image\n            width = image_width\n            height = image_height\n\n        i = (image_height - height) // 2\n        j = (image_width - width) // 2\n\n        h_start = i * 1.0 / (image_height - height + 1e-10)\n        w_start = j * 1.0 / (image_width - width + 1e-10)\n\n        crop_shape = (height, width)\n\n        crop_coords = fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n        return {\"crop_coords\": crop_coords}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"size\", \"scale\", \"ratio\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomSizedBBoxSafeCrop","title":"<code>class  RandomSizedBBoxSafeCrop</code> <code>       (height, width, erosion_rate=0.0, interpolation=1, mask_interpolation=0, always_apply=None, p=1.0)                         </code>  [view source on GitHub]","text":"<p>Crop a random part of the input and rescale it to a specific size without loss of bounding boxes.</p> <p>This transform first attempts to crop a random portion of the input image while ensuring that all bounding boxes remain within the cropped area. It then resizes the crop to the specified size. This is particularly useful for object detection tasks where preserving all objects in the image is crucial while also standardizing the image size.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>Height of the output image after resizing.</p> <code>width</code> <code>int</code> <p>Width of the output image after resizing.</p> <code>erosion_rate</code> <code>float</code> <p>A value between 0.0 and 1.0 that determines the minimum allowable size of the crop as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be at least 80% of the original image height and width. Default: 0.0 (no minimum size).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform ensures that all bounding boxes in the original image are fully contained within the   cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out),   it will default to cropping the entire image.</li> <li>After cropping, the result is resized to the specified (height, width) size.</li> <li>Bounding box coordinates are adjusted to match the new image size.</li> <li>Keypoints are moved along with the crop and scaled to the new image size.</li> <li>If there are no bounding boxes in the image, it will fall back to a random crop.</li> </ul> <p>Mathematical Details:     1. A crop region is selected that includes all bounding boxes.     2. The crop size is determined by the erosion_rate:        min_crop_size = (1 - erosion_rate) * original_size     3. If the selected crop is smaller than min_crop_size, it's expanded to meet this requirement.     4. The crop is then resized to the specified (height, width) size.     5. Bounding box coordinates are transformed to match the new image size:        new_coord = (old_coord - crop_start) * (new_size / crop_size)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (300, 300, 3), dtype=np.uint8)\n&gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomSizedBBoxSafeCrop(height=224, width=224, erosion_rate=0.2, p=1.0),\n... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n&gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n# transformed_image will be a 224x224 image containing all original bounding boxes,\n# with their coordinates adjusted to the new image size.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomSizedBBoxSafeCrop(BBoxSafeRandomCrop):\n    \"\"\"Crop a random part of the input and rescale it to a specific size without loss of bounding boxes.\n\n    This transform first attempts to crop a random portion of the input image while ensuring that all bounding boxes\n    remain within the cropped area. It then resizes the crop to the specified size. This is particularly useful for\n    object detection tasks where preserving all objects in the image is crucial while also standardizing the image size.\n\n    Args:\n        height (int): Height of the output image after resizing.\n        width (int): Width of the output image after resizing.\n        erosion_rate (float): A value between 0.0 and 1.0 that determines the minimum allowable size of the crop\n            as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be\n            at least 80% of the original image height and width. Default: 0.0 (no minimum size).\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform ensures that all bounding boxes in the original image are fully contained within the\n          cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out),\n          it will default to cropping the entire image.\n        - After cropping, the result is resized to the specified (height, width) size.\n        - Bounding box coordinates are adjusted to match the new image size.\n        - Keypoints are moved along with the crop and scaled to the new image size.\n        - If there are no bounding boxes in the image, it will fall back to a random crop.\n\n    Mathematical Details:\n        1. A crop region is selected that includes all bounding boxes.\n        2. The crop size is determined by the erosion_rate:\n           min_crop_size = (1 - erosion_rate) * original_size\n        3. If the selected crop is smaller than min_crop_size, it's expanded to meet this requirement.\n        4. The crop is then resized to the specified (height, width) size.\n        5. Bounding box coordinates are transformed to match the new image size:\n           new_coord = (old_coord - crop_start) * (new_size / crop_size)\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (300, 300, 3), dtype=np.uint8)\n        &gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomSizedBBoxSafeCrop(height=224, width=224, erosion_rate=0.2, p=1.0),\n        ... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n        &gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        # transformed_image will be a 224x224 image containing all original bounding boxes,\n        # with their coordinates adjusted to the new image size.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        erosion_rate: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        erosion_rate: float = 0.0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(erosion_rate=erosion_rate, p=p)\n        self.height = height\n        self.width = width\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        crop = fcrops.crop(img, *crop_coords)\n        return fgeometric.resize(crop, (self.height, self.width), self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        crop = fcrops.crop(mask, *crop_coords)\n        return fgeometric.resize(crop, (self.height, self.width), self.mask_interpolation)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        keypoints = fcrops.crop_keypoints_by_coords(keypoints, crop_coords)\n\n        crop_height = crop_coords[3] - crop_coords[1]\n        crop_width = crop_coords[2] - crop_coords[0]\n\n        scale_y = self.height / crop_height\n        scale_x = self.width / crop_width\n        return fgeometric.keypoints_scale(keypoints, scale_x=scale_x, scale_y=scale_y)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (*super().get_transform_init_args_names(), \"height\", \"width\", \"interpolation\", \"mask_interpolation\")\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomSizedCrop","title":"<code>class  RandomSizedCrop</code> <code>       (min_max_height, size=None, width=None, height=None, *, w2h_ratio=1.0, interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input and rescale it to a specific size.</p> <p>This transform first crops a random portion of the input and then resizes it to a specified size. The size of the random crop is controlled by the 'min_max_height' parameter.</p> <p>Parameters:</p> Name Type Description <code>min_max_height</code> <code>tuple[int, int]</code> <p>Minimum and maximum height of the crop in pixels.</p> <code>size</code> <code>tuple[int, int]</code> <p>Target size for the output image, i.e. (height, width) after crop and resize.</p> <code>w2h_ratio</code> <code>float</code> <p>Aspect ratio (width/height) of crop. Default: 1.0</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The crop size is randomly selected for each execution within the range specified by 'min_max_height'.</li> <li>The aspect ratio of the crop is determined by the 'w2h_ratio' parameter.</li> <li>After cropping, the result is resized to the specified 'size'.</li> <li>Bounding boxes that end up fully outside the cropped area will be removed.</li> <li>Keypoints that end up outside the cropped area will be removed.</li> <li>This transform differs from RandomResizedCrop in that it allows more control over the crop size   through the 'min_max_height' parameter, rather than using a scale parameter.</li> </ul> <p>Mathematical Details:     1. A random crop height h is sampled from the range [min_max_height[0], min_max_height[1]].     2. The crop width w is calculated as: w = h * w2h_ratio     3. A random location for the crop is selected within the input image.     4. The image is cropped to the size (h, w).     5. The crop is then resized to the specified 'size'.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomSizedCrop(\n...     min_max_height=(50, 80),\n...     size=(64, 64),\n...     w2h_ratio=1.0,\n...     interpolation=cv2.INTER_LINEAR,\n...     p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# transformed_image will be a 64x64 image, resulting from a crop with height\n# between 50 and 80 pixels, and the same aspect ratio as specified by w2h_ratio,\n# taken from a random location in the original image and then resized.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomSizedCrop(_BaseRandomSizedCrop):\n    \"\"\"Crop a random part of the input and rescale it to a specific size.\n\n    This transform first crops a random portion of the input and then resizes it to a specified size.\n    The size of the random crop is controlled by the 'min_max_height' parameter.\n\n    Args:\n        min_max_height (tuple[int, int]): Minimum and maximum height of the crop in pixels.\n        size (tuple[int, int]): Target size for the output image, i.e. (height, width) after crop and resize.\n        w2h_ratio (float): Aspect ratio (width/height) of crop. Default: 1.0\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The crop size is randomly selected for each execution within the range specified by 'min_max_height'.\n        - The aspect ratio of the crop is determined by the 'w2h_ratio' parameter.\n        - After cropping, the result is resized to the specified 'size'.\n        - Bounding boxes that end up fully outside the cropped area will be removed.\n        - Keypoints that end up outside the cropped area will be removed.\n        - This transform differs from RandomResizedCrop in that it allows more control over the crop size\n          through the 'min_max_height' parameter, rather than using a scale parameter.\n\n    Mathematical Details:\n        1. A random crop height h is sampled from the range [min_max_height[0], min_max_height[1]].\n        2. The crop width w is calculated as: w = h * w2h_ratio\n        3. A random location for the crop is selected within the input image.\n        4. The image is cropped to the size (h, w).\n        5. The crop is then resized to the specified 'size'.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomSizedCrop(\n        ...     min_max_height=(50, 80),\n        ...     size=(64, 64),\n        ...     w2h_ratio=1.0,\n        ...     interpolation=cv2.INTER_LINEAR,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # transformed_image will be a 64x64 image, resulting from a crop with height\n        # between 50 and 80 pixels, and the same aspect ratio as specified by w2h_ratio,\n        # taken from a random location in the original image and then resized.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        min_max_height: OnePlusIntRangeType\n        w2h_ratio: Annotated[float, Field(gt=0)]\n        width: int | None\n        height: int | None\n        size: ScaleIntType | None\n\n        @model_validator(mode=\"after\")\n        def process(self) -&gt; Self:\n            if isinstance(self.size, int):\n                if isinstance(self.width, int):\n                    warn(\n                        \"Initializing with 'size' as an integer and a separate 'width', `height` are deprecated. \"\n                        \"Please use a tuple (height, width) for the 'size' argument.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                    self.size = (self.size, self.width)\n                else:\n                    msg = \"If size is an integer, width as integer must be specified.\"\n                    raise TypeError(msg)\n\n            if self.size is None:\n                if self.height is None or self.width is None:\n                    message = \"If 'size' is not provided, both 'height' and 'width' must be specified.\"\n                    raise ValueError(message)\n                self.size = (self.height, self.width)\n            return self\n\n    def __init__(\n        self,\n        min_max_height: tuple[int, int],\n        # NOTE @zetyquickly: when (width, height) are deprecated, make 'size' non optional\n        size: ScaleIntType | None = None,\n        width: int | None = None,\n        height: int | None = None,\n        *,\n        w2h_ratio: float = 1.0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            size=cast(tuple[int, int], size),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.min_max_height = min_max_height\n        self.w2h_ratio = w2h_ratio\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_shape = params[\"shape\"][:2]\n\n        crop_height = self.py_random.randint(*self.min_max_height)\n        crop_width = int(crop_height * self.w2h_ratio)\n\n        crop_shape = (crop_height, crop_width)\n\n        h_start = self.py_random.random()\n        w_start = self.py_random.random()\n\n        crop_coords = fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n        return {\"crop_coords\": crop_coords}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (*super().get_transform_init_args_names(), \"min_max_height\", \"w2h_ratio\")\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/","title":"Index","text":"<ul> <li>Domain Adaptation functional transforms (albumentations.augmentations.domain_adaptation.functional)</li> <li>Domain Adaptation transforms (albumentations.augmentations.domain_adaptation.transforms)</li> </ul>"},{"location":"api_reference/augmentations/domain_adaptation/functional/","title":"Domain Adaptation functional transforms (augmentations.domain_adaptation.functional)","text":""},{"location":"api_reference/augmentations/domain_adaptation/functional/#albumentations.augmentations.domain_adaptation.functional.apply_histogram","title":"<code>def apply_histogram    (img, reference_image, blend_ratio)    </code> [view source on GitHub]","text":"<p>Apply histogram matching to an input image using a reference image and blend the result.</p> <p>This function performs histogram matching between the input image and a reference image, then blends the result with the original input image based on the specified blend ratio.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The input image to be transformed. Can be either grayscale or RGB. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>reference_image</code> <code>np.ndarray</code> <p>The reference image used for histogram matching. Should have the same number of channels as the input image. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>blend_ratio</code> <code>float</code> <p>The ratio for blending the matched image with the original image. Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The transformed image after histogram matching and blending.     The output will have the same shape and dtype as the input image.</p> <p>Supported image types:     - Grayscale images: 2D arrays     - RGB images: 3D arrays with 3 channels     - Multispectral images: 3D arrays with more than 3 channels</p> <p>Note</p> <ul> <li>If the input and reference images have different sizes, the reference image   will be resized to match the input image's dimensions.</li> <li>The function uses a custom implementation of histogram matching based on OpenCV and NumPy.</li> <li>The @clipped and @preserve_channel_dim decorators ensure the output is within   the valid range and maintains the original number of dimensions.</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef apply_histogram(img: np.ndarray, reference_image: np.ndarray, blend_ratio: float) -&gt; np.ndarray:\n    \"\"\"Apply histogram matching to an input image using a reference image and blend the result.\n\n    This function performs histogram matching between the input image and a reference image,\n    then blends the result with the original input image based on the specified blend ratio.\n\n    Args:\n        img (np.ndarray): The input image to be transformed. Can be either grayscale or RGB.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        reference_image (np.ndarray): The reference image used for histogram matching.\n            Should have the same number of channels as the input image.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        blend_ratio (float): The ratio for blending the matched image with the original image.\n            Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.\n\n    Returns:\n        np.ndarray: The transformed image after histogram matching and blending.\n            The output will have the same shape and dtype as the input image.\n\n    Supported image types:\n        - Grayscale images: 2D arrays\n        - RGB images: 3D arrays with 3 channels\n        - Multispectral images: 3D arrays with more than 3 channels\n\n    Note:\n        - If the input and reference images have different sizes, the reference image\n          will be resized to match the input image's dimensions.\n        - The function uses a custom implementation of histogram matching based on OpenCV and NumPy.\n        - The @clipped and @preserve_channel_dim decorators ensure the output is within\n          the valid range and maintains the original number of dimensions.\n    \"\"\"\n    # Resize reference image only if necessary\n    if img.shape[:2] != reference_image.shape[:2]:\n        reference_image = cv2.resize(reference_image, dsize=(img.shape[1], img.shape[0]))\n\n    img = np.squeeze(img)\n    reference_image = np.squeeze(reference_image)\n\n    # Match histograms between the images\n    matched = match_histograms(img, reference_image)\n\n    # Blend the original image and the matched image\n    return add_weighted(matched, blend_ratio, img, 1 - blend_ratio)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/functional/#albumentations.augmentations.domain_adaptation.functional.fourier_domain_adaptation","title":"<code>def fourier_domain_adaptation    (img, target_img, beta)    </code> [view source on GitHub]","text":"<p>Apply Fourier Domain Adaptation to the input image using a target image.</p> <p>This function performs domain adaptation in the frequency domain by modifying the amplitude spectrum of the source image based on the target image's amplitude spectrum. It preserves the phase information of the source image, which helps maintain its content while adapting its style to match the target image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The source image to be adapted. Can be grayscale or RGB.</p> <code>target_img</code> <code>np.ndarray</code> <p>The target image used as a reference for adaptation. Should have the same dimensions as the source image.</p> <code>beta</code> <code>float</code> <p>The adaptation strength, typically in the range [0, 1]. Higher values result in stronger adaptation towards the target image's style.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The adapted image with the same shape and type as the input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the source and target images have different shapes.</p> <p>Note</p> <ul> <li>Both input images are converted to float32 for processing.</li> <li>The function handles both grayscale (2D) and color (3D) images.</li> <li>For grayscale images, an extra dimension is added to facilitate uniform processing.</li> <li>The adaptation is performed channel-wise for color images.</li> <li>The output is clipped to the valid range and preserves the original number of channels.</li> </ul> <p>The adaptation process involves the following steps for each channel: 1. Compute the 2D Fourier Transform of both source and target images. 2. Shift the zero frequency component to the center of the spectrum. 3. Extract amplitude and phase information from the source image's spectrum. 4. Mutate the source amplitude using the target amplitude and the beta parameter. 5. Combine the mutated amplitude with the original phase. 6. Perform the inverse Fourier Transform to obtain the adapted channel.</p> <p>The <code>low_freq_mutate</code> function (not shown here) is responsible for the actual amplitude mutation, focusing on low-frequency components which carry style information.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n&gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n</code></pre> <p>References</p> <ul> <li>\"FDA: Fourier Domain Adaptation for Semantic Segmentation\"   (Yang and Soatto, 2020, CVPR)   https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef fourier_domain_adaptation(img: np.ndarray, target_img: np.ndarray, beta: float) -&gt; np.ndarray:\n    \"\"\"Apply Fourier Domain Adaptation to the input image using a target image.\n\n    This function performs domain adaptation in the frequency domain by modifying the amplitude\n    spectrum of the source image based on the target image's amplitude spectrum. It preserves\n    the phase information of the source image, which helps maintain its content while adapting\n    its style to match the target image.\n\n    Args:\n        img (np.ndarray): The source image to be adapted. Can be grayscale or RGB.\n        target_img (np.ndarray): The target image used as a reference for adaptation.\n            Should have the same dimensions as the source image.\n        beta (float): The adaptation strength, typically in the range [0, 1].\n            Higher values result in stronger adaptation towards the target image's style.\n\n    Returns:\n        np.ndarray: The adapted image with the same shape and type as the input image.\n\n    Raises:\n        ValueError: If the source and target images have different shapes.\n\n    Note:\n        - Both input images are converted to float32 for processing.\n        - The function handles both grayscale (2D) and color (3D) images.\n        - For grayscale images, an extra dimension is added to facilitate uniform processing.\n        - The adaptation is performed channel-wise for color images.\n        - The output is clipped to the valid range and preserves the original number of channels.\n\n    The adaptation process involves the following steps for each channel:\n    1. Compute the 2D Fourier Transform of both source and target images.\n    2. Shift the zero frequency component to the center of the spectrum.\n    3. Extract amplitude and phase information from the source image's spectrum.\n    4. Mutate the source amplitude using the target amplitude and the beta parameter.\n    5. Combine the mutated amplitude with the original phase.\n    6. Perform the inverse Fourier Transform to obtain the adapted channel.\n\n    The `low_freq_mutate` function (not shown here) is responsible for the actual\n    amplitude mutation, focusing on low-frequency components which carry style information.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n        &gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n\n    References:\n        - \"FDA: Fourier Domain Adaptation for Semantic Segmentation\"\n          (Yang and Soatto, 2020, CVPR)\n          https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n    \"\"\"\n    src_img = img.astype(np.float32)\n    trg_img = target_img.astype(np.float32)\n\n    if src_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        src_img = np.expand_dims(src_img, axis=-1)\n    if trg_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        trg_img = np.expand_dims(trg_img, axis=-1)\n\n    num_channels = src_img.shape[-1]\n\n    # Prepare container for the output image\n    src_in_trg = np.zeros_like(src_img)\n\n    for channel_id in range(num_channels):\n        # Perform FFT on each channel\n        fft_src = np.fft.fft2(src_img[:, :, channel_id])\n        fft_trg = np.fft.fft2(trg_img[:, :, channel_id])\n\n        # Shift the zero frequency component to the center\n        fft_src_shifted = np.fft.fftshift(fft_src)\n        fft_trg_shifted = np.fft.fftshift(fft_trg)\n\n        # Extract amplitude and phase\n        amp_src, pha_src = np.abs(fft_src_shifted), np.angle(fft_src_shifted)\n        amp_trg = np.abs(fft_trg_shifted)\n\n        # Mutate the amplitude part of the source with the target\n        mutated_amp = low_freq_mutate(amp_src.copy(), amp_trg, beta)\n\n        # Combine the mutated amplitude with the original phase\n        fft_src_mutated = np.fft.ifftshift(mutated_amp * np.exp(1j * pha_src))\n\n        # Perform inverse FFT\n        src_in_trg_channel = np.fft.ifft2(fft_src_mutated)\n\n        # Store the result in the corresponding channel of the output image\n        src_in_trg[:, :, channel_id] = np.real(src_in_trg_channel)\n\n    return src_in_trg\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/functional/#albumentations.augmentations.domain_adaptation.functional.match_histograms","title":"<code>def match_histograms    (image, reference)    </code> [view source on GitHub]","text":"<p>Adjust an image so that its cumulative histogram matches that of another.</p> <p>The adjustment is applied separately for each channel.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image. Can be gray-scale or in color.</p> <code>reference</code> <code>np.ndarray</code> <p>Image to match histogram of. Must have the same number of channels as image.</p> <code>channel_axis</code> <p>If None, the image is assumed to be a grayscale (single channel) image. Otherwise, this parameter indicates which axis of the array corresponds to channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>Thrown when the number of channels in the input image and the reference differ.</p> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef match_histograms(image: np.ndarray, reference: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Adjust an image so that its cumulative histogram matches that of another.\n\n    The adjustment is applied separately for each channel.\n\n    Args:\n        image: Input image. Can be gray-scale or in color.\n        reference: Image to match histogram of. Must have the same number of channels as image.\n        channel_axis: If None, the image is assumed to be a grayscale (single channel) image.\n            Otherwise, this parameter indicates which axis of the array corresponds to channels.\n\n    Returns:\n        np.ndarray: Transformed input image.\n\n    Raises:\n        ValueError: Thrown when the number of channels in the input image and the reference differ.\n    \"\"\"\n    if reference.dtype != np.uint8:\n        reference = from_float(reference, np.uint8)\n\n    if image.ndim != reference.ndim:\n        raise ValueError(\"Image and reference must have the same number of dimensions.\")\n\n    # Expand dimensions for grayscale images\n    if image.ndim == 2:\n        image = np.expand_dims(image, axis=-1)\n    if reference.ndim == 2:\n        reference = np.expand_dims(reference, axis=-1)\n\n    matched = np.empty(image.shape, dtype=np.uint8)\n\n    num_channels = image.shape[-1]\n\n    for channel in range(num_channels):\n        matched_channel = _match_cumulative_cdf(image[..., channel], reference[..., channel]).astype(np.uint8)\n        matched[..., channel] = matched_channel\n\n    return matched\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/","title":"Domain Adaptation transforms (augmentations.domain_adaptation.transforms)","text":""},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.FDA","title":"<code>class  FDA</code> <code>       (reference_images, beta_limit=(0, 0.1), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.</p> <p>This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.</p> <p>The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more of the original image's characteristics and higher values leading to more pronounced adaptation effects. It is recommended to use beta values less than 0.3 to avoid introducing artifacts.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>Sequence of objects to be converted into images by <code>read_fn</code>. This typically involves paths to images that serve as target domain examples for adaptation.</p> <code>beta_limit</code> <code>tuple[float, float] | float</code> <p>Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.</p> <code>read_fn</code> <code>Callable</code> <p>User-defined function for reading images. It takes an element from <code>reference_images</code> and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <ul> <li>https://github.com/YanchaoYang/FDA</li> <li>https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n&gt;&gt;&gt; result = aug(image=image)\n</code></pre> <p>Note</p> <p>FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target domain samples are unavailable. It enables significant improvements in model generalization by aligning the low-level statistics of source and target images through a simple yet effective Fourier-based method.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class FDA(ImageOnlyTransform):\n    \"\"\"Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation\n    (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source\n    and target datasets, effectively adapting images from one domain to closely resemble those from another without\n    altering their semantic content.\n\n    This transform is particularly beneficial in scenarios where the training (source) and testing (target) images\n    come from different distributions, such as synthetic versus real images, or day versus night scenes.\n    Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain\n    alignment by swapping low-frequency components of the Fourier transform between the source and target images.\n    This technique has shown to improve the performance of models on the target domain, particularly for tasks\n    like semantic segmentation, without additional training for domain invariance.\n\n    The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more\n    of the original image's characteristics and higher values leading to more pronounced adaptation effects.\n    It is recommended to use beta values less than 0.3 to avoid introducing artifacts.\n\n    Args:\n        reference_images (Sequence[Any]): Sequence of objects to be converted into images by `read_fn`. This typically\n            involves paths to images that serve as target domain examples for adaptation.\n        beta_limit (tuple[float, float] | float): Coefficient beta from the paper, controlling the swapping extent of\n            frequency components. If one value is provided beta will be sampled from uniform\n            distribution [0, beta_limit]. Values should be less than 0.5.\n        read_fn (Callable): User-defined function for reading images. It takes an element from `reference_images` and\n            returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a\n            numpy array.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        - https://github.com/YanchaoYang/FDA\n        - https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n        &gt;&gt;&gt; result = aug(image=image)\n\n    Note:\n        FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target\n        domain samples are unavailable. It enables significant improvements in model generalization by aligning\n        the low-level statistics of source and target images through a simple yet effective Fourier-based method.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        read_fn: Callable[[Any], np.ndarray]\n        beta_limit: ZeroOneRangeType\n\n        @field_validator(\"beta_limit\")\n        @classmethod\n        def check_ranges(cls, value: tuple[float, float]) -&gt; tuple[float, float]:\n            bounds = 0, MAX_BETA_LIMIT\n            if not bounds[0] &lt;= value[0] &lt;= value[1] &lt;= bounds[1]:\n                raise ValueError(f\"Values should be in the range {bounds} got {value} \")\n            return value\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        beta_limit: ScaleFloatType = (0, 0.1),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.beta_limit = cast(tuple[float, float], beta_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        target_image: np.ndarray,\n        beta: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fourier_domain_adaptation(img, target_image, beta)\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n        target_img = self.read_fn(self.py_random.choice(self.reference_images))\n        target_img = cv2.resize(target_img, dsize=(width, height))\n\n        return {\"target_image\": target_img, \"beta\": self.py_random.uniform(*self.beta_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str]:\n        return \"reference_images\", \"beta_limit\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"FDA can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.HistogramMatching","title":"<code>class  HistogramMatching</code> <code>       (reference_images, blend_ratio=(0.5, 1.0), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Adjust the pixel values of an input image to match the histogram of a reference image.</p> <p>This transform applies histogram matching, a technique that modifies the distribution of pixel intensities in the input image to closely resemble that of a reference image. This process is performed independently for each channel in multi-channel images, provided both the input and reference images have the same number of channels.</p> <p>Histogram matching is particularly useful for: - Normalizing images from different sources or captured under varying conditions. - Preparing images for feature matching or other computer vision tasks where consistent   tone and contrast are important. - Simulating different lighting or camera conditions in a controlled manner.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the <code>read_fn</code>.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)</p> <code>read_fn</code> <code>Callable[[Any], np.ndarray]</code> <p>A function that takes an element from <code>reference_images</code> and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> <li>The effectiveness of the matching depends on the similarity between the input and reference images.</li> <li>For best results, choose reference images that represent the desired tone and contrast.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.HistogramMatching(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     read_fn=lambda x: x,\n...     p=1\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; matched_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Histogram Matching in scikit-image:   https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class HistogramMatching(ImageOnlyTransform):\n    \"\"\"Adjust the pixel values of an input image to match the histogram of a reference image.\n\n    This transform applies histogram matching, a technique that modifies the distribution of pixel\n    intensities in the input image to closely resemble that of a reference image. This process is\n    performed independently for each channel in multi-channel images, provided both the input and\n    reference images have the same number of channels.\n\n    Histogram matching is particularly useful for:\n    - Normalizing images from different sources or captured under varying conditions.\n    - Preparing images for feature matching or other computer vision tasks where consistent\n      tone and contrast are important.\n    - Simulating different lighting or camera conditions in a controlled manner.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of reference image sources. These can be\n            file paths, URLs, or any objects that can be converted to images by the `read_fn`.\n        blend_ratio (tuple[float, float]): Range for the blending factor between the original\n            and the matched image. Must be two floats between 0 and 1, where:\n            - 0 means no blending (original image is returned)\n            - 1 means full histogram matching\n            A random value within this range is chosen for each application.\n            Default: (0.5, 1.0)\n        read_fn (Callable[[Any], np.ndarray]): A function that takes an element from\n            `reference_images` and returns a numpy array representing the image.\n            Default: read_rgb_image (reads image file from disk)\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform cannot be directly serialized due to its dependency on external image data.\n        - The effectiveness of the matching depends on the similarity between the input and reference images.\n        - For best results, choose reference images that represent the desired tone and contrast.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.HistogramMatching(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     read_fn=lambda x: x,\n        ...     p=1\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; matched_image = result[\"image\"]\n\n    References:\n        - Histogram Matching in scikit-image:\n          https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.5, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n\n    def apply(\n        self: np.ndarray,\n        img: np.ndarray,\n        reference_image: np.ndarray,\n        blend_ratio: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return apply_histogram(img, reference_image, blend_ratio)\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"HistogramMatching can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation","title":"<code>class  PixelDistributionAdaptation</code> <code>       (reference_images, blend_ratio=(0.25, 1.0), read_fn=&lt;function read_rgb_image at 0x7f9061366d40&gt;, transform_type='pca', p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical transformation (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.</p> <p>The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse    of the transform fitted on the reference image.</p> <p>This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of objects (typically image paths) that will be converted into images by <code>read_fn</code>. These images serve as references for the domain adaptation.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)</p> <code>read_fn</code> <code>Callable</code> <p>A user-defined function for reading and converting the objects in <code>reference_images</code> into numpy arrays. By default, it assumes these objects are image paths.</p> <code>transform_type</code> <code>Literal[\"pca\", \"standard\", \"minmax\"]</code> <p>Specifies the type of statistical transformation to apply. - \"pca\": Principal Component Analysis - \"standard\": StandardScaler (zero mean and unit variance) - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: \"pca\"</p> <code>p</code> <code>float</code> <p>The probability of applying the transform to any given image. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The effectiveness of the adaptation depends on the similarity between the input and reference domains.</li> <li>PCA transformation may alter color relationships more significantly than other methods.</li> <li>StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.</li> <li>The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.</li> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     transform_type=\"standard\",\n...     read_fn=lambda x: x,\n...     p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; adapted_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>https://github.com/arsenyinfo/qudida</li> <li>https://arxiv.org/abs/1911.11483</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class PixelDistributionAdaptation(ImageOnlyTransform):\n    \"\"\"Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image\n    with that of a reference image. This process involves fitting a simple statistical transformation\n    (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images,\n    transforming the original image with the transformation trained on it, and then applying the inverse\n    transformation using the transform fitted on the reference image. The result is an adapted image\n    that retains the original content while mimicking the pixel value distribution of the reference domain.\n\n    The process can be visualized as two main steps:\n    1. Adjusting the original image to a standard distribution space using a selected transform.\n    2. Moving the adjusted image into the distribution space of the reference image by applying the inverse\n       of the transform fitted on the reference image.\n\n    This technique is especially useful in scenarios where images from different domains (e.g., synthetic\n    vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in\n    image processing tasks.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of objects (typically image paths) that will be\n            converted into images by `read_fn`. These images serve as references for the domain adaptation.\n        blend_ratio (tuple[float, float]): Specifies the minimum and maximum blend ratio for mixing\n            the adapted image with the original. This enhances the diversity of the output images.\n            Values should be in the range [0, 1]. Default: (0.25, 1.0)\n        read_fn (Callable): A user-defined function for reading and converting the objects in\n            `reference_images` into numpy arrays. By default, it assumes these objects are image paths.\n        transform_type (Literal[\"pca\", \"standard\", \"minmax\"]): Specifies the type of statistical\n            transformation to apply.\n            - \"pca\": Principal Component Analysis\n            - \"standard\": StandardScaler (zero mean and unit variance)\n            - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1])\n            Default: \"pca\"\n        p (float): The probability of applying the transform to any given image. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The effectiveness of the adaptation depends on the similarity between the input and reference domains.\n        - PCA transformation may alter color relationships more significantly than other methods.\n        - StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.\n        - The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.\n        - This transform cannot be directly serialized due to its dependency on external image data.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     transform_type=\"standard\",\n        ...     read_fn=lambda x: x,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; adapted_image = result[\"image\"]\n\n    References:\n        - https://github.com/arsenyinfo/qudida\n        - https://arxiv.org/abs/1911.11483\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.25, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"] = \"pca\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n        self.transform_type = transform_type\n\n    def apply(self, img: np.ndarray, reference_image: np.ndarray, blend_ratio: float, **params: Any) -&gt; np.ndarray:\n        return adapt_pixel_distribution(\n            img,\n            ref=reference_image,\n            weight=blend_ratio,\n            transform_type=self.transform_type,\n        )\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\", \"transform_type\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"PixelDistributionAdaptation can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.TemplateTransform","title":"<code>class  TemplateTransform</code> <code>       (templates, img_weight=(0.5, 0.5), template_weight=None, template_transform=None, name=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Apply blending of input image with specified templates.</p> <p>This transform overlays one or more template images onto the input image using alpha blending. It allows for creating complex composite images or simulating various visual effects.</p> <p>Parameters:</p> Name Type Description <code>templates</code> <code>numpy array | list[np.ndarray]</code> <p>Images to use as templates for the transform. If a single numpy array is provided, it will be used as the only template. If a list of numpy arrays is provided, one will be randomly chosen for each application.</p> <code>img_weight</code> <code>tuple[float, float]  | float</code> <p>Weight of the original image in the blend. If a single float, that value will always be used. If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application. To use a fixed weight, use (weight, weight). Default: (0.5, 0.5).</p> <code>template_transform</code> <code>A.Compose | None</code> <p>A composition of Albumentations transforms to apply to the template before blending. This should be an instance of A.Compose containing one or more Albumentations transforms. Default: None.</p> <code>name</code> <code>str | None</code> <p>Name of the transform instance. Used for serialization purposes. Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The template(s) must have the same number of channels as the input image or be single-channel.</li> <li>If a single-channel template is used with a multi-channel image, the template will be replicated across   all channels.</li> <li>The template(s) will be resized to match the input image size if they differ.</li> <li>To make this transform serializable, provide a name when initializing it.</li> </ul> <p>Mathematical Formulation:     Given:     - I: Input image     - T: Template image     - w_i: Weight of input image (sampled from img_weight)</p> <pre><code>The blended image B is computed as:\n\nB = w_i * I + (1 - w_i) * T\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-a-single-template","title":"Apply template transform with a single template","text":"Python<pre><code>&gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-multiple-templates-and-custom-weights","title":"Apply template transform with multiple templates and custom weights","text":"Python<pre><code>&gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=templates,\n...     img_weight=(0.3, 0.7),\n...     name=\"multi_template_transform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-additional-transforms-on-the-template","title":"Apply template transform with additional transforms on the template","text":"Python<pre><code>&gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=template,\n...     img_weight=0.6,\n...     template_transform=template_transform,\n...     name=\"transformed_template\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Image blending: https://en.wikipedia.org/wiki/Image_blending</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class TemplateTransform(ImageOnlyTransform):\n    \"\"\"Apply blending of input image with specified templates.\n\n    This transform overlays one or more template images onto the input image using alpha blending.\n    It allows for creating complex composite images or simulating various visual effects.\n\n    Args:\n        templates (numpy array | list[np.ndarray]): Images to use as templates for the transform.\n            If a single numpy array is provided, it will be used as the only template.\n            If a list of numpy arrays is provided, one will be randomly chosen for each application.\n\n        img_weight (tuple[float, float]  | float): Weight of the original image in the blend.\n            If a single float, that value will always be used.\n            If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application.\n            To use a fixed weight, use (weight, weight).\n            Default: (0.5, 0.5).\n\n        template_transform (A.Compose | None): A composition of Albumentations transforms to apply to the template\n            before blending.\n            This should be an instance of A.Compose containing one or more Albumentations transforms.\n            Default: None.\n\n        name (str | None): Name of the transform instance. Used for serialization purposes.\n            Default: None.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The template(s) must have the same number of channels as the input image or be single-channel.\n        - If a single-channel template is used with a multi-channel image, the template will be replicated across\n          all channels.\n        - The template(s) will be resized to match the input image size if they differ.\n        - To make this transform serializable, provide a name when initializing it.\n\n    Mathematical Formulation:\n        Given:\n        - I: Input image\n        - T: Template image\n        - w_i: Weight of input image (sampled from img_weight)\n\n        The blended image B is computed as:\n\n        B = w_i * I + (1 - w_i) * T\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply template transform with a single template\n        &gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with multiple templates and custom weights\n        &gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=templates,\n        ...     img_weight=(0.3, 0.7),\n        ...     name=\"multi_template_transform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with additional transforms on the template\n        &gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=template,\n        ...     img_weight=0.6,\n        ...     template_transform=template_transform,\n        ...     name=\"transformed_template\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n    References:\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Image blending: https://en.wikipedia.org/wiki/Image_blending\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        templates: np.ndarray | Sequence[np.ndarray]\n        img_weight: ZeroOneRangeType\n        template_weight: ZeroOneRangeType | None = Field(\n            deprecated=\"Template_weight is deprecated. Computed automatically as (1 - img_weight)\",\n        )\n        template_transform: Compose | BasicTransform | None = None\n        name: str | None\n\n        @field_validator(\"templates\")\n        @classmethod\n        def validate_templates(cls, v: np.ndarray | list[np.ndarray]) -&gt; list[np.ndarray]:\n            if isinstance(v, np.ndarray):\n                return [v]\n            if isinstance(v, list):\n                if not all(isinstance(item, np.ndarray) for item in v):\n                    msg = \"All templates must be numpy arrays.\"\n                    raise ValueError(msg)\n                return v\n            msg = \"Templates must be a numpy array or a list of numpy arrays.\"\n            raise TypeError(msg)\n\n    def __init__(\n        self,\n        templates: np.ndarray | list[np.ndarray],\n        img_weight: ScaleFloatType = (0.5, 0.5),\n        template_weight: None = None,\n        template_transform: Compose | BasicTransform | None = None,\n        name: str | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.templates = templates\n        self.img_weight = cast(tuple[float, float], img_weight)\n        self.template_transform = template_transform\n        self.name = name\n\n    def apply(\n        self,\n        img: np.ndarray,\n        template: np.ndarray,\n        img_weight: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if img_weight == 0:\n            return template\n        if img_weight == 1:\n            return img\n\n        return add_weighted(img, img_weight, template, 1 - img_weight)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"img_weight\": self.py_random.uniform(*self.img_weight),\n        }\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        template = self.py_random.choice(self.templates)\n\n        if self.template_transform is not None:\n            template = self.template_transform(image=template)[\"image\"]\n\n        if get_num_channels(template) not in [1, get_num_channels(image)]:\n            msg = (\n                \"Template must be a single channel or \"\n                \"has the same number of channels as input \"\n                f\"image ({get_num_channels(image)}), got {get_num_channels(template)}\"\n            )\n            raise ValueError(msg)\n\n        if template.dtype != image.dtype:\n            msg = \"Image and template must be the same image type\"\n            raise ValueError(msg)\n\n        if image.shape[:2] != template.shape[:2]:\n            template = fgeometric.resize(template, image.shape[:2], interpolation=cv2.INTER_AREA)\n\n        if get_num_channels(template) == 1 and get_num_channels(image) &gt; 1:\n            # Replicate single channel template across all channels to match input image\n            template = cv2.merge([template] * get_num_channels(image))\n        # in order to support grayscale image with dummy dim\n        template = template.reshape(image.shape)\n\n        return {\"template\": template}\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        if self.name is None:\n            msg = (\n                \"To make a TemplateTransform serializable you should provide the `name` argument, \"\n                \"e.g. `TemplateTransform(name='my_transform', ...)`.\"\n            )\n            raise ValueError(msg)\n        return {\"__class_fullname__\": self.get_class_fullname(), \"__name__\": self.name}\n</code></pre>"},{"location":"api_reference/augmentations/dropout/","title":"Index","text":"<ul> <li>ChannelDropout augmentation (albumentations.augmentations.dropout.channel_dropout)</li> <li>CoarseDropout augmentation (albumentations.augmentations.dropout.coarse_dropout)</li> <li>GridDropout augmentation (albumentations.augmentations.dropout.grid_dropout)</li> <li>MaskDropout augmentation (albumentations.augmentations.dropout.mask_dropout)</li> </ul>"},{"location":"api_reference/augmentations/dropout/channel_dropout/","title":"ChannelDropout augmentation (augmentations.dropout.channel_dropout)","text":""},{"location":"api_reference/augmentations/dropout/channel_dropout/#albumentations.augmentations.dropout.channel_dropout.ChannelDropout","title":"<code>class  ChannelDropout</code> <code>       (channel_drop_range=(1, 1), fill_value=None, fill=0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Randomly drop channels in the input image.</p> <p>This transform randomly selects a number of channels to drop from the input image and replaces them with a specified fill value. This can improve model robustness to missing or corrupted channels.</p> <p>The technique is conceptually similar to: - Dropout layers in neural networks, which randomly set input units to 0 during training. - CoarseDropout augmentation, which drops out regions in the spatial dimensions of the image.</p> <p>However, ChannelDropout operates on the channel dimension, effectively \"dropping out\" entire color channels or feature maps.</p> <p>Parameters:</p> Name Type Description <code>channel_drop_range</code> <code>tuple[int, int]</code> <p>Range from which to choose the number of channels to drop. The actual number will be randomly selected from the inclusive range [min, max]. Default: (1, 1).</p> <code>fill</code> <code>float</code> <p>Pixel value used to fill the dropped channels. Default: 0.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Must be in the range [0, 1]. Default: 0.5.</p> <p>Exceptions:</p> Type Description <code>NotImplementedError</code> <p>If the input image has only one channel.</p> <code>ValueError</code> <p>If the upper bound of channel_drop_range is greater than or equal to the number of channels in the input image.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ChannelDropout(channel_drop_range=(1, 2), fill=128, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; dropped_image = result['image']\n&gt;&gt;&gt; assert dropped_image.shape == image.shape\n&gt;&gt;&gt; assert np.any(dropped_image != image)  # Some channels should be different\n</code></pre> <p>Note</p> <ul> <li>The number of channels to drop is randomly chosen within the specified range.</li> <li>Channels are randomly selected for dropping.</li> <li>This transform is not applicable to single-channel (grayscale) images.</li> <li>The transform will raise an error if it's not possible to drop the specified   number of channels (e.g., trying to drop 3 channels from an RGB image).</li> <li>This augmentation can be particularly useful for training models to be robust   against missing or corrupted channel data in multi-spectral or hyperspectral imagery.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/channel_dropout.py</code> Python<pre><code>class ChannelDropout(ImageOnlyTransform):\n    \"\"\"Randomly drop channels in the input image.\n\n    This transform randomly selects a number of channels to drop from the input image\n    and replaces them with a specified fill value. This can improve model robustness\n    to missing or corrupted channels.\n\n    The technique is conceptually similar to:\n    - Dropout layers in neural networks, which randomly set input units to 0 during training.\n    - CoarseDropout augmentation, which drops out regions in the spatial dimensions of the image.\n\n    However, ChannelDropout operates on the channel dimension, effectively \"dropping out\"\n    entire color channels or feature maps.\n\n    Args:\n        channel_drop_range (tuple[int, int]): Range from which to choose the number\n            of channels to drop. The actual number will be randomly selected from\n            the inclusive range [min, max]. Default: (1, 1).\n        fill (float): Pixel value used to fill the dropped channels.\n            Default: 0.\n        p (float): Probability of applying the transform. Must be in the range\n            [0, 1]. Default: 0.5.\n\n    Raises:\n        NotImplementedError: If the input image has only one channel.\n        ValueError: If the upper bound of channel_drop_range is greater than or\n            equal to the number of channels in the input image.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ChannelDropout(channel_drop_range=(1, 2), fill=128, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; dropped_image = result['image']\n        &gt;&gt;&gt; assert dropped_image.shape == image.shape\n        &gt;&gt;&gt; assert np.any(dropped_image != image)  # Some channels should be different\n\n    Note:\n        - The number of channels to drop is randomly chosen within the specified range.\n        - Channels are randomly selected for dropping.\n        - This transform is not applicable to single-channel (grayscale) images.\n        - The transform will raise an error if it's not possible to drop the specified\n          number of channels (e.g., trying to drop 3 channels from an RGB image).\n        - This augmentation can be particularly useful for training models to be robust\n          against missing or corrupted channel data in multi-spectral or hyperspectral imagery.\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        channel_drop_range: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n        fill_value: float | None\n        fill: float\n\n        @model_validator(mode=\"after\")\n        def validate_fill(self) -&gt; Self:\n            if self.fill_value is not None:\n                self.fill = self.fill_value\n                warn(\n                    \"`fill_value` deprecated. Use `fill` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n            return self\n\n    def __init__(\n        self,\n        channel_drop_range: tuple[int, int] = (1, 1),\n        fill_value: float | None = None,\n        fill: float = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.channel_drop_range = channel_drop_range\n        self.fill = fill\n\n    def apply(self, img: np.ndarray, channels_to_drop: tuple[int, ...], **params: Any) -&gt; np.ndarray:\n        return channel_dropout(img, channels_to_drop, self.fill)\n\n    def get_params_dependent_on_data(self, params: Mapping[str, Any], data: Mapping[str, Any]) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        num_channels = get_num_channels(image)\n\n        if num_channels == 1:\n            msg = \"Images has one channel. ChannelDropout is not defined.\"\n            raise NotImplementedError(msg)\n\n        if self.channel_drop_range[1] &gt;= num_channels:\n            msg = \"Can not drop all channels in ChannelDropout.\"\n            raise ValueError(msg)\n\n        num_drop_channels = self.py_random.randint(*self.channel_drop_range)\n\n        channels_to_drop = self.py_random.sample(range(num_channels), k=num_drop_channels)\n\n        return {\"channels_to_drop\": channels_to_drop}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"channel_drop_range\", \"fill\"\n</code></pre>"},{"location":"api_reference/augmentations/dropout/coarse_dropout/","title":"CoarseDropout augmentation (augmentations.dropout.coarse_dropout)","text":""},{"location":"api_reference/augmentations/dropout/coarse_dropout/#albumentations.augmentations.dropout.coarse_dropout.CoarseDropout","title":"<code>class  CoarseDropout</code> <code>       (max_holes=None, max_height=None, max_width=None, min_holes=None, min_height=None, min_width=None, fill_value=None, mask_fill_value=None, num_holes_range=(1, 1), hole_height_range=(8, 8), hole_width_range=(8, 8), fill=0, fill_mask=None, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>CoarseDropout randomly drops out rectangular regions from the image and optionally, the corresponding regions in an associated mask, to simulate occlusion and varied object sizes found in real-world settings.</p> <p>This transformation is an evolution of CutOut and RandomErasing, offering more flexibility in the size, number of dropout regions, and fill values.</p> <p>Parameters:</p> Name Type Description <code>num_holes_range</code> <code>tuple[int, int]</code> <p>Range (min, max) for the number of rectangular regions to drop out. Default: (1, 1)</p> <code>hole_height_range</code> <code>tuple[Real, Real]</code> <p>Range (min, max) for the height of dropout regions. If int, specifies absolute pixel values. If float, interpreted as a fraction of the image height. Default: (8, 8)</p> <code>hole_width_range</code> <code>tuple[Real, Real]</code> <p>Range (min, max) for the width of dropout regions. If int, specifies absolute pixel values. If float, interpreted as a fraction of the image width. Default: (8, 8)</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Fill value for dropout regions in the mask. If None, mask regions corresponding to image dropouts are unchanged. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual number and size of dropout regions are randomly chosen within the specified ranges for each     application.</li> <li>When using float values for hole_height_range and hole_width_range, ensure they are between 0 and 1.</li> <li>This implementation includes deprecation warnings for older parameter names (min_holes, max_holes, etc.).</li> <li>Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.</li> <li>For 'random_uniform' fill, each hole gets a single random color, unlike 'random' where each pixel     gets its own random value.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; # Example with random uniform fill\n&gt;&gt;&gt; aug_random = A.CoarseDropout(\n...     num_holes_range=(3, 6),\n...     hole_height_range=(10, 20),\n...     hole_width_range=(10, 20),\n...     fill=\"random_uniform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; # Example with inpainting\n&gt;&gt;&gt; aug_inpaint = A.CoarseDropout(\n...     num_holes_range=(3, 6),\n...     hole_height_range=(10, 20),\n...     hole_width_range=(10, 20),\n...     fill=\"inpaint_ns\",\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n</code></pre> <p>References</p> <ul> <li>CutOut: https://arxiv.org/abs/1708.04552</li> <li>Random Erasing: https://arxiv.org/abs/1708.04896</li> <li>OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/coarse_dropout.py</code> Python<pre><code>class CoarseDropout(BaseDropout):\n    \"\"\"CoarseDropout randomly drops out rectangular regions from the image and optionally,\n    the corresponding regions in an associated mask, to simulate occlusion and\n    varied object sizes found in real-world settings.\n\n    This transformation is an evolution of CutOut and RandomErasing, offering more\n    flexibility in the size, number of dropout regions, and fill values.\n\n    Args:\n        num_holes_range (tuple[int, int]): Range (min, max) for the number of rectangular\n            regions to drop out. Default: (1, 1)\n        hole_height_range (tuple[Real, Real]): Range (min, max) for the height\n            of dropout regions. If int, specifies absolute pixel values. If float,\n            interpreted as a fraction of the image height. Default: (8, 8)\n        hole_width_range (tuple[Real, Real]): Range (min, max) for the width\n            of dropout regions. If int, specifies absolute pixel values. If float,\n            interpreted as a fraction of the image width. Default: (8, 8)\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value for the dropped pixels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            - 'random': each pixel is filled with random values\n            - 'random_uniform': each hole is filled with a single random color\n            - 'inpaint_telea': uses OpenCV Telea inpainting method\n            - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        fill_mask (ColorType | None): Fill value for dropout regions in the mask.\n            If None, mask regions corresponding to image dropouts are unchanged. Default: None\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual number and size of dropout regions are randomly chosen within the specified ranges for each\n            application.\n        - When using float values for hole_height_range and hole_width_range, ensure they are between 0 and 1.\n        - This implementation includes deprecation warnings for older parameter names (min_holes, max_holes, etc.).\n        - Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.\n        - For 'random_uniform' fill, each hole gets a single random color, unlike 'random' where each pixel\n            gets its own random value.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; # Example with random uniform fill\n        &gt;&gt;&gt; aug_random = A.CoarseDropout(\n        ...     num_holes_range=(3, 6),\n        ...     hole_height_range=(10, 20),\n        ...     hole_width_range=(10, 20),\n        ...     fill=\"random_uniform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; # Example with inpainting\n        &gt;&gt;&gt; aug_inpaint = A.CoarseDropout(\n        ...     num_holes_range=(3, 6),\n        ...     hole_height_range=(10, 20),\n        ...     hole_width_range=(10, 20),\n        ...     fill=\"inpaint_ns\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n\n    References:\n        - CutOut: https://arxiv.org/abs/1708.04552\n        - Random Erasing: https://arxiv.org/abs/1708.04896\n        - OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html\n    \"\"\"\n\n    class InitSchema(BaseDropout.InitSchema):\n        min_holes: int | None = Field(ge=0)\n        max_holes: int | None = Field(ge=0)\n        num_holes_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, None)),\n            AfterValidator(nondecreasing),\n        ]\n\n        min_height: ScalarType | None = Field(ge=0)\n        max_height: ScalarType | None = Field(ge=0)\n        hole_height_range: Annotated[\n            tuple[ScalarType, ScalarType],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(1, None)),\n        ]\n\n        min_width: ScalarType | None = Field(ge=0)\n        max_width: ScalarType | None = Field(ge=0)\n        hole_width_range: Annotated[\n            tuple[ScalarType, ScalarType],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(1, None)),\n        ]\n\n        @staticmethod\n        def update_range(\n            min_value: Number | None,\n            max_value: Number | None,\n            default_range: tuple[Number, Number],\n        ) -&gt; tuple[Number, Number]:\n            return (min_value or max_value, max_value) if max_value is not None else default_range\n\n        @staticmethod\n        def validate_range(range_value: tuple[float, float], range_name: str, minimum: float = 0) -&gt; None:\n            if not minimum &lt;= range_value[0] &lt;= range_value[1]:\n                raise ValueError(\n                    f\"First value in {range_name} should be less or equal than the second value \"\n                    f\"and at least {minimum}. Got: {range_value}\",\n                )\n            if isinstance(range_value[0], float) and not all(0 &lt;= x &lt;= 1 for x in range_value):\n                raise ValueError(f\"All values in {range_name} should be in [0, 1] range. Got: {range_value}\")\n\n        @model_validator(mode=\"after\")\n        def check_num_holes_and_dimensions(self) -&gt; Self:\n            if self.min_holes is not None:\n                warn(\"`min_holes` is deprecated. Use num_holes_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.max_holes is not None:\n                warn(\"`max_holes` is deprecated. Use num_holes_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.min_height is not None:\n                warn(\"`min_height` is deprecated. Use hole_height_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.max_height is not None:\n                warn(\"`max_height` is deprecated. Use hole_height_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.min_width is not None:\n                warn(\"`min_width` is deprecated. Use hole_width_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.max_width is not None:\n                warn(\"`max_width` is deprecated. Use hole_width_range instead.\", DeprecationWarning, stacklevel=2)\n\n            if self.max_holes is not None:\n                self.num_holes_range = self.update_range(self.min_holes, self.max_holes, self.num_holes_range)\n\n            self.validate_range(self.num_holes_range, \"num_holes_range\", minimum=1)\n\n            if self.max_height is not None:\n                self.hole_height_range = self.update_range(self.min_height, self.max_height, self.hole_height_range)\n            self.validate_range(self.hole_height_range, \"hole_height_range\")\n\n            if self.max_width is not None:\n                self.hole_width_range = self.update_range(self.min_width, self.max_width, self.hole_width_range)\n            self.validate_range(self.hole_width_range, \"hole_width_range\")\n\n            return self\n\n    def __init__(\n        self,\n        max_holes: int | None = None,\n        max_height: ScalarType | None = None,\n        max_width: ScalarType | None = None,\n        min_holes: int | None = None,\n        min_height: ScalarType | None = None,\n        min_width: ScalarType | None = None,\n        fill_value: DropoutFillValue | None = None,\n        mask_fill_value: ColorType | None = None,\n        num_holes_range: tuple[int, int] = (1, 1),\n        hole_height_range: tuple[ScalarType, ScalarType] = (8, 8),\n        hole_width_range: tuple[ScalarType, ScalarType] = (8, 8),\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.num_holes_range = num_holes_range\n        self.hole_height_range = hole_height_range\n        self.hole_width_range = hole_width_range\n\n    def calculate_hole_dimensions(\n        self,\n        image_shape: tuple[int, int],\n        height_range: tuple[float, float],\n        width_range: tuple[float, float],\n        size: int,\n    ) -&gt; tuple[np.ndarray, np.ndarray]:\n        \"\"\"Calculate random hole dimensions based on the provided ranges.\"\"\"\n        height, width = image_shape[:2]\n\n        if isinstance(height_range[0], int):\n            min_height = height_range[0]\n            max_height = min(height_range[1], height)\n\n            min_width = width_range[0]\n            max_width = min(width_range[1], width)\n\n            hole_heights = self.random_generator.integers(int(min_height), int(max_height + 1), size=size)\n            hole_widths = self.random_generator.integers(int(min_width), int(max_width + 1), size=size)\n\n        else:  # Assume float\n            hole_heights = (height * self.random_generator.uniform(*height_range, size=size)).astype(int)\n            hole_widths = (width * self.random_generator.uniform(*width_range, size=size)).astype(int)\n\n        return hole_heights, hole_widths\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        num_holes = self.py_random.randint(*self.num_holes_range)\n\n        hole_heights, hole_widths = self.calculate_hole_dimensions(\n            image_shape,\n            self.hole_height_range,\n            self.hole_width_range,\n            size=num_holes,\n        )\n\n        height, width = image_shape[:2]\n\n        y_min = self.random_generator.integers(0, height - hole_heights + 1, size=num_holes)\n        x_min = self.random_generator.integers(0, width - hole_widths + 1, size=num_holes)\n        y_max = y_min + hole_heights\n        x_max = x_min + hole_widths\n\n        holes = np.stack([x_min, y_min, x_max, y_max], axis=-1)\n\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (*super().get_transform_init_args_names(), \"num_holes_range\", \"hole_height_range\", \"hole_width_range\")\n</code></pre>"},{"location":"api_reference/augmentations/dropout/coarse_dropout/#albumentations.augmentations.dropout.coarse_dropout.Erasing","title":"<code>class  Erasing</code> <code>       (scale=(0.02, 0.33), ratio=(0.3, 3.3), fill=0, fill_mask=None, always_apply=None, p=0.5)                     </code>  [view source on GitHub]","text":"<p>Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.</p> <p>This augmentation helps improve model robustness by randomly masking out rectangular regions in the image, simulating occlusions and encouraging the model to learn from partial information. It's particularly effective for image classification and person re-identification tasks.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>tuple[float, float]</code> <p>Range for the proportion of image area to erase. The actual area will be randomly sampled from (scale[0] * image_area, scale[1] * image_area). Default: (0.02, 0.33)</p> <code>ratio</code> <code>tuple[float, float]</code> <p>Range for the aspect ratio (width/height) of the erased region. The actual ratio will be randomly sampled from (ratio[0], ratio[1]). Default: (0.3, 3.3)</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value used to fill the erased regions. Can be: - int or float: fills all channels with this value - tuple: fills each channel with corresponding value - \"random\": fills each pixel with random values - \"random_uniform\": fills entire erased region with a single random color - \"inpaint_telea\": uses OpenCV Telea inpainting method - \"inpaint_ns\": uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>mask_fill</code> <code>ColorType | None</code> <p>Value used to fill erased regions in the mask. If None, mask regions are not modified. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform attempts to find valid erasing parameters up to 10 times.   If unsuccessful, no erasing is performed.</li> <li>The actual erased area and aspect ratio are randomly sampled within   the specified ranges for each application.</li> <li>When using inpainting methods, only grayscale or RGB images are supported.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; # Basic usage with default parameters\n&gt;&gt;&gt; transform = A.Erasing()\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; # Custom configuration\n&gt;&gt;&gt; transform = A.Erasing(\n...     scale=(0.1, 0.4),\n...     ratio=(0.5, 2.0),\n...     fill_value=\"random_uniform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = transform(image=image)\n</code></pre> <p>References</p> <ul> <li>Paper: https://arxiv.org/abs/1708.04896</li> <li>Implementation inspired by torchvision:   https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.RandomErasing</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/coarse_dropout.py</code> Python<pre><code>class Erasing(BaseDropout):\n    \"\"\"Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.\n\n    This augmentation helps improve model robustness by randomly masking out rectangular regions in the image,\n    simulating occlusions and encouraging the model to learn from partial information. It's particularly\n    effective for image classification and person re-identification tasks.\n\n    Args:\n        scale (tuple[float, float]): Range for the proportion of image area to erase.\n            The actual area will be randomly sampled from (scale[0] * image_area, scale[1] * image_area).\n            Default: (0.02, 0.33)\n        ratio (tuple[float, float]): Range for the aspect ratio (width/height) of the erased region.\n            The actual ratio will be randomly sampled from (ratio[0], ratio[1]).\n            Default: (0.3, 3.3)\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value used to fill the erased regions. Can be:\n            - int or float: fills all channels with this value\n            - tuple: fills each channel with corresponding value\n            - \"random\": fills each pixel with random values\n            - \"random_uniform\": fills entire erased region with a single random color\n            - \"inpaint_telea\": uses OpenCV Telea inpainting method\n            - \"inpaint_ns\": uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        mask_fill (ColorType | None): Value used to fill erased regions in the mask.\n            If None, mask regions are not modified. Default: None\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform attempts to find valid erasing parameters up to 10 times.\n          If unsuccessful, no erasing is performed.\n        - The actual erased area and aspect ratio are randomly sampled within\n          the specified ranges for each application.\n        - When using inpainting methods, only grayscale or RGB images are supported.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; # Basic usage with default parameters\n        &gt;&gt;&gt; transform = A.Erasing()\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; # Custom configuration\n        &gt;&gt;&gt; transform = A.Erasing(\n        ...     scale=(0.1, 0.4),\n        ...     ratio=(0.5, 2.0),\n        ...     fill_value=\"random_uniform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = transform(image=image)\n\n    References:\n        - Paper: https://arxiv.org/abs/1708.04896\n        - Implementation inspired by torchvision:\n          https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.RandomErasing\n    \"\"\"\n\n    class InitSchema(BaseDropout.InitSchema):\n        scale: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, None)),\n        ]\n        ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, None)),\n        ]\n\n    def __init__(\n        self,\n        scale: tuple[float, float] = (0.02, 0.33),\n        ratio: tuple[float, float] = (0.3, 3.3),\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n\n        self.scale = scale\n        self.ratio = ratio\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Calculate erasing parameters using direct mathematical derivation.\n\n        Given:\n        - Image dimensions (H, W)\n        - Target area (A)\n        - Aspect ratio (r = w/h)\n\n        We know:\n        - h * w = A (area equation)\n        - w = r * h (aspect ratio equation)\n\n        Therefore:\n        - h * (r * h) = A\n        - h\u00b2 = A/r\n        - h = sqrt(A/r)\n        - w = r * sqrt(A/r) = sqrt(A*r)\n        \"\"\"\n        height, width = params[\"shape\"][:2]\n        total_area = height * width\n\n        # Calculate maximum valid area based on dimensions and aspect ratio\n        max_area = total_area * self.scale[1]\n        min_area = total_area * self.scale[0]\n\n        # For each aspect ratio r, the maximum area is constrained by:\n        # h = sqrt(A/r) \u2264 H and w = sqrt(A*r) \u2264 W\n        # Therefore: A \u2264 min(r*H\u00b2, W\u00b2/r)\n        r_min, r_max = self.ratio\n\n        def area_constraint_h(r: float) -&gt; float:\n            return r * height * height\n\n        def area_constraint_w(r: float) -&gt; float:\n            return width * width / r\n\n        # Find maximum valid area considering aspect ratio constraints\n        max_area_h = min(area_constraint_h(r_min), area_constraint_h(r_max))\n        max_area_w = min(area_constraint_w(r_min), area_constraint_w(r_max))\n        max_valid_area = min(max_area, max_area_h, max_area_w)\n\n        if max_valid_area &lt; min_area:\n            return {\"holes\": np.array([], dtype=np.int32).reshape((0, 4))}\n\n        # Sample valid area and aspect ratio\n        erase_area = self.py_random.uniform(min_area, max_valid_area)\n\n        # Calculate valid aspect ratio range for this area\n        max_r = min(r_max, width * width / erase_area)\n        min_r = max(r_min, erase_area / (height * height))\n\n        if min_r &gt; max_r:\n            return {\"holes\": np.array([], dtype=np.int32).reshape((0, 4))}\n\n        aspect_ratio = self.py_random.uniform(min_r, max_r)\n\n        # Calculate dimensions\n        h = int(round(np.sqrt(erase_area / aspect_ratio)))\n        w = int(round(np.sqrt(erase_area * aspect_ratio)))\n\n        # Sample position\n        top = self.py_random.randint(0, height - h)\n        left = self.py_random.randint(0, width - w)\n\n        holes = np.array([[left, top, left + w, top + h]], dtype=np.int32)\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"scale\", \"ratio\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/","title":"Geometric functional transforms (augmentations.dropout.functional)","text":""},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.apply_inpainting","title":"<code>def apply_inpainting    (img, holes, method)    </code> [view source on GitHub]","text":"<p>Apply OpenCV inpainting to fill the holes in the image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image (grayscale or BGR)</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>method</code> <code>InpaintMethod</code> <p>Inpainting method to use (\"inpaint_telea\" or \"inpaint_ns\")</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Inpainted image</p> <p>Exceptions:</p> Type Description <code>NotImplementedError</code> <p>If image has more than 3 channels</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>@uint8_io\ndef apply_inpainting(img: np.ndarray, holes: np.ndarray, method: InpaintMethod) -&gt; np.ndarray:\n    \"\"\"Apply OpenCV inpainting to fill the holes in the image.\n\n    Args:\n        img: Input image (grayscale or BGR)\n        holes: Array of [x1, y1, x2, y2] coordinates\n        method: Inpainting method to use (\"inpaint_telea\" or \"inpaint_ns\")\n\n    Returns:\n        np.ndarray: Inpainted image\n\n    Raises:\n        NotImplementedError: If image has more than 3 channels\n    \"\"\"\n    num_channels = get_num_channels(img)\n    # Create inpainting mask\n    mask = np.zeros(img.shape[:2], dtype=np.uint8)\n    for x_min, y_min, x_max, y_max in holes:\n        mask[y_min:y_max, x_min:x_max] = 255\n\n    inpaint_method = cv2.INPAINT_TELEA if method == \"inpaint_telea\" else cv2.INPAINT_NS\n\n    # Handle grayscale images by converting to 3 channels and back\n    if num_channels == 1:\n        if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n            img = img.squeeze()\n        img_3ch = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)\n        result = cv2.inpaint(img_3ch, mask, 3, inpaint_method)\n        return (\n            cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)[..., None]\n            if num_channels == NUM_MULTI_CHANNEL_DIMENSIONS\n            else cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)\n        )\n\n    return cv2.inpaint(img, mask, 3, inpaint_method)\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.calculate_grid_dimensions","title":"<code>def calculate_grid_dimensions    (image_shape, unit_size_range, holes_number_xy, random_generator)    </code> [view source on GitHub]","text":"<p>Calculate the dimensions of grid units for GridDropout.</p> <p>This function determines the size of grid units based on the input parameters. It supports three modes of operation: 1. Using a range of unit sizes 2. Using a specified number of holes in x and y directions 3. Falling back to a default calculation</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>unit_size_range</code> <code>tuple[int, int] | None</code> <p>A range of possible unit sizes. If provided, a random size within this range will be chosen for both height and width.</p> <code>holes_number_xy</code> <code>tuple[int, int] | None</code> <p>The number of holes in the x and y directions. If provided, the grid dimensions will be calculated to fit this number of holes.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for generating random values.</p> <p>Returns:</p> Type Description <code>tuple[int, int]</code> <p>The calculated grid unit dimensions as (unit_height, unit_width).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the upper limit of unit_size_range is greater than the shortest image edge.</p> <p>Notes</p> <ul> <li>If both unit_size_range and holes_number_xy are None, the function falls back to a default calculation,   where the grid unit size is set to max(2, image_dimension // 10) for both height and width.</li> <li>The function prioritizes unit_size_range over holes_number_xy if both are provided.</li> <li>When using holes_number_xy, the actual number of holes may be slightly different due to integer division.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 200)\n&gt;&gt;&gt; calculate_grid_dimensions(image_shape, unit_size_range=(10, 20))\n(15, 15)  # Random value between 10 and 20\n</code></pre> Python<pre><code>&gt;&gt;&gt; calculate_grid_dimensions(image_shape, holes_number_xy=(5, 10))\n(20, 20)  # 100 // 5 and 200 // 10\n</code></pre> Python<pre><code>&gt;&gt;&gt; calculate_grid_dimensions(image_shape)\n(10, 20)  # Default calculation: max(2, dimension // 10)\n</code></pre> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def calculate_grid_dimensions(\n    image_shape: tuple[int, int],\n    unit_size_range: tuple[int, int] | None,\n    holes_number_xy: tuple[int, int] | None,\n    random_generator: np.random.Generator,\n) -&gt; tuple[int, int]:\n    \"\"\"Calculate the dimensions of grid units for GridDropout.\n\n    This function determines the size of grid units based on the input parameters.\n    It supports three modes of operation:\n    1. Using a range of unit sizes\n    2. Using a specified number of holes in x and y directions\n    3. Falling back to a default calculation\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        unit_size_range (tuple[int, int] | None, optional): A range of possible unit sizes.\n            If provided, a random size within this range will be chosen for both height and width.\n        holes_number_xy (tuple[int, int] | None, optional): The number of holes in the x and y directions.\n            If provided, the grid dimensions will be calculated to fit this number of holes.\n        random_generator (np.random.Generator): The random generator to use for generating random values.\n\n    Returns:\n        tuple[int, int]: The calculated grid unit dimensions as (unit_height, unit_width).\n\n    Raises:\n        ValueError: If the upper limit of unit_size_range is greater than the shortest image edge.\n\n    Notes:\n        - If both unit_size_range and holes_number_xy are None, the function falls back to a default calculation,\n          where the grid unit size is set to max(2, image_dimension // 10) for both height and width.\n        - The function prioritizes unit_size_range over holes_number_xy if both are provided.\n        - When using holes_number_xy, the actual number of holes may be slightly different due to integer division.\n\n    Examples:\n        &gt;&gt;&gt; image_shape = (100, 200)\n        &gt;&gt;&gt; calculate_grid_dimensions(image_shape, unit_size_range=(10, 20))\n        (15, 15)  # Random value between 10 and 20\n\n        &gt;&gt;&gt; calculate_grid_dimensions(image_shape, holes_number_xy=(5, 10))\n        (20, 20)  # 100 // 5 and 200 // 10\n\n        &gt;&gt;&gt; calculate_grid_dimensions(image_shape)\n        (10, 20)  # Default calculation: max(2, dimension // 10)\n    \"\"\"\n    height, width = image_shape[:2]\n\n    if unit_size_range is not None:\n        if unit_size_range[1] &gt; min(image_shape[:2]):\n            raise ValueError(\"Grid size limits must be within the shortest image edge.\")\n        unit_size = random_generator.integers(*unit_size_range)\n        return unit_size, unit_size\n\n    if holes_number_xy:\n        holes_number_x, holes_number_y = holes_number_xy\n        unit_width = width // holes_number_x\n        unit_height = height // holes_number_y\n        return unit_height, unit_width\n\n    # Default fallback\n    unit_width = max(2, width // 10)\n    unit_height = max(2, height // 10)\n    return unit_height, unit_width\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.cutout","title":"<code>def cutout    (img, holes, fill_value, random_generator)    </code> [view source on GitHub]","text":"<p>Apply cutout augmentation to the image by cutting out holes and filling them.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The image to augment</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>fill_value</code> <code>DropoutFillValue</code> <p>Value to fill holes with. Can be: - number (int/float): Will be broadcast to all channels - sequence (tuple/list/ndarray): Must match number of channels - \"random\": Different random values for each pixel - \"random_uniform\": Same random value for entire hole - \"inpaint_telea\"/\"inpaint_ns\": OpenCV inpainting methods</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random number generator for random fills</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If fill_value length doesn't match number of channels</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def cutout(\n    img: np.ndarray,\n    holes: np.ndarray,\n    fill_value: DropoutFillValue,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Apply cutout augmentation to the image by cutting out holes and filling them.\n\n    Args:\n        img: The image to augment\n        holes: Array of [x1, y1, x2, y2] coordinates\n        fill_value: Value to fill holes with. Can be:\n            - number (int/float): Will be broadcast to all channels\n            - sequence (tuple/list/ndarray): Must match number of channels\n            - \"random\": Different random values for each pixel\n            - \"random_uniform\": Same random value for entire hole\n            - \"inpaint_telea\"/\"inpaint_ns\": OpenCV inpainting methods\n        random_generator: Random number generator for random fills\n\n    Raises:\n        ValueError: If fill_value length doesn't match number of channels\n    \"\"\"\n    img = img.copy()\n\n    # Handle inpainting methods\n    if isinstance(fill_value, str):\n        if fill_value in {\"inpaint_telea\", \"inpaint_ns\"}:\n            return apply_inpainting(img, holes, cast(InpaintMethod, fill_value))\n        if fill_value == \"random\":\n            return fill_holes_with_random(img, holes, random_generator, uniform=False)\n        if fill_value == \"random_uniform\":\n            return fill_holes_with_random(img, holes, random_generator, uniform=True)\n        raise ValueError(f\"Unsupported string fill_value: {fill_value}\")\n\n    # Convert numeric fill values to numpy array\n    if isinstance(fill_value, (int, float)):\n        fill_array = np.array(fill_value, dtype=img.dtype)\n        return fill_holes_with_value(img, holes, fill_array)\n\n    # Handle sequence fill values\n    fill_array = np.array(fill_value, dtype=img.dtype)\n\n    # For multi-channel images, verify fill_value matches number of channels\n    if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n        fill_array = fill_array.ravel()\n        if fill_array.size != img.shape[2]:\n            raise ValueError(\n                f\"Fill value must have same number of channels as image. \"\n                f\"Got {fill_array.size}, expected {img.shape[2]}\",\n            )\n\n    return fill_holes_with_value(img, holes, fill_array)\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.fill_holes_with_random","title":"<code>def fill_holes_with_random    (img, holes, random_generator, uniform)    </code> [view source on GitHub]","text":"<p>Fill holes with random values.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random number generator</p> <code>uniform</code> <code>bool</code> <p>If True, use same random value for entire hole</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def fill_holes_with_random(\n    img: np.ndarray,\n    holes: np.ndarray,\n    random_generator: np.random.Generator,\n    uniform: bool,\n) -&gt; np.ndarray:\n    \"\"\"Fill holes with random values.\n\n    Args:\n        img: Input image\n        holes: Array of [x1, y1, x2, y2] coordinates\n        random_generator: Random number generator\n        uniform: If True, use same random value for entire hole\n    \"\"\"\n    for x_min, y_min, x_max, y_max in holes:\n        shape = (1,) if uniform else (y_max - y_min, x_max - x_min)\n        if img.ndim != MONO_CHANNEL_DIMENSIONS:\n            shape = (1, img.shape[2]) if uniform else (*shape, img.shape[2])\n\n        random_fill = generate_random_fill(img.dtype, shape, random_generator)\n        img[y_min:y_max, x_min:x_max] = random_fill\n    return img\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.fill_holes_with_value","title":"<code>def fill_holes_with_value    (img, holes, fill_value)    </code> [view source on GitHub]","text":"<p>Fill holes with a constant value.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>fill_value</code> <code>np.ndarray</code> <p>Value to fill the holes with</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def fill_holes_with_value(img: np.ndarray, holes: np.ndarray, fill_value: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Fill holes with a constant value.\n\n    Args:\n        img: Input image\n        holes: Array of [x1, y1, x2, y2] coordinates\n        fill_value: Value to fill the holes with\n    \"\"\"\n    for x_min, y_min, x_max, y_max in holes:\n        img[y_min:y_max, x_min:x_max] = fill_value\n    return img\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.filter_bboxes_by_holes","title":"<code>def filter_bboxes_by_holes    (bboxes, holes, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Filter bounding boxes based on their remaining visible area and visibility ratio after intersection with holes.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes, each represented as [x_min, y_min, x_max, y_max].</p> <code>holes</code> <code>np.ndarray</code> <p>Array of holes, each represented as [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width).</p> <code>min_area</code> <code>int</code> <p>Minimum remaining visible area to keep the bounding box.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio to keep the bounding box. Calculated as 1 - (intersection_area / bbox_area).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Filtered array of bounding boxes.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def filter_bboxes_by_holes(\n    bboxes: np.ndarray,\n    holes: np.ndarray,\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Filter bounding boxes based on their remaining visible area and visibility ratio after intersection with holes.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes, each represented as [x_min, y_min, x_max, y_max].\n        holes (np.ndarray): Array of holes, each represented as [x_min, y_min, x_max, y_max].\n        image_shape (tuple[int, int]): Shape of the image (height, width).\n        min_area (int): Minimum remaining visible area to keep the bounding box.\n        min_visibility (float): Minimum visibility ratio to keep the bounding box.\n            Calculated as 1 - (intersection_area / bbox_area).\n\n    Returns:\n        np.ndarray: Filtered array of bounding boxes.\n    \"\"\"\n    if len(bboxes) == 0 or len(holes) == 0:\n        return bboxes\n\n    # Create a blank mask for holes\n    hole_mask = np.zeros(image_shape, dtype=np.uint8)\n\n    # Fill in the holes on the mask\n    for hole in holes:\n        x_min, y_min, x_max, y_max = hole.astype(int)\n        hole_mask[y_min:y_max, x_min:x_max] = 1\n\n    # Vectorized calculation\n    bboxes_int = bboxes.astype(int)\n    x_min, y_min, x_max, y_max = bboxes_int[:, 0], bboxes_int[:, 1], bboxes_int[:, 2], bboxes_int[:, 3]\n\n    # Calculate box areas\n    box_areas = (x_max - x_min) * (y_max - y_min)\n\n    # Create a mask of the same shape as bboxes\n    mask = np.zeros(len(bboxes), dtype=bool)\n\n    for i in range(len(bboxes)):\n        intersection_area = np.sum(hole_mask[y_min[i] : y_max[i], x_min[i] : x_max[i]])\n        remaining_area = box_areas[i] - intersection_area\n        visibility_ratio = 1 - (intersection_area / box_areas[i])\n        mask[i] = (remaining_area &gt;= min_area) and (visibility_ratio &gt;= min_visibility)\n\n    return bboxes[mask]\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.filter_keypoints_in_holes","title":"<code>def filter_keypoints_in_holes    (keypoints, holes)    </code> [view source on GitHub]","text":"<p>Filter out keypoints that are inside any of the holes.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (num_keypoints, 2+).                     The first two columns are x and y coordinates.</p> <code>holes</code> <code>np.ndarray</code> <p>Array of holes with shape (num_holes, 4).                 Each hole is represented as [x1, y1, x2, y2].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints that are not inside any hole.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef filter_keypoints_in_holes(keypoints: np.ndarray, holes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Filter out keypoints that are inside any of the holes.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (num_keypoints, 2+).\n                                The first two columns are x and y coordinates.\n        holes (np.ndarray): Array of holes with shape (num_holes, 4).\n                            Each hole is represented as [x1, y1, x2, y2].\n\n    Returns:\n        np.ndarray: Array of keypoints that are not inside any hole.\n    \"\"\"\n    # Broadcast keypoints and holes for vectorized comparison\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    hole_x1 = holes[:, 0]  # Shape: (num_holes,)\n    hole_y1 = holes[:, 1]  # Shape: (num_holes,)\n    hole_x2 = holes[:, 2]  # Shape: (num_holes,)\n    hole_y2 = holes[:, 3]  # Shape: (num_holes,)\n\n    # Check if each keypoint is inside each hole\n    inside_hole = (kp_x &gt;= hole_x1) &amp; (kp_x &lt; hole_x2) &amp; (kp_y &gt;= hole_y1) &amp; (kp_y &lt; hole_y2)\n\n    # A keypoint is valid if it's not inside any hole\n    valid_keypoints = ~np.any(inside_hole, axis=1)\n\n    return keypoints[valid_keypoints]\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.generate_grid_holes","title":"<code>def generate_grid_holes    (image_shape, grid, ratio, random_offset, shift_xy, random_generator)    </code> [view source on GitHub]","text":"<p>Generate a list of holes for GridDropout using a uniform grid.</p> <p>This function creates a grid of holes for use in the GridDropout augmentation technique. It allows for customization of the grid size, hole size ratio, and positioning of holes.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>grid</code> <code>tuple[int, int]</code> <p>The grid size as (rows, columns). This determines the number of cells in the grid, where each cell may contain a hole.</p> <code>ratio</code> <code>float</code> <p>The ratio of the hole size to the grid cell size. Should be between 0 and 1. A ratio of 1 means the hole will fill the entire grid cell.</p> <code>random_offset</code> <code>bool</code> <p>If True, applies random offsets to each hole within its grid cell. If False, uses the global shift specified by shift_xy.</p> <code>shift_xy</code> <code>tuple[int, int]</code> <p>The global shift to apply to all holes as (shift_x, shift_y). Only used when random_offset is False.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator for generating random offsets and shuffling. If None, a new Generator will be created.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of hole coordinates, where each hole is represented as     [x1, y1, x2, y2]. The shape of the array is (n_holes, 4), where n_holes     is determined by the grid size.</p> <p>Notes</p> <ul> <li>The function first creates a uniform grid based on the image shape and specified grid size.</li> <li>Hole sizes are calculated based on the provided ratio and grid cell sizes.</li> <li>If random_offset is True, each hole is randomly positioned within its grid cell.</li> <li>If random_offset is False, all holes are shifted by the global shift_xy value.</li> <li>The function ensures that all holes remain within the image boundaries.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; grid = (5, 5)\n&gt;&gt;&gt; ratio = 0.5\n&gt;&gt;&gt; random_offset = True\n&gt;&gt;&gt; random_state = np.random.RandomState(42)\n&gt;&gt;&gt; shift_xy = (0, 0)\n&gt;&gt;&gt; holes = generate_grid_holes(image_shape, grid, ratio, random_offset, random_state, shift_xy)\n&gt;&gt;&gt; print(holes.shape)\n(25, 4)\n&gt;&gt;&gt; print(holes[0])  # Example output: [x1, y1, x2, y2] of the first hole\n[ 1 21 11 31]\n</code></pre> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def generate_grid_holes(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    ratio: float,\n    random_offset: bool,\n    shift_xy: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate a list of holes for GridDropout using a uniform grid.\n\n    This function creates a grid of holes for use in the GridDropout augmentation technique.\n    It allows for customization of the grid size, hole size ratio, and positioning of holes.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        grid (tuple[int, int]): The grid size as (rows, columns). This determines the number of cells\n            in the grid, where each cell may contain a hole.\n        ratio (float): The ratio of the hole size to the grid cell size. Should be between 0 and 1.\n            A ratio of 1 means the hole will fill the entire grid cell.\n        random_offset (bool): If True, applies random offsets to each hole within its grid cell.\n            If False, uses the global shift specified by shift_xy.\n        shift_xy (tuple[int, int]): The global shift to apply to all holes as (shift_x, shift_y).\n            Only used when random_offset is False.\n        random_generator (np.random.Generator): The random generator for generating random offsets\n            and shuffling. If None, a new Generator will be created.\n\n    Returns:\n        np.ndarray: An array of hole coordinates, where each hole is represented as\n            [x1, y1, x2, y2]. The shape of the array is (n_holes, 4), where n_holes\n            is determined by the grid size.\n\n    Notes:\n        - The function first creates a uniform grid based on the image shape and specified grid size.\n        - Hole sizes are calculated based on the provided ratio and grid cell sizes.\n        - If random_offset is True, each hole is randomly positioned within its grid cell.\n        - If random_offset is False, all holes are shifted by the global shift_xy value.\n        - The function ensures that all holes remain within the image boundaries.\n\n    Examples:\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; grid = (5, 5)\n        &gt;&gt;&gt; ratio = 0.5\n        &gt;&gt;&gt; random_offset = True\n        &gt;&gt;&gt; random_state = np.random.RandomState(42)\n        &gt;&gt;&gt; shift_xy = (0, 0)\n        &gt;&gt;&gt; holes = generate_grid_holes(image_shape, grid, ratio, random_offset, random_state, shift_xy)\n        &gt;&gt;&gt; print(holes.shape)\n        (25, 4)\n        &gt;&gt;&gt; print(holes[0])  # Example output: [x1, y1, x2, y2] of the first hole\n        [ 1 21 11 31]\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Generate the uniform grid\n    cells = split_uniform_grid(image_shape, grid, random_generator)\n\n    # Calculate hole sizes based on the ratio\n    cell_heights = cells[:, 2] - cells[:, 0]\n    cell_widths = cells[:, 3] - cells[:, 1]\n    hole_heights = np.clip(cell_heights * ratio, 1, cell_heights - 1).astype(int)\n    hole_widths = np.clip(cell_widths * ratio, 1, cell_widths - 1).astype(int)\n\n    # Calculate maximum possible offsets\n    max_offset_y = cell_heights - hole_heights\n    max_offset_x = cell_widths - hole_widths\n\n    if random_offset:\n        # Generate random offsets for each hole\n        offset_y = random_generator.integers(0, max_offset_y + 1)\n        offset_x = random_generator.integers(0, max_offset_x + 1)\n    else:\n        # Use global shift\n        offset_y = np.full_like(max_offset_y, shift_xy[1])\n        offset_x = np.full_like(max_offset_x, shift_xy[0])\n\n    # Calculate hole coordinates\n    x_min = np.clip(cells[:, 1] + offset_x, 0, width - hole_widths)\n    y_min = np.clip(cells[:, 0] + offset_y, 0, height - hole_heights)\n    x_max = np.minimum(x_min + hole_widths, width)\n    y_max = np.minimum(y_min + hole_heights, height)\n\n    return np.column_stack((x_min, y_min, x_max, y_max))\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.generate_random_fill","title":"<code>def generate_random_fill    (dtype, shape, random_generator)    </code> [view source on GitHub]","text":"<p>Generate a random fill array based on the given dtype and target shape.</p> <p>This function creates a numpy array filled with random values. The range and type of these values depend on the input dtype. For integer dtypes, it generates random integers. For floating-point dtypes, it generates random floats.</p> <p>Parameters:</p> Name Type Description <code>dtype</code> <code>np.dtype</code> <p>The data type of the array to be generated.</p> <code>shape</code> <code>tuple[int, ...]</code> <p>The shape of the array to be generated.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for generating values. If None, the default numpy random generator is used.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of the specified shape and dtype, filled with random values.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input dtype is neither integer nor floating-point.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; random_state = np.random.RandomState(42)\n&gt;&gt;&gt; result = generate_random_fill(np.dtype('uint8'), (2, 2), random_state)\n&gt;&gt;&gt; print(result)\n[[172 251]\n [ 80 141]]\n</code></pre> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def generate_random_fill(\n    dtype: np.dtype,\n    shape: tuple[int, ...],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate a random fill array based on the given dtype and target shape.\n\n    This function creates a numpy array filled with random values. The range and type of these values\n    depend on the input dtype. For integer dtypes, it generates random integers. For floating-point\n    dtypes, it generates random floats.\n\n    Args:\n        dtype (np.dtype): The data type of the array to be generated.\n        shape (tuple[int, ...]): The shape of the array to be generated.\n        random_generator (np.random.Generator): The random generator to use for generating values.\n            If None, the default numpy random generator is used.\n\n    Returns:\n        np.ndarray: A numpy array of the specified shape and dtype, filled with random values.\n\n    Raises:\n        ValueError: If the input dtype is neither integer nor floating-point.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; random_state = np.random.RandomState(42)\n        &gt;&gt;&gt; result = generate_random_fill(np.dtype('uint8'), (2, 2), random_state)\n        &gt;&gt;&gt; print(result)\n        [[172 251]\n         [ 80 141]]\n    \"\"\"\n    max_value = MAX_VALUES_BY_DTYPE[dtype]\n    if np.issubdtype(dtype, np.integer):\n        return random_generator.integers(0, max_value + 1, size=shape, dtype=dtype)\n    if np.issubdtype(dtype, np.floating):\n        return random_generator.uniform(0, max_value, size=shape).astype(dtype)\n    raise ValueError(f\"Unsupported dtype: {dtype}\")\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.label","title":"<code>def label    (mask, return_num=False, connectivity=2)    </code> [view source on GitHub]","text":"<p>Label connected regions of an integer array.</p> <p>This function uses OpenCV's connectedComponents under the hood but mimics the behavior of scikit-image's label function.</p> <p>Parameters:</p> Name Type Description <code>mask</code> <code>np.ndarray</code> <p>The array to label. Must be of integer type.</p> <code>return_num</code> <code>bool</code> <p>If True, return the number of labels (default: False).</p> <code>connectivity</code> <code>int</code> <p>Maximum number of orthogonal hops to consider a pixel/voxel                 as a neighbor. Accepted values are 1 or 2. Default is 2.</p> <p>Returns:</p> Type Description <code>np.ndarray | tuple[np.ndarray, int]</code> <p>Labeled array, where all connected regions are assigned the same integer value. If return_num is True, it also returns the number of labels.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def label(mask: np.ndarray, return_num: bool = False, connectivity: int = 2) -&gt; np.ndarray | tuple[np.ndarray, int]:\n    \"\"\"Label connected regions of an integer array.\n\n    This function uses OpenCV's connectedComponents under the hood but mimics\n    the behavior of scikit-image's label function.\n\n    Args:\n        mask (np.ndarray): The array to label. Must be of integer type.\n        return_num (bool): If True, return the number of labels (default: False).\n        connectivity (int): Maximum number of orthogonal hops to consider a pixel/voxel\n                            as a neighbor. Accepted values are 1 or 2. Default is 2.\n\n    Returns:\n        np.ndarray | tuple[np.ndarray, int]: Labeled array, where all connected regions are\n        assigned the same integer value. If return_num is True, it also returns the number of labels.\n    \"\"\"\n    # Create a copy of the original mask\n    labeled = np.zeros_like(mask, dtype=np.int32)\n\n    # Get unique non-zero values from the original mask\n    unique_values = np.unique(mask[mask != 0])\n\n    # Label each unique value separately\n    next_label = 1\n    for value in unique_values:\n        binary_mask = (mask == value).astype(np.uint8)\n\n        # Set connectivity for OpenCV (4 or 8)\n        cv2_connectivity = 4 if connectivity == 1 else 8\n\n        # Use OpenCV's connectedComponents\n        num_labels, labels = cv2.connectedComponents(binary_mask, connectivity=cv2_connectivity)\n\n        # Assign new labels\n        for i in range(1, num_labels):\n            labeled[labels == i] = next_label\n            next_label += 1\n\n    num_labels = next_label - 1\n\n    return (labeled, num_labels) if return_num else labeled\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.mask_dropout_bboxes","title":"<code>def mask_dropout_bboxes    (bboxes, dropout_mask, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Filter out bounding boxes based on their intersection with the dropout mask.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) in format [x_min, y_min, x_max, y_max, ...].</p> <code>dropout_mask</code> <code>np.ndarray</code> <p>Boolean mask of shape (height, width) where True values indicate dropped out regions.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the original image as (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area of the bounding box to be kept.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio of the bounding box to be kept.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Filtered array of bounding boxes.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef mask_dropout_bboxes(\n    bboxes: np.ndarray,\n    dropout_mask: np.ndarray,\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Filter out bounding boxes based on their intersection with the dropout mask.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (N, 4+) in format [x_min, y_min, x_max, y_max, ...].\n        dropout_mask (np.ndarray): Boolean mask of shape (height, width) where True values indicate dropped out regions.\n        image_shape (Tuple[int, int]): The shape of the original image as (height, width).\n        min_area (float): Minimum area of the bounding box to be kept.\n        min_visibility (float): Minimum visibility ratio of the bounding box to be kept.\n\n    Returns:\n        np.ndarray: Filtered array of bounding boxes.\n    \"\"\"\n    height, width = image_shape\n\n    # Create binary masks for each bounding box\n    y, x = np.ogrid[:height, :width]\n    box_masks = (\n        (x[None, :] &gt;= bboxes[:, 0, None, None])\n        &amp; (x[None, :] &lt;= bboxes[:, 2, None, None])\n        &amp; (y[None, :] &gt;= bboxes[:, 1, None, None])\n        &amp; (y[None, :] &lt;= bboxes[:, 3, None, None])\n    )\n\n    # Calculate the area of each bounding box\n    box_areas = (bboxes[:, 2] - bboxes[:, 0]) * (bboxes[:, 3] - bboxes[:, 1])\n\n    # Calculate the visible area of each box (non-intersecting area with dropout mask)\n    visible_areas = np.sum(box_masks &amp; ~dropout_mask.squeeze(), axis=(1, 2))\n\n    # Calculate visibility ratio (visible area / total box area)\n    visibility_ratio = visible_areas / box_areas\n\n    # Create a boolean mask for boxes to keep\n    keep_mask = (visible_areas &gt;= min_area) &amp; (visibility_ratio &gt;= min_visibility)\n\n    return bboxes[keep_mask]\n</code></pre>"},{"location":"api_reference/augmentations/dropout/grid_dropout/","title":"GridDropout augmentation (augmentations.dropout.grid_dropout)","text":""},{"location":"api_reference/augmentations/dropout/grid_dropout/#albumentations.augmentations.dropout.grid_dropout.GridDropout","title":"<code>class  GridDropout</code> <code>       (ratio=0.5, unit_size_min=None, unit_size_max=None, holes_number_x=None, holes_number_y=None, shift_x=None, shift_y=None, random_offset=True, fill_value=None, mask_fill_value=None, unit_size_range=None, holes_number_xy=None, shift_xy=(0, 0), fill=0, fill_mask=None, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply GridDropout augmentation to images, masks, bounding boxes, and keypoints.</p> <p>GridDropout drops out rectangular regions of an image and the corresponding mask in a grid fashion. This technique can help improve model robustness by forcing the network to rely on a broader context rather than specific local features.</p> <p>Parameters:</p> Name Type Description <code>ratio</code> <code>float</code> <p>The ratio of the mask holes to the unit size (same for horizontal and vertical directions). Must be between 0 and 1. Default: 0.5.</p> <code>unit_size_range</code> <code>tuple[int, int] | None</code> <p>Range from which to sample grid size. Default: None. Must be between 2 and the image's shorter edge. If None, grid size is calculated based on image size.</p> <code>holes_number_xy</code> <code>tuple[int, int] | None</code> <p>The number of grid units in x and y directions. First value should be between 1 and image width//2, Second value should be between 1 and image height//2. Default: None. If provided, overrides unit_size_range.</p> <code>random_offset</code> <code>bool</code> <p>Whether to offset the grid randomly between 0 and (grid unit size - hole size). If True, entered shift_xy is ignored and set randomly. Default: True.</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Value for the dropped pixels in mask. If None, the mask is not modified. Default: None.</p> <code>shift_xy</code> <code>tuple[int, int]</code> <p>Offsets of the grid start in x and y directions from (0,0) coordinate. Only used when random_offset is False. Default: (0, 0).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If both unit_size_range and holes_number_xy are None, the grid size is calculated based on the image size.</li> <li>The actual number of dropped regions may differ slightly from holes_number_xy due to rounding.</li> <li>Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.</li> <li>For 'random_uniform' fill, each grid cell gets a single random color, unlike 'random' where each pixel     gets its own random value.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; # Example with standard fill value\n&gt;&gt;&gt; aug_basic = A.GridDropout(\n...     ratio=0.3,\n...     unit_size_range=(10, 20),\n...     random_offset=True,\n...     p=1.0\n... )\n&gt;&gt;&gt; # Example with random uniform fill\n&gt;&gt;&gt; aug_random = A.GridDropout(\n...     ratio=0.3,\n...     unit_size_range=(10, 20),\n...     fill=\"random_uniform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; # Example with inpainting\n&gt;&gt;&gt; aug_inpaint = A.GridDropout(\n...     ratio=0.3,\n...     unit_size_range=(10, 20),\n...     fill=\"inpaint_ns\",\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n</code></pre> <p>Reference</p> <ul> <li>Paper: https://arxiv.org/abs/2001.04086</li> <li>OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/grid_dropout.py</code> Python<pre><code>class GridDropout(BaseDropout):\n    \"\"\"Apply GridDropout augmentation to images, masks, bounding boxes, and keypoints.\n\n    GridDropout drops out rectangular regions of an image and the corresponding mask in a grid fashion.\n    This technique can help improve model robustness by forcing the network to rely on a broader context\n    rather than specific local features.\n\n    Args:\n        ratio (float): The ratio of the mask holes to the unit size (same for horizontal and vertical directions).\n            Must be between 0 and 1. Default: 0.5.\n        unit_size_range (tuple[int, int] | None): Range from which to sample grid size. Default: None.\n            Must be between 2 and the image's shorter edge. If None, grid size is calculated based on image size.\n        holes_number_xy (tuple[int, int] | None): The number of grid units in x and y directions.\n            First value should be between 1 and image width//2,\n            Second value should be between 1 and image height//2.\n            Default: None. If provided, overrides unit_size_range.\n        random_offset (bool): Whether to offset the grid randomly between 0 and (grid unit size - hole size).\n            If True, entered shift_xy is ignored and set randomly. Default: True.\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value for the dropped pixels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            - 'random': each pixel is filled with random values\n            - 'random_uniform': each hole is filled with a single random color\n            - 'inpaint_telea': uses OpenCV Telea inpainting method\n            - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        fill_mask (ColorType | None): Value for the dropped pixels in mask.\n            If None, the mask is not modified. Default: None.\n        shift_xy (tuple[int, int]): Offsets of the grid start in x and y directions from (0,0) coordinate.\n            Only used when random_offset is False. Default: (0, 0).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If both unit_size_range and holes_number_xy are None, the grid size is calculated based on the image size.\n        - The actual number of dropped regions may differ slightly from holes_number_xy due to rounding.\n        - Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.\n        - For 'random_uniform' fill, each grid cell gets a single random color, unlike 'random' where each pixel\n            gets its own random value.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; # Example with standard fill value\n        &gt;&gt;&gt; aug_basic = A.GridDropout(\n        ...     ratio=0.3,\n        ...     unit_size_range=(10, 20),\n        ...     random_offset=True,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; # Example with random uniform fill\n        &gt;&gt;&gt; aug_random = A.GridDropout(\n        ...     ratio=0.3,\n        ...     unit_size_range=(10, 20),\n        ...     fill=\"random_uniform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; # Example with inpainting\n        &gt;&gt;&gt; aug_inpaint = A.GridDropout(\n        ...     ratio=0.3,\n        ...     unit_size_range=(10, 20),\n        ...     fill=\"inpaint_ns\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n\n    Reference:\n        - Paper: https://arxiv.org/abs/2001.04086\n        - OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html\n    \"\"\"\n\n    class InitSchema(BaseDropout.InitSchema):\n        ratio: float = Field(gt=0, le=1)\n\n        unit_size_min: int | None = Field(ge=2)\n        unit_size_max: int | None = Field(ge=2)\n\n        holes_number_x: int | None = Field(ge=1)\n        holes_number_y: int | None = Field(ge=1)\n\n        shift_x: int | None = Field(ge=0)\n        shift_y: int | None = Field(ge=0)\n\n        random_offset: bool\n        fill_value: DropoutFillValue | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_fill_value: ColorType | None = Field(deprecated=\"Deprecated use fill_mask instead\")\n\n        unit_size_range: (\n            Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None)), AfterValidator(nondecreasing)]\n            | None\n        )\n        shift_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(0, None))]\n\n        holes_number_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))] | None\n\n        @model_validator(mode=\"after\")\n        def validate_normalization(self) -&gt; Self:\n            if self.unit_size_min is not None and self.unit_size_max is not None:\n                self.unit_size_range = self.unit_size_min, self.unit_size_max\n                warn(\n                    \"unit_size_min and unit_size_max are deprecated. Use unit_size_range instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.shift_x is not None and self.shift_y is not None:\n                self.shift_xy = self.shift_x, self.shift_y\n                warn(\"shift_x and shift_y are deprecated. Use shift_xy instead.\", DeprecationWarning, stacklevel=2)\n\n            if self.holes_number_x is not None and self.holes_number_y is not None:\n                self.holes_number_xy = self.holes_number_x, self.holes_number_y\n                warn(\n                    \"holes_number_x and holes_number_y are deprecated. Use holes_number_xy instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.unit_size_range and not MIN_UNIT_SIZE &lt;= self.unit_size_range[0] &lt;= self.unit_size_range[1]:\n                raise ValueError(\"Max unit size should be &gt;= min size, both at least 2 pixels.\")\n\n            return self\n\n    def __init__(\n        self,\n        ratio: float = 0.5,\n        unit_size_min: int | None = None,\n        unit_size_max: int | None = None,\n        holes_number_x: int | None = None,\n        holes_number_y: int | None = None,\n        shift_x: int | None = None,\n        shift_y: int | None = None,\n        random_offset: bool = True,\n        fill_value: DropoutFillValue | None = None,\n        mask_fill_value: ColorType | None = None,\n        unit_size_range: tuple[int, int] | None = None,\n        holes_number_xy: tuple[int, int] | None = None,\n        shift_xy: tuple[int, int] = (0, 0),\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.ratio = ratio\n        self.unit_size_range = unit_size_range\n        self.holes_number_xy = holes_number_xy\n        self.random_offset = random_offset\n        self.shift_xy = shift_xy\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"]\n        if self.holes_number_xy:\n            grid = self.holes_number_xy\n        else:\n            # Calculate grid based on unit_size_range or default\n            unit_height, unit_width = fdropout.calculate_grid_dimensions(\n                image_shape,\n                self.unit_size_range,\n                self.holes_number_xy,\n                self.random_generator,\n            )\n            grid = (image_shape[0] // unit_height, image_shape[1] // unit_width)\n\n        holes = fdropout.generate_grid_holes(\n            image_shape,\n            grid,\n            self.ratio,\n            self.random_offset,\n            self.shift_xy,\n            self.random_generator,\n        )\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"ratio\",\n            \"unit_size_range\",\n            \"holes_number_xy\",\n            \"shift_xy\",\n            \"random_offset\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/dropout/mask_dropout/","title":"MaskDropout augmentation (augmentations.dropout.mask_dropout)","text":""},{"location":"api_reference/augmentations/dropout/mask_dropout/#albumentations.augmentations.dropout.mask_dropout.MaskDropout","title":"<code>class  MaskDropout</code> <code>       (max_objects=(1, 1), image_fill_value=None, mask_fill_value=None, fill=0, fill_mask=0, p=0.5, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Apply dropout to random objects in a mask, zeroing out the corresponding regions in both the image and mask.</p> <p>This transform identifies objects in the mask (where each unique non-zero value represents a distinct object), randomly selects a number of these objects, and sets their corresponding regions to zero in both the image and mask. It can also handle bounding boxes and keypoints, removing or adjusting them based on the dropout regions.</p> <p>Parameters:</p> Name Type Description <code>max_objects</code> <code>int | tuple[int, int]</code> <p>Maximum number of objects to dropout. If a single int is provided, it's treated as the upper bound. If a tuple of two ints is provided, it's treated as a range [min, max].</p> <code>fill</code> <code>float | str | Literal[\"inpaint\"]</code> <p>Value to fill the dropped out regions in the image. If set to 'inpaint', it applies inpainting to the dropped out regions (works only for 3-channel images).</p> <code>fill_mask</code> <code>float | int</code> <p>Value to fill the dropped out regions in the mask.</p> <code>min_area</code> <code>float</code> <p>Minimum area (in pixels) of a bounding box that must remain visible after dropout to be kept. Only applicable if bounding box augmentation is enabled. Default: 0.0</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio (visible area / total area) of a bounding box after dropout to be kept. Only applicable if bounding box augmentation is enabled. Default: 0.0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The mask should be a single-channel image where 0 represents the background and non-zero values represent   different object instances.</li> <li>For bounding box and keypoint augmentation, make sure to set up the corresponding processors in the pipeline.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Define a sample image, mask, and bounding boxes\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n&gt;&gt;&gt; mask[20:40, 20:40] = 1  # Object 1\n&gt;&gt;&gt; mask[60:80, 60:80] = 2  # Object 2\n&gt;&gt;&gt; bboxes = np.array([[20, 20, 40, 40], [60, 60, 80, 80]])\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Define the transform\n&gt;&gt;&gt; transform = A.Compose([\n...     A.MaskDropout(max_objects=1, mask_fill_value=0, min_area=100, min_visibility=0.5, p=1.0),\n... ], bbox_params=A.BboxParams(format='pascal_voc', min_area=1, min_visibility=0.1))\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Apply the transform\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # The result will have one of the objects dropped out in both image and mask,\n&gt;&gt;&gt; # and the corresponding bounding box removed if it doesn't meet the area and visibility criteria\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/mask_dropout.py</code> Python<pre><code>class MaskDropout(DualTransform):\n    \"\"\"Apply dropout to random objects in a mask, zeroing out the corresponding regions in both the image and mask.\n\n    This transform identifies objects in the mask (where each unique non-zero value represents a distinct object),\n    randomly selects a number of these objects, and sets their corresponding regions to zero in both the image and mask.\n    It can also handle bounding boxes and keypoints, removing or adjusting them based on the dropout regions.\n\n    Args:\n        max_objects (int | tuple[int, int]): Maximum number of objects to dropout. If a single int is provided,\n            it's treated as the upper bound. If a tuple of two ints is provided, it's treated as a range [min, max].\n        fill (float | str | Literal[\"inpaint\"]): Value to fill the dropped out regions in the image.\n            If set to 'inpaint', it applies inpainting to the dropped out regions (works only for 3-channel images).\n        fill_mask (float | int): Value to fill the dropped out regions in the mask.\n        min_area (float): Minimum area (in pixels) of a bounding box that must remain visible after dropout to be kept.\n            Only applicable if bounding box augmentation is enabled. Default: 0.0\n        min_visibility (float): Minimum visibility ratio (visible area / total area) of a bounding box after dropout\n            to be kept. Only applicable if bounding box augmentation is enabled. Default: 0.0\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The mask should be a single-channel image where 0 represents the background and non-zero values represent\n          different object instances.\n        - For bounding box and keypoint augmentation, make sure to set up the corresponding processors in the pipeline.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Define a sample image, mask, and bounding boxes\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; mask[20:40, 20:40] = 1  # Object 1\n        &gt;&gt;&gt; mask[60:80, 60:80] = 2  # Object 2\n        &gt;&gt;&gt; bboxes = np.array([[20, 20, 40, 40], [60, 60, 80, 80]])\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Define the transform\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.MaskDropout(max_objects=1, mask_fill_value=0, min_area=100, min_visibility=0.5, p=1.0),\n        ... ], bbox_params=A.BboxParams(format='pascal_voc', min_area=1, min_visibility=0.1))\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Apply the transform\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # The result will have one of the objects dropped out in both image and mask,\n        &gt;&gt;&gt; # and the corresponding bounding box removed if it doesn't meet the area and visibility criteria\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_objects: OnePlusIntRangeType\n\n        image_fill_value: float | Literal[\"inpaint\"] | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_fill_value: float | None = Field(deprecated=\"Deprecated use fill_mask instead\")\n\n        fill: float | Literal[\"inpaint\"]\n        fill_mask: float\n\n    def __init__(\n        self,\n        max_objects: ScaleIntType = (1, 1),\n        image_fill_value: float | Literal[\"inpaint\"] | None = None,\n        mask_fill_value: float | None = None,\n        fill: float | Literal[\"inpaint\"] = 0,\n        fill_mask: float = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_objects = cast(tuple[int, int], max_objects)\n        self.fill = fill  # type: ignore[assignment]\n        self.fill_mask = fill_mask\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [\"mask\"]\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        mask = data[\"mask\"]\n\n        label_image, num_labels = fdropout.label(mask, return_num=True)\n\n        if num_labels == 0:\n            dropout_mask = None\n        else:\n            objects_to_drop = self.py_random.randint(*self.max_objects)\n            objects_to_drop = min(num_labels, objects_to_drop)\n\n            if objects_to_drop == num_labels:\n                dropout_mask = mask &gt; 0\n            else:\n                labels_index = self.py_random.sample(range(1, num_labels + 1), objects_to_drop)\n                dropout_mask = np.zeros(mask.shape[:2], dtype=bool)\n                for label_index in labels_index:\n                    dropout_mask |= label_image == label_index\n\n        return {\"dropout_mask\": dropout_mask}\n\n    def apply(self, img: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return img\n\n        if self.fill == \"inpaint\":\n            dropout_mask = dropout_mask.astype(np.uint8)\n            _, _, width, height = cv2.boundingRect(dropout_mask)\n            radius = min(3, max(width, height) // 2)\n            return cv2.inpaint(img, dropout_mask, radius, cv2.INPAINT_NS)\n\n        img = img.copy()\n        img[dropout_mask] = self.fill\n\n        return img\n\n    def apply_to_mask(self, mask: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return mask\n\n        mask = mask.copy()\n        mask[dropout_mask] = self.fill_mask\n        return mask\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return bboxes\n\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n\n        image_shape = params[\"shape\"][:2]\n\n        denormalized_bboxes = denormalize_bboxes(bboxes, image_shape)\n\n        result = fdropout.mask_dropout_bboxes(\n            denormalized_bboxes,\n            dropout_mask,\n            image_shape,\n            processor.params.min_area,\n            processor.params.min_visibility,\n        )\n\n        return normalize_bboxes(result, image_shape)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return keypoints\n\n        processor = cast(KeypointsProcessor, self.get_processor(\"keypoints\"))\n\n        if processor is None or not processor.params.remove_invisible:\n            return keypoints\n\n        return fdropout.mask_dropout_keypoints(keypoints, dropout_mask)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_objects\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/dropout/xy_masking/","title":"XYMasking augmentation (augmentations.dropout.xy_masking)","text":""},{"location":"api_reference/augmentations/dropout/xy_masking/#albumentations.augmentations.dropout.xy_masking.XYMasking","title":"<code>class  XYMasking</code> <code>       (num_masks_x=0, num_masks_y=0, mask_x_length=0, mask_y_length=0, fill_value=None, mask_fill_value=None, fill=0, fill_mask=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Applies masking strips to an image, either horizontally (X axis) or vertically (Y axis), simulating occlusions. This transform is useful for training models to recognize images with varied visibility conditions. It's particularly effective for spectrogram images, allowing spectral and frequency masking to improve model robustness.</p> <p>At least one of <code>max_x_length</code> or <code>max_y_length</code> must be specified, dictating the mask's maximum size along each axis.</p> <p>Parameters:</p> Name Type Description <code>num_masks_x</code> <code>int | tuple[int, int]</code> <p>Number or range of horizontal regions to mask. Defaults to 0.</p> <code>num_masks_y</code> <code>int | tuple[int, int]</code> <p>Number or range of vertical regions to mask. Defaults to 0.</p> <code>mask_x_length</code> <code>int | tuple[int, int]</code> <p>Specifies the length of the masks along the X (horizontal) axis. If an integer is provided, it sets a fixed mask length. If a tuple of two integers (min, max) is provided, the mask length is randomly chosen within this range for each mask. This allows for variable-length masks in the horizontal direction.</p> <code>mask_y_length</code> <code>int | tuple[int, int]</code> <p>Specifies the height of the masks along the Y (vertical) axis. Similar to <code>mask_x_length</code>, an integer sets a fixed mask height, while a tuple (min, max) allows for variable-height masks, chosen randomly within the specified range for each mask. This flexibility facilitates creating masks of various sizes in the vertical direction.</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>mask_fill_value</code> <code>ColorType | None</code> <p>Fill value for dropout regions in the mask. If None, mask regions corresponding to image dropouts are unchanged. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Defaults to 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note: Either <code>max_x_length</code> or <code>max_y_length</code> or both must be defined.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/xy_masking.py</code> Python<pre><code>class XYMasking(BaseDropout):\n    \"\"\"Applies masking strips to an image, either horizontally (X axis) or vertically (Y axis),\n    simulating occlusions. This transform is useful for training models to recognize images\n    with varied visibility conditions. It's particularly effective for spectrogram images,\n    allowing spectral and frequency masking to improve model robustness.\n\n    At least one of `max_x_length` or `max_y_length` must be specified, dictating the mask's\n    maximum size along each axis.\n\n    Args:\n        num_masks_x (int | tuple[int, int]): Number or range of horizontal regions to mask. Defaults to 0.\n        num_masks_y (int | tuple[int, int]): Number or range of vertical regions to mask. Defaults to 0.\n        mask_x_length (int | tuple[int, int]): Specifies the length of the masks along\n            the X (horizontal) axis. If an integer is provided, it sets a fixed mask length.\n            If a tuple of two integers (min, max) is provided,\n            the mask length is randomly chosen within this range for each mask.\n            This allows for variable-length masks in the horizontal direction.\n        mask_y_length (int | tuple[int, int]): Specifies the height of the masks along\n            the Y (vertical) axis. Similar to `mask_x_length`, an integer sets a fixed mask height,\n            while a tuple (min, max) allows for variable-height masks, chosen randomly\n            within the specified range for each mask. This flexibility facilitates creating masks of various\n            sizes in the vertical direction.\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value for the dropped pixels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            - 'random': each pixel is filled with random values\n            - 'random_uniform': each hole is filled with a single random color\n            - 'inpaint_telea': uses OpenCV Telea inpainting method\n            - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        mask_fill_value (ColorType | None): Fill value for dropout regions in the mask.\n            If None, mask regions corresponding to image dropouts are unchanged. Default: None\n        p (float): Probability of applying the transform. Defaults to 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note: Either `max_x_length` or `max_y_length` or both must be defined.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_masks_x: NonNegativeIntRangeType\n        num_masks_y: NonNegativeIntRangeType\n        mask_x_length: NonNegativeIntRangeType\n        mask_y_length: NonNegativeIntRangeType\n\n        fill_value: DropoutFillValue | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_fill_value: ColorType | None = Field(deprecated=\"Deprecated use fill_mask instead\")\n\n        fill: DropoutFillValue\n        fill_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def check_mask_length(self) -&gt; Self:\n            if (\n                isinstance(self.mask_x_length, int)\n                and self.mask_x_length &lt;= 0\n                and isinstance(self.mask_y_length, int)\n                and self.mask_y_length &lt;= 0\n            ):\n                msg = \"At least one of `mask_x_length` or `mask_y_length` Should be a positive number.\"\n                raise ValueError(msg)\n\n            if self.fill_value is not None:\n                self.fill = self.fill_value\n\n            if self.mask_fill_value is not None:\n                self.fill_mask = self.mask_fill_value\n\n            return self\n\n    def __init__(\n        self,\n        num_masks_x: ScaleIntType = 0,\n        num_masks_y: ScaleIntType = 0,\n        mask_x_length: ScaleIntType = 0,\n        mask_y_length: ScaleIntType = 0,\n        fill_value: DropoutFillValue | None = None,\n        mask_fill_value: ColorType | None = None,\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, fill=fill, fill_mask=fill_mask)\n        self.num_masks_x = cast(tuple[int, int], num_masks_x)\n        self.num_masks_y = cast(tuple[int, int], num_masks_y)\n\n        self.mask_x_length = cast(tuple[int, int], mask_x_length)\n        self.mask_y_length = cast(tuple[int, int], mask_y_length)\n\n    def validate_mask_length(\n        self,\n        mask_length: tuple[int, int] | None,\n        dimension_size: int,\n        dimension_name: str,\n    ) -&gt; None:\n        \"\"\"Validate the mask length against the corresponding image dimension size.\"\"\"\n        if mask_length is not None:\n            if isinstance(mask_length, (tuple, list)):\n                if mask_length[0] &lt; 0 or mask_length[1] &gt; dimension_size:\n                    raise ValueError(\n                        f\"{dimension_name} range {mask_length} is out of valid range [0, {dimension_size}]\",\n                    )\n            elif mask_length &lt; 0 or mask_length &gt; dimension_size:\n                raise ValueError(f\"{dimension_name} {mask_length} exceeds image {dimension_name} {dimension_size}\")\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        image_shape = params[\"shape\"][:2]\n\n        height, width = image_shape\n\n        self.validate_mask_length(self.mask_x_length, width, \"mask_x_length\")\n        self.validate_mask_length(self.mask_y_length, height, \"mask_y_length\")\n\n        masks_x = self.generate_masks(self.num_masks_x, image_shape, self.mask_x_length, axis=\"x\")\n        masks_y = self.generate_masks(self.num_masks_y, image_shape, self.mask_y_length, axis=\"y\")\n\n        holes = np.array(masks_x + masks_y)\n\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def generate_mask_size(self, mask_length: tuple[int, int]) -&gt; int:\n        return self.py_random.randint(*mask_length)\n\n    def generate_masks(\n        self,\n        num_masks: tuple[int, int],\n        image_shape: tuple[int, int],\n        max_length: tuple[int, int] | None,\n        axis: str,\n    ) -&gt; list[tuple[int, int, int, int]]:\n        if max_length is None or max_length == 0 or (isinstance(num_masks, (int, float)) and num_masks == 0):\n            return []\n\n        masks = []\n        num_masks_integer = (\n            num_masks if isinstance(num_masks, int) else self.py_random.randint(num_masks[0], num_masks[1])\n        )\n\n        height, width = image_shape\n\n        for _ in range(num_masks_integer):\n            length = self.generate_mask_size(max_length)\n\n            if axis == \"x\":\n                x_min = self.py_random.randint(0, width - length)\n                y_min = 0\n                x_max, y_max = x_min + length, height\n            else:  # axis == 'y'\n                y_min = self.py_random.randint(0, height - length)\n                x_min = 0\n                x_max, y_max = width, y_min + length\n\n            masks.append((x_min, y_min, x_max, y_max))\n        return masks\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"num_masks_x\",\n            \"num_masks_y\",\n            \"mask_x_length\",\n            \"mask_y_length\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/","title":"Index","text":"<ul> <li>Geometric functional transforms (albumentations.augmentations.geometric.functional)</li> <li>Resizing transforms (augmentations.geometric.resize)</li> <li>Rotation transforms (augmentations.geometric.functional)</li> <li>Geometric transforms (augmentations.geometric.transforms)</li> </ul>"},{"location":"api_reference/augmentations/geometric/functional/","title":"Geometric functional transforms (augmentations.geometric.functional)","text":""},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.adjust_padding_by_position","title":"<code>def adjust_padding_by_position    (h_top, h_bottom, w_left, w_right, position, py_random)    </code> [view source on GitHub]","text":"<p>Adjust padding values based on desired position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def adjust_padding_by_position(\n    h_top: int,\n    h_bottom: int,\n    w_left: int,\n    w_right: int,\n    position: PositionType,\n    py_random: np.random.RandomState,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Adjust padding values based on desired position.\"\"\"\n    if position == \"center\":\n        return h_top, h_bottom, w_left, w_right\n\n    if position == \"top_left\":\n        return 0, h_top + h_bottom, 0, w_left + w_right\n\n    if position == \"top_right\":\n        return 0, h_top + h_bottom, w_left + w_right, 0\n\n    if position == \"bottom_left\":\n        return h_top + h_bottom, 0, 0, w_left + w_right\n\n    if position == \"bottom_right\":\n        return h_top + h_bottom, 0, w_left + w_right, 0\n\n    if position == \"random\":\n        h_pad = h_top + h_bottom\n        w_pad = w_left + w_right\n        h_top = py_random.randint(0, h_pad)\n        h_bottom = h_pad - h_top\n        w_left = py_random.randint(0, w_pad)\n        w_right = w_pad - w_left\n        return h_top, h_bottom, w_left, w_right\n\n    raise ValueError(f\"Unknown position: {position}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.almost_equal_intervals","title":"<code>def almost_equal_intervals    (n, parts)    </code> [view source on GitHub]","text":"<p>Generates an array of nearly equal integer intervals that sum up to <code>n</code>.</p> <p>This function divides the number <code>n</code> into <code>parts</code> nearly equal parts. It ensures that the sum of all parts equals <code>n</code>, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.</p> <p>Parameters:</p> Name Type Description <code>n</code> <code>int</code> <p>The total value to be split.</p> <code>parts</code> <code>int</code> <p>The number of parts to split into.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of integers where each integer represents the size of a part.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; almost_equal_intervals(20, 3)\narray([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n&gt;&gt;&gt; almost_equal_intervals(16, 4)\narray([4, 4, 4, 4])  # Splits 16 into four equal parts\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def almost_equal_intervals(n: int, parts: int) -&gt; np.ndarray:\n    \"\"\"Generates an array of nearly equal integer intervals that sum up to `n`.\n\n    This function divides the number `n` into `parts` nearly equal parts. It ensures that\n    the sum of all parts equals `n`, and the difference between any two parts is at most one.\n    This is useful for distributing a total amount into nearly equal discrete parts.\n\n    Args:\n        n (int): The total value to be split.\n        parts (int): The number of parts to split into.\n\n    Returns:\n        np.ndarray: An array of integers where each integer represents the size of a part.\n\n    Example:\n        &gt;&gt;&gt; almost_equal_intervals(20, 3)\n        array([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n        &gt;&gt;&gt; almost_equal_intervals(16, 4)\n        array([4, 4, 4, 4])  # Splits 16 into four equal parts\n    \"\"\"\n    part_size, remainder = divmod(n, parts)\n    # Create an array with the base part size and adjust the first `remainder` parts by adding 1\n    return np.array(\n        [part_size + 1 if i &lt; remainder else part_size for i in range(parts)],\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.apply_affine_to_points","title":"<code>def apply_affine_to_points    (points, matrix)    </code> [view source on GitHub]","text":"<p>Apply affine transformation to a set of points.</p> <p>This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.</p> <p>Parameters:</p> Name Type Description <code>points</code> <code>np.ndarray</code> <p>Array of points with shape (N, 2).</p> <code>matrix</code> <code>np.ndarray</code> <p>3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (N, 2).</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"points\")\ndef apply_affine_to_points(points: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply affine transformation to a set of points.\n\n    This function handles potential division by zero by replacing zero values\n    in the homogeneous coordinate with a small epsilon value.\n\n    Args:\n        points (np.ndarray): Array of points with shape (N, 2).\n        matrix (np.ndarray): 3x3 affine transformation matrix.\n\n    Returns:\n        np.ndarray: Transformed points with shape (N, 2).\n    \"\"\"\n    homogeneous_points = np.column_stack([points, np.ones(points.shape[0])])\n    transformed_points = homogeneous_points @ matrix.T\n\n    # Handle potential division by zero\n    epsilon = np.finfo(transformed_points.dtype).eps\n    transformed_points[:, 2] = np.where(\n        np.abs(transformed_points[:, 2]) &lt; epsilon,\n        np.sign(transformed_points[:, 2]) * epsilon,\n        transformed_points[:, 2],\n    )\n\n    return transformed_points[:, :2] / transformed_points[:, 2:]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_affine","title":"<code>def bboxes_affine    (bboxes, matrix, rotate_method, image_shape, border_mode, output_shape)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes.</p> <p>For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes</p> <p>For other border modes, it directly applies the affine transformation without padding.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Input bounding boxes</p> <code>matrix</code> <code>np.ndarray</code> <p>Affine transformation matrix</p> <code>rotate_method</code> <code>str</code> <p>Method for rotating bounding boxes ('largest_box' or 'ellipse')</p> <code>image_shape</code> <code>Sequence[int]</code> <p>Shape of the input image</p> <code>border_mode</code> <code>int</code> <p>OpenCV border mode</p> <code>output_shape</code> <code>Sequence[int]</code> <p>Shape of the output image</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed and normalized bounding boxes</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine(\n    bboxes: np.ndarray,\n    matrix: np.ndarray,\n    rotate_method: Literal[\"largest_box\", \"ellipse\"],\n    image_shape: tuple[int, int],\n    border_mode: int,\n    output_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes.\n\n    For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function:\n    1. Calculates necessary padding to avoid information loss\n    2. Applies padding to the bounding boxes\n    3. Adjusts the transformation matrix to account for padding\n    4. Applies the affine transformation\n    5. Validates the transformed bounding boxes\n\n    For other border modes, it directly applies the affine transformation without padding.\n\n    Args:\n        bboxes (np.ndarray): Input bounding boxes\n        matrix (np.ndarray): Affine transformation matrix\n        rotate_method (str): Method for rotating bounding boxes ('largest_box' or 'ellipse')\n        image_shape (Sequence[int]): Shape of the input image\n        border_mode (int): OpenCV border mode\n        output_shape (Sequence[int]): Shape of the output image\n\n    Returns:\n        np.ndarray: Transformed and normalized bounding boxes\n    \"\"\"\n    if is_identity_matrix(matrix):\n        return bboxes\n\n    bboxes = denormalize_bboxes(bboxes, image_shape)\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        bboxes = generate_reflected_bboxes(\n            bboxes,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Apply affine transform\n    if rotate_method == \"largest_box\":\n        transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n    elif rotate_method == \"ellipse\":\n        transformed_bboxes = bboxes_affine_ellipse(bboxes, matrix)\n    else:\n        raise ValueError(f\"Method {rotate_method} is not a valid rotation method.\")\n\n    # Validate and normalize bboxes\n    validated_bboxes = validate_bboxes(transformed_bboxes, output_shape)\n\n    return normalize_bboxes(validated_bboxes, output_shape)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_affine_ellipse","title":"<code>def bboxes_affine_ellipse    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes using an ellipse approximation method.</p> <p>This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The ellipse approximation method can provide a tighter bounding box compared to the   largest box method, especially for rotations.</li> <li>360 points are used to approximate each ellipse, which provides a good balance between   accuracy and computational efficiency.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method may be more suitable for objects that are roughly elliptical in shape.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_ellipse(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes using an ellipse approximation method.\n\n    This function transforms bounding boxes by approximating each box with an ellipse,\n    transforming points along the ellipse's circumference, and then computing the\n    new bounding box that encloses the transformed ellipse.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The ellipse approximation method can provide a tighter bounding box compared to the\n          largest box method, especially for rotations.\n        - 360 points are used to approximate each ellipse, which provides a good balance between\n          accuracy and computational efficiency.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method may be more suitable for objects that are roughly elliptical in shape.\n    \"\"\"\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n    bbox_width = (x_max - x_min) / 2\n    bbox_height = (y_max - y_min) / 2\n    center_x = x_min + bbox_width\n    center_y = y_min + bbox_height\n\n    angles = np.arange(0, 360, dtype=np.float32)\n    cos_angles = np.cos(np.radians(angles))\n    sin_angles = np.sin(np.radians(angles))\n\n    # Generate points for all ellipses at once\n    x = bbox_width[:, np.newaxis] * sin_angles + center_x[:, np.newaxis]\n    y = bbox_height[:, np.newaxis] * cos_angles + center_y[:, np.newaxis]\n    points = np.stack([x, y], axis=-1).reshape(-1, 2)\n\n    # Transform all points at once using the helper function\n    transformed_points = apply_affine_to_points(points, matrix)\n\n    transformed_points = transformed_points.reshape(len(bboxes), -1, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_points[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_points[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_points[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_points[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_affine_largest_box","title":"<code>def bboxes_affine_largest_box    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes and return the largest enclosing boxes.</p> <p>This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The resulting bounding boxes are the smallest axis-aligned boxes that completely   enclose the transformed original boxes. They may be larger than the minimal possible   bounding box if the original box becomes rotated.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method is called \"largest box\" because it returns the largest axis-aligned box   that encloses all corners of the transformed bounding box.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n&gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n&gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n&gt;&gt;&gt; print(transformed_bboxes)\n[[ 25.  25.  45.  45.   1.]\n [ 65.  65.  85.  85.   2.]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_largest_box(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes and return the largest enclosing boxes.\n\n    This function transforms each corner of every bounding box using the given affine transformation\n    matrix, then computes the new bounding boxes that fully enclose the transformed corners.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The resulting bounding boxes are the smallest axis-aligned boxes that completely\n          enclose the transformed original boxes. They may be larger than the minimal possible\n          bounding box if the original box becomes rotated.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method is called \"largest box\" because it returns the largest axis-aligned box\n          that encloses all corners of the transformed bounding box.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n        &gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n        &gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n        &gt;&gt;&gt; print(transformed_bboxes)\n        [[ 25.  25.  45.  45.   1.]\n         [ 65.  65.  85.  85.   2.]]\n    \"\"\"\n    # Extract corners of all bboxes\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    corners = (\n        np.array([[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]]).transpose(2, 0, 1).reshape(-1, 2)\n    )\n\n    # Transform all corners at once\n    transformed_corners = apply_affine_to_points(corners, matrix).reshape(-1, 4, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_corners[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_corners[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_corners[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_corners[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_d4","title":"<code>def bboxes_d4    (bboxes, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a bounding box.</p> <p>The function transforms a bounding box according to the specified group member from the <code>D_4</code> group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions.</p> <ul> <li>bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</li> <li>group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.</li> </ul> <ul> <li>BoxInternalType: The transformed bounding box.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Applying a 90-degree rotation:   <code>bbox_d4((10, 20, 110, 120), 'r90')</code>   This would rotate the bounding box 90 degrees within a 100x100 image.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_d4(\n    bboxes: np.ndarray,\n    group_member: D4Type,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a bounding box.\n\n    The function transforms a bounding box according to the specified group member from the `D_4` group.\n    These transformations include rotations and reflections, specified to work on an image's bounding box given\n    its dimensions.\n\n    Parameters:\n    -  bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n    - group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.\n\n    Returns:\n    - BoxInternalType: The transformed bounding box.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Applying a 90-degree rotation:\n      `bbox_d4((10, 20, 110, 120), 'r90')`\n      This would rotate the bounding box 90 degrees within a 100x100 image.\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: bboxes_rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: bboxes_rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: bboxes_rot90(x, 3),  # Rotate 270 degrees\n        \"v\": lambda x: bboxes_vflip(x),  # Vertical flip\n        \"hvt\": lambda x: bboxes_transpose(\n            bboxes_rot90(x, 2),\n        ),  # Reflect over anti-diagonal\n        \"h\": lambda x: bboxes_hflip(x),  # Horizontal flip\n        \"t\": lambda x: bboxes_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](bboxes)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_grid_shuffle","title":"<code>def bboxes_grid_shuffle    (bboxes, tiles, mapping, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Apply grid shuffle transformation to bounding boxes.</p> <p>This function transforms bounding boxes according to a grid shuffle operation. It handles cases where bounding boxes may be split into multiple components after shuffling and applies filtering based on minimum area and visibility requirements.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) where N is the number of boxes.    Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents    optional additional fields (e.g., class_id, score).</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tile coordinates with shape (M, 4) where M is the number of tiles.    Each tile is in format [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int]</code> <p>List of indices defining how tiles should be rearranged. Each index i in the list     contains the index of the tile that should be moved to position i.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area threshold in pixels. If a component's area after shuffling is      smaller than this value, it will be filtered out. If None, no area filtering      is applied.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio threshold in range [0, 1]. Calculated as            (component_area / original_area). If a component's visibility is lower            than this value, it will be filtered out. If None, no visibility            filtering is applied.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of transformed bounding boxes with shape (K, 4+) where K is the            number of valid components after shuffling and filtering. The format of            each box matches the input format, preserving any additional fields.            If no valid components remain after filtering, returns an empty array            with shape (0, C) where C matches the input column count.</p> <p>Note</p> <ul> <li>The function converts bboxes to masks before applying the transformation to handle   cases where boxes may be split into multiple components.</li> <li>After shuffling, each component is validated against min_area and min_visibility   requirements independently.</li> <li>Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and   copied to all components derived from the same original bbox.</li> <li>Empty input arrays are handled gracefully and return empty arrays of the   appropriate shape.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n&gt;&gt;&gt; tiles = np.array([\n...     [0, 0, 50, 50],    # top-left tile\n...     [0, 50, 50, 100],  # top-right tile\n...     [50, 0, 100, 50],  # bottom-left tile\n...     [50, 50, 100, 100] # bottom-right tile\n... ])\n&gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n&gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n&gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_grid_shuffle(\n    bboxes: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int],\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply grid shuffle transformation to bounding boxes.\n\n    This function transforms bounding boxes according to a grid shuffle operation. It handles cases\n    where bounding boxes may be split into multiple components after shuffling and applies\n    filtering based on minimum area and visibility requirements.\n\n    Args:\n        bboxes: Array of bounding boxes with shape (N, 4+) where N is the number of boxes.\n               Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents\n               optional additional fields (e.g., class_id, score).\n        tiles: Array of tile coordinates with shape (M, 4) where M is the number of tiles.\n               Each tile is in format [start_y, start_x, end_y, end_x].\n        mapping: List of indices defining how tiles should be rearranged. Each index i in the list\n                contains the index of the tile that should be moved to position i.\n        image_shape: Shape of the image as (height, width).\n        min_area: Minimum area threshold in pixels. If a component's area after shuffling is\n                 smaller than this value, it will be filtered out. If None, no area filtering\n                 is applied.\n        min_visibility: Minimum visibility ratio threshold in range [0, 1]. Calculated as\n                       (component_area / original_area). If a component's visibility is lower\n                       than this value, it will be filtered out. If None, no visibility\n                       filtering is applied.\n\n    Returns:\n        np.ndarray: Array of transformed bounding boxes with shape (K, 4+) where K is the\n                   number of valid components after shuffling and filtering. The format of\n                   each box matches the input format, preserving any additional fields.\n                   If no valid components remain after filtering, returns an empty array\n                   with shape (0, C) where C matches the input column count.\n\n    Note:\n        - The function converts bboxes to masks before applying the transformation to handle\n          cases where boxes may be split into multiple components.\n        - After shuffling, each component is validated against min_area and min_visibility\n          requirements independently.\n        - Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and\n          copied to all components derived from the same original bbox.\n        - Empty input arrays are handled gracefully and return empty arrays of the\n          appropriate shape.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n        &gt;&gt;&gt; tiles = np.array([\n        ...     [0, 0, 50, 50],    # top-left tile\n        ...     [0, 50, 50, 100],  # top-right tile\n        ...     [50, 0, 100, 50],  # bottom-left tile\n        ...     [50, 50, 100, 100] # bottom-right tile\n        ... ])\n        &gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n        &gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n        &gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n    \"\"\"\n    # Convert bboxes to masks\n    masks = masks_from_bboxes(bboxes, image_shape)\n\n    # Apply grid shuffle to each mask and handle split components\n    all_component_masks = []\n    extra_bbox_data = []  # Store additional bbox data for each component\n\n    for idx, mask in enumerate(masks):\n        original_area = np.sum(mask)  # Get original mask area\n\n        # Shuffle the mask\n        shuffled_mask = swap_tiles_on_image(mask, tiles, mapping)\n\n        # Find connected components\n        num_components, components = cv2.connectedComponents(\n            shuffled_mask.astype(np.uint8),\n        )\n\n        # For each component, create a separate binary mask\n        for comp_idx in range(1, num_components):  # Skip background (0)\n            component_mask = (components == comp_idx).astype(np.uint8)\n\n            # Calculate area and visibility ratio\n            component_area = np.sum(component_mask)\n            # Check if component meets minimum requirements\n            if is_valid_component(\n                component_area,\n                original_area,\n                min_area,\n                min_visibility,\n            ):\n                all_component_masks.append(component_mask)\n                # Append additional bbox data for this component\n                if bboxes.shape[1] &gt; NUM_BBOXES_COLUMNS_IN_ALBUMENTATIONS:\n                    extra_bbox_data.append(bboxes[idx, 4:])\n\n    # Convert all component masks to bboxes\n    if all_component_masks:\n        all_component_masks = np.array(all_component_masks)\n        shuffled_bboxes = bboxes_from_masks(all_component_masks)\n\n        # Add back additional bbox data if present\n        if extra_bbox_data:\n            extra_bbox_data = np.array(extra_bbox_data)\n            return np.column_stack([shuffled_bboxes, extra_bbox_data])\n    else:\n        # Handle case where no valid components were found\n        return np.zeros((0, bboxes.shape[1]), dtype=bboxes.dtype)\n\n    return shuffled_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_hflip","title":"<code>def bboxes_hflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of horizontally flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_hflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally around the y-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of horizontally flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 0] = 1 - bboxes[:, 2]  # new x_min = 1 - x_max\n    flipped_bboxes[:, 2] = 1 - bboxes[:, 0]  # new x_max = 1 - x_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_rot90","title":"<code>def bboxes_rot90    (bboxes, factor)    </code> [view source on GitHub]","text":"<p>Rotates bounding boxes by 90 degrees CCW (see np.rot90)</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <code>factor</code> <code>int</code> <p>Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of rotated bounding boxes with the same shape as input.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If factor is not in set {0, 1, 2, 3}.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_rot90(bboxes: np.ndarray, factor: int) -&gt; np.ndarray:\n    \"\"\"Rotates bounding boxes by 90 degrees CCW (see np.rot90)\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n        factor: Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.\n\n    Returns:\n        np.ndarray: A numpy array of rotated bounding boxes with the same shape as input.\n\n    Raises:\n        ValueError: If factor is not in set {0, 1, 2, 3}.\n    \"\"\"\n    if factor not in {0, 1, 2, 3}:\n        raise ValueError(\"Parameter factor must be in set {0, 1, 2, 3}\")\n\n    if factor == 0:\n        return bboxes\n\n    rotated_bboxes = bboxes.copy()\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    if factor == 1:\n        rotated_bboxes[:, 0] = y_min\n        rotated_bboxes[:, 1] = 1 - x_max\n        rotated_bboxes[:, 2] = y_max\n        rotated_bboxes[:, 3] = 1 - x_min\n    elif factor == ROT90_180_FACTOR:\n        rotated_bboxes[:, 0] = 1 - x_max\n        rotated_bboxes[:, 1] = 1 - y_max\n        rotated_bboxes[:, 2] = 1 - x_min\n        rotated_bboxes[:, 3] = 1 - y_min\n    elif factor == ROT90_270_FACTOR:\n        rotated_bboxes[:, 0] = 1 - y_max\n        rotated_bboxes[:, 1] = x_min\n        rotated_bboxes[:, 2] = 1 - y_min\n        rotated_bboxes[:, 3] = x_max\n\n    return rotated_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_transpose","title":"<code>def bboxes_transpose    (bboxes)    </code> [view source on GitHub]","text":"<p>Transpose bounding boxes by swapping x and y coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of transposed bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_transpose(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transpose bounding boxes by swapping x and y coordinates.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of transposed bounding boxes with the same shape as input.\n    \"\"\"\n    transposed_bboxes = bboxes.copy()\n    transposed_bboxes[:, [0, 1, 2, 3]] = bboxes[:, [1, 0, 3, 2]]\n\n    return transposed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_vflip","title":"<code>def bboxes_vflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of vertically flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_vflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes vertically around the x-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of vertically flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 1] = 1 - bboxes[:, 3]  # new y_min = 1 - y_max\n    flipped_bboxes[:, 3] = 1 - bboxes[:, 1]  # new y_max = 1 - y_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.calculate_affine_transform_padding","title":"<code>def calculate_affine_transform_padding    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the necessary padding for an affine transformation to avoid empty spaces.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def calculate_affine_transform_padding(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate the necessary padding for an affine transformation to avoid empty spaces.\"\"\"\n    height, width = image_shape[:2]\n\n    # Check for identity transform\n    if is_identity_matrix(matrix):\n        return (0, 0, 0, 0)\n\n    # Original corners\n    corners = np.array([[0, 0], [width, 0], [width, height], [0, height]])\n\n    # Transform corners\n    transformed_corners = apply_affine_to_points(corners, matrix)\n\n    # Ensure transformed_corners is 2D\n    transformed_corners = transformed_corners.reshape(-1, 2)\n\n    # Find box that includes both original and transformed corners\n    all_corners = np.vstack((corners, transformed_corners))\n    min_x, min_y = all_corners.min(axis=0)\n    max_x, max_y = all_corners.max(axis=0)\n\n    # Compute the inverse transform\n    inverse_matrix = np.linalg.inv(matrix)\n\n    # Apply inverse transform to all corners of the bounding box\n    bbox_corners = np.array(\n        [[min_x, min_y], [max_x, min_y], [max_x, max_y], [min_x, max_y]],\n    )\n    inverse_corners = apply_affine_to_points(bbox_corners, inverse_matrix).reshape(\n        -1,\n        2,\n    )\n\n    min_x, min_y = inverse_corners.min(axis=0)\n    max_x, max_y = inverse_corners.max(axis=0)\n\n    pad_left = max(0, math.ceil(0 - min_x))\n    pad_right = max(0, math.ceil(max_x - width))\n    pad_top = max(0, math.ceil(0 - min_y))\n    pad_bottom = max(0, math.ceil(max_y - height))\n\n    return pad_left, pad_right, pad_top, pad_bottom\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.center","title":"<code>def center    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates if image. Used by images, masks and keypoints.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates if image. Used by images, masks and keypoints.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2 - 0.5, height / 2 - 0.5\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.center_bbox","title":"<code>def center_bbox    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates for of image for bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center_bbox(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates for of image for bounding boxes.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2, height / 2\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.compute_tps_weights","title":"<code>def compute_tps_weights    (src_points, dst_points)    </code> [view source on GitHub]","text":"<p>Compute Thin Plate Spline weights.</p> <p>Parameters:</p> Name Type Description <code>src_points</code> <code>np.ndarray</code> <p>Source control points with shape (num_points, 2)</p> <code>dst_points</code> <code>np.ndarray</code> <p>Destination control points with shape (num_points, 2)</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)</li> <li>affine_weights: Weights for affine transformation (3, 2)     [constant term, x scale/shear, y scale/shear]</li> </ul> <p>Note</p> <p>The TPS interpolation is decomposed into: 1. Nonlinear part (controlled by kernel weights) 2. Affine part (global scaling, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_tps_weights(\n    src_points: np.ndarray,\n    dst_points: np.ndarray,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute Thin Plate Spline weights.\n\n    Args:\n        src_points: Source control points with shape (num_points, 2)\n        dst_points: Destination control points with shape (num_points, 2)\n\n    Returns:\n        tuple of:\n        - nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)\n        - affine_weights: Weights for affine transformation (3, 2)\n            [constant term, x scale/shear, y scale/shear]\n\n    Note:\n        The TPS interpolation is decomposed into:\n        1. Nonlinear part (controlled by kernel weights)\n        2. Affine part (global scaling, rotation, translation)\n    \"\"\"\n    num_points = src_points.shape[0]\n\n    # Compute pairwise distances\n    distances = np.linalg.norm(src_points[:, None] - src_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    # Add small epsilon to avoid log(0)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Construct affine terms matrix [1, x, y]\n    affine_terms = np.ones((num_points, 3))\n    affine_terms[:, 1:] = src_points\n\n    # Build system matrix\n    system_matrix = np.zeros((num_points + 3, num_points + 3))\n    system_matrix[:num_points, :num_points] = kernel_matrix\n    system_matrix[:num_points, num_points:] = affine_terms\n    system_matrix[num_points:, :num_points] = affine_terms.T\n\n    # Right-hand side of the system\n    target_coords = np.zeros((num_points + 3, 2))\n    target_coords[:num_points] = dst_points\n\n    # Solve the system for both x and y coordinates\n    all_weights = np.linalg.solve(system_matrix, target_coords)\n\n    # Split weights into nonlinear and affine components\n    nonlinear_weights = all_weights[:num_points]\n    affine_weights = all_weights[num_points:]\n\n    return nonlinear_weights, affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.compute_transformed_image_bounds","title":"<code>def compute_transformed_image_bounds    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Compute the bounds of an image after applying an affine transformation.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing:     - min_coords: An array with the minimum x and y coordinates.     - max_coords: An array with the maximum x and y coordinates.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_transformed_image_bounds(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute the bounds of an image after applying an affine transformation.\n\n    Args:\n        matrix (np.ndarray): The 3x3 affine transformation matrix.\n        image_shape (Tuple[int, int]): The shape of the image as (height, width).\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing:\n            - min_coords: An array with the minimum x and y coordinates.\n            - max_coords: An array with the maximum x and y coordinates.\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Define the corners of the image\n    corners = np.array([[0, 0, 1], [width, 0, 1], [width, height, 1], [0, height, 1]])\n\n    # Transform the corners\n    transformed_corners = corners @ matrix.T\n    transformed_corners = transformed_corners[:, :2] / transformed_corners[:, 2:]\n\n    # Calculate the bounding box of the transformed corners\n    min_coords = np.floor(transformed_corners.min(axis=0)).astype(int)\n    max_coords = np.ceil(transformed_corners.max(axis=0)).astype(int)\n\n    return min_coords, max_coords\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.create_affine_transformation_matrix","title":"<code>def create_affine_transformation_matrix    (translate, shear, scale, rotate, shift)    </code> [view source on GitHub]","text":"<p>Create an affine transformation matrix combining translation, shear, scale, and rotation.</p> <p>Parameters:</p> Name Type Description <code>translate</code> <code>dict[str, float]</code> <p>Translation in x and y directions.</p> <code>shear</code> <code>dict[str, float]</code> <p>Shear in x and y directions (in degrees).</p> <code>scale</code> <code>dict[str, float]</code> <p>Scale factors for x and y directions.</p> <code>rotate</code> <code>float</code> <p>Rotation angle in degrees.</p> <code>shift</code> <code>tuple[float, float]</code> <p>Shift to apply before and after transformations.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The resulting 3x3 affine transformation matrix.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_affine_transformation_matrix(\n    translate: XYInt,\n    shear: XYFloat,\n    scale: XYFloat,\n    rotate: float,\n    shift: tuple[float, float],\n) -&gt; np.ndarray:\n    \"\"\"Create an affine transformation matrix combining translation, shear, scale, and rotation.\n\n    Args:\n        translate (dict[str, float]): Translation in x and y directions.\n        shear (dict[str, float]): Shear in x and y directions (in degrees).\n        scale (dict[str, float]): Scale factors for x and y directions.\n        rotate (float): Rotation angle in degrees.\n        shift (tuple[float, float]): Shift to apply before and after transformations.\n\n    Returns:\n        np.ndarray: The resulting 3x3 affine transformation matrix.\n    \"\"\"\n    # Convert angles to radians\n    rotate_rad = np.deg2rad(rotate % 360)\n\n    shear_x_rad = np.deg2rad(shear[\"x\"])\n    shear_y_rad = np.deg2rad(shear[\"y\"])\n\n    # Create individual transformation matrices\n    # 1. Shift to top-left\n    m_shift_topleft = np.array([[1, 0, -shift[0]], [0, 1, -shift[1]], [0, 0, 1]])\n\n    # 2. Scale\n    m_scale = np.array([[scale[\"x\"], 0, 0], [0, scale[\"y\"], 0], [0, 0, 1]])\n\n    # 3. Rotation\n    m_rotate = np.array(\n        [\n            [np.cos(rotate_rad), np.sin(rotate_rad), 0],\n            [-np.sin(rotate_rad), np.cos(rotate_rad), 0],\n            [0, 0, 1],\n        ],\n    )\n\n    # 4. Shear\n    m_shear = np.array(\n        [[1, np.tan(shear_x_rad), 0], [np.tan(shear_y_rad), 1, 0], [0, 0, 1]],\n    )\n\n    # 5. Translation\n    m_translate = np.array([[1, 0, translate[\"x\"]], [0, 1, translate[\"y\"]], [0, 0, 1]])\n\n    # 6. Shift back to center\n    m_shift_center = np.array([[1, 0, shift[0]], [0, 1, shift[1]], [0, 0, 1]])\n\n    # Combine all transformations\n    # The order is important: transformations are applied from right to left\n    m = m_shift_center @ m_translate @ m_shear @ m_rotate @ m_scale @ m_shift_topleft\n\n    # Ensure the last row is exactly [0, 0, 1]\n    m[2] = [0, 0, 1]\n\n    return m\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.create_piecewise_affine_maps","title":"<code>def create_piecewise_affine_maps    (image_shape, grid, scale, absolute_scale, random_generator)    </code> [view source on GitHub]","text":"<p>Create maps for piecewise affine transformation using OpenCV's remap function.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_piecewise_affine_maps(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    scale: float,\n    absolute_scale: bool,\n    random_generator: np.random.Generator,\n) -&gt; tuple[np.ndarray | None, np.ndarray | None]:\n    \"\"\"Create maps for piecewise affine transformation using OpenCV's remap function.\"\"\"\n    height, width = image_shape[:2]\n    nb_rows, nb_cols = grid\n\n    # Input validation\n    if height &lt;= 0 or width &lt;= 0 or nb_rows &lt;= 0 or nb_cols &lt;= 0:\n        raise ValueError(\"Dimensions must be positive\")\n    if scale &lt;= 0:\n        return None, None\n\n    # Create source points grid\n    y = np.linspace(0, height - 1, nb_rows, dtype=np.float32)\n    x = np.linspace(0, width - 1, nb_cols, dtype=np.float32)\n    xx_src, yy_src = np.meshgrid(x, y)\n\n    # Initialize destination maps at full resolution\n    map_x = np.zeros((height, width), dtype=np.float32)\n    map_y = np.zeros((height, width), dtype=np.float32)\n\n    # Generate jitter for control points\n    jitter_scale = scale / 3 if absolute_scale else scale * min(width, height) / 3\n\n    jitter = random_generator.normal(0, jitter_scale, (nb_rows, nb_cols, 2)).astype(\n        np.float32,\n    )\n\n    # Create control points with jitter\n    control_points = np.zeros((nb_rows * nb_cols, 4), dtype=np.float32)\n    for i in range(nb_rows):\n        for j in range(nb_cols):\n            idx = i * nb_cols + j\n            # Source points\n            control_points[idx, 0] = xx_src[i, j]\n            control_points[idx, 1] = yy_src[i, j]\n            # Destination points with jitter\n            control_points[idx, 2] = np.clip(\n                xx_src[i, j] + jitter[i, j, 1],\n                0,\n                width - 1,\n            )\n            control_points[idx, 3] = np.clip(\n                yy_src[i, j] + jitter[i, j, 0],\n                0,\n                height - 1,\n            )\n\n    # Create full resolution maps\n    for i in range(height):\n        for j in range(width):\n            # Find nearest control points and interpolate\n            dx = j - control_points[:, 0]\n            dy = i - control_points[:, 1]\n            dist = dx * dx + dy * dy\n            weights = 1 / (dist + 1e-8)\n            weights = weights / np.sum(weights)\n\n            map_x[i, j] = np.sum(weights * control_points[:, 2])\n            map_y[i, j] = np.sum(weights * control_points[:, 3])\n\n    # Ensure output is within bounds\n    map_x = np.clip(map_x, 0, width - 1, out=map_x)\n    map_y = np.clip(map_y, 0, height - 1, out=map_y)\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.create_shape_groups","title":"<code>def create_shape_groups    (tiles)    </code> [view source on GitHub]","text":"<p>Groups tiles by their shape and stores the indices for each shape.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_shape_groups(tiles: np.ndarray) -&gt; dict[tuple[int, int], list[int]]:\n    \"\"\"Groups tiles by their shape and stores the indices for each shape.\"\"\"\n    shape_groups = defaultdict(list)\n    for index, (start_y, start_x, end_y, end_x) in enumerate(tiles):\n        shape = (end_y - start_y, end_x - start_x)\n        shape_groups[shape].append(index)\n    return shape_groups\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.d4","title":"<code>def d4    (img, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to an image array.</p> <p>This function manipulates an image using transformations such as rotations and flips, corresponding to the <code>D_4</code> dihedral group symmetry operations. Each transformation is identified by a unique group member code.</p> <ul> <li>img (np.ndarray): The input image array to transform.</li> <li>group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:</li> <li>'e': Identity (no transformation).</li> <li>'r90': Rotate 90 degrees counterclockwise.</li> <li>'r180': Rotate 180 degrees.</li> <li>'r270': Rotate 270 degrees counterclockwise.</li> <li>'v': Vertical flip.</li> <li>'hvt': Transpose over second diagonal</li> <li>'h': Horizontal flip.</li> <li>'t': Transpose (reflect over the main diagonal).</li> </ul> <ul> <li>np.ndarray: The transformed image array.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Rotating an image by 90 degrees:   <code>transformed_image = d4(original_image, 'r90')</code></li> <li>Applying a horizontal flip to an image:   <code>transformed_image = d4(original_image, 'h')</code></li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def d4(img: np.ndarray, group_member: D4Type) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to an image array.\n\n    This function manipulates an image using transformations such as rotations and flips,\n    corresponding to the `D_4` dihedral group symmetry operations.\n    Each transformation is identified by a unique group member code.\n\n    Parameters:\n    - img (np.ndarray): The input image array to transform.\n    - group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:\n      - 'e': Identity (no transformation).\n      - 'r90': Rotate 90 degrees counterclockwise.\n      - 'r180': Rotate 180 degrees.\n      - 'r270': Rotate 270 degrees counterclockwise.\n      - 'v': Vertical flip.\n      - 'hvt': Transpose over second diagonal\n      - 'h': Horizontal flip.\n      - 't': Transpose (reflect over the main diagonal).\n\n    Returns:\n    - np.ndarray: The transformed image array.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Rotating an image by 90 degrees:\n      `transformed_image = d4(original_image, 'r90')`\n    - Applying a horizontal flip to an image:\n      `transformed_image = d4(original_image, 'h')`\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: rot90(x, 3),  # Rotate 270 degrees\n        \"v\": vflip,  # Vertical flip\n        \"hvt\": lambda x: transpose(rot90(x, 2)),  # Reflect over anti-diagonal\n        \"h\": hflip,  # Horizontal flip\n        \"t\": transpose,  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](img)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.distort_image","title":"<code>def distort_image    (image, generated_mesh, interpolation)    </code> [view source on GitHub]","text":"<p>Apply perspective distortion to an image based on a generated mesh.</p> <p>This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>The input image to be distorted. Can be a 2D grayscale image or a                 3D color image.</p> <code>generated_mesh</code> <code>np.ndarray</code> <p>A 2D array where each row represents a quadrilateral cell                         as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].                         The first four values define the source rectangle, and the last eight values                         define the destination quadrilateral.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used in the perspective transformation.                  Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The distorted image with the same shape and dtype as the input image.</p> <p>Note</p> <ul> <li>The function preserves the channel dimension of the input image.</li> <li>Each cell of the generated mesh is transformed independently and then blended into the output image.</li> <li>The distortion is applied using perspective transformation, which allows for more complex   distortions compared to affine transformations.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n&gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n&gt;&gt;&gt; distorted.shape\n(100, 100, 3)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@preserve_channel_dim\ndef distort_image(\n    image: np.ndarray,\n    generated_mesh: np.ndarray,\n    interpolation: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply perspective distortion to an image based on a generated mesh.\n\n    This function applies a perspective transformation to each cell of the image defined by the\n    generated mesh. The distortion is applied using OpenCV's perspective transformation and\n    blending techniques.\n\n    Args:\n        image (np.ndarray): The input image to be distorted. Can be a 2D grayscale image or a\n                            3D color image.\n        generated_mesh (np.ndarray): A 2D array where each row represents a quadrilateral cell\n                                    as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].\n                                    The first four values define the source rectangle, and the last eight values\n                                    define the destination quadrilateral.\n        interpolation (int): Interpolation method to be used in the perspective transformation.\n                             Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).\n\n    Returns:\n        np.ndarray: The distorted image with the same shape and dtype as the input image.\n\n    Note:\n        - The function preserves the channel dimension of the input image.\n        - Each cell of the generated mesh is transformed independently and then blended into the output image.\n        - The distortion is applied using perspective transformation, which allows for more complex\n          distortions compared to affine transformations.\n\n    Example:\n        &gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n        &gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n        &gt;&gt;&gt; distorted.shape\n        (100, 100, 3)\n    \"\"\"\n    distorted_image = np.zeros_like(image)\n\n    for mesh in generated_mesh:\n        # Extract source rectangle and destination quadrilateral\n        x1, y1, x2, y2 = mesh[:4]  # Source rectangle\n        dst_quad = mesh[4:].reshape(4, 2)  # Destination quadrilateral\n\n        # Convert source rectangle to quadrilateral\n        src_quad = np.array(\n            [\n                [x1, y1],  # Top-left\n                [x2, y1],  # Top-right\n                [x2, y2],  # Bottom-right\n                [x1, y2],  # Bottom-left\n            ],\n            dtype=np.float32,\n        )\n\n        # Calculate Perspective transformation matrix\n        perspective_mat = cv2.getPerspectiveTransform(src_quad, dst_quad)\n\n        # Apply Perspective transformation\n        warped = cv2.warpPerspective(\n            image,\n            perspective_mat,\n            (image.shape[1], image.shape[0]),\n            flags=interpolation,\n        )\n\n        # Create mask for the transformed region\n        mask = np.zeros(image.shape[:2], dtype=np.uint8)\n        cv2.fillConvexPoly(mask, np.int32(dst_quad), 255)\n\n        # Copy only the warped quadrilateral area to the output image\n        distorted_image = cv2.copyTo(warped, mask, distorted_image)\n\n    return distorted_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.find_keypoint","title":"<code>def find_keypoint    (position, distance_map, threshold, inverted)    </code> [view source on GitHub]","text":"<p>Determine if a valid keypoint can be found at the given position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def find_keypoint(\n    position: tuple[int, int],\n    distance_map: np.ndarray,\n    threshold: float | None,\n    inverted: bool,\n) -&gt; tuple[float, float] | None:\n    \"\"\"Determine if a valid keypoint can be found at the given position.\"\"\"\n    y, x = position\n    value = distance_map[y, x]\n    if not inverted and threshold is not None and value &gt;= threshold:\n        return None\n    if inverted and threshold is not None and value &lt;= threshold:\n        return None\n    return float(x), float(y)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.flip_bboxes","title":"<code>def flip_bboxes    (bboxes, flip_horizontal=False, flip_vertical=False, image_shape=(0, 0))    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally and/or vertically.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...].</p> <code>flip_horizontal</code> <code>bool</code> <p>Whether to flip horizontally.</p> <code>flip_vertical</code> <code>bool</code> <p>Whether to flip vertically.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Flipped bounding boxes.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef flip_bboxes(\n    bboxes: np.ndarray,\n    flip_horizontal: bool = False,\n    flip_vertical: bool = False,\n    image_shape: tuple[int, int] = (0, 0),\n) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally and/or vertically.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where each row is\n            [x_min, y_min, x_max, y_max, ...].\n        flip_horizontal (bool): Whether to flip horizontally.\n        flip_vertical (bool): Whether to flip vertically.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Flipped bounding boxes.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    flipped_bboxes = bboxes.copy()\n    if flip_horizontal:\n        flipped_bboxes[:, [0, 2]] = cols - flipped_bboxes[:, [2, 0]]\n    if flip_vertical:\n        flipped_bboxes[:, [1, 3]] = rows - flipped_bboxes[:, [3, 1]]\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.from_distance_maps","title":"<code>def from_distance_maps    (distance_maps, inverted, if_not_found_coords=None, threshold=None)    </code> [view source on GitHub]","text":"<p>Convert distance maps back to keypoints coordinates.</p> <p>This function is the inverse of <code>to_distance_maps</code>. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.</p> <p>Parameters:</p> Name Type Description <code>distance_maps</code> <code>np.ndarray</code> <p>A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint.</p> <code>inverted</code> <code>bool</code> <p>If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity).</p> <code>if_not_found_coords</code> <code>Sequence[int] | dict[str, Any] | None</code> <p>Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None.</p> <code>threshold</code> <code>float | None</code> <p>A threshold value to determine valid keypoints. For inverted maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates of the reconstructed keypoints. If <code>drop_if_not_found</code> is True (derived from if_not_found_coords), the output may have fewer rows than input keypoints.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input <code>distance_maps</code> is not a 3D array.</p> <p>Notes</p> <ul> <li>The function uses vectorized operations for improved performance, especially with large numbers of keypoints.</li> <li>When <code>threshold</code> is None, all keypoints are considered valid, and <code>if_not_found_coords</code> is not used.</li> <li>The function assumes that the input distance maps are properly normalized and scaled according to the   original image dimensions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n&gt;&gt;&gt; inverted = True\n&gt;&gt;&gt; if_not_found_coords = [0, 0]\n&gt;&gt;&gt; threshold = 0.5\n&gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n&gt;&gt;&gt; print(keypoints.shape)\n(3, 2)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def from_distance_maps(\n    distance_maps: np.ndarray,\n    inverted: bool,\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None = None,\n    threshold: float | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Convert distance maps back to keypoints coordinates.\n\n    This function is the inverse of `to_distance_maps`. It takes distance maps generated for a set of keypoints\n    and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps,\n    and can handle cases where keypoints are not found or fall outside a specified threshold.\n\n    Args:\n        distance_maps (np.ndarray): A 3D numpy array of shape (height, width, nb_keypoints) containing\n            distance maps for each keypoint. Each channel represents the distance map for one keypoint.\n        inverted (bool): If True, treats the distance maps as inverted (where higher values indicate\n            closer proximity to keypoints). If False, treats them as regular distance maps (where lower\n            values indicate closer proximity).\n        if_not_found_coords (Sequence[int] | dict[str, Any] | None, optional): Coordinates to use for\n            keypoints that are not found or fall outside the threshold. Can be:\n            - None: Drop keypoints that are not found.\n            - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints.\n            - Dict with 'x' and 'y' keys: Use these values for not found keypoints.\n            Defaults to None.\n        threshold (float | None, optional): A threshold value to determine valid keypoints. For inverted\n            maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are\n            considered valid. If None, all keypoints are considered valid. Defaults to None.\n\n    Returns:\n        np.ndarray: A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates\n        of the reconstructed keypoints. If `drop_if_not_found` is True (derived from if_not_found_coords),\n        the output may have fewer rows than input keypoints.\n\n    Raises:\n        ValueError: If the input `distance_maps` is not a 3D array.\n\n    Notes:\n        - The function uses vectorized operations for improved performance, especially with large numbers of keypoints.\n        - When `threshold` is None, all keypoints are considered valid, and `if_not_found_coords` is not used.\n        - The function assumes that the input distance maps are properly normalized and scaled according to the\n          original image dimensions.\n\n    Example:\n        &gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n        &gt;&gt;&gt; inverted = True\n        &gt;&gt;&gt; if_not_found_coords = [0, 0]\n        &gt;&gt;&gt; threshold = 0.5\n        &gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n        &gt;&gt;&gt; print(keypoints.shape)\n        (3, 2)\n    \"\"\"\n    if distance_maps.ndim != NUM_MULTI_CHANNEL_DIMENSIONS:\n        msg = f\"Expected three-dimensional input, got {distance_maps.ndim} dimensions and shape {distance_maps.shape}.\"\n        raise ValueError(msg)\n    height, width, nb_keypoints = distance_maps.shape\n\n    drop_if_not_found, if_not_found_x, if_not_found_y = validate_if_not_found_coords(\n        if_not_found_coords,\n    )\n\n    # Find the indices of max/min values for all keypoints at once\n    if inverted:\n        hitidx_flat = np.argmax(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n    else:\n        hitidx_flat = np.argmin(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n\n    # Convert flat indices to 2D coordinates\n    hitidx_y, hitidx_x = np.unravel_index(hitidx_flat, (height, width))\n\n    # Create keypoints array\n    keypoints = np.column_stack((hitidx_x, hitidx_y)).astype(float)\n\n    if threshold is not None:\n        # Check threshold condition\n        if inverted:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &gt;= threshold\n        else:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &lt;= threshold\n\n        if not drop_if_not_found:\n            # Replace invalid keypoints with if_not_found_coords\n            keypoints[~valid_mask] = [if_not_found_x, if_not_found_y]\n        else:\n            # Keep only valid keypoints\n            return keypoints[valid_mask]\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_displacement_fields","title":"<code>def generate_displacement_fields    (image_shape, alpha, sigma, same_dxdy, kernel_size, random_generator, noise_distribution)    </code> [view source on GitHub]","text":"<p>Generate displacement fields for elastic transform.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width)</p> <code>alpha</code> <code>float</code> <p>Scaling factor for displacement</p> <code>sigma</code> <code>float</code> <p>Standard deviation for Gaussian blur</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use same displacement field for both directions</p> <code>kernel_size</code> <code>tuple[int, int]</code> <p>Size of Gaussian blur kernel</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random number generator</p> <code>noise_distribution</code> <code>Literal['gaussian', 'uniform']</code> <p>Type of noise distribution to use (\"gaussian\" or \"uniform\")</p> <p>Returns:</p> Type Description <code>tuple</code> <p>(dx, dy) displacement fields</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_displacement_fields(\n    image_shape: tuple[int, int],\n    alpha: float,\n    sigma: float,\n    same_dxdy: bool,\n    kernel_size: tuple[int, int],\n    random_generator: np.random.Generator,\n    noise_distribution: Literal[\"gaussian\", \"uniform\"],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate displacement fields for elastic transform.\n\n    Args:\n        image_shape: Shape of the image (height, width)\n        alpha: Scaling factor for displacement\n        sigma: Standard deviation for Gaussian blur\n        same_dxdy: Whether to use same displacement field for both directions\n        kernel_size: Size of Gaussian blur kernel\n        random_generator: NumPy random number generator\n        noise_distribution: Type of noise distribution to use (\"gaussian\" or \"uniform\")\n\n    Returns:\n        tuple: (dx, dy) displacement fields\n    \"\"\"\n\n    def generate_noise_field() -&gt; np.ndarray:\n        # Generate noise based on distribution type\n        if noise_distribution == \"gaussian\":\n            field = random_generator.standard_normal(size=image_shape[:2])\n        else:  # uniform\n            field = random_generator.uniform(low=-1, high=1, size=image_shape[:2])\n\n        # Common operations for both distributions\n        field = field.astype(np.float32)\n        cv2.GaussianBlur(field, kernel_size, sigma, dst=field)\n        return field * alpha\n\n    # Generate first displacement field\n    dx = generate_noise_field()\n\n    # Generate or copy second displacement field\n    dy = dx if same_dxdy else generate_noise_field()\n\n    return dx, dy\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons","title":"<code>def generate_distorted_grid_polygons    (dimensions, magnitude, random_generator)    </code> [view source on GitHub]","text":"<p>Generate distorted grid polygons based on input dimensions and magnitude.</p> <p>This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.</p> <p>Parameters:</p> Name Type Description <code>dimensions</code> <code>np.ndarray</code> <p>A 3D array of shape (grid_height, grid_width, 4) where each element                      is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. The actual displacement              will be randomly chosen in the range [-magnitude, magnitude].</p> <code>random_generator</code> <code>np.random.Generator</code> <p>A random number generator.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D array of shape (total_cells, 8) where each row represents a distorted polygon             as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.</p> <p>Note</p> <ul> <li>Only internal grid points are distorted; boundary points remain fixed.</li> <li>The function ensures consistent distortion across shared vertices of adjacent cells.</li> <li>The distortion is applied to the following points of each internal cell:<ul> <li>Bottom-right of the cell above and to the left</li> <li>Bottom-left of the cell above</li> <li>Top-right of the cell to the left</li> <li>Top-left of the current cell</li> </ul> </li> <li>Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.     +--+--+--+--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--+--+--+--+</li> <li>For each X, the coordinates of the left, right, top, and bottom edges   in the four adjacent cells are displaced.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n&gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n&gt;&gt;&gt; distorted.shape\n(4, 8)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_distorted_grid_polygons(\n    dimensions: np.ndarray,\n    magnitude: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate distorted grid polygons based on input dimensions and magnitude.\n\n    This function creates a grid of polygons and applies random distortions to the internal vertices,\n    while keeping the boundary vertices fixed. The distortion is applied consistently across shared\n    vertices to avoid gaps or overlaps in the resulting grid.\n\n    Args:\n        dimensions (np.ndarray): A 3D array of shape (grid_height, grid_width, 4) where each element\n                                 is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.\n        magnitude (int): Maximum pixel-wise displacement for distortion. The actual displacement\n                         will be randomly chosen in the range [-magnitude, magnitude].\n        random_generator (np.random.Generator): A random number generator.\n\n    Returns:\n        np.ndarray: A 2D array of shape (total_cells, 8) where each row represents a distorted polygon\n                    as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.\n\n    Note:\n        - Only internal grid points are distorted; boundary points remain fixed.\n        - The function ensures consistent distortion across shared vertices of adjacent cells.\n        - The distortion is applied to the following points of each internal cell:\n            * Bottom-right of the cell above and to the left\n            * Bottom-left of the cell above\n            * Top-right of the cell to the left\n            * Top-left of the current cell\n        - Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.\n            +--+--+--+--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--+--+--+--+\n        - For each X, the coordinates of the left, right, top, and bottom edges\n          in the four adjacent cells are displaced.\n\n    Example:\n        &gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n        ...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n        &gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n        &gt;&gt;&gt; distorted.shape\n        (4, 8)\n    \"\"\"\n    grid_height, grid_width = dimensions.shape[:2]\n    total_cells = grid_height * grid_width\n\n    # Initialize polygons\n    polygons = np.zeros((total_cells, 8), dtype=np.float32)\n    polygons[:, 0:2] = dimensions.reshape(-1, 4)[:, [0, 1]]  # x1, y1\n    polygons[:, 2:4] = dimensions.reshape(-1, 4)[:, [2, 1]]  # x2, y1\n    polygons[:, 4:6] = dimensions.reshape(-1, 4)[:, [2, 3]]  # x2, y2\n    polygons[:, 6:8] = dimensions.reshape(-1, 4)[:, [0, 3]]  # x1, y2\n\n    # Generate displacements for internal grid points only\n    internal_points_height, internal_points_width = grid_height - 1, grid_width - 1\n    displacements = random_generator.integers(\n        -magnitude,\n        magnitude + 1,\n        size=(internal_points_height, internal_points_width, 2),\n    ).astype(np.float32)\n\n    # Apply displacements to internal polygon vertices\n    for i in range(1, grid_height):\n        for j in range(1, grid_width):\n            dx, dy = displacements[i - 1, j - 1]\n\n            # Bottom-right of cell (i-1, j-1)\n            polygons[(i - 1) * grid_width + (j - 1), 4:6] += [dx, dy]\n\n            # Bottom-left of cell (i-1, j)\n            polygons[(i - 1) * grid_width + j, 6:8] += [dx, dy]\n\n            # Top-right of cell (i, j-1)\n            polygons[i * grid_width + (j - 1), 2:4] += [dx, dy]\n\n            # Top-left of cell (i, j)\n            polygons[i * grid_width + j, 0:2] += [dx, dy]\n\n    return polygons\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_grid","title":"<code>def generate_grid    (image_shape, steps_x, steps_y, num_steps)    </code> [view source on GitHub]","text":"<p>Generate a distorted grid for image transformation based on given step sizes.</p> <p>This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>steps_x</code> <code>list[float]</code> <p>List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction.</p> <code>steps_y</code> <code>list[float]</code> <p>List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction.</p> <code>num_steps</code> <code>int</code> <p>The number of steps to divide each axis into. This determines the granularity of the distortion grid.</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing two 2D numpy arrays:     - map_x: A 2D array of float32 values representing the x-coordinates       of the distorted grid.     - map_y: A 2D array of float32 values representing the y-coordinates       of the distorted grid.</p> <p>Note</p> <ul> <li>The function generates a grid where each cell can be distorted independently.</li> <li>The distortion is controlled by the steps_x and steps_y parameters, which   determine how much each grid line is shifted.</li> <li>The resulting map_x and map_y can be used directly with cv2.remap() to   apply the distortion to an image.</li> <li>The distortion is applied smoothly across each grid cell using linear   interpolation.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n&gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n&gt;&gt;&gt; num_steps = 5\n&gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n&gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_grid(\n    image_shape: tuple[int, int],\n    steps_x: list[float],\n    steps_y: list[float],\n    num_steps: int,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate a distorted grid for image transformation based on given step sizes.\n\n    This function creates two 2D arrays (map_x and map_y) that represent a distorted version\n    of the original image grid. These arrays can be used with OpenCV's remap function to\n    apply grid distortion to an image.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        steps_x (list[float]): List of step sizes for the x-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the x direction.\n        steps_y (list[float]): List of step sizes for the y-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the y direction.\n        num_steps (int): The number of steps to divide each axis into. This determines\n            the granularity of the distortion grid.\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing two 2D numpy arrays:\n            - map_x: A 2D array of float32 values representing the x-coordinates\n              of the distorted grid.\n            - map_y: A 2D array of float32 values representing the y-coordinates\n              of the distorted grid.\n\n    Note:\n        - The function generates a grid where each cell can be distorted independently.\n        - The distortion is controlled by the steps_x and steps_y parameters, which\n          determine how much each grid line is shifted.\n        - The resulting map_x and map_y can be used directly with cv2.remap() to\n          apply the distortion to an image.\n        - The distortion is applied smoothly across each grid cell using linear\n          interpolation.\n\n    Example:\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n        &gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n        &gt;&gt;&gt; num_steps = 5\n        &gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n        &gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n    \"\"\"\n    height, width = image_shape[:2]\n    x_step = width // num_steps\n    xx = np.zeros(width, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_x):\n        x = idx * x_step\n        start = int(x)\n        end = min(int(x) + x_step, width)\n        cur = prev + x_step * step\n        xx[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    y_step = height // num_steps\n    yy = np.zeros(height, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_y):\n        y = idx * y_step\n        start = int(y)\n        end = min(int(y) + y_step, height)\n        cur = prev + y_step * step\n        yy[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    return np.meshgrid(xx, yy)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_reflected_bboxes","title":"<code>def generate_reflected_bboxes    (bboxes, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected bounding boxes for the entire reflection grid.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Original bounding boxes.</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>Grid dimensions and original position.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted bounding boxes for the entire grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_bboxes(\n    bboxes: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected bounding boxes for the entire reflection grid.\n\n    Args:\n        bboxes (np.ndarray): Original bounding boxes.\n        grid_dims (dict[str, tuple[int, int]]): Grid dimensions and original position.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted bounding boxes for the entire grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of bboxes\n    bboxes_hflipped = flip_bboxes(bboxes, flip_horizontal=True, image_shape=image_shape)\n    bboxes_vflipped = flip_bboxes(bboxes, flip_vertical=True, image_shape=image_shape)\n    bboxes_hvflipped = flip_bboxes(\n        bboxes,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [\n            original_col * cols,\n            original_row * rows,\n            original_col * cols,\n            original_row * rows,\n        ],\n    )\n    bboxes = shift_bboxes(bboxes, shift_vector)\n    bboxes_hflipped = shift_bboxes(bboxes_hflipped, shift_vector)\n    bboxes_vflipped = shift_bboxes(bboxes_vflipped, shift_vector)\n    bboxes_hvflipped = shift_bboxes(bboxes_hvflipped, shift_vector)\n\n    new_bboxes = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of bboxes to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes\n            elif (grid_row - original_row) % 2 == 0:\n                current_bboxes = bboxes_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes_vflipped\n            else:\n                current_bboxes = bboxes_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                ],\n            )\n            shifted_bboxes = shift_bboxes(current_bboxes, cell_shift)\n\n            new_bboxes.append(shifted_bboxes)\n\n    result = np.vstack(new_bboxes)\n\n    return shift_bboxes(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_reflected_keypoints","title":"<code>def generate_reflected_keypoints    (keypoints, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected keypoints for the entire reflection grid.</p> <p>This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the <code>center_in_origin</code> parameter.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Original keypoints array of shape (N, 4+), where N is the number of keypoints,                     and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing grid dimensions and original position. It should have the following keys: - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols) - \"original_position\": tuple[int, int] representing (original_row, original_col)</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted keypoints for the entire grid. The shape is             (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.</p> <p>Note</p> <ul> <li>The function handles keypoint flipping and shifting to create a grid of reflected keypoints.</li> <li>It preserves the angle and scale information of the keypoints during transformations.</li> <li>The resulting grid can be either centered at the origin or positioned based on the original grid.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_keypoints(\n    keypoints: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected keypoints for the entire reflection grid.\n\n    This function creates a grid of keypoints by reflecting and shifting the original keypoints.\n    It handles both centered and non-centered grids based on the `center_in_origin` parameter.\n\n    Args:\n        keypoints (np.ndarray): Original keypoints array of shape (N, 4+), where N is the number of keypoints,\n                                and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).\n        grid_dims (dict[str, tuple[int, int]]): A dictionary containing grid dimensions and original position.\n            It should have the following keys:\n            - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols)\n            - \"original_position\": tuple[int, int] representing (original_row, original_col)\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool, optional): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted keypoints for the entire grid. The shape is\n                    (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.\n\n    Note:\n        - The function handles keypoint flipping and shifting to create a grid of reflected keypoints.\n        - It preserves the angle and scale information of the keypoints during transformations.\n        - The resulting grid can be either centered at the origin or positioned based on the original grid.\n    \"\"\"\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of keypoints\n    keypoints_hflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        image_shape=image_shape,\n    )\n    keypoints_vflipped = flip_keypoints(\n        keypoints,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n    keypoints_hvflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    rows, cols = image_shape[:2]\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [original_col * cols, original_row * rows, 0, 0],\n    )  # Only shift x and y\n    keypoints = shift_keypoints(keypoints, shift_vector)\n    keypoints_hflipped = shift_keypoints(keypoints_hflipped, shift_vector)\n    keypoints_vflipped = shift_keypoints(keypoints_vflipped, shift_vector)\n    keypoints_hvflipped = shift_keypoints(keypoints_hvflipped, shift_vector)\n\n    new_keypoints = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of keypoints to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints\n            elif (grid_row - original_row) % 2 == 0:\n                current_keypoints = keypoints_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints_vflipped\n            else:\n                current_keypoints = keypoints_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    0,\n                    0,\n                ],\n            )\n            shifted_keypoints = shift_keypoints(current_keypoints, cell_shift)\n\n            new_keypoints.append(shifted_keypoints)\n\n    result = np.vstack(new_keypoints)\n\n    return shift_keypoints(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_shuffled_splits","title":"<code>def generate_shuffled_splits    (size, divisions, random_generator)    </code> [view source on GitHub]","text":"<p>Generate shuffled splits for a given dimension size and number of divisions.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>int</code> <p>Total size of the dimension (height or width).</p> <code>divisions</code> <code>int</code> <p>Number of divisions (rows or columns).</p> <code>random_generator</code> <code>np.random.Generator | None</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Cumulative edges of the shuffled intervals.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_shuffled_splits(\n    size: int,\n    divisions: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate shuffled splits for a given dimension size and number of divisions.\n\n    Args:\n        size (int): Total size of the dimension (height or width).\n        divisions (int): Number of divisions (rows or columns).\n        random_generator (np.random.Generator | None): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: Cumulative edges of the shuffled intervals.\n    \"\"\"\n    intervals = almost_equal_intervals(size, divisions)\n    random_generator.shuffle(intervals)\n    return np.insert(np.cumsum(intervals), 0, 0)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps","title":"<code>def get_camera_matrix_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using camera matrix model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_camera_matrix_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using camera matrix model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n    camera_matrix = np.array(\n        [[width, 0, center_xy[0]], [0, height, center_xy[1]], [0, 0, 1]],\n        dtype=np.float32,\n    )\n    distortion = np.array([k, k, 0, 0, 0], dtype=np.float32)\n    return cv2.initUndistortRectifyMap(\n        camera_matrix,\n        distortion,\n        None,\n        None,\n        (width, height),\n        cv2.CV_32FC1,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_dimension_padding","title":"<code>def get_dimension_padding    (current_size, min_size, divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding for a single dimension.</p> <p>Parameters:</p> Name Type Description <code>current_size</code> <code>int</code> <p>Current size of the dimension</p> <code>min_size</code> <code>int | None</code> <p>Minimum size requirement, if any</p> <code>divisor</code> <code>int | None</code> <p>Divisor for padding to make size divisible, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int]</code> <p>(pad_before, pad_after)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_dimension_padding(\n    current_size: int,\n    min_size: int | None,\n    divisor: int | None,\n) -&gt; tuple[int, int]:\n    \"\"\"Calculate padding for a single dimension.\n\n    Args:\n        current_size: Current size of the dimension\n        min_size: Minimum size requirement, if any\n        divisor: Divisor for padding to make size divisible, if any\n\n    Returns:\n        tuple[int, int]: (pad_before, pad_after)\n    \"\"\"\n    if min_size is not None:\n        if current_size &lt; min_size:\n            pad_before = int((min_size - current_size) / 2.0)\n            pad_after = min_size - current_size - pad_before\n            return pad_before, pad_after\n    elif divisor is not None:\n        remainder = current_size % divisor\n        if remainder &gt; 0:\n            total_pad = divisor - remainder\n            pad_before = total_pad // 2\n            pad_after = total_pad - pad_before\n            return pad_before, pad_after\n\n    return 0, 0\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps","title":"<code>def get_fisheye_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using fisheye model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_fisheye_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using fisheye model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n\n    center_x, center_y = center_xy\n\n    # Create coordinate grid\n    y, x = np.mgrid[:height, :width].astype(np.float32)\n\n    x = x - center_x\n    y = y - center_y\n\n    # Calculate polar coordinates\n    r = np.sqrt(x * x + y * y)\n    theta = np.arctan2(y, x)\n\n    # Normalize radius by the maximum possible radius to keep distortion in check\n    max_radius = math.sqrt(max(center_x, width - center_x) ** 2 + max(center_y, height - center_y) ** 2)\n    r_norm = r / max_radius\n\n    # Apply fisheye distortion to normalized radius\n    r_dist = r * (1 + k * r_norm * r_norm)\n\n    # Convert back to cartesian coordinates\n    map_x = r_dist * np.cos(theta) + center_x\n    map_y = r_dist * np.sin(theta) + center_y\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_pad_grid_dimensions","title":"<code>def get_pad_grid_dimensions    (pad_top, pad_bottom, pad_left, pad_right, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the dimensions of the grid needed for reflection padding and the position of the original image.</p> <p>Parameters:</p> Name Type Description <code>pad_top</code> <code>int</code> <p>Number of pixels to pad above the image.</p> <code>pad_bottom</code> <code>int</code> <p>Number of pixels to pad below the image.</p> <code>pad_left</code> <code>int</code> <p>Number of pixels to pad to the left of the image.</p> <code>pad_right</code> <code>int</code> <p>Number of pixels to pad to the right of the image.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <p>Returns:</p> Type Description <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing:     - 'grid_shape': A tuple (grid_rows, grid_cols) where:         - grid_rows (int): Number of times the image needs to be repeated vertically.         - grid_cols (int): Number of times the image needs to be repeated horizontally.     - 'original_position': A tuple (original_row, original_col) where:         - original_row (int): Row index of the original image in the grid.         - original_col (int): Column index of the original image in the grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_pad_grid_dimensions(\n    pad_top: int,\n    pad_bottom: int,\n    pad_left: int,\n    pad_right: int,\n    image_shape: tuple[int, int],\n) -&gt; dict[str, tuple[int, int]]:\n    \"\"\"Calculate the dimensions of the grid needed for reflection padding and the position of the original image.\n\n    Args:\n        pad_top (int): Number of pixels to pad above the image.\n        pad_bottom (int): Number of pixels to pad below the image.\n        pad_left (int): Number of pixels to pad to the left of the image.\n        pad_right (int): Number of pixels to pad to the right of the image.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n\n    Returns:\n        dict[str, tuple[int, int]]: A dictionary containing:\n            - 'grid_shape': A tuple (grid_rows, grid_cols) where:\n                - grid_rows (int): Number of times the image needs to be repeated vertically.\n                - grid_cols (int): Number of times the image needs to be repeated horizontally.\n            - 'original_position': A tuple (original_row, original_col) where:\n                - original_row (int): Row index of the original image in the grid.\n                - original_col (int): Column index of the original image in the grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    grid_rows = 1 + math.ceil(pad_top / rows) + math.ceil(pad_bottom / rows)\n    grid_cols = 1 + math.ceil(pad_left / cols) + math.ceil(pad_right / cols)\n    original_row = math.ceil(pad_top / rows)\n    original_col = math.ceil(pad_left / cols)\n\n    return {\n        \"grid_shape\": (grid_rows, grid_cols),\n        \"original_position\": (original_row, original_col),\n    }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_padding_params","title":"<code>def get_padding_params    (image_shape, min_height, min_width, pad_height_divisor, pad_width_divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding parameters based on target dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>(height, width) of the image</p> <code>min_height</code> <code>int | None</code> <p>Minimum height requirement, if any</p> <code>min_width</code> <code>int | None</code> <p>Minimum width requirement, if any</p> <code>pad_height_divisor</code> <code>int | None</code> <p>Divisor for height padding, if any</p> <code>pad_width_divisor</code> <code>int | None</code> <p>Divisor for width padding, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int, int, int]</code> <p>(pad_top, pad_bottom, pad_left, pad_right)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_padding_params(\n    image_shape: tuple[int, int],\n    min_height: int | None,\n    min_width: int | None,\n    pad_height_divisor: int | None,\n    pad_width_divisor: int | None,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate padding parameters based on target dimensions.\n\n    Args:\n        image_shape: (height, width) of the image\n        min_height: Minimum height requirement, if any\n        min_width: Minimum width requirement, if any\n        pad_height_divisor: Divisor for height padding, if any\n        pad_width_divisor: Divisor for width padding, if any\n\n    Returns:\n        tuple[int, int, int, int]: (pad_top, pad_bottom, pad_left, pad_right)\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    h_pad_top, h_pad_bottom = get_dimension_padding(\n        rows,\n        min_height,\n        pad_height_divisor,\n    )\n    w_pad_left, w_pad_right = get_dimension_padding(cols, min_width, pad_width_divisor)\n\n    return h_pad_top, h_pad_bottom, w_pad_left, w_pad_right\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.is_identity_matrix","title":"<code>def is_identity_matrix    (matrix)    </code> [view source on GitHub]","text":"<p>Check if the given matrix is an identity matrix.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>A 3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>bool</code> <p>True if the matrix is an identity matrix, False otherwise.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_identity_matrix(matrix: np.ndarray) -&gt; bool:\n    \"\"\"Check if the given matrix is an identity matrix.\n\n    Args:\n        matrix (np.ndarray): A 3x3 affine transformation matrix.\n\n    Returns:\n        bool: True if the matrix is an identity matrix, False otherwise.\n    \"\"\"\n    return np.allclose(matrix, np.eye(3, dtype=matrix.dtype))\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.is_valid_component","title":"<code>def is_valid_component    (component_area, original_area, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Validate if a component meets the minimum requirements.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_valid_component(\n    component_area: float,\n    original_area: float,\n    min_area: float | None,\n    min_visibility: float | None,\n) -&gt; bool:\n    \"\"\"Validate if a component meets the minimum requirements.\"\"\"\n    visibility = component_area / original_area\n    return (min_area is None or component_area &gt;= min_area) and (min_visibility is None or visibility &gt;= min_visibility)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_affine","title":"<code>def keypoints_affine    (keypoints, matrix, image_shape, scale, border_mode)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to keypoints.</p> <p>This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 4+) where N is the number of keypoints.                     Each keypoint is represented as [x, y, angle, scale, ...].</p> <code>matrix</code> <code>np.ndarray</code> <p>The 2x3 or 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width).</p> <code>scale</code> <code>dict[str, float]</code> <p>Dictionary containing scale factors for x and y directions.                       Expected keys are 'x' and 'y'.</p> <code>border_mode</code> <code>int</code> <p>Border mode for handling keypoints near image edges.                 Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed keypoints array with the same shape as input.</p> <p>Notes</p> <ul> <li>The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.</li> <li>Coordinates (x, y) are transformed using the affine matrix.</li> <li>Angles are adjusted based on the rotation component of the affine transformation.</li> <li>Scales are multiplied by the maximum of x and y scale factors.</li> <li>The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n&gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n&gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n&gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_affine(\n    keypoints: np.ndarray,\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n    scale: XYFloat,\n    border_mode: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to keypoints.\n\n    This function transforms keypoints using the given affine transformation matrix.\n    It handles reflection padding if necessary, updates coordinates, angles, and scales.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 4+) where N is the number of keypoints.\n                                Each keypoint is represented as [x, y, angle, scale, ...].\n        matrix (np.ndarray): The 2x3 or 3x3 affine transformation matrix.\n        image_shape (tuple[int, int]): Shape of the image (height, width).\n        scale (dict[str, float]): Dictionary containing scale factors for x and y directions.\n                                  Expected keys are 'x' and 'y'.\n        border_mode (int): Border mode for handling keypoints near image edges.\n                            Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.\n\n    Returns:\n        np.ndarray: Transformed keypoints array with the same shape as input.\n\n    Notes:\n        - The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.\n        - Coordinates (x, y) are transformed using the affine matrix.\n        - Angles are adjusted based on the rotation component of the affine transformation.\n        - Scales are multiplied by the maximum of x and y scale factors.\n        - The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.\n\n    Example:\n        &gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n        &gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n        &gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n    \"\"\"\n    keypoints = keypoints.copy().astype(np.float32)\n\n    if is_identity_matrix(matrix):\n        return keypoints\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        keypoints = generate_reflected_keypoints(\n            keypoints,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Extract x, y coordinates\n    xy = keypoints[:, :2]\n\n    # Ensure matrix is 2x3\n    if matrix.shape == (3, 3):\n        matrix = matrix[:2]\n\n    # Transform x, y coordinates\n    xy_transformed = cv2.transform(xy.reshape(-1, 1, 2), matrix).squeeze()\n\n    # Calculate angle adjustment\n    angle_adjustment = rotation2d_matrix_to_euler_angles(matrix[:2, :2], y_up=False)\n\n    # Update angles\n    keypoints[:, 2] = keypoints[:, 2] + angle_adjustment\n\n    # Update scales\n    max_scale = max(scale[\"x\"], scale[\"y\"])\n\n    keypoints[:, 3] *= max_scale\n\n    # Update x, y coordinates\n    keypoints[:, :2] = xy_transformed\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_d4","title":"<code>def keypoints_d4    (keypoints, group_member, image_shape, ** params)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a keypoint.</p> <p>This function adjusts a keypoint's coordinates according to the specified <code>D_4</code> group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.</p> <ul> <li>keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). -group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.</li> <li>image_shape (tuple[int, int]): The shape of the image.</li> <li>params (Any): Not used</li> </ul> <ul> <li>KeypointInternalType: The transformed keypoint.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.</li> </ul> <p>Examples:</p> <ul> <li>Rotating a keypoint by 90 degrees in a 100x100 image:   <code>keypoint_d4((50, 30), 'r90', 100, 100)</code>   This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_d4(\n    keypoints: np.ndarray,\n    group_member: D4Type,\n    image_shape: tuple[int, int],\n    **params: Any,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a keypoint.\n\n    This function adjusts a keypoint's coordinates according to the specified `D_4` group transformation,\n    which includes rotations and reflections suitable for image processing tasks. These transformations account\n    for the dimensions of the image to ensure the keypoint remains within its boundaries.\n\n    Parameters:\n    - keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n    -group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.\n    - image_shape (tuple[int, int]): The shape of the image.\n    - params (Any): Not used\n\n    Returns:\n    - KeypointInternalType: The transformed keypoint.\n\n    Raises:\n    - ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.\n\n    Examples:\n    - Rotating a keypoint by 90 degrees in a 100x100 image:\n      `keypoint_d4((50, 30), 'r90', 100, 100)`\n      This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: keypoints_rot90(x, 1, image_shape),  # Rotate 90 degrees\n        \"r180\": lambda x: keypoints_rot90(x, 2, image_shape),  # Rotate 180 degrees\n        \"r270\": lambda x: keypoints_rot90(x, 3, image_shape),  # Rotate 270 degrees\n        \"v\": lambda x: keypoints_vflip(x, rows),  # Vertical flip\n        \"hvt\": lambda x: keypoints_transpose(\n            keypoints_rot90(x, 2, image_shape),\n        ),  # Reflect over anti diagonal\n        \"h\": lambda x: keypoints_hflip(x, cols),  # Horizontal flip\n        \"t\": lambda x: keypoints_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](keypoints)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_hflip","title":"<code>def keypoints_hflip    (keypoints, cols)    </code> [view source on GitHub]","text":"<p>Flip keypoints horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>cols</code> <code>int</code> <p>Image width.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_hflip(keypoints: np.ndarray, cols: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints horizontally around the y-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        cols: Image width.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip x-coordinates\n    flipped_keypoints[:, 0] = (cols - 1) - keypoints[:, 0]\n\n    # Adjust angles\n    flipped_keypoints[:, 2] = np.pi - keypoints[:, 2]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_rot90","title":"<code>def keypoints_rot90    (keypoints, factor, image_shape)    </code> [view source on GitHub]","text":"<p>Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p> <code>factor</code> <code>int</code> <p>The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The rotated keypoints with the same shape as the input.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the factor is not in the set {0, 1, 2, 3}.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_rot90(\n    keypoints: np.ndarray,\n    factor: int,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.\n\n    Args:\n        keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n        factor (int): The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n\n    Returns:\n        np.ndarray: The rotated keypoints with the same shape as the input.\n\n    Raises:\n        ValueError: If the factor is not in the set {0, 1, 2, 3}.\n    \"\"\"\n    if factor not in {0, 1, 2, 3}:\n        raise ValueError(\"Parameter factor must be in set {0, 1, 2, 3}\")\n\n    if factor == 0:\n        return keypoints\n\n    height, width = image_shape[:2]\n    rotated_keypoints = keypoints.copy().astype(np.float32)\n\n    x, y, angle = keypoints[:, 0], keypoints[:, 1], keypoints[:, 2]\n\n    if factor == 1:\n        rotated_keypoints[:, 0] = y\n        rotated_keypoints[:, 1] = width - 1 - x\n        rotated_keypoints[:, 2] = angle - np.pi / 2\n    elif factor == ROT90_180_FACTOR:\n        rotated_keypoints[:, 0] = width - 1 - x\n        rotated_keypoints[:, 1] = height - 1 - y\n        rotated_keypoints[:, 2] = angle - np.pi\n    elif factor == ROT90_270_FACTOR:\n        rotated_keypoints[:, 0] = height - 1 - y\n        rotated_keypoints[:, 1] = x\n        rotated_keypoints[:, 2] = angle + np.pi / 2\n\n    return rotated_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_scale","title":"<code>def keypoints_scale    (keypoints, scale_x, scale_y)    </code> [view source on GitHub]","text":"<p>Scales keypoints by scale_x and scale_y.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p> <code>scale_x</code> <code>float</code> <p>Scale coefficient x-axis.</p> <code>scale_y</code> <code>float</code> <p>Scale coefficient y-axis.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of scaled keypoints with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_scale(\n    keypoints: np.ndarray,\n    scale_x: float,\n    scale_y: float,\n) -&gt; np.ndarray:\n    \"\"\"Scales keypoints by scale_x and scale_y.\n\n    Args:\n        keypoints: A numpy array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n        scale_x: Scale coefficient x-axis.\n        scale_y: Scale coefficient y-axis.\n\n    Returns:\n        A numpy array of scaled keypoints with the same shape as input.\n    \"\"\"\n    # Extract x, y, angle, and scale\n    x, y, angle, scale = (\n        keypoints[:, 0],\n        keypoints[:, 1],\n        keypoints[:, 2],\n        keypoints[:, 3],\n    )\n\n    # Scale x and y\n    x_scaled = x * scale_x\n    y_scaled = y * scale_y\n\n    # Scale the keypoint scale by the maximum of scale_x and scale_y\n    scale_scaled = scale * max(scale_x, scale_y)\n\n    # Create the output array\n    scaled_keypoints = np.column_stack([x_scaled, y_scaled, angle, scale_scaled])\n\n    # If there are additional columns, preserve them\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack(\n            [scaled_keypoints, keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:]],\n        )\n\n    return scaled_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_transpose","title":"<code>def keypoints_transpose    (keypoints)    </code> [view source on GitHub]","text":"<p>Transposes keypoints along the main diagonal.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transposed keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_transpose(keypoints: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes keypoints along the main diagonal.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n\n    Returns:\n        np.ndarray: An array of transposed keypoints with the same shape as the input.\n    \"\"\"\n    transposed_keypoints = keypoints.copy()\n\n    # Swap x and y coordinates\n    transposed_keypoints[:, [0, 1]] = keypoints[:, [1, 0]]\n\n    # Adjust angles to reflect the coordinate swap\n    angles = keypoints[:, 2]\n    transposed_keypoints[:, 2] = np.where(\n        angles &lt;= np.pi,\n        np.pi / 2 - angles,\n        3 * np.pi / 2 - angles,\n    )\n\n    return transposed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_vflip","title":"<code>def keypoints_vflip    (keypoints, rows)    </code> [view source on GitHub]","text":"<p>Flip keypoints vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>rows</code> <code>int</code> <p>Image height.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_vflip(keypoints: np.ndarray, rows: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints vertically around the x-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        rows: Image height.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip y-coordinates\n    flipped_keypoints[:, 1] = (rows - 1) - keypoints[:, 1]\n\n    # Negate angles\n    flipped_keypoints[:, 2] = -keypoints[:, 2]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.perspective_bboxes","title":"<code>def perspective_bboxes    (bboxes, image_shape, matrix, max_width, max_height, keep_size)    </code> [view source on GitHub]","text":"<p>Applies perspective transformation to bounding boxes.</p> <p>This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (num_bboxes, 4+).                  Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).                  Additional columns beyond the first 4 are preserved unchanged.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <code>matrix</code> <code>np.ndarray</code> <p>The perspective transformation matrix.</p> <code>max_width</code> <code>int</code> <p>The maximum width of the output image.</p> <code>max_height</code> <code>int</code> <p>The maximum height of the output image.</p> <code>keep_size</code> <code>bool</code> <p>If True, maintains the original image size after transformation.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as input.             The first 4 columns contain the transformed coordinates, and any             additional columns are preserved from the input.</p> <p>Note</p> <ul> <li>This function modifies only the coordinate columns (first 4) of the input bounding boxes.</li> <li>Any additional attributes (columns beyond the first 4) are kept unchanged.</li> <li>The function handles denormalization and renormalization of coordinates internally.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n&gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef perspective_bboxes(\n    bboxes: np.ndarray,\n    image_shape: tuple[int, int],\n    matrix: np.ndarray,\n    max_width: int,\n    max_height: int,\n    keep_size: bool,\n) -&gt; np.ndarray:\n    \"\"\"Applies perspective transformation to bounding boxes.\n\n    This function transforms bounding boxes using the given perspective transformation matrix.\n    It handles bounding boxes with additional attributes beyond the standard coordinates.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (num_bboxes, 4+).\n                             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n                             Additional columns beyond the first 4 are preserved unchanged.\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n        matrix (np.ndarray): The perspective transformation matrix.\n        max_width (int): The maximum width of the output image.\n        max_height (int): The maximum height of the output image.\n        keep_size (bool): If True, maintains the original image size after transformation.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as input.\n                    The first 4 columns contain the transformed coordinates, and any\n                    additional columns are preserved from the input.\n\n    Note:\n        - This function modifies only the coordinate columns (first 4) of the input bounding boxes.\n        - Any additional attributes (columns beyond the first 4) are kept unchanged.\n        - The function handles denormalization and renormalization of coordinates internally.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n        &gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n    \"\"\"\n    height, width = image_shape[:2]\n    transformed_bboxes = bboxes.copy()\n    denormalized_coords = denormalize_bboxes(bboxes[:, :4], image_shape)\n\n    x_min, y_min, x_max, y_max = denormalized_coords.T\n    points = np.array(\n        [[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]],\n    ).transpose(2, 0, 1)\n    points_reshaped = points.reshape(-1, 1, 2)\n\n    transformed_points = cv2.perspectiveTransform(\n        points_reshaped.astype(np.float32),\n        matrix,\n    )\n    transformed_points = transformed_points.reshape(-1, 4, 2)\n\n    new_coords = np.array(\n        [[np.min(box[:, 0]), np.min(box[:, 1]), np.max(box[:, 0]), np.max(box[:, 1])] for box in transformed_points],\n    )\n\n    if keep_size:\n        scale_x, scale_y = width / max_width, height / max_height\n        new_coords[:, [0, 2]] *= scale_x\n        new_coords[:, [1, 3]] *= scale_y\n        output_shape = image_shape\n    else:\n        output_shape = (max_height, max_width)\n\n    normalized_coords = normalize_bboxes(new_coords, output_shape)\n    transformed_bboxes[:, :4] = normalized_coords\n\n    return transformed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles","title":"<code>def rotation2d_matrix_to_euler_angles    (matrix, y_up)    </code> [view source on GitHub]","text":"<p>matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def rotation2d_matrix_to_euler_angles(matrix: np.ndarray, y_up: bool) -&gt; float:\n    \"\"\"Args:\n    matrix (np.ndarray): Rotation matrix\n    y_up (bool): is Y axis looks up or down\n\n    \"\"\"\n    if y_up:\n        return np.arctan2(matrix[1, 0], matrix[0, 0])\n    return np.arctan2(-matrix[1, 0], matrix[0, 0])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.shift_bboxes","title":"<code>def shift_bboxes    (bboxes, shift_vector)    </code> [view source on GitHub]","text":"<p>Shift bounding boxes by a given vector.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where n is the number of bboxes                  and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>shift_vector</code> <code>np.ndarray</code> <p>Vector to shift the bounding boxes by, with shape (4,) for                        [shift_x, shift_y, shift_x, shift_y].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Shifted bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shift_bboxes(bboxes: np.ndarray, shift_vector: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Shift bounding boxes by a given vector.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where n is the number of bboxes\n                             and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].\n        shift_vector (np.ndarray): Vector to shift the bounding boxes by, with shape (4,) for\n                                   [shift_x, shift_y, shift_x, shift_y].\n\n    Returns:\n        np.ndarray: Shifted bounding boxes with the same shape as input.\n    \"\"\"\n    # Create a copy of the input array to avoid modifying it in-place\n    shifted_bboxes = bboxes.copy()\n\n    # Add the shift vector to the first 4 columns\n    shifted_bboxes[:, :4] += shift_vector\n\n    return shifted_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups","title":"<code>def shuffle_tiles_within_shape_groups    (shape_groups, random_generator)    </code> [view source on GitHub]","text":"<p>Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.</p> <p>Parameters:</p> Name Type Description <code>shape_groups</code> <code>dict[tuple[int, int], list[int]]</code> <p>Groups of tile indices categorized by shape.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the indices. If None, a new random generator will be used.</p> <p>Returns:</p> Type Description <code>list[int]</code> <p>A list where each index is mapped to the new index of the tile after shuffling.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shuffle_tiles_within_shape_groups(\n    shape_groups: dict[tuple[int, int], list[int]],\n    random_generator: np.random.Generator,\n) -&gt; list[int]:\n    \"\"\"Shuffles indices within each group of similar shapes and creates a list where each\n    index points to the index of the tile it should be mapped to.\n\n    Args:\n        shape_groups (dict[tuple[int, int], list[int]]): Groups of tile indices categorized by shape.\n        random_generator (np.random.Generator): The random generator to use for shuffling the indices.\n            If None, a new random generator will be used.\n\n    Returns:\n        list[int]: A list where each index is mapped to the new index of the tile after shuffling.\n    \"\"\"\n    # Initialize the output list with the same size as the total number of tiles, filled with -1\n    num_tiles = sum(len(indices) for indices in shape_groups.values())\n    mapping = [-1] * num_tiles\n\n    # Prepare the random number generator\n\n    for indices in shape_groups.values():\n        shuffled_indices = indices.copy()\n        random_generator.shuffle(shuffled_indices)\n\n        for old, new in zip(indices, shuffled_indices):\n            mapping[old] = new\n\n    return mapping\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.split_uniform_grid","title":"<code>def split_uniform_grid    (image_shape, grid, random_generator)    </code> [view source on GitHub]","text":"<p>Splits an image shape into a uniform grid specified by the grid dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>grid</code> <code>tuple[int, int]</code> <p>The grid size as (rows, columns).</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).</p> <p>Note</p> <p>The function uses <code>generate_shuffled_splits</code> to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def split_uniform_grid(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Splits an image shape into a uniform grid specified by the grid dimensions.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        grid (tuple[int, int]): The grid size as (rows, columns).\n        random_generator (np.random.Generator): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).\n\n    Note:\n        The function uses `generate_shuffled_splits` to generate the splits for the height and width of the image.\n        The splits are then used to calculate the coordinates of the tiles.\n    \"\"\"\n    n_rows, n_cols = grid\n\n    height_splits = generate_shuffled_splits(\n        image_shape[0],\n        grid[0],\n        random_generator=random_generator,\n    )\n    width_splits = generate_shuffled_splits(\n        image_shape[1],\n        grid[1],\n        random_generator=random_generator,\n    )\n\n    # Calculate tiles coordinates\n    tiles = [\n        (height_splits[i], width_splits[j], height_splits[i + 1], width_splits[j + 1])\n        for i in range(n_rows)\n        for j in range(n_cols)\n    ]\n\n    return np.array(tiles, dtype=np.int16)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.swap_tiles_on_image","title":"<code>def swap_tiles_on_image    (image, tiles, mapping=None)    </code> [view source on GitHub]","text":"<p>Swap tiles on the image according to the new format.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image.</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tiles with each tile as [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int] | None</code> <p>list of new tile indices.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Output image with tiles swapped according to the random shuffle.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_image(\n    image: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int] | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Swap tiles on the image according to the new format.\n\n    Args:\n        image: Input image.\n        tiles: Array of tiles with each tile as [start_y, start_x, end_y, end_x].\n        mapping: list of new tile indices.\n\n    Returns:\n        np.ndarray: Output image with tiles swapped according to the random shuffle.\n    \"\"\"\n    # If no tiles are provided, return a copy of the original image\n    if tiles.size == 0 or mapping is None:\n        return image.copy()\n\n    # Create a copy of the image to retain original for reference\n    new_image = np.empty_like(image)\n    for num, new_index in enumerate(mapping):\n        start_y, start_x, end_y, end_x = tiles[new_index]\n        start_y_orig, start_x_orig, end_y_orig, end_x_orig = tiles[num]\n        # Assign the corresponding tile from the original image to the new image\n        new_image[start_y:end_y, start_x:end_x] = image[\n            start_y_orig:end_y_orig,\n            start_x_orig:end_x_orig,\n        ]\n\n    return new_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints","title":"<code>def swap_tiles_on_keypoints    (keypoints, tiles, mapping)    </code> [view source on GitHub]","text":"<p>Swap the positions of keypoints based on a tile mapping.</p> <p>This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (N, 2) where N is the number of keypoints.                     Each row represents a keypoint's (x, y) coordinates.</p> <code>tiles</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (M, 4) where M is the number of tiles.                 Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.</p> <code>mapping</code> <code>np.ndarray</code> <p>A 1D numpy array of shape (M,) where M is the number of tiles.                   Each element i contains the index of the tile that tile i should be swapped with.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of the same shape as the input keypoints, containing the new positions             of the keypoints after the tile swap.</p> <p>Exceptions:</p> Type Description <code>RuntimeWarning</code> <p>If any keypoint is not found within any tile.</p> <p>Notes</p> <ul> <li>Keypoints that do not fall within any tile will remain unchanged.</li> <li>The function assumes that the tiles do not overlap and cover the entire image space.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_keypoints(\n    keypoints: np.ndarray,\n    tiles: np.ndarray,\n    mapping: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Swap the positions of keypoints based on a tile mapping.\n\n    This function takes a set of keypoints and repositions them according to a mapping of tile swaps.\n    Keypoints are moved from their original tiles to new positions in the swapped tiles.\n\n    Args:\n        keypoints (np.ndarray): A 2D numpy array of shape (N, 2) where N is the number of keypoints.\n                                Each row represents a keypoint's (x, y) coordinates.\n        tiles (np.ndarray): A 2D numpy array of shape (M, 4) where M is the number of tiles.\n                            Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.\n        mapping (np.ndarray): A 1D numpy array of shape (M,) where M is the number of tiles.\n                              Each element i contains the index of the tile that tile i should be swapped with.\n\n    Returns:\n        np.ndarray: A 2D numpy array of the same shape as the input keypoints, containing the new positions\n                    of the keypoints after the tile swap.\n\n    Raises:\n        RuntimeWarning: If any keypoint is not found within any tile.\n\n    Notes:\n        - Keypoints that do not fall within any tile will remain unchanged.\n        - The function assumes that the tiles do not overlap and cover the entire image space.\n    \"\"\"\n    if not keypoints.size:\n        return keypoints\n\n    # Broadcast keypoints and tiles for vectorized comparison\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    start_y, start_x, end_y, end_x = tiles.T  # Each shape: (num_tiles,)\n\n    # Check if each keypoint is inside each tile\n    in_tile = (kp_y &gt;= start_y) &amp; (kp_y &lt; end_y) &amp; (kp_x &gt;= start_x) &amp; (kp_x &lt; end_x)\n\n    # Find which tile each keypoint belongs to\n    tile_indices = np.argmax(in_tile, axis=1)\n\n    # Check if any keypoint is not in any tile\n    not_in_any_tile = ~np.any(in_tile, axis=1)\n    if np.any(not_in_any_tile):\n        warn(\n            \"Some keypoints are not in any tile. They will be returned unchanged. This is unexpected and should be \"\n            \"investigated.\",\n            RuntimeWarning,\n            stacklevel=2,\n        )\n\n    # Get the new tile indices\n    new_tile_indices = np.array(mapping)[tile_indices]\n\n    # Calculate the offsets\n    old_start_x = tiles[tile_indices, 1]\n    old_start_y = tiles[tile_indices, 0]\n    new_start_x = tiles[new_tile_indices, 1]\n    new_start_y = tiles[new_tile_indices, 0]\n\n    # Apply the transformation\n    new_keypoints = keypoints.copy()\n    new_keypoints[:, 0] = (keypoints[:, 0] - old_start_x) + new_start_x\n    new_keypoints[:, 1] = (keypoints[:, 1] - old_start_y) + new_start_y\n\n    # Keep original coordinates for keypoints not in any tile\n    new_keypoints[not_in_any_tile] = keypoints[not_in_any_tile]\n\n    return new_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.to_distance_maps","title":"<code>def to_distance_maps    (keypoints, image_shape, inverted=False)    </code> [view source on GitHub]","text":"<p>Generate a <code>(H,W,N)</code> array of distance maps for <code>N</code> keypoints.</p> <p>The <code>n</code>-th distance map contains at every location <code>(y, x)</code> the euclidean distance to the <code>n</code>-th keypoint.</p> <p>This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 2+) where N is the number of keypoints.        Each row represents a keypoint's (x, y) coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>tuple[int, int] shape of the image (height, width)</p> <code>inverted</code> <code>bool</code> <p>If <code>True</code>, inverted distance maps are returned where each distance value d is replaced by <code>d/(d+1)</code>, i.e. the distance maps have values in the range <code>(0.0, 1.0]</code> with <code>1.0</code> denoting exactly the position of the respective keypoint.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A <code>float32</code> array of shape (H, W, N) containing <code>N</code> distance maps for <code>N</code>     keypoints. Each location <code>(y, x, n)</code> in the array denotes the     euclidean distance at <code>(y, x)</code> to the <code>n</code>-th keypoint.     If <code>inverted</code> is <code>True</code>, the distance <code>d</code> is replaced     by <code>d/(d+1)</code>. The height and width of the array match the     height and width in <code>image_shape</code>.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def to_distance_maps(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    inverted: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate a ``(H,W,N)`` array of distance maps for ``N`` keypoints.\n\n    The ``n``-th distance map contains at every location ``(y, x)`` the\n    euclidean distance to the ``n``-th keypoint.\n\n    This function can be used as a helper when augmenting keypoints with a\n    method that only supports the augmentation of images.\n\n    Args:\n        keypoints: A numpy array of shape (N, 2+) where N is the number of keypoints.\n                   Each row represents a keypoint's (x, y) coordinates.\n        image_shape: tuple[int, int] shape of the image (height, width)\n        inverted (bool): If ``True``, inverted distance maps are returned where each\n            distance value d is replaced by ``d/(d+1)``, i.e. the distance\n            maps have values in the range ``(0.0, 1.0]`` with ``1.0`` denoting\n            exactly the position of the respective keypoint.\n\n    Returns:\n        np.ndarray: A ``float32`` array of shape (H, W, N) containing ``N`` distance maps for ``N``\n            keypoints. Each location ``(y, x, n)`` in the array denotes the\n            euclidean distance at ``(y, x)`` to the ``n``-th keypoint.\n            If `inverted` is ``True``, the distance ``d`` is replaced\n            by ``d/(d+1)``. The height and width of the array match the\n            height and width in ``image_shape``.\n    \"\"\"\n    height, width = image_shape[:2]\n    if len(keypoints) == 0:\n        return np.zeros((height, width, 0), dtype=np.float32)\n\n    # Create coordinate grids\n    yy, xx = np.mgrid[:height, :width]\n\n    # Convert keypoints to numpy array\n    keypoints_array = np.array(keypoints)\n\n    # Compute distances for all keypoints at once\n    distances = np.sqrt(\n        (xx[..., np.newaxis] - keypoints_array[:, 0]) ** 2 + (yy[..., np.newaxis] - keypoints_array[:, 1]) ** 2,\n    )\n\n    if inverted:\n        return (1 / (distances + 1)).astype(np.float32)\n    return distances.astype(np.float32)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.tps_transform","title":"<code>def tps_transform    (target_points, control_points, nonlinear_weights, affine_weights)    </code> [view source on GitHub]","text":"<p>Apply Thin Plate Spline transformation to points.</p> <p>Parameters:</p> Name Type Description <code>target_points</code> <code>np.ndarray</code> <p>Points to transform with shape (num_targets, 2)</p> <code>control_points</code> <code>np.ndarray</code> <p>Original control points with shape (num_controls, 2)</p> <code>nonlinear_weights</code> <code>np.ndarray</code> <p>TPS kernel weights with shape (num_controls, 2)</p> <code>affine_weights</code> <code>np.ndarray</code> <p>Affine transformation weights with shape (3, 2)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (num_targets, 2)</p> <p>Note</p> <p>The transformation combines: 1. Nonlinear warping based on distances to control points 2. Global affine transformation (scale, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def tps_transform(\n    target_points: np.ndarray,\n    control_points: np.ndarray,\n    nonlinear_weights: np.ndarray,\n    affine_weights: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply Thin Plate Spline transformation to points.\n\n    Args:\n        target_points: Points to transform with shape (num_targets, 2)\n        control_points: Original control points with shape (num_controls, 2)\n        nonlinear_weights: TPS kernel weights with shape (num_controls, 2)\n        affine_weights: Affine transformation weights with shape (3, 2)\n\n    Returns:\n        Transformed points with shape (num_targets, 2)\n\n    Note:\n        The transformation combines:\n        1. Nonlinear warping based on distances to control points\n        2. Global affine transformation (scale, rotation, translation)\n    \"\"\"\n    # Compute all pairwise distances at once: (num_targets, num_controls)\n    distances = np.linalg.norm(target_points[:, None] - control_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Prepare affine terms [1, x, y] for each point\n    affine_terms = np.c_[np.ones(len(target_points)), target_points]\n\n    # Combine nonlinear and affine transformations\n    return kernel_matrix @ nonlinear_weights + affine_terms @ affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.transpose","title":"<code>def transpose    (img)    </code> [view source on GitHub]","text":"<p>Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transposed array.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def transpose(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes the first two dimensions of an array of any dimensionality.\n    Retains the order of any additional dimensions.\n\n    Args:\n        img (np.ndarray): Input array.\n\n    Returns:\n        np.ndarray: Transposed array.\n    \"\"\"\n    # Generate the new axes order\n    new_axes = list(range(img.ndim))\n    new_axes[0], new_axes[1] = 1, 0  # Swap the first two dimensions\n\n    # Transpose the array using the new axes order\n    return img.transpose(new_axes)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.validate_bboxes","title":"<code>def validate_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Validate bounding boxes and remove invalid ones.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid bounding boxes, potentially with fewer boxes than the input.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n&gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n&gt;&gt;&gt; print(valid_bboxes)\n[[10 20 30 40]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_bboxes(bboxes: np.ndarray, image_shape: Sequence[int]) -&gt; np.ndarray:\n    \"\"\"Validate bounding boxes and remove invalid ones.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid bounding boxes, potentially with fewer boxes than the input.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n        &gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n        &gt;&gt;&gt; print(valid_bboxes)\n        [[10 20 30 40]]\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    valid_indices = (x_max &gt; 0) &amp; (y_max &gt; 0) &amp; (x_min &lt; cols) &amp; (y_min &lt; rows)\n\n    return bboxes[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.validate_if_not_found_coords","title":"<code>def validate_if_not_found_coords    (if_not_found_coords)    </code> [view source on GitHub]","text":"<p>Validate and process <code>if_not_found_coords</code> parameter.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_if_not_found_coords(\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None,\n) -&gt; tuple[bool, float, float]:\n    \"\"\"Validate and process `if_not_found_coords` parameter.\"\"\"\n    if if_not_found_coords is None:\n        return True, -1, -1\n    if isinstance(if_not_found_coords, (tuple, list)):\n        if len(if_not_found_coords) != PAIR:\n            msg = \"Expected tuple/list 'if_not_found_coords' to contain exactly two entries.\"\n            raise ValueError(msg)\n        return False, if_not_found_coords[0], if_not_found_coords[1]\n    if isinstance(if_not_found_coords, dict):\n        return False, if_not_found_coords[\"x\"], if_not_found_coords[\"y\"]\n\n    msg = \"Expected if_not_found_coords to be None, tuple, list, or dict.\"\n    raise ValueError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.validate_keypoints","title":"<code>def validate_keypoints    (keypoints, image_shape)    </code> [view source on GitHub]","text":"<p>Validate keypoints and remove those that fall outside the image boundaries.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, M) where N is the number of keypoints                     and M &gt;= 2. The first two columns represent x and y coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid keypoints that fall within the image boundaries.</p> <p>Note</p> <p>This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Validate keypoints and remove those that fall outside the image boundaries.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, M) where N is the number of keypoints\n                                and M &gt;= 2. The first two columns represent x and y coordinates.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid keypoints that fall within the image boundaries.\n\n    Note:\n        This function only checks the x and y coordinates (first two columns) of the keypoints.\n        Any additional columns (e.g., angle, scale) are preserved for valid keypoints.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x, y = keypoints[:, 0], keypoints[:, 1]\n\n    valid_indices = (x &gt;= 0) &amp; (x &lt; cols) &amp; (y &gt;= 0) &amp; (y &lt; rows)\n\n    return keypoints[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/","title":"Resizing transforms (augmentations.geometric.resize)","text":""},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.LongestMaxSize","title":"<code>class  LongestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,     keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, Sequence[int]</code> <p>Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>interpolation method. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the longest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / max(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = min(H_target/H, W_target/W)\n       This ensures both dimensions fit within the specified bounds.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n&gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Common use case with padding\n&gt;&gt;&gt; transform4 = A.Compose([\n...     A.LongestMaxSize(max_size=1024),\n...     A.PadIfNeeded(min_height=1024, min_width=1024),\n... ])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class LongestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,\n        keeping the aspect ratio.\n\n    Args:\n        max_size (int, Sequence[int], optional): Maximum size of the longest side after the transformation.\n            When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must fit within these bounds\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): interpolation method. Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the longest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / max(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = min(H_target/H, W_target/W)\n               This ensures both dimensions fit within the specified bounds.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n        &gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Common use case with padding\n        &gt;&gt;&gt; transform4 = A.Compose([\n        ...     A.LongestMaxSize(max_size=1024),\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024),\n        ... ])\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / max(img_h, img_w)\n        elif self.max_size_hw is not None:\n            # We know max_size_hw is not None here due to model validator\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on longest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = min(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.MaxSizeTransform","title":"<code>class  MaxSizeTransform</code> <code>       (max_size=1024, max_size_hw=None, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                                     </code>  [view source on GitHub]","text":"<p>Base class for transforms that resize based on maximum size constraints.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class MaxSizeTransform(DualTransform):\n    \"\"\"Base class for transforms that resize based on maximum size constraints.\"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_size: int | list[int] | None\n        max_size_hw: tuple[int | None, int | None] | None\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def validate_size_parameters(self) -&gt; Self:\n            if self.max_size is None and self.max_size_hw is None:\n                raise ValueError(\"Either max_size or max_size_hw must be specified\")\n            if self.max_size is not None and self.max_size_hw is not None:\n                raise ValueError(\"Only one of max_size or max_size_hw should be specified\")\n            return self\n\n    def __init__(\n        self,\n        max_size: int | Sequence[int] | None = 1024,\n        max_size_hw: tuple[int | None, int | None] | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_size = max_size\n        self.max_size_hw = max_size_hw\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = img.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(img, (new_height, new_width), interpolation=self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = mask.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(mask, (new_height, new_width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_images(self, images: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(images, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volume, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volumes, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(masks3d, *args, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_size\", \"max_size_hw\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.RandomScale","title":"<code>class  RandomScale</code> <code>       (scale_limit=(-0.1, 0.1), interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Randomly resize the input. Output image size is different from the input image size.</p> <p>Parameters:</p> Name Type Description <code>scale_limit</code> <code>float or tuple[float, float]</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The output image size is different from the input image size.</li> <li>Scale factor is sampled independently per image side (width and height).</li> <li>Bounding box coordinates are scaled accordingly.</li> <li>Keypoint coordinates are scaled accordingly.</li> </ul> <p>Mathematical formulation:     Let (W, H) be the original image dimensions and (W', H') be the output dimensions.     The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].     Then, W' = W * s and H' = H * s.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; scaled_image = result['image']\n# scaled_image will have dimensions in the range [90, 110] x [90, 110]\n# (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class RandomScale(DualTransform):\n    \"\"\"Randomly resize the input. Output image size is different from the input image size.\n\n    Args:\n        scale_limit (float or tuple[float, float]): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The output image size is different from the input image size.\n        - Scale factor is sampled independently per image side (width and height).\n        - Bounding box coordinates are scaled accordingly.\n        - Keypoint coordinates are scaled accordingly.\n\n    Mathematical formulation:\n        Let (W, H) be the original image dimensions and (W', H') be the output dimensions.\n        The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].\n        Then, W' = W * s and H' = H * s.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; scaled_image = result['image']\n        # scaled_image will have dimensions in the range [90, 110] x [90, 110]\n        # (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_limit: ScaleFloatType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(cls, v: ScaleFloatType) -&gt; tuple[float, float]:\n            return to_tuple(v, bias=1.0)\n\n    def __init__(\n        self,\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"scale\": self.py_random.uniform(*self.scale_limit)}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(img, scale, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(mask, scale, self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"interpolation\": self.interpolation,\n            \"mask_interpolation\": self.mask_interpolation,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.Resize","title":"<code>class  Resize</code> <code>       (height, width, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Resize the input to the given height and width.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>desired height of the output.</p> <code>width</code> <code>int</code> <p>desired width of the output.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class Resize(DualTransform):\n    \"\"\"Resize the input to the given height and width.\n\n    Args:\n        height (int): desired height of the output.\n        width (int): desired width of the output.\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        height: int = Field(ge=1)\n        width: int = Field(ge=1)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.height = height\n        self.width = width\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(img, (self.height, self.width), interpolation=self.interpolation)\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(mask, (self.height, self.width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        height, width = params[\"shape\"][:2]\n        scale_x = self.width / width\n        scale_y = self.height / height\n        return fgeometric.keypoints_scale(keypoints, scale_x, scale_y)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.SmallestMaxSize","title":"<code>class  SmallestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, list of int</code> <p>Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the smallest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be larger than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / min(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = max(H_target/H, W_target/W)\n       This ensures both dimensions are at least as large as specified.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n&gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n&gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class SmallestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints,\n    keeping the aspect ratio.\n\n    Args:\n        max_size (int, list of int, optional): Maximum size of smallest side of the image after the transformation.\n            When using a list, max size will be randomly selected from the values in the list. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must be at least these values\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the smallest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be larger than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / min(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = max(H_target/H, W_target/W)\n               This ensures both dimensions are at least as large as specified.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n        &gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n        &gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / min(img_h, img_w)\n        elif self.max_size_hw is not None:\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on smallest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = max(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/","title":"Rotation transforms (augmentations.geometric.functional)","text":""},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.RandomRotate90","title":"<code>class  RandomRotate90</code> <code> </code>  [view source on GitHub]","text":"<p>Randomly rotate the input by 90 degrees zero or more times.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RandomRotate90(DualTransform):\n    \"\"\"Randomly rotate the input by 90 degrees zero or more times.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, factor: int, **params: Any) -&gt; np.ndarray:\n        return fgeometric.rot90(img, factor)\n\n    def get_params(self) -&gt; dict[str, int]:\n        # Random int in the range [0, 3]\n        return {\"factor\": self.py_random.randint(0, 3)}\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        factor: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_rot90(bboxes, factor)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        factor: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_rot90(keypoints, factor, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.Rotate","title":"<code>class  Rotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Rotate the input by an angle selected randomly from the uniform distribution.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>str</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>crop_border</code> <code>bool</code> <p>Whether to crop border after rotation. If True, the output image size might differ from the input. Default: False</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation angle is randomly selected for each execution within the range specified by 'limit'.</li> <li>When 'crop_border' is False, the output image will have the same size as the input, potentially   introducing black triangles in the corners.</li> <li>When 'crop_border' is True, the output image is cropped to remove black triangles, which may result   in a smaller image.</li> <li>Bounding boxes are rotated and may change size or shape.</li> <li>Keypoints are rotated around the center of the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. Each point (x, y) in the image is transformed to (x', y') by:        [x']   cos(\u03b8)  -sin(\u03b8)   [cx]        [y'] = sin(\u03b8)   cos(\u03b8) + [cy]        where (cx, cy) is the center of the image.     5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class Rotate(DualTransform):\n    \"\"\"Rotate the input by an angle selected randomly from the uniform distribution.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        rotate_method (str): Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'.\n            Default: 'largest_box'\n        crop_border (bool): Whether to crop border after rotation. If True, the output image size might differ\n            from the input. Default: False\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation angle is randomly selected for each execution within the range specified by 'limit'.\n        - When 'crop_border' is False, the output image will have the same size as the input, potentially\n          introducing black triangles in the corners.\n        - When 'crop_border' is True, the output image is cropped to remove black triangles, which may result\n          in a smaller image.\n        - Bounding boxes are rotated and may change size or shape.\n        - Keypoints are rotated around the center of the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [cos(\u03b8)  -sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [sin(\u03b8)   cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n        5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        crop_border: bool\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_value(self) -&gt; Self:\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n            return self\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        crop_border: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.crop_border = crop_border\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            img,\n            matrix,\n            self.interpolation,\n            self.fill,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            mask,\n            matrix,\n            self.mask_interpolation,\n            self.fill_mask,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_out = fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            image_shape,\n            self.border_mode,\n            image_shape,\n        )\n        if self.crop_border:\n            return fcrops.crop_bboxes_by_coords(\n                bboxes_out,\n                (x_min, y_min, x_max, y_max),\n                image_shape,\n            )\n        return bboxes_out\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        keypoints_out = fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"][:2],\n            scale={\"x\": 1, \"y\": 1},\n            border_mode=self.border_mode,\n        )\n        if self.crop_border:\n            return fcrops.crop_keypoints_by_coords(\n                keypoints_out,\n                (x_min, y_min, x_max, y_max),\n            )\n        return keypoints_out\n\n    @staticmethod\n    def _rotated_rect_with_max_area(\n        height: int,\n        width: int,\n        angle: float,\n    ) -&gt; dict[str, int]:\n        \"\"\"Given a rectangle of size wxh that has been rotated by 'angle' (in\n        degrees), computes the width and height of the largest possible\n        axis-aligned rectangle (maximal area) within the rotated rectangle.\n\n        Reference:\n            https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders\n        \"\"\"\n        angle = math.radians(angle)\n        width_is_longer = width &gt;= height\n        side_long, side_short = (width, height) if width_is_longer else (height, width)\n\n        # since the solutions for angle, -angle and 180-angle are all the same,\n        # it is sufficient to look at the first quadrant and the absolute values of sin,cos:\n        sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))\n        if side_short &lt;= 2.0 * sin_a * cos_a * side_long or abs(sin_a - cos_a) &lt; SMALL_NUMBER:\n            # half constrained case: two crop corners touch the longer side,\n            # the other two corners are on the mid-line parallel to the longer line\n            x = 0.5 * side_short\n            wr, hr = (x / sin_a, x / cos_a) if width_is_longer else (x / cos_a, x / sin_a)\n        else:\n            # fully constrained case: crop touches all 4 sides\n            cos_2a = cos_a * cos_a - sin_a * sin_a\n            wr, hr = (\n                (width * cos_a - height * sin_a) / cos_2a,\n                (height * cos_a - width * sin_a) / cos_2a,\n            )\n\n        return {\n            \"x_min\": max(0, int(width / 2 - wr / 2)),\n            \"x_max\": min(width, int(width / 2 + wr / 2)),\n            \"y_min\": max(0, int(height / 2 - hr / 2)),\n            \"y_max\": min(height, int(height / 2 + hr / 2)),\n        }\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        angle = self.py_random.uniform(*self.limit)\n\n        if self.crop_border:\n            height, width = params[\"shape\"][:2]\n            out_params = self._rotated_rect_with_max_area(height, width, angle)\n        else:\n            out_params = {\"x_min\": -1, \"x_max\": -1, \"y_min\": -1, \"y_max\": -1}\n\n        center = fgeometric.center(params[\"shape\"][:2])\n        bbox_center = fgeometric.center_bbox(params[\"shape\"][:2])\n\n        translate: fgeometric.XYInt = {\"x\": 0, \"y\": 0}\n        shear: fgeometric.XYFloat = {\"x\": 0, \"y\": 0}\n        scale: fgeometric.XYFloat = {\"x\": 1, \"y\": 1}\n        rotate = angle\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            center,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_center,\n        )\n        out_params[\"matrix\"] = matrix\n        out_params[\"bbox_matrix\"] = bbox_matrix\n\n        return out_params\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"crop_border\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.RotateInitSchema","title":"<code>class  RotateInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RotateInitSchema(BaseTransformInitSchema):\n    limit: SymmetricRangeType\n\n    interpolation: InterpolationType\n    mask_interpolation: InterpolationType\n\n    border_mode: BorderModeType\n\n    fill: ColorType | None\n    fill_mask: ColorType | None\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.SafeRotate","title":"<code>class  SafeRotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.</p> <p>This transformation ensures that the entire rotated image fits within the original frame by scaling it down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the rotation and scaling process.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation is performed around the center of the image.</li> <li>After rotation, the image is scaled to fit within the original frame, which may cause some distortion.</li> <li>The output image will always have the same dimensions as the input image.</li> <li>Bounding boxes and keypoints are transformed along with the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:        s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),                height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))     5. The combined transformation matrix T is:        T = [scos(\u03b8)  -ssin(\u03b8)  tx]            [ssin(\u03b8)   scos(\u03b8)  ty]        where tx and ty are translation factors to keep the image centered.     6. Each point (x, y) in the image is transformed to (x', y') by:        [x']   scos(\u03b8)   ssin(\u03b8)   [cx]        [y'] = -ssin(\u03b8)  scos(\u03b8) + [cy]        where (cx, cy) is the center of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n# scaled to fit within the original 100x100 frame\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class SafeRotate(Affine):\n    \"\"\"Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.\n\n    This transformation ensures that the entire rotated image fits within the original frame by scaling it\n    down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the\n    rotation and scaling process.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied\n            for masks.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): Method to rotate bounding boxes.\n            Should be 'largest_box' or 'ellipse'. Default: 'largest_box'\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation is performed around the center of the image.\n        - After rotation, the image is scaled to fit within the original frame, which may cause some distortion.\n        - The output image will always have the same dimensions as the input image.\n        - Bounding boxes and keypoints are transformed along with the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:\n           s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),\n                   height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))\n        5. The combined transformation matrix T is:\n           T = [s*cos(\u03b8)  -s*sin(\u03b8)  tx]\n               [s*sin(\u03b8)   s*cos(\u03b8)  ty]\n           where tx and ty are translation factors to keep the image centered.\n        6. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [s*cos(\u03b8)   s*sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [-s*sin(\u03b8)  s*cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n        # scaled to fit within the original 100x100 frame\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            rotate=limit,\n            interpolation=interpolation,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            rotate_method=rotate_method,\n            fit_output=True,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.mask_interpolation = mask_interpolation\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"mask_interpolation\",\n        )\n\n    def _create_safe_rotate_matrix(\n        self,\n        angle: float,\n        center: tuple[float, float],\n        image_shape: tuple[int, int],\n    ) -&gt; tuple[np.ndarray, dict[str, float]]:\n        height, width = image_shape[:2]\n        rotation_mat = cv2.getRotationMatrix2D(center, angle, 1.0)\n\n        # Calculate new image size\n        abs_cos = abs(rotation_mat[0, 0])\n        abs_sin = abs(rotation_mat[0, 1])\n        new_w = int(height * abs_sin + width * abs_cos)\n        new_h = int(height * abs_cos + width * abs_sin)\n\n        # Adjust the rotation matrix to take into account the new size\n        rotation_mat[0, 2] += new_w / 2 - center[0]\n        rotation_mat[1, 2] += new_h / 2 - center[1]\n\n        # Calculate scaling factors\n        scale_x = width / new_w\n        scale_y = height / new_h\n\n        # Create scaling matrix\n        scale_mat = np.array([[scale_x, 0, 0], [0, scale_y, 0], [0, 0, 1]])\n\n        # Combine rotation and scaling\n        matrix = scale_mat @ np.vstack([rotation_mat, [0, 0, 1]])\n\n        return matrix, {\"x\": scale_x, \"y\": scale_y}\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        angle = self.py_random.uniform(*self.limit)\n\n        # Calculate centers for image and bbox\n        image_center = fgeometric.center(image_shape)\n        bbox_center = fgeometric.center_bbox(image_shape)\n\n        # Create matrices for image and bbox\n        matrix, scale = self._create_safe_rotate_matrix(\n            angle,\n            image_center,\n            image_shape,\n        )\n        bbox_matrix, _ = self._create_safe_rotate_matrix(\n            angle,\n            bbox_center,\n            image_shape,\n        )\n\n        return {\n            \"rotate\": angle,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": image_shape,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/","title":"Geometric transforms (augmentations.geometric.transforms)","text":""},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Affine","title":"<code>class  Affine</code> <code>       (scale=1, translate_percent=None, translate_px=None, rotate=0, shear=0, interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, fit_output=False, keep_ratio=False, rotate_method='largest_box', balanced_scale=False, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Augmentation to apply affine transformations to images.</p> <p>Affine transformations involve:</p> <pre><code>- Translation (\"move\" image on the x-/y-axis)\n- Rotation\n- Scaling (\"zoom\" in/out)\n- Shear (move one side of the image, turning a square into a trapezoid)\n</code></pre> <p>All such transformations can create \"new\" pixels in the image without a defined content, e.g. if the image is translated to the left, pixels are created on the right. A method has to be defined to deal with these pixel values. The parameters <code>fill</code> and <code>fill_mask</code> of this class deal with this.</p> <p>Some transformations involve interpolations between several pixels of the input image to generate output pixel values. The parameters <code>interpolation</code> and <code>mask_interpolation</code> deals with the method of interpolation used for this.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>number, tuple of number or dict</code> <p>Scaling factor to use, where <code>1.0</code> denotes \"no change\" and <code>0.5</code> is zoomed out to <code>50</code> percent of the original size.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That the same range will be used for both x- and y-axis. To keep the aspect ratio, set       <code>keep_ratio=True</code>, then the same value will be used for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes. Note that when       the <code>keep_ratio=True</code>, the x- and y-axis ranges should be the same.</p> <code>translate_percent</code> <code>None, number, tuple of number or dict</code> <p>Translation as a fraction of the image height/width (x-translation, y-translation), where <code>0</code> denotes \"no change\" and <code>0.5</code> denotes \"half of the axis size\".     * If <code>None</code> then equivalent to <code>0.0</code> unless <code>translate_px</code> has a value other than <code>None</code>.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That sampled fraction value will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>translate_px</code> <code>None, int, tuple of int or dict</code> <p>Translation in pixels.     * If <code>None</code> then equivalent to <code>0</code> unless <code>translate_percent</code> has a value other than <code>None</code>.     * If a single int, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from       the discrete interval <code>[a..b]</code>. That number will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>rotate</code> <code>number or tuple of number</code> <p>Rotation in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>. Rotation happens around the center of the image, not the top left corner as in some other frameworks.     * If a number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>       and used as the rotation value.</p> <code>shear</code> <code>number, tuple of number or dict</code> <p>Shear in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>, with reasonable values being in the range of <code>[-45, 45]</code>.     * If a number, then that value will be used for all images as       the shear on the x-axis (no shear on the y-axis will be done).     * If a tuple <code>(a, b)</code>, then two value will be uniformly sampled per image       from the interval <code>[a, b]</code> and be used as the x- and y-shear value.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>mask_interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>fill</code> <code>ColorType</code> <p>The constant value to use when filling in newly created pixels. (E.g. translating by 1px to the right will create a new 1px-wide column of pixels on the left of the image). The value is only used when <code>mode=constant</code>. The expected value range is <code>[0, 255]</code> for <code>uint8</code> images.</p> <code>fill_mask</code> <code>ColorType</code> <p>Same as fill but only for masks.</p> <code>border_mode</code> <code>int</code> <p>OpenCV border flag.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to tightly capture the whole image after affine transformation (<code>translate_percent</code> and <code>translate_px</code> are ignored). Otherwise (<code>False</code>),  parts of the transformed image may end up outside the image plane. Fitting the output shape can be useful to avoid corners of the image being outside the image plane after applying rotations. Default: False</p> <code>keep_ratio</code> <code>bool</code> <p>When True, the original aspect ratio will be kept when the random scale is applied. Default: False.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"</p> <code>balanced_scale</code> <code>bool</code> <p>When True, scaling factors are chosen to be either entirely below or above 1, ensuring balanced scaling. Default: False.</p> <p>This is important because without it, scaling tends to lean towards upscaling. For example, if we want the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly from [0.5, 2]. With <code>balanced_scale</code>, the  function ensures that half the time, the scaling factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in). This makes the zooming in and out process more balanced.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>[1] https://arxiv.org/abs/2109.13488</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Affine(DualTransform):\n    \"\"\"Augmentation to apply affine transformations to images.\n\n    Affine transformations involve:\n\n        - Translation (\"move\" image on the x-/y-axis)\n        - Rotation\n        - Scaling (\"zoom\" in/out)\n        - Shear (move one side of the image, turning a square into a trapezoid)\n\n    All such transformations can create \"new\" pixels in the image without a defined content, e.g.\n    if the image is translated to the left, pixels are created on the right.\n    A method has to be defined to deal with these pixel values.\n    The parameters `fill` and `fill_mask` of this class deal with this.\n\n    Some transformations involve interpolations between several pixels\n    of the input image to generate output pixel values. The parameters `interpolation` and\n    `mask_interpolation` deals with the method of interpolation used for this.\n\n    Args:\n        scale (number, tuple of number or dict): Scaling factor to use, where ``1.0`` denotes \"no change\" and\n            ``0.5`` is zoomed out to ``50`` percent of the original size.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That the same range will be used for both x- and y-axis. To keep the aspect ratio, set\n                  ``keep_ratio=True``, then the same value will be used for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes. Note that when\n                  the ``keep_ratio=True``, the x- and y-axis ranges should be the same.\n        translate_percent (None, number, tuple of number or dict): Translation as a fraction of the image height/width\n            (x-translation, y-translation), where ``0`` denotes \"no change\"\n            and ``0.5`` denotes \"half of the axis size\".\n                * If ``None`` then equivalent to ``0.0`` unless `translate_px` has a value other than ``None``.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That sampled fraction value will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        translate_px (None, int, tuple of int or dict): Translation in pixels.\n                * If ``None`` then equivalent to ``0`` unless `translate_percent` has a value other than ``None``.\n                * If a single int, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from\n                  the discrete interval ``[a..b]``. That number will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        rotate (number or tuple of number): Rotation in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``. Rotation happens around the *center* of the image,\n            not the top left corner as in some other frameworks.\n                * If a number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``\n                  and used as the rotation value.\n        shear (number, tuple of number or dict): Shear in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``, with reasonable values being in the range of ``[-45, 45]``.\n                * If a number, then that value will be used for all images as\n                  the shear on the x-axis (no shear on the y-axis will be done).\n                * If a tuple ``(a, b)``, then two value will be uniformly sampled per image\n                  from the interval ``[a, b]`` and be used as the x- and y-shear value.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        interpolation (int): OpenCV interpolation flag.\n        mask_interpolation (int): OpenCV interpolation flag.\n        fill (ColorType): The constant value to use when filling in newly created pixels.\n            (E.g. translating by 1px to the right will create a new 1px-wide column of pixels\n            on the left of the image).\n            The value is only used when `mode=constant`. The expected value range is ``[0, 255]`` for ``uint8`` images.\n        fill_mask (ColorType): Same as fill but only for masks.\n        border_mode (int): OpenCV border flag.\n        fit_output (bool): If True, the image plane size and position will be adjusted to tightly capture\n            the whole image after affine transformation (`translate_percent` and `translate_px` are ignored).\n            Otherwise (``False``),  parts of the transformed image may end up outside the image plane.\n            Fitting the output shape can be useful to avoid corners of the image being outside the image plane\n            after applying rotations. Default: False\n        keep_ratio (bool): When True, the original aspect ratio will be kept when the random scale is applied.\n            Default: False.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): rotation method used for the bounding boxes.\n            Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"\n        balanced_scale (bool): When True, scaling factors are chosen to be either entirely below or above 1,\n            ensuring balanced scaling. Default: False.\n\n            This is important because without it, scaling tends to lean towards upscaling. For example, if we want\n            the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is\n            three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly\n            from [0.5, 2]. With `balanced_scale`, the  function ensures that half the time, the scaling\n            factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in).\n            This makes the zooming in and out process more balanced.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        [1] https://arxiv.org/abs/2109.13488\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: ScaleFloatType | fgeometric.XYFloatScale\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None\n        rotate: ScaleFloatType\n        shear: ScaleFloatType | fgeometric.XYFloatScale\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        cval: ColorType | None\n        cval_mask: ColorType | None\n        mode: BorderModeType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        fit_output: bool\n        keep_ratio: bool\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        balanced_scale: bool\n\n        @field_validator(\"shear\", \"scale\")\n        @classmethod\n        def process_shear(\n            cls,\n            value: ScaleFloatType | fgeometric.XYFloatScale,\n            info: ValidationInfo,\n        ) -&gt; fgeometric.XYFloatDict:\n            return cast(\n                fgeometric.XYFloatDict,\n                cls._handle_dict_arg(value, info.field_name),\n            )\n\n        @field_validator(\"rotate\")\n        @classmethod\n        def process_rotate(\n            cls,\n            value: ScaleFloatType,\n        ) -&gt; tuple[float, float]:\n            return to_tuple(value, value)\n\n        @model_validator(mode=\"after\")\n        def handle_translate(self) -&gt; Self:\n            if self.translate_percent is None and self.translate_px is None:\n                self.translate_px = 0\n\n            if self.translate_percent is not None and self.translate_px is not None:\n                msg = \"Expected either translate_percent or translate_px to be provided, but both were provided.\"\n                raise ValueError(msg)\n\n            if self.translate_percent is not None:\n                self.translate_percent = self._handle_dict_arg(\n                    self.translate_percent,\n                    \"translate_percent\",\n                    default=0.0,\n                )  # type: ignore[assignment]\n\n            if self.translate_px is not None:\n                self.translate_px = self._handle_dict_arg(\n                    self.translate_px,\n                    \"translate_px\",\n                    default=0,\n                )  # type: ignore[assignment]\n\n            return self\n\n        @staticmethod\n        def _handle_dict_arg(\n            val: ScaleType | fgeometric.XYFloatScale | fgeometric.XYIntScale,\n            name: str | None,\n            default: float = 1.0,\n        ) -&gt; dict[str, Any]:\n            if isinstance(val, dict):\n                if \"x\" not in val and \"y\" not in val:\n                    raise ValueError(\n                        f'Expected {name} dictionary to contain at least key \"x\" or key \"y\". Found neither of them.',\n                    )\n                x = val.get(\"x\", default)\n                y = val.get(\"y\", default)\n                return {\"x\": to_tuple(x, x), \"y\": to_tuple(y, y)}  # type: ignore[arg-type]\n            return {\"x\": to_tuple(val, val), \"y\": to_tuple(val, val)}\n\n        @model_validator(mode=\"after\")\n        def validate_fill_types(self) -&gt; Self:\n            if self.cval is not None:\n                self.fill = self.cval\n                warn(\"cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n            if self.cval_mask is not None:\n                self.fill_mask = self.cval_mask\n                warn(\"cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n            if self.mode is not None:\n                self.border_mode = self.mode\n                warn(\"mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType | fgeometric.XYFloatScale = 1,\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None = None,\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None = None,\n        rotate: ScaleFloatType = 0,\n        shear: ScaleFloatType | fgeometric.XYFloatScale = 0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: ColorType | None = None,\n        cval_mask: ColorType | None = None,\n        mode: int | None = None,\n        fit_output: bool = False,\n        keep_ratio: bool = False,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        balanced_scale: bool = False,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n        self.scale = cast(fgeometric.XYFloatDict, scale)\n        self.translate_percent = cast(fgeometric.XYFloatDict, translate_percent)\n        self.translate_px = cast(fgeometric.XYIntDict, translate_px)\n        self.rotate = cast(tuple[float, float], rotate)\n        self.fit_output = fit_output\n        self.shear = cast(fgeometric.XYFloatDict, shear)\n        self.keep_ratio = keep_ratio\n        self.rotate_method = rotate_method\n        self.balanced_scale = balanced_scale\n\n        if self.keep_ratio and self.scale[\"x\"] != self.scale[\"y\"]:\n            raise ValueError(\n                f\"When keep_ratio is True, the x and y scale range should be identical. got {self.scale}\",\n            )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"fill\",\n            \"border_mode\",\n            \"scale\",\n            \"translate_percent\",\n            \"translate_px\",\n            \"rotate\",\n            \"fit_output\",\n            \"shear\",\n            \"fill_mask\",\n            \"keep_ratio\",\n            \"rotate_method\",\n            \"balanced_scale\",\n        )\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            img,\n            matrix,\n            interpolation=self.interpolation,\n            fill=self.fill,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            mask,\n            matrix,\n            interpolation=self.mask_interpolation,\n            fill=self.fill_mask,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            params[\"shape\"][:2],\n            self.border_mode,\n            output_shape,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        scale: fgeometric.XYFloat,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"],\n            scale,\n            self.border_mode,\n        )\n\n    @staticmethod\n    def get_scale(\n        scale: fgeometric.XYFloatDict,\n        keep_ratio: bool,\n        balanced_scale: bool,\n        random_state: random.Random,\n    ) -&gt; fgeometric.XYFloat:\n        result_scale = {}\n        for key, value in scale.items():\n            if isinstance(value, (int, float)):\n                result_scale[key] = float(value)\n            elif isinstance(value, tuple):\n                if balanced_scale:\n                    lower_interval = (value[0], 1.0) if value[0] &lt; 1 else None\n                    upper_interval = (1.0, value[1]) if value[1] &gt; 1 else None\n\n                    if lower_interval is not None and upper_interval is not None:\n                        selected_interval = random_state.choice(\n                            [lower_interval, upper_interval],\n                        )\n                    elif lower_interval is not None:\n                        selected_interval = lower_interval\n                    elif upper_interval is not None:\n                        selected_interval = upper_interval\n                    else:\n                        result_scale[key] = 1.0\n                        continue\n\n                    result_scale[key] = random_state.uniform(*selected_interval)\n                else:\n                    result_scale[key] = random_state.uniform(*value)\n            else:\n                raise TypeError(\n                    f\"Invalid scale value for key {key}: {value}. Expected a float or a tuple of two floats.\",\n                )\n\n        if keep_ratio:\n            result_scale[\"y\"] = result_scale[\"x\"]\n\n        return cast(fgeometric.XYFloat, result_scale)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        translate = self._get_translate_params(image_shape)\n        shear = self._get_shear_params()\n        scale = self.get_scale(\n            self.scale,\n            self.keep_ratio,\n            self.balanced_scale,\n            self.py_random,\n        )\n        rotate = self.py_random.uniform(*self.rotate)\n\n        image_shift = fgeometric.center(image_shape)\n        bbox_shift = fgeometric.center_bbox(image_shape)\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            image_shift,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_shift,\n        )\n\n        if self.fit_output:\n            matrix, output_shape = fgeometric.compute_affine_warp_output_shape(\n                matrix,\n                image_shape,\n            )\n            bbox_matrix, _ = fgeometric.compute_affine_warp_output_shape(\n                bbox_matrix,\n                image_shape,\n            )\n        else:\n            output_shape = image_shape\n\n        return {\n            \"rotate\": rotate,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": output_shape,\n        }\n\n    def _get_translate_params(self, image_shape: tuple[int, int]) -&gt; fgeometric.XYInt:\n        height, width = image_shape[:2]\n        if self.translate_px is not None:\n            return {\n                \"x\": self.py_random.randint(*self.translate_px[\"x\"]),\n                \"y\": self.py_random.randint(*self.translate_px[\"y\"]),\n            }\n        if self.translate_percent is not None:\n            translate = {key: self.py_random.uniform(*value) for key, value in self.translate_percent.items()}\n            return cast(\n                fgeometric.XYInt,\n                {\"x\": int(translate[\"x\"] * width), \"y\": int(translate[\"y\"] * height)},\n            )\n        return cast(fgeometric.XYInt, {\"x\": 0, \"y\": 0})\n\n    def _get_shear_params(self) -&gt; fgeometric.XYFloat:\n        return {\n            \"x\": -self.py_random.uniform(*self.shear[\"x\"]),\n            \"y\": -self.py_random.uniform(*self.shear[\"y\"]),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.BaseDistortion","title":"<code>class  BaseDistortion</code> <code>       (interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Base class for distortion-based transformations.</p> <p>This class provides a foundation for implementing various types of image distortions, such as optical distortions, grid distortions, and elastic transformations. It handles the common operations of applying distortions to images, masks, bounding boxes, and keypoints.</p> <p>Parameters:</p> Name Type Description <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR, cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This is an abstract base class and should not be used directly.</li> <li>Subclasses should implement the <code>get_params_dependent_on_data</code> method to generate   the distortion maps (map_x and map_y).</li> <li>The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)   to maintain coherence in the augmented data.</li> </ul> <p>Example of a subclass:     class CustomDistortion(BaseDistortion):         def init(self, args, **kwargs):             super().init(args, **kwargs)             # Add custom parameters here</p> <pre><code>    def get_params_dependent_on_data(self, params, data):\n        # Generate and return map_x and map_y based on the distortion logic\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self):\n        return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class BaseDistortion(DualTransform):\n    \"\"\"Base class for distortion-based transformations.\n\n    This class provides a foundation for implementing various types of image distortions,\n    such as optical distortions, grid distortions, and elastic transformations. It handles\n    the common operations of applying distortions to images, masks, bounding boxes, and keypoints.\n\n    Args:\n        interpolation (int): Interpolation method to be used for image transformation.\n            Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR,\n            cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This is an abstract base class and should not be used directly.\n        - Subclasses should implement the `get_params_dependent_on_data` method to generate\n          the distortion maps (map_x and map_y).\n        - The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)\n          to maintain coherence in the augmented data.\n\n    Example of a subclass:\n        class CustomDistortion(BaseDistortion):\n            def __init__(self, *args, **kwargs):\n                super().__init__(*args, **kwargs)\n                # Add custom parameters here\n\n            def get_params_dependent_on_data(self, params, data):\n                # Generate and return map_x and map_y based on the distortion logic\n                return {\"map_x\": map_x, \"map_y\": map_y}\n\n            def get_transform_init_args_names(self):\n                return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            img,\n            map_x,\n            map_y,\n            self.interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            mask,\n            map_x,\n            map_y,\n            self.mask_interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        bboxes_returned = fgeometric.remap_bboxes(\n            bboxes_denorm,\n            map_x,\n            map_y,\n            image_shape,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap_keypoints(keypoints, map_x, map_y, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.D4","title":"<code>class  D4</code> <code>       (p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections.</p> <p>The D4 group transformations include: - 'e' (identity): No transformation is applied. - 'r90' (rotation by 90 degrees counterclockwise) - 'r180' (rotation by 180 degrees) - 'r270' (rotation by 270 degrees counterclockwise) - 'v' (reflection across the vertical midline) - 'hvt' (reflection across the anti-diagonal) - 'h' (reflection across the horizontal midline) - 't' (reflection across the main diagonal)</p> <p>Even if the probability (<code>p</code>) of applying the transform is set to 1, the identity transformation 'e' may still occur, which means the input will remain unchanged in one out of eight cases.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform is particularly useful for augmenting data that does not have a clear orientation,   such as top-view satellite or drone imagery, or certain types of medical images.</li> <li>The input image should be square-shaped for optimal results. Non-square inputs may lead to   unexpected behavior or distortions.</li> <li>When applied to bounding boxes or keypoints, their coordinates will be adjusted according   to the selected transformation.</li> <li>This transform preserves the aspect ratio and size of the input.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.D4(p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed['image']\n# The resulting image will be one of the 8 possible D4 transformations of the input\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class D4(DualTransform):\n    \"\"\"Applies one of the eight possible D4 dihedral group transformations to a square-shaped input,\n    maintaining the square shape. These transformations correspond to the symmetries of a square,\n    including rotations and reflections.\n\n    The D4 group transformations include:\n    - 'e' (identity): No transformation is applied.\n    - 'r90' (rotation by 90 degrees counterclockwise)\n    - 'r180' (rotation by 180 degrees)\n    - 'r270' (rotation by 270 degrees counterclockwise)\n    - 'v' (reflection across the vertical midline)\n    - 'hvt' (reflection across the anti-diagonal)\n    - 'h' (reflection across the horizontal midline)\n    - 't' (reflection across the main diagonal)\n\n    Even if the probability (`p`) of applying the transform is set to 1, the identity transformation\n    'e' may still occur, which means the input will remain unchanged in one out of eight cases.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform is particularly useful for augmenting data that does not have a clear orientation,\n          such as top-view satellite or drone imagery, or certain types of medical images.\n        - The input image should be square-shaped for optimal results. Non-square inputs may lead to\n          unexpected behavior or distortions.\n        - When applied to bounding boxes or keypoints, their coordinates will be adjusted according\n          to the selected transformation.\n        - This transform preserves the aspect ratio and size of the input.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.D4(p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        # The resulting image will be one of the 8 possible D4 transformations of the input\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(\n        self,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.d4(img, group_element)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_d4(bboxes, group_element)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_d4(keypoints, group_element, params[\"shape\"])\n\n    def get_params(self) -&gt; dict[str, D4Type]:\n        return {\n            \"group_element\": self.random_generator.choice(d4_group_elements),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ElasticTransform","title":"<code>class  ElasticTransform</code> <code>       (alpha=1, sigma=50, interpolation=1, border_mode=4, value=None, mask_value=None, approximate=False, same_dxdy=False, mask_interpolation=0, noise_distribution='gaussian', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply elastic deformation to images, masks, bounding boxes, and keypoints.</p> <p>This transformation introduces random elastic distortions to the input data. It's particularly useful for data augmentation in training deep learning models, especially for tasks like image segmentation or object detection where you want to maintain the relative positions of features while introducing realistic deformations.</p> <p>The transform works by generating random displacement fields and applying them to the input. These fields are smoothed using a Gaussian filter to create more natural-looking distortions.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>float</code> <p>Scaling factor for the random displacement fields. Higher values result in more pronounced distortions. Default: 1.0</p> <code>sigma</code> <code>float</code> <p>Standard deviation of the Gaussian filter used to smooth the displacement fields. Higher values result in smoother, more global distortions. Default: 50.0</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>approximate</code> <code>bool</code> <p>Whether to use an approximate version of the elastic transform. If True, uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially less accurate for large sigma values. Default: False</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use the same random displacement field for both x and y directions. Can speed up the transform at the cost of less diverse distortions. Default: False</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>noise_distribution</code> <code>Literal[\"gaussian\", \"uniform\"]</code> <p>Distribution used to generate the displacement fields. \"gaussian\" generates fields using normal distribution (more natural deformations). \"uniform\" generates fields using uniform distribution (more mechanical deformations). Default: \"gaussian\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)   by using the same displacement fields for all.</li> <li>The 'approximate' parameter determines whether to use a precise or approximate method for   generating displacement fields. The approximate method can be faster but may be less   accurate for large sigma values.</li> <li>Bounding boxes that end up outside the image after transformation will be removed.</li> <li>Keypoints that end up outside the image after transformation will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ElasticTransform(BaseDistortion):\n    \"\"\"Apply elastic deformation to images, masks, bounding boxes, and keypoints.\n\n    This transformation introduces random elastic distortions to the input data. It's particularly\n    useful for data augmentation in training deep learning models, especially for tasks like\n    image segmentation or object detection where you want to maintain the relative positions of\n    features while introducing realistic deformations.\n\n    The transform works by generating random displacement fields and applying them to the input.\n    These fields are smoothed using a Gaussian filter to create more natural-looking distortions.\n\n    Args:\n        alpha (float): Scaling factor for the random displacement fields. Higher values result in\n            more pronounced distortions. Default: 1.0\n        sigma (float): Standard deviation of the Gaussian filter used to smooth the displacement\n            fields. Higher values result in smoother, more global distortions. Default: 50.0\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        approximate (bool): Whether to use an approximate version of the elastic transform. If True,\n            uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially\n            less accurate for large sigma values. Default: False\n        same_dxdy (bool): Whether to use the same random displacement field for both x and y\n            directions. Can speed up the transform at the cost of less diverse distortions. Default: False\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        noise_distribution (Literal[\"gaussian\", \"uniform\"]): Distribution used to generate the displacement fields.\n            \"gaussian\" generates fields using normal distribution (more natural deformations).\n            \"uniform\" generates fields using uniform distribution (more mechanical deformations).\n            Default: \"gaussian\".\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)\n          by using the same displacement fields for all.\n        - The 'approximate' parameter determines whether to use a precise or approximate method for\n          generating displacement fields. The approximate method can be faster but may be less\n          accurate for large sigma values.\n        - Bounding boxes that end up outside the image after transformation will be removed.\n        - Keypoints that end up outside the image after transformation will be removed.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        alpha: Annotated[float, Field(ge=0)]\n        sigma: Annotated[float, Field(ge=1)]\n        approximate: bool\n        same_dxdy: bool\n        noise_distribution: Literal[\"gaussian\", \"uniform\"]\n        border_mode: BorderModeType = Field(deprecated=\"Deprecated\")\n        value: ColorType | None = Field(deprecated=\"Deprecated\")\n        mask_value: ColorType | None = Field(deprecated=\"Deprecated\")\n\n    def __init__(\n        self,\n        alpha: float = 1,\n        sigma: float = 50,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        approximate: bool = False,\n        same_dxdy: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        noise_distribution: Literal[\"gaussian\", \"uniform\"] = \"gaussian\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.alpha = alpha\n        self.sigma = sigma\n        self.approximate = approximate\n        self.same_dxdy = same_dxdy\n        self.noise_distribution = noise_distribution\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n        kernel_size = (0, 0) if self.approximate else (17, 17)\n\n        # Generate displacement fields\n        dx, dy = fgeometric.generate_displacement_fields(\n            (height, width),\n            self.alpha,\n            self.sigma,\n            same_dxdy=self.same_dxdy,\n            kernel_size=kernel_size,\n            random_generator=self.random_generator,\n            noise_distribution=self.noise_distribution,\n        )\n\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        map_x = np.float32(x + dx)\n        map_y = np.float32(y + dy)\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"alpha\",\n            \"sigma\",\n            \"approximate\",\n            \"same_dxdy\",\n            \"noise_distribution\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.GridDistortion","title":"<code>class  GridDistortion</code> <code>       (num_steps=5, distort_limit=(-0.3, 0.3), interpolation=1, border_mode=4, value=None, mask_value=None, normalized=True, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply grid distortion to images, masks, bounding boxes, and keypoints.</p> <p>This transformation divides the image into a grid and randomly distorts each cell, creating localized warping effects. It's particularly useful for data augmentation in tasks like medical image analysis, OCR, and other domains where local geometric variations are meaningful.</p> <p>Parameters:</p> Name Type Description <code>num_steps</code> <code>int</code> <p>Number of grid cells on each side of the image. Higher values create more granular distortions. Must be at least 1. Default: 5.</p> <code>distort_limit</code> <code>float or tuple[float, float]</code> <p>Range of distortion. If a single float is provided, the range will be (-distort_limit, distort_limit). Higher values create stronger distortions. Should be in the range of -1 to 1. Default: (-0.3, 0.3).</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation method used for image transformation. Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.</p> <code>normalized</code> <code>bool</code> <p>If True, ensures that the distortion does not move pixels outside the image boundaries. This can result in less extreme distortions but guarantees that no information is lost. Default: True.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The same distortion is applied to all targets (image, mask, bboxes, keypoints)   to maintain consistency.</li> <li>When normalized=True, the distortion is adjusted to ensure all pixels remain   within the image boundaries.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridDistortion(BaseDistortion):\n    \"\"\"Apply grid distortion to images, masks, bounding boxes, and keypoints.\n\n    This transformation divides the image into a grid and randomly distorts each cell,\n    creating localized warping effects. It's particularly useful for data augmentation\n    in tasks like medical image analysis, OCR, and other domains where local geometric\n    variations are meaningful.\n\n    Args:\n        num_steps (int): Number of grid cells on each side of the image. Higher values\n            create more granular distortions. Must be at least 1. Default: 5.\n        distort_limit (float or tuple[float, float]): Range of distortion. If a single float\n            is provided, the range will be (-distort_limit, distort_limit). Higher values\n            create stronger distortions. Should be in the range of -1 to 1.\n            Default: (-0.3, 0.3).\n        interpolation (int): OpenCV interpolation method used for image transformation.\n            Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.\n        normalized (bool): If True, ensures that the distortion does not move pixels\n            outside the image boundaries. This can result in less extreme distortions\n            but guarantees that no information is lost. Default: True.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The same distortion is applied to all targets (image, mask, bboxes, keypoints)\n          to maintain consistency.\n        - When normalized=True, the distortion is adjusted to ensure all pixels remain\n          within the image boundaries.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        num_steps: Annotated[int, Field(ge=1)]\n        distort_limit: SymmetricRangeType\n        normalized: bool\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int = Field(deprecated=\"Deprecated. Does not have any effect.\")\n\n        @field_validator(\"distort_limit\")\n        @classmethod\n        def check_limits(\n            cls,\n            v: tuple[float, float],\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = -1, 1\n            result = to_tuple(v)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        num_steps: int = 5,\n        distort_limit: ScaleFloatType = (-0.3, 0.3),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        normalized: bool = True,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.num_steps = num_steps\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.normalized = normalized\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        steps_x = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n        steps_y = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n\n        if self.normalized:\n            normalized_params = fgeometric.normalize_grid_distortion_steps(\n                image_shape,\n                self.num_steps,\n                steps_x,\n                steps_y,\n            )\n            steps_x, steps_y = (\n                normalized_params[\"steps_x\"],\n                normalized_params[\"steps_y\"],\n            )\n\n        map_x, map_y = fgeometric.generate_grid(\n            image_shape,\n            steps_x,\n            steps_y,\n            self.num_steps,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"num_steps\",\n            \"distort_limit\",\n            \"normalized\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.GridElasticDeform","title":"<code>class  GridElasticDeform</code> <code>       (num_grid_xy, magnitude, interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.</p> <p>This transformation overlays a grid on the input and applies random displacements to the grid points, resulting in local elastic distortions. The granularity and intensity of the distortions can be controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.</p> <p>Parameters:</p> Name Type Description <code>num_grid_xy</code> <code>tuple[int, int]</code> <p>Number of grid cells along the width and height. Specified as (grid_width, grid_height). Each value must be greater than 1.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. Must be greater than 0.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for the image transformation. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method to be used for mask transformation. Default: cv2.INTER_NEAREST</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n&gt;&gt;&gt; result = transform(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n</code></pre> <p>Note</p> <p>This transformation is particularly useful for data augmentation in medical imaging and other domains where elastic deformations can simulate realistic variations.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridElasticDeform(DualTransform):\n    \"\"\"Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.\n\n    This transformation overlays a grid on the input and applies random displacements to the grid points,\n    resulting in local elastic distortions. The granularity and intensity of the distortions can be\n    controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.\n\n\n    Args:\n        num_grid_xy (tuple[int, int]): Number of grid cells along the width and height.\n            Specified as (grid_width, grid_height). Each value must be greater than 1.\n        magnitude (int): Maximum pixel-wise displacement for distortion. Must be greater than 0.\n        interpolation (int): Interpolation method to be used for the image transformation.\n            Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Interpolation method to be used for mask transformation.\n            Default: cv2.INTER_NEAREST\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n\n    Note:\n        This transformation is particularly useful for data augmentation in medical imaging\n        and other domains where elastic deformations can simulate realistic variations.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        num_grid_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n        magnitude: int = Field(gt=0)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        num_grid_xy: tuple[int, int],\n        magnitude: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_grid_xy = num_grid_xy\n        self.magnitude = magnitude\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    @staticmethod\n    def generate_mesh(polygons: np.ndarray, dimensions: np.ndarray) -&gt; np.ndarray:\n        return np.hstack((dimensions.reshape(-1, 4), polygons))\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        # Replace calculate_grid_dimensions with split_uniform_grid\n        tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.num_grid_xy,\n            self.random_generator,\n        )\n\n        # Convert tiles to the format expected by generate_distorted_grid_polygons\n        dimensions = np.array(\n            [\n                [\n                    tile[1],\n                    tile[0],\n                    tile[3],\n                    tile[2],\n                ]  # Reorder to [x_min, y_min, x_max, y_max]\n                for tile in tiles\n            ],\n        ).reshape(\n            self.num_grid_xy[::-1] + (4,),\n        )  # Reshape to (grid_height, grid_width, 4)\n\n        polygons = fgeometric.generate_distorted_grid_polygons(\n            dimensions,\n            self.magnitude,\n            self.random_generator,\n        )\n\n        generated_mesh = self.generate_mesh(polygons, dimensions)\n\n        return {\"generated_mesh\": generated_mesh}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(img, generated_mesh, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(mask, generated_mesh, self.mask_interpolation)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        bboxes_denorm = denormalize_bboxes(bboxes, params[\"shape\"][:2])\n        return normalize_bboxes(\n            fgeometric.bbox_distort_image(\n                bboxes_denorm,\n                generated_mesh,\n                params[\"shape\"][:2],\n            ),\n            params[\"shape\"][:2],\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image_keypoints(\n            keypoints,\n            generated_mesh,\n            params[\"shape\"][:2],\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"num_grid_xy\", \"magnitude\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.HorizontalFlip","title":"<code>class  HorizontalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class HorizontalFlip(DualTransform):\n    \"\"\"Flip the input horizontally around the y-axis.\n\n    Args:\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return hflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_hflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_hflip(keypoints, params[\"shape\"][1])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.OpticalDistortion","title":"<code>class  OpticalDistortion</code> <code>       (distort_limit=(-0.05, 0.05), shift_limit=None, interpolation=1, border_mode=None, value=None, mask_value=None, mask_interpolation=0, mode='camera', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply optical distortion to images, masks, bounding boxes, and keypoints.</p> <p>Supports two distortion models: 1. Camera matrix model (original):    Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients</p> <ol> <li>Fisheye model:    Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)</li> </ol> <p>Parameters:</p> Name Type Description <code>distort_limit</code> <code>float | tuple[float, float]</code> <p>Range of distortion coefficient. For camera model: recommended range (-0.05, 0.05) For fisheye model: recommended range (-0.3, 0.3) Default: (-0.05, 0.05)</p> <code>mode</code> <code>Literal['camera', 'fisheye']</code> <p>Distortion model to use: - 'camera': Original camera matrix model - 'fisheye': Fisheye lens model Default: 'camera'</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Interpolation method used for image transformation. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.</li> <li>The distortion coefficient (k) is randomly sampled from the distort_limit range.</li> <li>The image center is shifted by dx and dy, randomly sampled from the shift_limit range.</li> <li>Bounding boxes and keypoints are transformed along with the image to maintain consistency.</li> <li>Fisheye model directly applies radial distortion</li> <li>Both models use shift_limit to control distortion center</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class OpticalDistortion(BaseDistortion):\n    \"\"\"Apply optical distortion to images, masks, bounding boxes, and keypoints.\n\n    Supports two distortion models:\n    1. Camera matrix model (original):\n       Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients\n\n    2. Fisheye model:\n       Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)\n\n    Args:\n        distort_limit (float | tuple[float, float]): Range of distortion coefficient.\n            For camera model: recommended range (-0.05, 0.05)\n            For fisheye model: recommended range (-0.3, 0.3)\n            Default: (-0.05, 0.05)\n\n        mode (Literal['camera', 'fisheye']): Distortion model to use:\n            - 'camera': Original camera matrix model\n            - 'fisheye': Fisheye lens model\n            Default: 'camera'\n\n        interpolation (OpenCV flag): Interpolation method used for image transformation.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC,\n            cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.\n\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.\n        - The distortion coefficient (k) is randomly sampled from the distort_limit range.\n        - The image center is shifted by dx and dy, randomly sampled from the shift_limit range.\n        - Bounding boxes and keypoints are transformed along with the image to maintain consistency.\n        - Fisheye model directly applies radial distortion\n        - Both models use shift_limit to control distortion center\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        distort_limit: SymmetricRangeType\n        mode: Literal[\"camera\", \"fisheye\"]\n        shift_limit: SymmetricRangeType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n    def __init__(\n        self,\n        distort_limit: ScaleFloatType = (-0.05, 0.05),\n        shift_limit: ScaleFloatType | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int | None = None,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        mode: Literal[\"camera\", \"fisheye\"] = \"camera\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.mode = mode\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        height, width = image_shape\n\n        # Get distortion coefficient\n        k = self.py_random.uniform(*self.distort_limit)\n\n        # Calculate center shift\n        center_xy = fgeometric.center(image_shape)\n\n        # Get distortion maps based on mode\n        if self.mode == \"camera\":\n            map_x, map_y = fgeometric.get_camera_matrix_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n        else:  # fisheye\n            map_x, map_y = fgeometric.get_fisheye_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"distort_limit\",\n            \"mode\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Pad","title":"<code>class  Pad</code> <code>       (padding=0, fill=0, fill_mask=0, border_mode=0, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Pad the sides of an image by specified number of pixels.</p> <p>Parameters:</p> Name Type Description <code>padding</code> <code>int, tuple[int, int] or tuple[int, int, int, int]</code> <p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>References</p> <ul> <li>https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Pad(DualTransform):\n    \"\"\"Pad the sides of an image by specified number of pixels.\n\n    Args:\n        padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:\n            * int - pad all sides by this value\n            * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y\n            * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT\n        fill_mask (ColorType): Padding value for mask if border_mode is cv2.BORDER_CONSTANT\n        border_mode (OpenCV flag): OpenCV border mode\n        p (float): probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    References:\n        - https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        padding: int | tuple[int, int] | tuple[int, int, int, int]\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n    def __init__(\n        self,\n        padding: int | tuple[int, int] | tuple[int, int, int, int] = 0,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        border_mode: BorderModeType = cv2.BORDER_CONSTANT,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.padding = padding\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n\n    def apply(\n        self,\n        img: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            img,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            mask,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill_mask,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_np = denormalize_bboxes(bboxes, params[\"shape\"])\n\n        result = fgeometric.pad_bboxes(\n            bboxes_np,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=image_shape,\n        )\n\n        rows, cols = params[\"shape\"][:2]\n        return normalize_bboxes(\n            result,\n            (rows + pad_top + pad_bottom, cols + pad_left + pad_right),\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_keypoints(\n            keypoints,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=params[\"shape\"][:2],\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        if isinstance(self.padding, Real):\n            pad_top = pad_bottom = pad_left = pad_right = self.padding\n        elif isinstance(self.padding, (tuple, list)):\n            if len(self.padding) == NUM_PADS_XY:\n                pad_left = pad_right = self.padding[0]\n                pad_top = pad_bottom = self.padding[1]\n            elif len(self.padding) == NUM_PADS_ALL_SIDES:\n                pad_left, pad_top, pad_right, pad_bottom = self.padding  # type: ignore[misc]\n            else:\n                raise TypeError(\n                    \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n                )\n        else:\n            raise TypeError(\n                \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n            )\n\n        return {\n            \"pad_top\": pad_top,\n            \"pad_bottom\": pad_bottom,\n            \"pad_left\": pad_left,\n            \"pad_right\": pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"padding\",\n            \"fill\",\n            \"fill_mask\",\n            \"border_mode\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PadIfNeeded","title":"<code>class  PadIfNeeded</code> <code>       (min_height=1024, min_width=1024, pad_height_divisor=None, pad_width_divisor=None, position='center', border_mode=4, value=None, mask_value=None, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pads the sides of an image if the image dimensions are less than the specified minimum dimensions. If the <code>pad_height_divisor</code> or <code>pad_width_divisor</code> is specified, the function additionally ensures that the image dimensions are divisible by these values.</p> <p>Parameters:</p> Name Type Description <code>min_height</code> <code>int | None</code> <p>Minimum desired height of the image. Ensures image height is at least this value. If not specified, pad_height_divisor must be provided.</p> <code>min_width</code> <code>int | None</code> <p>Minimum desired width of the image. Ensures image width is at least this value. If not specified, pad_width_divisor must be provided.</p> <code>pad_height_divisor</code> <code>int | None</code> <p>If set, pads the image height to make it divisible by this value. If not specified, min_height must be provided.</p> <code>pad_width_divisor</code> <code>int | None</code> <p>If set, pads the image width to make it divisible by this value. If not specified, min_width must be provided.</p> <code>position</code> <code>Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]</code> <p>Position where the image is to be placed after padding. Default is 'center'.</p> <code>border_mode</code> <code>int</code> <p>Specifies the border mode to use if padding is required. The default is <code>cv2.BORDER_REFLECT_101</code>.</p> <code>fill</code> <code>ColorType | None</code> <p>Value to fill the border pixels if the border mode is <code>cv2.BORDER_CONSTANT</code>. Default is None.</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Similar to <code>fill</code> but used for padding masks. Default is None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default is 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Either <code>min_height</code> or <code>pad_height_divisor</code> must be set, but not both.</li> <li>Either <code>min_width</code> or <code>pad_width_divisor</code> must be set, but not both.</li> <li>If <code>border_mode</code> is set to <code>cv2.BORDER_CONSTANT</code>, <code>value</code> must be provided.</li> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).</li> <li>For bounding boxes, the coordinates will be adjusted to account for the padding.</li> <li>For keypoints, their positions will be shifted according to the padding.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; padded_image = transformed['image']\n&gt;&gt;&gt; padded_mask = transformed['mask']\n&gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n&gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PadIfNeeded(Pad):\n    \"\"\"Pads the sides of an image if the image dimensions are less than the specified minimum dimensions.\n    If the `pad_height_divisor` or `pad_width_divisor` is specified, the function additionally ensures\n    that the image dimensions are divisible by these values.\n\n    Args:\n        min_height (int | None): Minimum desired height of the image. Ensures image height is at least this value.\n            If not specified, pad_height_divisor must be provided.\n        min_width (int | None): Minimum desired width of the image. Ensures image width is at least this value.\n            If not specified, pad_width_divisor must be provided.\n        pad_height_divisor (int | None): If set, pads the image height to make it divisible by this value.\n            If not specified, min_height must be provided.\n        pad_width_divisor (int | None): If set, pads the image width to make it divisible by this value.\n            If not specified, min_width must be provided.\n        position (Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]):\n            Position where the image is to be placed after padding. Default is 'center'.\n        border_mode (int): Specifies the border mode to use if padding is required.\n            The default is `cv2.BORDER_REFLECT_101`.\n        fill (ColorType | None): Value to fill the border pixels if the border mode is `cv2.BORDER_CONSTANT`.\n            Default is None.\n        fill_mask (ColorType | None): Similar to `fill` but used for padding masks. Default is None.\n        p (float): Probability of applying the transform. Default is 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Either `min_height` or `pad_height_divisor` must be set, but not both.\n        - Either `min_width` or `pad_width_divisor` must be set, but not both.\n        - If `border_mode` is set to `cv2.BORDER_CONSTANT`, `value` must be provided.\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).\n        - For bounding boxes, the coordinates will be adjusted to account for the padding.\n        - For keypoints, their positions will be shifted according to the padding.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; padded_image = transformed['image']\n        &gt;&gt;&gt; padded_mask = transformed['mask']\n        &gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        min_height: int | None = Field(ge=1)\n        min_width: int | None = Field(ge=1)\n        pad_height_divisor: int | None = Field(ge=1)\n        pad_width_divisor: int | None = Field(ge=1)\n        position: PositionType\n        border_mode: BorderModeType\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        @model_validator(mode=\"after\")\n        def validate_divisibility(self) -&gt; Self:\n            if (self.min_height is None) == (self.pad_height_divisor is None):\n                msg = \"Only one of 'min_height' and 'pad_height_divisor' parameters must be set\"\n                raise ValueError(msg)\n            if (self.min_width is None) == (self.pad_width_divisor is None):\n                msg = \"Only one of 'min_width' and 'pad_width_divisor' parameters must be set\"\n                raise ValueError(msg)\n\n            if self.border_mode == cv2.BORDER_CONSTANT and self.fill is None:\n                msg = \"If 'border_mode' is set to 'BORDER_CONSTANT', 'fill' must be provided.\"\n                raise ValueError(msg)\n\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n\n            return self\n\n    def __init__(\n        self,\n        min_height: int | None = 1024,\n        min_width: int | None = 1024,\n        pad_height_divisor: int | None = None,\n        pad_width_divisor: int | None = None,\n        position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        # Initialize with dummy padding that will be calculated later\n        super().__init__(\n            padding=0,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            p=p,\n        )\n        self.min_height = min_height\n        self.min_width = min_width\n        self.pad_height_divisor = pad_height_divisor\n        self.pad_width_divisor = pad_width_divisor\n        self.position = position\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.get_padding_params(\n            image_shape=params[\"shape\"][:2],\n            min_height=self.min_height,\n            min_width=self.min_width,\n            pad_height_divisor=self.pad_height_divisor,\n            pad_width_divisor=self.pad_width_divisor,\n        )\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.adjust_padding_by_position(\n            h_top=h_pad_top,\n            h_bottom=h_pad_bottom,\n            w_left=w_pad_left,\n            w_right=w_pad_right,\n            position=self.position,\n            py_random=self.py_random,\n        )\n\n        return {\n            \"pad_top\": h_pad_top,\n            \"pad_bottom\": h_pad_bottom,\n            \"pad_left\": w_pad_left,\n            \"pad_right\": w_pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"min_height\",\n            \"min_width\",\n            \"pad_height_divisor\",\n            \"pad_width_divisor\",\n            \"position\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Perspective","title":"<code>class  Perspective</code> <code>       (scale=(0.05, 0.1), keep_size=True, pad_mode=None, pad_val=None, mask_pad_val=None, fit_output=False, interpolation=1, mask_interpolation=0, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Apply random four point perspective transformation to the input.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>float or tuple of float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Default: (0.05, 0.1).</p> <code>keep_size</code> <code>bool</code> <p>Whether to resize image back to its original size after applying the perspective transform. If set to False, the resulting images may end up having different shapes. Default: True.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to still capture the whole image after perspective transformation. This is followed by image resizing if keep_size is set to True. If False, parts of the transformed image may be outside of the image plane. This setting should not be set to True when using large scale values as it could lead to very large images. Default: False.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>This transformation creates a perspective effect by randomly moving the four corners of the image. The amount of movement is controlled by the 'scale' parameter.</p> <p>When 'keep_size' is True, the output image will have the same size as the input image, which may cause some parts of the transformed image to be cut off or padded.</p> <p>When 'fit_output' is True, the transformation ensures that the entire transformed image is visible, which may result in a larger output image if keep_size is False.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n... ])\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Perspective(DualTransform):\n    \"\"\"Apply random four point perspective transformation to the input.\n\n    Args:\n        scale (float or tuple of float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Default: (0.05, 0.1).\n        keep_size (bool): Whether to resize image back to its original size after applying the perspective transform.\n            If set to False, the resulting images may end up having different shapes.\n            Default: True.\n        border_mode (OpenCV flag): OpenCV border mode used for padding.\n            Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n            Default: 0.\n        fill_mask (ColorType): Padding value for mask if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fit_output (bool): If True, the image plane size and position will be adjusted to still capture\n            the whole image after perspective transformation. This is followed by image resizing if keep_size is set\n            to True. If False, parts of the transformed image may be outside of the image plane.\n            This setting should not be set to True when using large scale values as it could lead to very large images.\n            Default: False.\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        This transformation creates a perspective effect by randomly moving the four corners of the image.\n        The amount of movement is controlled by the 'scale' parameter.\n\n        When 'keep_size' is True, the output image will have the same size as the input image,\n        which may cause some parts of the transformed image to be cut off or padded.\n\n        When 'fit_output' is True, the transformation ensures that the entire transformed image is visible,\n        which may result in a larger output image if keep_size is False.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n        ... ])\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        keep_size: bool\n        pad_mode: BorderModeType | None = Field(\n            deprecated=\"Deprecated use border_mode instead\",\n        )\n        pad_val: ColorType | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_pad_val: ColorType | None = Field(\n            deprecated=\"Deprecated use fill_mask instead\",\n        )\n        fit_output: bool\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        @model_validator(mode=\"after\")\n        def validate_deprecated_fields(self) -&gt; Self:\n            if self.pad_mode is not None:\n                self.border_mode = self.pad_mode\n            if self.pad_val is not None:\n                self.fill = self.pad_val\n            if self.mask_pad_val is not None:\n                self.fill_mask = self.mask_pad_val\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.05, 0.1),\n        keep_size: bool = True,\n        pad_mode: int | None = None,\n        pad_val: ColorType | None = None,\n        mask_pad_val: ColorType | None = None,\n        fit_output: bool = False,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p, always_apply=always_apply)\n        self.scale = cast(tuple[float, float], scale)\n        self.keep_size = keep_size\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.fit_output = fit_output\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            img,\n            matrix,\n            max_width,\n            max_height,\n            self.fill,\n            self.border_mode,\n            self.keep_size,\n            self.interpolation,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            mask,\n            matrix,\n            max_width,\n            max_height,\n            self.fill_mask,\n            self.border_mode,\n            self.keep_size,\n            self.mask_interpolation,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        matrix_bbox: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_bboxes(\n            bboxes,\n            params[\"shape\"],\n            matrix_bbox,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_keypoints(\n            keypoints,\n            params[\"shape\"],\n            matrix,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        scale = self.py_random.uniform(*self.scale)\n\n        points = fgeometric.generate_perspective_points(\n            image_shape,\n            scale,\n            self.random_generator,\n        )\n        points = fgeometric.order_points(points)\n\n        matrix, max_width, max_height = fgeometric.compute_perspective_params(\n            points,\n            image_shape,\n        )\n\n        if self.fit_output:\n            matrix, max_width, max_height = fgeometric.expand_transform(\n                matrix,\n                image_shape,\n            )\n\n        return {\n            \"matrix\": matrix,\n            \"max_height\": max_height,\n            \"max_width\": max_width,\n            \"matrix_bbox\": matrix,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"keep_size\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"fit_output\",\n            \"interpolation\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PiecewiseAffine","title":"<code>class  PiecewiseAffine</code> <code>       (scale=(0.03, 0.05), nb_rows=(4, 4), nb_cols=(4, 4), interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, absolute_scale=False, p=0.5, always_apply=None, keypoints_threshold=0.01)                     </code>  [view source on GitHub]","text":"<p>Apply piecewise affine transformations to the input image.</p> <p>This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points around via affine transformations. This leads to local distortions in the image.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>tuple[float, float] | float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Recommended values are in the range (0.01, 0.05) for small distortions, and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).</p> <code>nb_rows</code> <code>tuple[int, int] | int</code> <p>Number of rows of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of rows. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>nb_cols</code> <code>tuple[int, int] | int</code> <p>Number of columns of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of columns. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>absolute_scale</code> <code>bool</code> <p>If set to True, the value of the scale parameter will be treated as an absolute pixel value. If set to False, it will be treated as a fraction of the image height and width. Default: False.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This augmentation is very slow. Consider using <code>ElasticTransform</code> instead, which is at least 10x faster.</li> <li>The augmentation may not always produce visible effects, especially with small scale values.</li> <li>For keypoints and bounding boxes, the transformation might move them outside the image boundaries.   In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PiecewiseAffine(BaseDistortion):\n    \"\"\"Apply piecewise affine transformations to the input image.\n\n    This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points\n    around via affine transformations. This leads to local distortions in the image.\n\n    Args:\n        scale (tuple[float, float] | float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Recommended values are in the range (0.01, 0.05) for small distortions,\n            and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).\n        nb_rows (tuple[int, int] | int): Number of rows of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of rows.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        nb_cols (tuple[int, int] | int): Number of columns of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of columns.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        absolute_scale (bool): If set to True, the value of the scale parameter will be treated as an absolute\n            pixel value. If set to False, it will be treated as a fraction of the image height and width.\n            Default: False.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This augmentation is very slow. Consider using `ElasticTransform` instead, which is at least 10x faster.\n        - The augmentation may not always produce visible effects, especially with small scale values.\n        - For keypoints and bounding boxes, the transformation might move them outside the image boundaries.\n          In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed[\"image\"]\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        nb_rows: ScaleIntType\n        nb_cols: ScaleIntType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        cval: int | None = Field(deprecated=\"Deprecated. Does not have any effect.\")\n        cval_mask: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = Field(\n            deprecated=\"Deprecated. Does not have any effects.\",\n        )\n\n        absolute_scale: bool\n        keypoints_threshold: float = Field(\n            deprecated=\"This parameter is not used anymore\",\n        )\n\n        @field_validator(\"nb_rows\", \"nb_cols\")\n        @classmethod\n        def process_range(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = 2, BIG_INTEGER\n            result = to_tuple(value, value)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.03, 0.05),\n        nb_rows: ScaleIntType = (4, 4),\n        nb_cols: ScaleIntType = (4, 4),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: int | None = None,\n        cval_mask: int | None = None,\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = None,\n        absolute_scale: bool = False,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n        keypoints_threshold: float = 0.01,\n    ):\n        super().__init__(\n            p=p,\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n        )\n\n        warn(\n            \"This augmenter is very slow. Try to use ``ElasticTransform`` instead, which is at least 10x faster.\",\n            stacklevel=2,\n        )\n\n        self.scale = cast(tuple[float, float], scale)\n        self.nb_rows = cast(tuple[int, int], nb_rows)\n        self.nb_cols = cast(tuple[int, int], nb_cols)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.absolute_scale = absolute_scale\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"nb_rows\",\n            \"nb_cols\",\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"absolute_scale\",\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        nb_rows = np.clip(self.py_random.randint(*self.nb_rows), 2, None)\n        nb_cols = np.clip(self.py_random.randint(*self.nb_cols), 2, None)\n        scale = self.py_random.uniform(*self.scale)\n\n        map_x, map_y = fgeometric.create_piecewise_affine_maps(\n            image_shape=image_shape,\n            grid=(nb_rows, nb_cols),\n            scale=scale,\n            absolute_scale=self.absolute_scale,\n            random_generator=self.random_generator,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.RandomGridShuffle","title":"<code>class  RandomGridShuffle</code> <code>       (grid=(3, 3), p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Randomly shuffles the grid's cells on an image, mask, or keypoints, effectively rearranging patches within the image. This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.</p> <p>Parameters:</p> Name Type Description <code>grid</code> <code>tuple[int, int]</code> <p>Size of the grid for splitting the image into cells. Each cell is shuffled randomly. For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled. Default: (3, 3)</p> <code>p</code> <code>float</code> <p>Probability that the transform will be applied. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform maintains consistency across all targets. If applied to an image and its corresponding   mask or keypoints, the same shuffling will be applied to all.</li> <li>The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))   for the transform to have any effect.</li> <li>Keypoints are moved along with their corresponding grid cell.</li> <li>This transform could be useful when only micro features are important for the model, and memorizing   the global structure could be harmful. For example:</li> <li>Identifying the type of cell phone used to take a picture based on micro artifacts generated by     phone post-processing algorithms, rather than the semantic features of the photo.     See more at https://ieeexplore.ieee.org/abstract/document/8622031</li> <li>Identifying stress, glucose, hydration levels based on skin images.</li> </ul> <p>Mathematical Formulation:     1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.     2. A random permutation P of integers from 0 to (mn - 1) is generated.     3. Each cell in the grid is assigned a number from 0 to (mn - 1) in row-major order.     4. The cells are then rearranged according to the permutation P.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4]\n... ])\n&gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# The resulting image might look like this (one possible outcome):\n# [[4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1]]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class RandomGridShuffle(DualTransform):\n    \"\"\"Randomly shuffles the grid's cells on an image, mask, or keypoints,\n    effectively rearranging patches within the image.\n    This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.\n\n    Args:\n        grid (tuple[int, int]): Size of the grid for splitting the image into cells. Each cell is shuffled randomly.\n            For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled.\n            Default: (3, 3)\n        p (float): Probability that the transform will be applied. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform maintains consistency across all targets. If applied to an image and its corresponding\n          mask or keypoints, the same shuffling will be applied to all.\n        - The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))\n          for the transform to have any effect.\n        - Keypoints are moved along with their corresponding grid cell.\n        - This transform could be useful when only micro features are important for the model, and memorizing\n          the global structure could be harmful. For example:\n          - Identifying the type of cell phone used to take a picture based on micro artifacts generated by\n            phone post-processing algorithms, rather than the semantic features of the photo.\n            See more at https://ieeexplore.ieee.org/abstract/document/8622031\n          - Identifying stress, glucose, hydration levels based on skin images.\n\n    Mathematical Formulation:\n        1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.\n        2. A random permutation P of integers from 0 to (m*n - 1) is generated.\n        3. Each cell in the grid is assigned a number from 0 to (m*n - 1) in row-major order.\n        4. The cells are then rearranged according to the permutation P.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4]\n        ... ])\n        &gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # The resulting image might look like this (one possible outcome):\n        # [[4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1]]\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        grid: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n\n    _targets = ALL_TARGETS\n\n    def __init__(\n        self,\n        grid: tuple[int, int] = (3, 3),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.grid = grid\n\n    def apply(\n        self,\n        img: np.ndarray,\n        tiles: np.ndarray,\n        mapping: list[int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_image(img, tiles, mapping)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n        bboxes_returned = fgeometric.bboxes_grid_shuffle(\n            bboxes_denorm,\n            tiles,\n            mapping,\n            image_shape,\n            min_area=processor.params.min_area,\n            min_visibility=processor.params.min_visibility,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_keypoints(keypoints, tiles, mapping)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        image_shape = params[\"shape\"][:2]\n\n        original_tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.grid,\n            self.random_generator,\n        )\n        shape_groups = fgeometric.create_shape_groups(original_tiles)\n        mapping = fgeometric.shuffle_tiles_within_shape_groups(\n            shape_groups,\n            self.random_generator,\n        )\n\n        return {\"tiles\": original_tiles, \"mapping\": mapping}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"grid\",)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ShiftScaleRotate","title":"<code>class  ShiftScaleRotate</code> <code>       (shift_limit=(-0.0625, 0.0625), scale_limit=(-0.1, 0.1), rotate_limit=(-45, 45), interpolation=1, border_mode=4, value=None, mask_value=None, shift_limit_x=None, shift_limit_y=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Randomly apply affine transforms: translate, scale and rotate the input.</p> <p>Parameters:</p> Name Type Description <code>shift_limit</code> <code>float, float) or float</code> <p>shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).</p> <code>scale_limit</code> <code>float, float) or float</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>rotate_limit</code> <code>int, int) or int</code> <p>rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>shift_limit_x</code> <code>float, float) or float</code> <p>shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [-1, 1]. Default: None.</p> <code>shift_limit_y</code> <code>float, float) or float</code> <p>shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [-, 1]. Default: None.</p> <code>rotate_method</code> <code>str</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\". Default: \"largest_box\"</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ShiftScaleRotate(Affine):\n    \"\"\"Randomly apply affine transforms: translate, scale and rotate the input.\n\n    Args:\n        shift_limit ((float, float) or float): shift factor range for both height and width. If shift_limit\n            is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and\n            upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).\n        scale_limit ((float, float) or float): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        rotate_limit ((int, int) or int): rotation range. If rotate_limit is a single int value, the\n            range will be (-rotate_limit, rotate_limit). Default: (-45, 45).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        shift_limit_x ((float, float) or float): shift factor range for width. If it is set then this value\n            instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value,\n            the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in\n            the range [-1, 1]. Default: None.\n        shift_limit_y ((float, float) or float): shift factor range for height. If it is set then this value\n            instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value,\n            the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie\n            in the range [-, 1]. Default: None.\n        rotate_method (str): rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\".\n            Default: \"largest_box\"\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        shift_limit: SymmetricRangeType = (-0.0625, 0.0625)\n        scale_limit: SymmetricRangeType = (-0.1, 0.1)\n        rotate_limit: SymmetricRangeType = (-45, 45)\n        interpolation: InterpolationType = cv2.INTER_LINEAR\n        border_mode: BorderModeType = cv2.BORDER_REFLECT_101\n\n        value: ColorType | None = Field(\n            default=None,\n            deprecated=\"Deprecated. Use fill instead.\",\n        )\n        mask_value: ColorType | None = Field(\n            default=None,\n            deprecated=\"Deprecated. Use fill_mask instead.\",\n        )\n\n        fill: ColorType = 0\n        fill_mask: ColorType = 0\n\n        shift_limit_x: ScaleFloatType | None = Field(default=None)\n        shift_limit_y: ScaleFloatType | None = Field(default=None)\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\"\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def check_shift_limit(self) -&gt; Self:\n            bounds = -1, 1\n            self.shift_limit_x = to_tuple(\n                self.shift_limit_x if self.shift_limit_x is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_x, *bounds, \"shift_limit_x\")\n            self.shift_limit_y = to_tuple(\n                self.shift_limit_y if self.shift_limit_y is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_y, *bounds, \"shift_limit_y\")\n\n            return self\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; ScaleFloatType:\n            bounds = 0, float(\"inf\")\n            result = to_tuple(value, bias=1.0)\n            check_range(result, *bounds, str(info.field_name))\n            return result\n\n    def __init__(\n        self,\n        shift_limit: ScaleFloatType = (-0.0625, 0.0625),\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        rotate_limit: ScaleFloatType = (-45, 45),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        shift_limit_x: ScaleFloatType | None = None,\n        shift_limit_y: ScaleFloatType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: InterpolationType = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        shift_limit_x = cast(tuple[float, float], shift_limit_x)\n        shift_limit_y = cast(tuple[float, float], shift_limit_y)\n        super().__init__(\n            scale=scale_limit,\n            translate_percent={\"x\": shift_limit_x, \"y\": shift_limit_y},\n            rotate=rotate_limit,\n            shear=(0, 0),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            fit_output=False,\n            keep_ratio=False,\n            rotate_method=rotate_method,\n            always_apply=always_apply,\n            p=p,\n        )\n        warn(\n            \"ShiftScaleRotate is deprecated. Please use Affine transform instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        self.shift_limit_x = shift_limit_x\n        self.shift_limit_y = shift_limit_y\n\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.rotate_limit = cast(tuple[int, int], rotate_limit)\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"shift_limit_x\": self.shift_limit_x,\n            \"shift_limit_y\": self.shift_limit_y,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n            \"rotate_limit\": self.rotate_limit,\n            \"interpolation\": self.interpolation,\n            \"border_mode\": self.border_mode,\n            \"fill\": self.fill,\n            \"fill_mask\": self.fill_mask,\n            \"rotate_method\": self.rotate_method,\n            \"mask_interpolation\": self.mask_interpolation,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ThinPlateSpline","title":"<code>class  ThinPlateSpline</code> <code>       (scale_range=(0.2, 0.4), num_control_points=4, interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.</p> <p>Imagine the image printed on a thin metal plate that can be bent and warped smoothly: - Control points act like pins pushing or pulling the plate - The plate resists sharp bending, creating smooth deformations - The transformation maintains continuity (no tears or folds) - Areas between control points are interpolated naturally</p> <p>The transform works by: 1. Creating a regular grid of control points (like pins in the plate) 2. Randomly displacing these points (like pushing/pulling the pins) 3. Computing a smooth interpolation (like the plate bending) 4. Applying the resulting deformation to the image</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for random displacement of control points. Values should be in [0.0, 1.0]: - 0.0: No displacement (identity transform) - 0.1: Subtle warping - 0.2-0.4: Moderate deformation (recommended range) - 0.5+: Strong warping Default: (0.2, 0.4)</p> <code>num_control_points</code> <code>int</code> <p>Number of control points per side. Creates a grid of num_control_points x num_control_points points. - 2: Minimal deformation (affine-like) - 3-4: Moderate flexibility (recommended) - 5+: More local deformation control Must be &gt;= 2. Default: 4</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag. Used for image sampling. See also: cv2.INTER_* Default: cv2.INTER_LINEAR</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transformation preserves smoothness and continuity</li> <li>Stronger scale values may create more extreme deformations</li> <li>Higher number of control points allows more local deformations</li> <li>The same deformation is applied consistently to all targets</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Basic usage\n&gt;&gt;&gt; transform = A.ThinPlateSpline()\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Subtle deformation\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.1, 0.2),\n...     num_control_points=3\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Strong warping with fine control\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.3, 0.5),\n...     num_control_points=5,\n... )\n</code></pre> <p>References</p> <ul> <li> <p>\"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"   by F.L. Bookstein   https://doi.org/10.1109/34.24792</p> </li> <li> <p>Thin Plate Splines in Computer Vision:   https://en.wikipedia.org/wiki/Thin_plate_spline</p> </li> <li> <p>Similar implementation in Kornia:   https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline</p> </li> </ul> <p>See Also:     - ElasticTransform: For different type of non-rigid deformation     - GridDistortion: For grid-based warping     - OpticalDistortion: For lens-like distortions</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ThinPlateSpline(BaseDistortion):\n    r\"\"\"Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.\n\n    Imagine the image printed on a thin metal plate that can be bent and warped smoothly:\n    - Control points act like pins pushing or pulling the plate\n    - The plate resists sharp bending, creating smooth deformations\n    - The transformation maintains continuity (no tears or folds)\n    - Areas between control points are interpolated naturally\n\n    The transform works by:\n    1. Creating a regular grid of control points (like pins in the plate)\n    2. Randomly displacing these points (like pushing/pulling the pins)\n    3. Computing a smooth interpolation (like the plate bending)\n    4. Applying the resulting deformation to the image\n\n\n    Args:\n        scale_range (tuple[float, float]): Range for random displacement of control points.\n            Values should be in [0.0, 1.0]:\n            - 0.0: No displacement (identity transform)\n            - 0.1: Subtle warping\n            - 0.2-0.4: Moderate deformation (recommended range)\n            - 0.5+: Strong warping\n            Default: (0.2, 0.4)\n\n        num_control_points (int): Number of control points per side.\n            Creates a grid of num_control_points x num_control_points points.\n            - 2: Minimal deformation (affine-like)\n            - 3-4: Moderate flexibility (recommended)\n            - 5+: More local deformation control\n            Must be &gt;= 2. Default: 4\n\n        interpolation (int): OpenCV interpolation flag. Used for image sampling.\n            See also: cv2.INTER_*\n            Default: cv2.INTER_LINEAR\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transformation preserves smoothness and continuity\n        - Stronger scale values may create more extreme deformations\n        - Higher number of control points allows more local deformations\n        - The same deformation is applied consistently to all targets\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Basic usage\n        &gt;&gt;&gt; transform = A.ThinPlateSpline()\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Subtle deformation\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.1, 0.2),\n        ...     num_control_points=3\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Strong warping with fine control\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.3, 0.5),\n        ...     num_control_points=5,\n        ... )\n\n    References:\n        - \"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"\n          by F.L. Bookstein\n          https://doi.org/10.1109/34.24792\n\n        - Thin Plate Splines in Computer Vision:\n          https://en.wikipedia.org/wiki/Thin_plate_spline\n\n        - Similar implementation in Kornia:\n          https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline\n\n    See Also:\n        - ElasticTransform: For different type of non-rigid deformation\n        - GridDistortion: For grid-based warping\n        - OpticalDistortion: For lens-like distortions\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        scale_range: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        num_control_points: int = Field(ge=2)\n\n    def __init__(\n        self,\n        scale_range: tuple[float, float] = (0.2, 0.4),\n        num_control_points: int = 4,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.scale_range = scale_range\n        self.num_control_points = num_control_points\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        # Create regular grid of control points\n        grid_size = self.num_control_points\n        x = np.linspace(0, 1, grid_size)\n        y = np.linspace(0, 1, grid_size)\n        src_points = np.stack(np.meshgrid(x, y), axis=-1).reshape(-1, 2)\n\n        # Add random displacement to destination points\n        scale = self.py_random.uniform(*self.scale_range) / 10\n        dst_points = src_points + self.random_generator.normal(\n            0,\n            scale,\n            src_points.shape,\n        )\n\n        # Compute TPS weights\n        weights, affine = fgeometric.compute_tps_weights(src_points, dst_points)\n\n        # Create grid of points\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        points = np.stack([x.flatten(), y.flatten()], axis=1).astype(np.float32)\n\n        # Transform points\n        transformed = fgeometric.tps_transform(\n            points / [width, height],\n            src_points,\n            weights,\n            affine,\n        )\n        transformed *= [width, height]\n\n        return {\n            \"map_x\": transformed[:, 0].reshape(height, width).astype(np.float32),\n            \"map_y\": transformed[:, 1].reshape(height, width).astype(np.float32),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale_range\",\n            \"num_control_points\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Transpose","title":"<code>class  Transpose</code> <code> </code>  [view source on GitHub]","text":"<p>Transpose the input by swapping its rows and columns.</p> <p>This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The dimensions of the output will be swapped compared to the input. For example,   an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).</li> <li>This transform is its own inverse. Applying it twice will return the original input.</li> <li>For multi-channel images (like RGB), the channels are preserved in their original order.</li> <li>Bounding boxes will have their coordinates adjusted to match the new image dimensions.</li> <li>Keypoints will have their x and y coordinates swapped.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (y_min, x_min, y_max, x_max)     3. For keypoints with coordinates (x, y):        new_keypoint = (y, x)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.Transpose(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transposed_image = result['image']\n&gt;&gt;&gt; print(transposed_image)\n[[[ 1  2  3]\n  [ 7  8  9]]\n [[ 4  5  6]\n  [10 11 12]]]\n# The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Transpose(DualTransform):\n    \"\"\"Transpose the input by swapping its rows and columns.\n\n    This transform flips the image over its main diagonal, effectively switching its width and height.\n    It's equivalent to a 90-degree rotation followed by a horizontal flip.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The dimensions of the output will be swapped compared to the input. For example,\n          an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).\n        - This transform is its own inverse. Applying it twice will return the original input.\n        - For multi-channel images (like RGB), the channels are preserved in their original order.\n        - Bounding boxes will have their coordinates adjusted to match the new image dimensions.\n        - Keypoints will have their x and y coordinates swapped.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (y_min, x_min, y_max, x_max)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (y, x)\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.Transpose(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transposed_image = result['image']\n        &gt;&gt;&gt; print(transposed_image)\n        [[[ 1  2  3]\n          [ 7  8  9]]\n         [[ 4  5  6]\n          [10 11 12]]]\n        # The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.transpose(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_transpose(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_transpose(keypoints)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.VerticalFlip","title":"<code>class  VerticalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform flips the image upside down. The top of the image becomes the bottom and vice versa.</li> <li>The dimensions of the image remain unchanged.</li> <li>For multi-channel images (like RGB), each channel is flipped independently.</li> <li>Bounding boxes are adjusted to match their new positions in the flipped image.</li> <li>Keypoints are moved to their new positions in the flipped image.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (x_min, H-y_max, x_max, H-y_min)     3. For keypoints with coordinates (x, y):        new_keypoint = (x, H-y)     where H is the height of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; flipped_image = result['image']\n&gt;&gt;&gt; print(flipped_image)\n[[[ 7  8  9]\n  [10 11 12]]\n [[ 1  2  3]\n  [ 4  5  6]]]\n# The original image is flipped vertically, with rows reversed\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class VerticalFlip(DualTransform):\n    \"\"\"Flip the input vertically around the x-axis.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform flips the image upside down. The top of the image becomes the bottom and vice versa.\n        - The dimensions of the image remain unchanged.\n        - For multi-channel images (like RGB), each channel is flipped independently.\n        - Bounding boxes are adjusted to match their new positions in the flipped image.\n        - Keypoints are moved to their new positions in the flipped image.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (x_min, H-y_max, x_max, H-y_min)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (x, H-y)\n        where H is the height of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; flipped_image = result['image']\n        &gt;&gt;&gt; print(flipped_image)\n        [[[ 7  8  9]\n          [10 11 12]]\n         [[ 1  2  3]\n          [ 4  5  6]]]\n        # The original image is flipped vertically, with rows reversed\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return vflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_vflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_vflip(keypoints, params[\"shape\"][0])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/mixing/","title":"Index","text":"<ul> <li>Mixing transforms (augmentations.mixing.transforms)</li> <li>Mixing functional transforms (albumentations.augmentations.mixing.functional)</li> </ul>"},{"location":"api_reference/augmentations/mixing/functional/","title":"Mixing transforms (augmentations.mixing.functional)","text":""},{"location":"api_reference/augmentations/mixing/transforms/","title":"Mixing transforms (augmentations.mixing.transforms)","text":""},{"location":"api_reference/augmentations/mixing/transforms/#albumentations.augmentations.mixing.transforms.OverlayElements","title":"<code>class  OverlayElements</code> <code>       (metadata_key='overlay_metadata', p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Apply overlay elements such as images and masks onto an input image. This transformation can be used to add various objects (e.g., stickers, logos) to images with optional masks and bounding boxes for better placement control.</p> <p>Parameters:</p> Name Type Description <code>metadata_key</code> <code>str</code> <p>Additional target key for metadata. Default <code>overlay_metadata</code>.</p> <code>p</code> <code>float</code> <p>Probability of applying the transformation. Default: 0.5.</p> <p>Possible Metadata Fields:     - image (np.ndarray): The overlay image to be applied. This is a required field.     - bbox (list[int]): The bounding box specifying the region where the overlay should be applied. It should                         contain four floats: [y_min, x_min, y_max, x_max]. If <code>label_id</code> is provided, it should                         be appended as the fifth element in the bbox. BBox should be in Albumentations format,                         that is the same as normalized Pascal VOC format                         [x_min / width, y_min / height, x_max / width, y_max / height]     - mask (np.ndarray): An optional mask that defines the non-rectangular region of the overlay image. If not                          provided, the entire overlay image is used.     - mask_id (int): An optional identifier for the mask. If provided, the regions specified by the mask will                      be labeled with this identifier in the output mask.</p> <p>Targets</p> <p>image, mask</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>https://github.com/danaaubakirova/doc-augmentation</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/mixing/transforms.py</code> Python<pre><code>class OverlayElements(DualTransform):\n    \"\"\"Apply overlay elements such as images and masks onto an input image. This transformation can be used to add\n    various objects (e.g., stickers, logos) to images with optional masks and bounding boxes for better placement\n    control.\n\n    Args:\n        metadata_key (str): Additional target key for metadata. Default `overlay_metadata`.\n        p (float): Probability of applying the transformation. Default: 0.5.\n\n    Possible Metadata Fields:\n        - image (np.ndarray): The overlay image to be applied. This is a required field.\n        - bbox (list[int]): The bounding box specifying the region where the overlay should be applied. It should\n                            contain four floats: [y_min, x_min, y_max, x_max]. If `label_id` is provided, it should\n                            be appended as the fifth element in the bbox. BBox should be in Albumentations format,\n                            that is the same as normalized Pascal VOC format\n                            [x_min / width, y_min / height, x_max / width, y_max / height]\n        - mask (np.ndarray): An optional mask that defines the non-rectangular region of the overlay image. If not\n                             provided, the entire overlay image is used.\n        - mask_id (int): An optional identifier for the mask. If provided, the regions specified by the mask will\n                         be labeled with this identifier in the output mask.\n\n    Targets:\n        image, mask\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        https://github.com/danaaubakirova/doc-augmentation\n\n    \"\"\"\n\n    _targets = (Targets.IMAGE, Targets.MASK)\n\n    class InitSchema(BaseTransformInitSchema):\n        metadata_key: str\n\n    def __init__(\n        self,\n        metadata_key: str = \"overlay_metadata\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.metadata_key = metadata_key\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [self.metadata_key]\n\n    @staticmethod\n    def preprocess_metadata(\n        metadata: dict[str, Any],\n        img_shape: tuple[int, int],\n        random_state: random.Random,\n    ) -&gt; dict[str, Any]:\n        overlay_image = metadata[\"image\"]\n        overlay_height, overlay_width = overlay_image.shape[:2]\n        image_height, image_width = img_shape[:2]\n\n        if \"bbox\" in metadata:\n            bbox = metadata[\"bbox\"]\n            bbox_np = np.array([bbox])\n            check_bboxes(bbox_np)\n            denormalized_bbox = denormalize_bboxes(bbox_np, img_shape[:2])[0]\n\n            x_min, y_min, x_max, y_max = (int(x) for x in denormalized_bbox[:4])\n\n            if \"mask\" in metadata:\n                mask = metadata[\"mask\"]\n                mask = cv2.resize(mask, (x_max - x_min, y_max - y_min), interpolation=cv2.INTER_NEAREST)\n            else:\n                mask = np.ones((y_max - y_min, x_max - x_min), dtype=np.uint8)\n\n            overlay_image = cv2.resize(overlay_image, (x_max - x_min, y_max - y_min), interpolation=cv2.INTER_AREA)\n            offset = (y_min, x_min)\n\n            if len(bbox) == LENGTH_RAW_BBOX and \"bbox_id\" in metadata:\n                bbox = [x_min, y_min, x_max, y_max, metadata[\"bbox_id\"]]\n            else:\n                bbox = (x_min, y_min, x_max, y_max, *bbox[4:])\n        else:\n            if image_height &lt; overlay_height or image_width &lt; overlay_width:\n                overlay_image = cv2.resize(overlay_image, (image_width, image_height), interpolation=cv2.INTER_AREA)\n                overlay_height, overlay_width = overlay_image.shape[:2]\n\n            mask = metadata[\"mask\"] if \"mask\" in metadata else np.ones_like(overlay_image, dtype=np.uint8)\n\n            max_x_offset = image_width - overlay_width\n            max_y_offset = image_height - overlay_height\n\n            offset_x = random_state.randint(0, max_x_offset)\n            offset_y = random_state.randint(0, max_y_offset)\n\n            offset = (offset_y, offset_x)\n\n            bbox = [\n                offset_x,\n                offset_y,\n                offset_x + overlay_width,\n                offset_y + overlay_height,\n            ]\n\n            if \"bbox_id\" in metadata:\n                bbox = [*bbox, metadata[\"bbox_id\"]]\n\n        result = {\n            \"overlay_image\": overlay_image,\n            \"overlay_mask\": mask,\n            \"offset\": offset,\n            \"bbox\": bbox,\n        }\n\n        if \"mask_id\" in metadata:\n            result[\"mask_id\"] = metadata[\"mask_id\"]\n\n        return result\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        metadata = data[self.metadata_key]\n        img_shape = params[\"shape\"]\n\n        if isinstance(metadata, list):\n            overlay_data = [self.preprocess_metadata(md, img_shape, self.py_random) for md in metadata]\n        else:\n            overlay_data = [self.preprocess_metadata(metadata, img_shape, self.py_random)]\n\n        return {\n            \"overlay_data\": overlay_data,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        overlay_data: list[dict[str, Any]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        for data in overlay_data:\n            overlay_image = data[\"overlay_image\"]\n            overlay_mask = data[\"overlay_mask\"]\n            offset = data[\"offset\"]\n            img = fmixing.copy_and_paste_blend(img, overlay_image, overlay_mask, offset=offset)\n        return img\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        overlay_data: list[dict[str, Any]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        for data in overlay_data:\n            if \"mask_id\" in data and data[\"mask_id\"] is not None:\n                overlay_mask = data[\"overlay_mask\"]\n                offset = data[\"offset\"]\n                mask_id = data[\"mask_id\"]\n\n                y_min, x_min = offset\n                y_max = y_min + overlay_mask.shape[0]\n                x_max = x_min + overlay_mask.shape[1]\n\n                mask_section = mask[y_min:y_max, x_min:x_max]\n                mask_section[overlay_mask &gt; 0] = mask_id\n\n        return mask\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"metadata_key\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/","title":"Index","text":"<ul> <li>3D (Volumetric) transforms (augmentations.transforms3d.transforms)</li> <li>3D (Volumetric) functional transforms (albumentations.augmentations.transforms3d.functional)</li> </ul>"},{"location":"api_reference/augmentations/transforms3d/functional/","title":"3D (Volumetric) functional transforms (augmentations.transforms3d.functional)","text":""},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.adjust_padding_by_position3d","title":"<code>def adjust_padding_by_position3d    (paddings, position, py_random)    </code> [view source on GitHub]","text":"<p>Adjust padding values based on desired position for 3D data.</p> <p>Parameters:</p> Name Type Description <code>paddings</code> <code>list[tuple[int, int]]</code> <p>List of tuples containing padding pairs for each dimension [(d_pad), (h_pad), (w_pad)]</p> <code>position</code> <code>Literal['center', 'random']</code> <p>Position of the image after padding. Either 'center' or 'random'</p> <code>py_random</code> <code>Random</code> <p>Random number generator</p> <p>Returns:</p> Type Description <code>tuple[int, int, int, int, int, int]</code> <p>Final padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def adjust_padding_by_position3d(\n    paddings: list[tuple[int, int]],  # [(front, back), (top, bottom), (left, right)]\n    position: Literal[\"center\", \"random\"],\n    py_random: random.Random,\n) -&gt; tuple[int, int, int, int, int, int]:\n    \"\"\"Adjust padding values based on desired position for 3D data.\n\n    Args:\n        paddings: List of tuples containing padding pairs for each dimension [(d_pad), (h_pad), (w_pad)]\n        position: Position of the image after padding. Either 'center' or 'random'\n        py_random: Random number generator\n\n    Returns:\n        tuple[int, int, int, int, int, int]: Final padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)\n    \"\"\"\n    if position == \"center\":\n        return (\n            paddings[0][0],  # d_front\n            paddings[0][1],  # d_back\n            paddings[1][0],  # h_top\n            paddings[1][1],  # h_bottom\n            paddings[2][0],  # w_left\n            paddings[2][1],  # w_right\n        )\n\n    # For random position, redistribute padding for each dimension\n    d_pad = sum(paddings[0])\n    h_pad = sum(paddings[1])\n    w_pad = sum(paddings[2])\n\n    return (\n        py_random.randint(0, d_pad),  # d_front\n        d_pad - py_random.randint(0, d_pad),  # d_back\n        py_random.randint(0, h_pad),  # h_top\n        h_pad - py_random.randint(0, h_pad),  # h_bottom\n        py_random.randint(0, w_pad),  # w_left\n        w_pad - py_random.randint(0, w_pad),  # w_right\n    )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.crop3d","title":"<code>def crop3d    (volume, crop_coords)    </code> [view source on GitHub]","text":"<p>Crop 3D volume using coordinates.</p> <p>Parameters:</p> Name Type Description <code>volume</code> <code>ndarray</code> <p>Input volume with shape (z, y, x) or (z, y, x, channels)</p> <code>crop_coords</code> <code>tuple[int, int, int, int, int, int]</code> <p>Tuple of (z_min, z_max, y_min, y_max, x_min, x_max) coordinates for cropping</p> <p>Returns:</p> Type Description <code>ndarray</code> <p>Cropped volume with same number of dimensions as input</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def crop3d(\n    volume: np.ndarray,\n    crop_coords: tuple[int, int, int, int, int, int],\n) -&gt; np.ndarray:\n    \"\"\"Crop 3D volume using coordinates.\n\n    Args:\n        volume: Input volume with shape (z, y, x) or (z, y, x, channels)\n        crop_coords: Tuple of (z_min, z_max, y_min, y_max, x_min, x_max) coordinates for cropping\n\n    Returns:\n        Cropped volume with same number of dimensions as input\n    \"\"\"\n    z_min, z_max, y_min, y_max, x_min, x_max = crop_coords\n\n    return volume[z_min:z_max, y_min:y_max, x_min:x_max]\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.cutout3d","title":"<code>def cutout3d    (volume, holes, fill_value)    </code> [view source on GitHub]","text":"<p>Cut out holes in 3D volume and fill them with a given value.</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def cutout3d(volume: np.ndarray, holes: np.ndarray, fill_value: ColorType) -&gt; np.ndarray:\n    \"\"\"Cut out holes in 3D volume and fill them with a given value.\"\"\"\n    volume = volume.copy()\n    for z1, y1, x1, z2, y2, x2 in holes:\n        volume[z1:z2, y1:y2, x1:x2] = fill_value\n    return volume\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.pad_3d_with_params","title":"<code>def pad_3d_with_params    (volume, padding, value)    </code> [view source on GitHub]","text":"<p>Pad 3D image with given parameters.</p> <p>Parameters:</p> Name Type Description <code>volume</code> <code>ndarray</code> <p>Input volume with shape (depth, height, width) or (depth, height, width, channels)</p> <code>padding</code> <code>tuple[int, int, int, int, int, int]</code> <p>Padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</p> <code>value</code> <code>Union[float, collections.abc.Sequence[float]]</code> <p>Padding value</p> <p>Returns:</p> Type Description <code>ndarray</code> <p>Padded image with same number of dimensions as input</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def pad_3d_with_params(\n    volume: np.ndarray,\n    padding: tuple[int, int, int, int, int, int],  # (d_front, d_back, h_top, h_bottom, w_left, w_right)\n    value: ColorType,\n) -&gt; np.ndarray:\n    \"\"\"Pad 3D image with given parameters.\n\n    Args:\n        volume: Input volume with shape (depth, height, width) or (depth, height, width, channels)\n        padding: Padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)\n        value: Padding value\n\n    Returns:\n        Padded image with same number of dimensions as input\n    \"\"\"\n    d_front, d_back, h_top, h_bottom, w_left, w_right = padding\n\n    # Skip if no padding is needed\n    if d_front == d_back == h_top == h_bottom == w_left == w_right == 0:\n        return volume\n\n    # Handle both 3D and 4D arrays\n    pad_width = [\n        (d_front, d_back),  # depth padding\n        (h_top, h_bottom),  # height padding\n        (w_left, w_right),  # width padding\n    ]\n\n    # Add channel padding if 4D array\n    if volume.ndim == NUM_VOLUME_DIMENSIONS:\n        pad_width.append((0, 0))  # no padding for channels\n\n    return np.pad(\n        volume,\n        pad_width=pad_width,\n        mode=\"constant\",\n        constant_values=value,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.transform_cube","title":"<code>def transform_cube    (cube, index)    </code> [view source on GitHub]","text":"<p>Transform cube by index (0-47)</p> <p>Parameters:</p> Name Type Description <code>cube</code> <code>ndarray</code> <p>Input array with shape (D, H, W) or (D, H, W, C)</p> <code>index</code> <code>int</code> <p>Integer from 0 to 47 specifying which transformation to apply</p> <p>Returns:</p> Type Description <code>ndarray</code> <p>Transformed cube with same shape as input</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def transform_cube(cube: np.ndarray, index: int) -&gt; np.ndarray:\n    \"\"\"Transform cube by index (0-47)\n\n    Args:\n        cube: Input array with shape (D, H, W) or (D, H, W, C)\n        index: Integer from 0 to 47 specifying which transformation to apply\n    Returns:\n        Transformed cube with same shape as input\n    \"\"\"\n    if not (0 &lt;= index &lt; 48):\n        raise ValueError(\"Index must be between 0 and 47\")\n\n    # First determine if we need reflection (indices 24-47)\n    needs_reflection = index &gt;= 24\n    working_cube = cube[:, :, ::-1].copy() if needs_reflection else cube.copy()\n    rotation_index = index % 24\n\n    # Map rotation_index (0-23) to specific rotations\n    if rotation_index &lt; 4:\n        # First 4: rotate around axis 0\n        return np.rot90(working_cube, rotation_index, axes=(1, 2))\n\n    if rotation_index &lt; 8:\n        # Next 4: flip 180\u00b0 about axis 1, then rotate around axis 0\n        temp = np.rot90(working_cube, 2, axes=(0, 2))\n        return np.rot90(temp, rotation_index - 4, axes=(1, 2))\n\n    if rotation_index &lt; 16:\n        # Next 8: split between 90\u00b0 and 270\u00b0 about axis 1, then rotate around axis 2\n        if rotation_index &lt; 12:\n            temp = np.rot90(working_cube, axes=(0, 2))\n            return np.rot90(temp, rotation_index - 8, axes=(0, 1))\n        temp = np.rot90(working_cube, -1, axes=(0, 2))\n        return np.rot90(temp, rotation_index - 12, axes=(0, 1))\n\n    # Final 8: split between rotations about axis 2, then rotate around axis 1\n    if rotation_index &lt; 20:\n        temp = np.rot90(working_cube, axes=(0, 1))\n        return np.rot90(temp, rotation_index - 16, axes=(0, 2))\n    temp = np.rot90(working_cube, -1, axes=(0, 1))\n    return np.rot90(temp, rotation_index - 20, axes=(0, 2))\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/","title":"3D (Volumetric) transforms (augmentations.transforms3d.transforms)","text":""},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.BaseCropAndPad3D","title":"<code>class  BaseCropAndPad3D</code> <code>       (pad_if_needed, fill, fill_mask, pad_position, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Base class for 3D transforms that need both cropping and padding.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class BaseCropAndPad3D(Transform3D):\n    \"\"\"Base class for 3D transforms that need both cropping and padding.\"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D)\n\n    class InitSchema(Transform3D.InitSchema):\n        pad_if_needed: bool\n        fill: ColorType\n        fill_mask: ColorType\n        pad_position: Literal[\"center\", \"random\"]\n\n    def __init__(\n        self,\n        pad_if_needed: bool,\n        fill: ColorType,\n        fill_mask: ColorType,\n        pad_position: Literal[\"center\", \"random\"],\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.pad_if_needed = pad_if_needed\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.pad_position = pad_position\n\n    def _random_pad(self, pad: int) -&gt; tuple[int, int]:\n        \"\"\"Helper function to calculate random padding for one dimension.\"\"\"\n        if pad &gt; 0:\n            pad_start = self.py_random.randint(0, pad)\n            pad_end = pad - pad_start\n        else:\n            pad_start = pad_end = 0\n        return pad_start, pad_end\n\n    def _center_pad(self, pad: int) -&gt; tuple[int, int]:\n        \"\"\"Helper function to calculate center padding for one dimension.\"\"\"\n        pad_start = pad // 2\n        pad_end = pad - pad_start\n        return pad_start, pad_end\n\n    def _get_pad_params(\n        self,\n        image_shape: tuple[int, int, int],\n        target_shape: tuple[int, int, int],\n    ) -&gt; dict[str, Any] | None:\n        \"\"\"Calculate padding parameters if needed for 3D volumes.\"\"\"\n        if not self.pad_if_needed:\n            return None\n\n        z, h, w = image_shape\n        target_z, target_h, target_w = target_shape\n\n        # Calculate total padding needed for each dimension\n        z_pad = max(0, target_z - z)\n        h_pad = max(0, target_h - h)\n        w_pad = max(0, target_w - w)\n\n        if z_pad == 0 and h_pad == 0 and w_pad == 0:\n            return None\n\n        # For center padding, split equally\n        if self.pad_position == \"center\":\n            z_front, z_back = self._center_pad(z_pad)\n            h_top, h_bottom = self._center_pad(h_pad)\n            w_left, w_right = self._center_pad(w_pad)\n        # For random padding, randomly distribute the padding\n        else:  # random\n            z_front, z_back = self._random_pad(z_pad)\n            h_top, h_bottom = self._random_pad(h_pad)\n            w_left, w_right = self._random_pad(w_pad)\n\n        return {\n            \"pad_front\": z_front,\n            \"pad_back\": z_back,\n            \"pad_top\": h_top,\n            \"pad_bottom\": h_bottom,\n            \"pad_left\": w_left,\n            \"pad_right\": w_right,\n        }\n\n    def apply_to_volume(\n        self,\n        volume: np.ndarray,\n        crop_coords: tuple[int, int, int, int, int, int],\n        pad_params: dict[str, int] | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        # First crop\n        cropped = f3d.crop3d(volume, crop_coords)\n\n        # Then pad if needed\n        if pad_params is not None:\n            padding = (\n                pad_params[\"pad_front\"],\n                pad_params[\"pad_back\"],\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n            )\n            return f3d.pad_3d_with_params(\n                cropped,\n                padding=padding,\n                value=cast(ColorType, self.fill),\n            )\n\n        return cropped\n\n    def apply_to_mask3d(\n        self,\n        mask3d: np.ndarray,\n        crop_coords: tuple[int, int, int, int, int, int],\n        pad_params: dict[str, int] | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        # First crop\n        cropped = f3d.crop3d(mask3d, crop_coords)\n\n        # Then pad if needed\n        if pad_params is not None:\n            padding = (\n                pad_params[\"pad_front\"],\n                pad_params[\"pad_back\"],\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n            )\n            return f3d.pad_3d_with_params(\n                cropped,\n                padding=padding,\n                value=cast(ColorType, self.fill_mask),\n            )\n\n        return cropped\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.BasePad3D","title":"<code>class  BasePad3D</code> <code>       (fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Base class for 3D padding transforms.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class BasePad3D(Transform3D):\n    \"\"\"Base class for 3D padding transforms.\"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D)\n\n    class InitSchema(Transform3D.InitSchema):\n        fill: ColorType\n        fill_mask: ColorType\n\n    def __init__(\n        self,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def apply_to_volume(\n        self,\n        volume: np.ndarray,\n        padding: tuple[int, int, int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if padding == (0, 0, 0, 0, 0, 0):\n            return volume\n        return f3d.pad_3d_with_params(\n            volume=volume,\n            padding=padding,\n            value=cast(ColorType, self.fill),\n        )\n\n    def apply_to_mask3d(\n        self,\n        mask3d: np.ndarray,\n        padding: tuple[int, int, int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if padding == (0, 0, 0, 0, 0, 0):\n            return mask3d\n        return f3d.pad_3d_with_params(\n            volume=mask3d,\n            padding=padding,\n            value=cast(ColorType, self.fill_mask),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.CenterCrop3D","title":"<code>class  CenterCrop3D</code> <code>       (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop the center of 3D volume.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, int, int]</code> <p>Desired output size of the crop in format (depth, height, width)</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if the volume is smaller than desired crop size. Default: False</p> <code>fill</code> <code>ColorType</code> <p>Padding value for image if pad_if_needed is True. Default: 0</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if pad_if_needed is True. Default: 0</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>If you want to perform cropping only in the XY plane while preserving all slices along the Z axis, consider using CenterCrop instead. CenterCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class CenterCrop3D(BaseCropAndPad3D):\n    \"\"\"Crop the center of 3D volume.\n\n    Args:\n        size (tuple[int, int, int]): Desired output size of the crop in format (depth, height, width)\n        pad_if_needed (bool): Whether to pad if the volume is smaller than desired crop size. Default: False\n        fill (ColorType): Padding value for image if pad_if_needed is True. Default: 0\n        fill_mask (ColorType): Padding value for mask if pad_if_needed is True. Default: 0\n        p (float): probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        If you want to perform cropping only in the XY plane while preserving all slices along\n        the Z axis, consider using CenterCrop instead. CenterCrop will apply the same XY crop\n        to each slice independently, maintaining the full depth of the volume.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        size: Annotated[tuple[int, int, int], AfterValidator(check_range_bounds(1, None))]\n        pad_if_needed: bool\n        fill: ColorType\n        fill_mask: ColorType\n\n    def __init__(\n        self,\n        size: tuple[int, int, int],\n        pad_if_needed: bool = False,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=\"center\",  # Center crop always uses center padding\n            p=p,\n        )\n        self.size = size\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        volume = data[\"volume\"]\n        z, h, w = volume.shape[:3]\n        target_z, target_h, target_w = self.size\n\n        # Get padding params if needed\n        pad_params = self._get_pad_params(\n            image_shape=(z, h, w),\n            target_shape=self.size,\n        )\n\n        # Update dimensions if padding is applied\n        if pad_params is not None:\n            z = z + pad_params[\"pad_front\"] + pad_params[\"pad_back\"]\n            h = h + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            w = w + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n\n        # Validate dimensions after padding\n        if z &lt; target_z or h &lt; target_h or w &lt; target_w:\n            msg = (\n                f\"Crop size {self.size} is larger than padded image size ({z}, {h}, {w}). \"\n                f\"This should not happen - please report this as a bug.\"\n            )\n            raise ValueError(msg)\n\n        # For CenterCrop3D:\n        z_start = (z - target_z) // 2\n        h_start = (h - target_h) // 2\n        w_start = (w - target_w) // 2\n\n        crop_coords = (\n            z_start,\n            z_start + target_z,\n            h_start,\n            h_start + target_h,\n            w_start,\n            w_start + target_w,\n        )\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"size\", \"pad_if_needed\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.CoarseDropout3D","title":"<code>class  CoarseDropout3D</code> <code>       (num_holes_range=(1, 1), hole_depth_range=(0.1, 0.2), hole_height_range=(0.1, 0.2), hole_width_range=(0.1, 0.2), fill=0, fill_mask=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>CoarseDropout3D randomly drops out cuboid regions from a 3D volume and optionally, the corresponding regions in an associated 3D mask, to simulate occlusion and varied object sizes found in real-world volumetric data.</p> <p>Parameters:</p> Name Type Description <code>num_holes_range</code> <code>tuple[int, int]</code> <p>Range (min, max) for the number of cuboid regions to drop out. Default: (1, 1)</p> <code>hole_depth_range</code> <code>tuple[float, float]</code> <p>Range (min, max) for the depth of dropout regions as a fraction of the volume depth (between 0 and 1). Default: (0.1, 0.2)</p> <code>hole_height_range</code> <code>tuple[float, float]</code> <p>Range (min, max) for the height of dropout regions as a fraction of the volume height (between 0 and 1). Default: (0.1, 0.2)</p> <code>hole_width_range</code> <code>tuple[float, float]</code> <p>Range (min, max) for the width of dropout regions as a fraction of the volume width (between 0 and 1). Default: (0.1, 0.2)</p> <code>fill</code> <code>ColorType</code> <p>Value for the dropped voxels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel Default: 0</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Fill value for dropout regions in the 3D mask. If None, mask regions corresponding to volume dropouts are unchanged. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual number and size of dropout regions are randomly chosen within the specified ranges.</li> <li>All values in hole_depth_range, hole_height_range and hole_width_range must be between 0 and 1.</li> <li>If you want to apply dropout only in the XY plane while preserving the full depth dimension,   consider using CoarseDropout instead. CoarseDropout will apply the same rectangular dropout   to each slice independently, effectively creating cylindrical dropout regions that extend   through the entire depth of the volume.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n&gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n&gt;&gt;&gt; aug = A.CoarseDropout3D(\n...     num_holes_range=(3, 6),\n...     hole_depth_range=(0.1, 0.2),\n...     hole_height_range=(0.1, 0.2),\n...     hole_width_range=(0.1, 0.2),\n...     fill=0,\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = aug(volume=volume, mask3d=mask3d)\n&gt;&gt;&gt; transformed_volume, transformed_mask3d = transformed[\"volume\"], transformed[\"mask3d\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class CoarseDropout3D(Transform3D):\n    \"\"\"CoarseDropout3D randomly drops out cuboid regions from a 3D volume and optionally,\n    the corresponding regions in an associated 3D mask, to simulate occlusion and\n    varied object sizes found in real-world volumetric data.\n\n    Args:\n        num_holes_range (tuple[int, int]): Range (min, max) for the number of cuboid\n            regions to drop out. Default: (1, 1)\n        hole_depth_range (tuple[float, float]): Range (min, max) for the depth\n            of dropout regions as a fraction of the volume depth (between 0 and 1). Default: (0.1, 0.2)\n        hole_height_range (tuple[float, float]): Range (min, max) for the height\n            of dropout regions as a fraction of the volume height (between 0 and 1). Default: (0.1, 0.2)\n        hole_width_range (tuple[float, float]): Range (min, max) for the width\n            of dropout regions as a fraction of the volume width (between 0 and 1). Default: (0.1, 0.2)\n        fill (ColorType): Value for the dropped voxels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            Default: 0\n        fill_mask (ColorType | None): Fill value for dropout regions in the 3D mask.\n            If None, mask regions corresponding to volume dropouts are unchanged. Default: None\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual number and size of dropout regions are randomly chosen within the specified ranges.\n        - All values in hole_depth_range, hole_height_range and hole_width_range must be between 0 and 1.\n        - If you want to apply dropout only in the XY plane while preserving the full depth dimension,\n          consider using CoarseDropout instead. CoarseDropout will apply the same rectangular dropout\n          to each slice independently, effectively creating cylindrical dropout regions that extend\n          through the entire depth of the volume.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n        &gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n        &gt;&gt;&gt; aug = A.CoarseDropout3D(\n        ...     num_holes_range=(3, 6),\n        ...     hole_depth_range=(0.1, 0.2),\n        ...     hole_height_range=(0.1, 0.2),\n        ...     hole_width_range=(0.1, 0.2),\n        ...     fill=0,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = aug(volume=volume, mask3d=mask3d)\n        &gt;&gt;&gt; transformed_volume, transformed_mask3d = transformed[\"volume\"], transformed[\"mask3d\"]\n    \"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D)\n\n    class InitSchema(Transform3D.InitSchema):\n        num_holes_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n        hole_depth_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        hole_height_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        hole_width_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        fill: ColorType\n        fill_mask: ColorType | None\n\n        @staticmethod\n        def validate_range(range_value: tuple[float, float], range_name: str) -&gt; None:\n            if not 0 &lt;= range_value[0] &lt;= range_value[1] &lt;= 1:\n                raise ValueError(\n                    f\"All values in {range_name} should be in [0, 1] range and first value \"\n                    f\"should be less or equal than the second value. Got: {range_value}\",\n                )\n\n        @model_validator(mode=\"after\")\n        def check_ranges(self) -&gt; Self:\n            self.validate_range(self.hole_depth_range, \"hole_depth_range\")\n            self.validate_range(self.hole_height_range, \"hole_height_range\")\n            self.validate_range(self.hole_width_range, \"hole_width_range\")\n            return self\n\n    def __init__(\n        self,\n        num_holes_range: tuple[int, int] = (1, 1),\n        hole_depth_range: tuple[float, float] = (0.1, 0.2),\n        hole_height_range: tuple[float, float] = (0.1, 0.2),\n        hole_width_range: tuple[float, float] = (0.1, 0.2),\n        fill: ColorType = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_holes_range = num_holes_range\n        self.hole_depth_range = hole_depth_range\n        self.hole_height_range = hole_height_range\n        self.hole_width_range = hole_width_range\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def calculate_hole_dimensions(\n        self,\n        volume_shape: tuple[int, int, int],\n        depth_range: tuple[float, float],\n        height_range: tuple[float, float],\n        width_range: tuple[float, float],\n        size: int,\n    ) -&gt; tuple[np.ndarray, np.ndarray, np.ndarray]:\n        \"\"\"Calculate random hole dimensions based on the provided ranges.\"\"\"\n        depth, height, width = volume_shape[:3]\n\n        hole_depths = np.maximum(1, np.ceil(depth * self.random_generator.uniform(*depth_range, size=size))).astype(int)\n        hole_heights = np.maximum(1, np.ceil(height * self.random_generator.uniform(*height_range, size=size))).astype(\n            int,\n        )\n        hole_widths = np.maximum(1, np.ceil(width * self.random_generator.uniform(*width_range, size=size))).astype(int)\n\n        return hole_depths, hole_heights, hole_widths\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        volume_shape = data[\"volume\"].shape[:3]\n\n        num_holes = self.py_random.randint(*self.num_holes_range)\n\n        hole_depths, hole_heights, hole_widths = self.calculate_hole_dimensions(\n            volume_shape,\n            self.hole_depth_range,\n            self.hole_height_range,\n            self.hole_width_range,\n            size=num_holes,\n        )\n\n        depth, height, width = volume_shape[:3]\n\n        z_min = self.random_generator.integers(0, depth - hole_depths + 1, size=num_holes)\n        y_min = self.random_generator.integers(0, height - hole_heights + 1, size=num_holes)\n        x_min = self.random_generator.integers(0, width - hole_widths + 1, size=num_holes)\n        z_max = z_min + hole_depths\n        y_max = y_min + hole_heights\n        x_max = x_min + hole_widths\n\n        holes = np.stack([z_min, y_min, x_min, z_max, y_max, x_max], axis=-1)\n\n        return {\"holes\": holes}\n\n    def apply_to_volume(self, volume: np.ndarray, holes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if holes.size == 0:\n            return volume\n\n        return f3d.cutout3d(volume, holes, cast(ColorType, self.fill))\n\n    def apply_to_mask(self, mask: np.ndarray, holes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if self.fill_mask is None or holes.size == 0:\n            return mask\n\n        return f3d.cutout3d(mask, holes, cast(ColorType, self.fill_mask))\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"num_holes_range\",\n            \"hole_depth_range\",\n            \"hole_height_range\",\n            \"hole_width_range\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.CubicSymmetry","title":"<code>class  CubicSymmetry</code> <code>       (p=1.0, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Applies a random cubic symmetry transformation to a 3D volume.</p> <p>This transform is a 3D extension of D4. While D4 handles the 8 symmetries of a square (4 rotations x 2 reflections), CubicSymmetry handles all 48 symmetries of a cube. Like D4, this transform does not create any interpolation artifacts as it only remaps voxels from one position to another without any interpolation.</p> <p>The 48 transformations consist of: - 24 rotations (orientation-preserving):     * 4 rotations around each face diagonal (6 face diagonals x 4 rotations = 24) - 24 rotoreflections (orientation-reversing):     * Reflection through a plane followed by any of the 24 rotations</p> <p>For a cube, these transformations preserve: - All face centers (6) - All vertex positions (8) - All edge centers (12)</p> <p>works with 3D volumes and masks of the shape (D, H, W) or (D, H, W, C)</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform is particularly useful for data augmentation in 3D medical imaging,   crystallography, and voxel-based 3D modeling where the object's orientation   is arbitrary.</li> <li>All transformations preserve the object's chirality (handedness) when using   pure rotations (indices 0-23) and invert it when using rotoreflections   (indices 24-47).</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n&gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n&gt;&gt;&gt; transform = A.CubicSymmetry(p=1.0)\n&gt;&gt;&gt; transformed = transform(volume=volume, mask3d=mask3d)\n&gt;&gt;&gt; transformed_volume = transformed[\"volume\"]\n&gt;&gt;&gt; transformed_mask3d = transformed[\"mask3d\"]\n</code></pre> <p>See Also:     - D4: The 2D version that handles the 8 symmetries of a square</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class CubicSymmetry(Transform3D):\n    \"\"\"Applies a random cubic symmetry transformation to a 3D volume.\n\n    This transform is a 3D extension of D4. While D4 handles the 8 symmetries\n    of a square (4 rotations x 2 reflections), CubicSymmetry handles all 48 symmetries of a cube.\n    Like D4, this transform does not create any interpolation artifacts as it only remaps voxels\n    from one position to another without any interpolation.\n\n    The 48 transformations consist of:\n    - 24 rotations (orientation-preserving):\n        * 4 rotations around each face diagonal (6 face diagonals x 4 rotations = 24)\n    - 24 rotoreflections (orientation-reversing):\n        * Reflection through a plane followed by any of the 24 rotations\n\n    For a cube, these transformations preserve:\n    - All face centers (6)\n    - All vertex positions (8)\n    - All edge centers (12)\n\n    works with 3D volumes and masks of the shape (D, H, W) or (D, H, W, C)\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform is particularly useful for data augmentation in 3D medical imaging,\n          crystallography, and voxel-based 3D modeling where the object's orientation\n          is arbitrary.\n        - All transformations preserve the object's chirality (handedness) when using\n          pure rotations (indices 0-23) and invert it when using rotoreflections\n          (indices 24-47).\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n        &gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n        &gt;&gt;&gt; transform = A.CubicSymmetry(p=1.0)\n        &gt;&gt;&gt; transformed = transform(volume=volume, mask3d=mask3d)\n        &gt;&gt;&gt; transformed_volume = transformed[\"volume\"]\n        &gt;&gt;&gt; transformed_mask3d = transformed[\"mask3d\"]\n\n    See Also:\n        - D4: The 2D version that handles the 8 symmetries of a square\n    \"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D)\n\n    def __init__(\n        self,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        # Randomly select one of 48 possible transformations\n        return {\"index\": self.py_random.randint(0, 47)}\n\n    def apply_to_volume(self, volume: np.ndarray, index: int, **params: Any) -&gt; np.ndarray:\n        return f3d.transform_cube(volume, index)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.Pad3D","title":"<code>class  Pad3D</code> <code>       (padding, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pad the sides of a 3D volume by specified number of voxels.</p> <p>Parameters:</p> Name Type Description <code>padding</code> <code>int, tuple[int, int, int] or tuple[int, int, int, int, int, int]</code> <p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int, int] - symmetric padding (pad_z, pad_y, pad_x) where:     - pad_z: padding for depth/z-axis (front/back)     - pad_y: padding for height/y-axis (top/bottom)     - pad_x: padding for width/x-axis (left/right) * tuple[int, int, int, int, int, int] - explicit padding per side in order:     (front, top, left, back, bottom, right) where:     - front/back: padding along z-axis (depth)     - top/bottom: padding along y-axis (height)     - left/right: padding along x-axis (width)</p> <code>fill</code> <code>ColorType</code> <p>Padding value for image</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class Pad3D(BasePad3D):\n    \"\"\"Pad the sides of a 3D volume by specified number of voxels.\n\n    Args:\n        padding (int, tuple[int, int, int] or tuple[int, int, int, int, int, int]): Padding values. Can be:\n            * int - pad all sides by this value\n            * tuple[int, int, int] - symmetric padding (pad_z, pad_y, pad_x) where:\n                - pad_z: padding for depth/z-axis (front/back)\n                - pad_y: padding for height/y-axis (top/bottom)\n                - pad_x: padding for width/x-axis (left/right)\n            * tuple[int, int, int, int, int, int] - explicit padding per side in order:\n                (front, top, left, back, bottom, right) where:\n                - front/back: padding along z-axis (depth)\n                - top/bottom: padding along y-axis (height)\n                - left/right: padding along x-axis (width)\n        fill (ColorType): Padding value for image\n        fill_mask (ColorType): Padding value for mask\n        p (float): probability of applying the transform. Default: 1.0.\n\n    Targets:\n        volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width),\n        with optional channel dimension as the last axis.\n    \"\"\"\n\n    class InitSchema(BasePad3D.InitSchema):\n        padding: int | tuple[int, int, int] | tuple[int, int, int, int, int, int]\n\n        @field_validator(\"padding\")\n        @classmethod\n        def validate_padding(\n            cls,\n            v: int | tuple[int, int, int] | tuple[int, int, int, int, int, int],\n        ) -&gt; int | tuple[int, int, int] | tuple[int, int, int, int, int, int]:\n            if isinstance(v, int) and v &lt; 0:\n                raise ValueError(\"Padding value must be non-negative\")\n            if isinstance(v, tuple) and not all(isinstance(i, int) and i &gt;= 0 for i in v):\n                raise ValueError(\"Padding tuple must contain non-negative integers\")\n\n            return v\n\n    def __init__(\n        self,\n        padding: int | tuple[int, int, int] | tuple[int, int, int, int, int, int],\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.padding = padding\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        if isinstance(self.padding, int):\n            pad_d = pad_h = pad_w = self.padding\n            padding = (pad_d, pad_d, pad_h, pad_h, pad_w, pad_w)\n        elif len(self.padding) == NUM_DIMENSIONS:\n            pad_d, pad_h, pad_w = self.padding  # type: ignore[misc]\n            padding = (pad_d, pad_d, pad_h, pad_h, pad_w, pad_w)\n        else:\n            padding = self.padding  # type: ignore[assignment]\n\n        return {\"padding\": padding}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"padding\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.PadIfNeeded3D","title":"<code>class  PadIfNeeded3D</code> <code>       (min_zyx=None, pad_divisor_zyx=None, position='center', fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions. If the pad_divisor_zyx is specified, the function additionally ensures that the volume dimensions are divisible by these values.</p> <p>Parameters:</p> Name Type Description <code>min_zyx</code> <code>tuple[int, int, int] | None</code> <p>Minimum desired size as (depth, height, width). Ensures volume dimensions are at least these values. If not specified, pad_divisor_zyx must be provided.</p> <code>pad_divisor_zyx</code> <code>tuple[int, int, int] | None</code> <p>If set, pads each dimension to make it divisible by corresponding value in format (depth_div, height_div, width_div). If not specified, min_zyx must be provided.</p> <code>position</code> <code>Literal[\"center\", \"random\"]</code> <p>Position where the volume is to be placed after padding. Default is 'center'.</p> <code>fill</code> <code>ColorType</code> <p>Value to fill the border voxels for volume. Default: 0</p> <code>fill_mask</code> <code>ColorType</code> <p>Value to fill the border voxels for masks. Default: 0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class PadIfNeeded3D(BasePad3D):\n    \"\"\"Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions.\n    If the pad_divisor_zyx is specified, the function additionally ensures that the volume\n    dimensions are divisible by these values.\n\n    Args:\n        min_zyx (tuple[int, int, int] | None): Minimum desired size as (depth, height, width).\n            Ensures volume dimensions are at least these values.\n            If not specified, pad_divisor_zyx must be provided.\n        pad_divisor_zyx (tuple[int, int, int] | None): If set, pads each dimension to make it\n            divisible by corresponding value in format (depth_div, height_div, width_div).\n            If not specified, min_zyx must be provided.\n        position (Literal[\"center\", \"random\"]): Position where the volume is to be placed after padding.\n            Default is 'center'.\n        fill (ColorType): Value to fill the border voxels for volume. Default: 0\n        fill_mask (ColorType): Value to fill the border voxels for masks. Default: 0\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width),\n        with optional channel dimension as the last axis.\n    \"\"\"\n\n    class InitSchema(BasePad3D.InitSchema):\n        min_zyx: Annotated[tuple[int, int, int] | None, AfterValidator(check_range_bounds(0, None))]\n        pad_divisor_zyx: Annotated[tuple[int, int, int] | None, AfterValidator(check_range_bounds(1, None))]\n        position: Literal[\"center\", \"random\"]\n\n        @model_validator(mode=\"after\")\n        def validate_params(self) -&gt; Self:\n            if self.min_zyx is None and self.pad_divisor_zyx is None:\n                msg = \"At least one of min_zyx or pad_divisor_zyx must be set\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        min_zyx: tuple[int, int, int] | None = None,\n        pad_divisor_zyx: tuple[int, int, int] | None = None,\n        position: Literal[\"center\", \"random\"] = \"center\",\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.min_zyx = min_zyx\n        self.pad_divisor_zyx = pad_divisor_zyx\n        self.position = position\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        depth, height, width = data[\"volume\"].shape[:3]\n        sizes = (depth, height, width)\n\n        paddings = [\n            fgeometric.get_dimension_padding(\n                current_size=size,\n                min_size=self.min_zyx[i] if self.min_zyx else None,\n                divisor=self.pad_divisor_zyx[i] if self.pad_divisor_zyx else None,\n            )\n            for i, size in enumerate(sizes)\n        ]\n\n        padding = f3d.adjust_padding_by_position3d(\n            paddings=paddings,\n            position=self.position,\n            py_random=self.py_random,\n        )\n\n        return {\"padding\": padding}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"min_zyx\",\n            \"pad_divisor_zyx\",\n            \"position\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.RandomCrop3D","title":"<code>class  RandomCrop3D</code> <code>       (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop random part of 3D volume.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, int, int]</code> <p>Desired output size of the crop in format (depth, height, width)</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if the volume is smaller than desired crop size. Default: False</p> <code>fill</code> <code>ColorType</code> <p>Padding value for image if pad_if_needed is True. Default: 0</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if pad_if_needed is True. Default: 0</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>If you want to perform random cropping only in the XY plane while preserving all slices along the Z axis, consider using RandomCrop instead. RandomCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class RandomCrop3D(BaseCropAndPad3D):\n    \"\"\"Crop random part of 3D volume.\n\n    Args:\n        size (tuple[int, int, int]): Desired output size of the crop in format (depth, height, width)\n        pad_if_needed (bool): Whether to pad if the volume is smaller than desired crop size. Default: False\n        fill (ColorType): Padding value for image if pad_if_needed is True. Default: 0\n        fill_mask (ColorType): Padding value for mask if pad_if_needed is True. Default: 0\n        p (float): probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        If you want to perform random cropping only in the XY plane while preserving all slices along\n        the Z axis, consider using RandomCrop instead. RandomCrop will apply the same XY crop\n        to each slice independently, maintaining the full depth of the volume.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        size: Annotated[tuple[int, int, int], AfterValidator(check_range_bounds(1, None))]\n        pad_if_needed: bool\n        fill: ColorType\n        fill_mask: ColorType\n\n    def __init__(\n        self,\n        size: tuple[int, int, int],\n        pad_if_needed: bool = False,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=\"random\",  # Random crop uses random padding position\n            p=p,\n        )\n        self.size = size\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        volume = data[\"volume\"]\n        z, h, w = volume.shape[:3]\n        target_z, target_h, target_w = self.size\n\n        # Get padding params if needed\n        pad_params = self._get_pad_params(\n            image_shape=(z, h, w),\n            target_shape=self.size,\n        )\n\n        # Update dimensions if padding is applied\n        if pad_params is not None:\n            z = z + pad_params[\"pad_front\"] + pad_params[\"pad_back\"]\n            h = h + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            w = w + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n\n        # Calculate random crop coordinates\n        z_start = self.py_random.randint(0, max(0, z - target_z))\n        h_start = self.py_random.randint(0, max(0, h - target_h))\n        w_start = self.py_random.randint(0, max(0, w - target_w))\n\n        crop_coords = (\n            z_start,\n            z_start + target_z,\n            h_start,\n            h_start + target_h,\n            w_start,\n            w_start + target_w,\n        )\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"size\", \"pad_if_needed\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/core/","title":"Index","text":"<ul> <li>Composition API (albumentations.core.composition)</li> <li>Serialization API (albumentations.core.serialization)</li> <li>Transforms Interface (albumentations.core.transforms_interface)</li> <li>Helper functions for working with bounding boxes (albumentations.core.bbox_utils)</li> <li>Helper functions for working with keypoints (albumentations.core.keypoints_utils)</li> </ul>"},{"location":"api_reference/core/bbox_utils/","title":"Helper functions for working with bounding boxes (augmentations.core.bbox_utils)","text":""},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.BboxParams","title":"<code>class  BboxParams</code> <code>       (format, label_fields=None, min_area=0.0, min_visibility=0.0, min_width=0.0, min_height=0.0, check_each_transform=True, clip=False)                         </code>  [view source on GitHub]","text":"<p>Parameters of bounding boxes</p> <p>Parameters:</p> Name Type Description <code>format</code> <code>Literal[\"coco\", \"pascal_voc\", \"albumentations\", \"yolo\"]</code> <p>format of bounding boxes.</p> <p>The <code>coco</code> format     <code>[x_min, y_min, width, height]</code>, e.g. [97, 12, 150, 200]. The <code>pascal_voc</code> format     <code>[x_min, y_min, x_max, y_max]</code>, e.g. [97, 12, 247, 212]. The <code>albumentations</code> format     is like <code>pascal_voc</code>, but normalized,     in other words: <code>[x_min, y_min, x_max, y_max]</code>, e.g. [0.2, 0.3, 0.4, 0.5]. The <code>yolo</code> format     <code>[x, y, width, height]</code>, e.g. [0.1, 0.2, 0.3, 0.4];     <code>x</code>, <code>y</code> - normalized bbox center; <code>width</code>, <code>height</code> - normalized bbox width and height.</p> <code>label_fields</code> <code>list</code> <p>List of fields joined with boxes, e.g., labels.</p> <code>min_area</code> <code>float</code> <p>Minimum area of a bounding box in pixels or normalized units. Bounding boxes with an area less than this value will be removed. Default: 0.0.</p> <code>min_visibility</code> <code>float</code> <p>Minimum fraction of area for a bounding box to remain in the list. Bounding boxes with a visible area less than this fraction will be removed. Default: 0.0.</p> <code>min_width</code> <code>float</code> <p>Minimum width of a bounding box in pixels or normalized units. Bounding boxes with a width less than this value will be removed. Default: 0.0.</p> <code>min_height</code> <code>float</code> <p>Minimum height of a bounding box in pixels or normalized units. Bounding boxes with a height less than this value will be removed. Default: 0.0.</p> <code>check_each_transform</code> <code>bool</code> <p>If True, bounding boxes will be checked after each dual transform. Default: True.</p> <code>clip</code> <code>bool</code> <p>If True, bounding boxes will be clipped to the image borders before applying any transform. Default: False.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>class BboxParams(Params):\n    \"\"\"Parameters of bounding boxes\n\n    Args:\n        format Literal[\"coco\", \"pascal_voc\", \"albumentations\", \"yolo\"]: format of bounding boxes.\n\n            The `coco` format\n                `[x_min, y_min, width, height]`, e.g. [97, 12, 150, 200].\n            The `pascal_voc` format\n                `[x_min, y_min, x_max, y_max]`, e.g. [97, 12, 247, 212].\n            The `albumentations` format\n                is like `pascal_voc`, but normalized,\n                in other words: `[x_min, y_min, x_max, y_max]`, e.g. [0.2, 0.3, 0.4, 0.5].\n            The `yolo` format\n                `[x, y, width, height]`, e.g. [0.1, 0.2, 0.3, 0.4];\n                `x`, `y` - normalized bbox center; `width`, `height` - normalized bbox width and height.\n\n        label_fields (list): List of fields joined with boxes, e.g., labels.\n        min_area (float): Minimum area of a bounding box in pixels or normalized units.\n            Bounding boxes with an area less than this value will be removed. Default: 0.0.\n        min_visibility (float): Minimum fraction of area for a bounding box to remain in the list.\n            Bounding boxes with a visible area less than this fraction will be removed. Default: 0.0.\n        min_width (float): Minimum width of a bounding box in pixels or normalized units.\n            Bounding boxes with a width less than this value will be removed. Default: 0.0.\n        min_height (float): Minimum height of a bounding box in pixels or normalized units.\n            Bounding boxes with a height less than this value will be removed. Default: 0.0.\n        check_each_transform (bool): If True, bounding boxes will be checked after each dual transform. Default: True.\n        clip (bool): If True, bounding boxes will be clipped to the image borders before applying any transform.\n            Default: False.\n\n    \"\"\"\n\n    def __init__(\n        self,\n        format: Literal[\"coco\", \"pascal_voc\", \"albumentations\", \"yolo\"],  # noqa: A002\n        label_fields: Sequence[Any] | None = None,\n        min_area: float = 0.0,\n        min_visibility: float = 0.0,\n        min_width: float = 0.0,\n        min_height: float = 0.0,\n        check_each_transform: bool = True,\n        clip: bool = False,\n    ):\n        super().__init__(format, label_fields)\n        self.min_area = min_area\n        self.min_visibility = min_visibility\n        self.min_width = min_width\n        self.min_height = min_height\n        self.check_each_transform = check_each_transform\n        self.clip = clip\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        data = super().to_dict_private()\n        data.update(\n            {\n                \"min_area\": self.min_area,\n                \"min_visibility\": self.min_visibility,\n                \"min_width\": self.min_width,\n                \"min_height\": self.min_height,\n                \"check_each_transform\": self.check_each_transform,\n                \"clip\": self.clip,\n            },\n        )\n        return data\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return \"BboxParams\"\n\n    def __repr__(self) -&gt; str:\n        return (\n            f\"BboxParams(format={self.format}, label_fields={self.label_fields}, min_area={self.min_area},\"\n            f\" min_visibility={self.min_visibility}, min_width={self.min_width}, min_height={self.min_height},\"\n            f\" check_each_transform={self.check_each_transform}, clip={self.clip})\"\n        )\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.bboxes_from_masks","title":"<code>def bboxes_from_masks    (masks)    </code> [view source on GitHub]","text":"<p>Create bounding boxes from binary masks (fast version)</p> <p>Parameters:</p> Name Type Description <code>masks</code> <code>np.ndarray</code> <p>Binary masks of shape (H, W) or (N, H, W) where N is the number of masks,                and H, W are the height and width of each mask.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4), where each row is            (x_min, y_min, x_max, y_max).</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def bboxes_from_masks(masks: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Create bounding boxes from binary masks (fast version)\n\n    Args:\n        masks (np.ndarray): Binary masks of shape (H, W) or (N, H, W) where N is the number of masks,\n                           and H, W are the height and width of each mask.\n\n    Returns:\n        np.ndarray: An array of bounding boxes with shape (N, 4), where each row is\n                   (x_min, y_min, x_max, y_max).\n    \"\"\"\n    # Handle single mask case by adding batch dimension\n    if len(masks.shape) == MONO_CHANNEL_DIMENSIONS:\n        masks = masks[np.newaxis, ...]\n\n    rows = np.any(masks, axis=2)\n    cols = np.any(masks, axis=1)\n\n    bboxes = np.zeros((masks.shape[0], 4), dtype=np.int32)\n\n    for i, (row, col) in enumerate(zip(rows, cols)):\n        if not np.any(row) or not np.any(col):\n            bboxes[i] = [-1, -1, -1, -1]\n        else:\n            y_min, y_max = np.where(row)[0][[0, -1]]\n            x_min, x_max = np.where(col)[0][[0, -1]]\n            bboxes[i] = [x_min, y_min, x_max + 1, y_max + 1]\n\n    return bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.calculate_bbox_areas_in_pixels","title":"<code>def calculate_bbox_areas_in_pixels    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate areas for multiple bounding boxes.</p> <p>This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where N is the number of bounding boxes.                  Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates.                  Additional columns beyond the first 4 are ignored.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>A tuple containing the height and width of the image (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.             Returns an empty array if the input <code>bboxes</code> is empty.</p> <p>Note</p> <ul> <li>The function assumes that the input bounding boxes are valid (i.e., x_max &gt; x_min and y_max &gt; y_min).   Invalid bounding boxes may result in negative areas.</li> <li>The function preserves the input array and creates a copy for internal calculations.</li> <li>The returned areas are in pixel units, not normalized.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])\n&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; areas = calculate_bbox_areas(bboxes, image_shape)\n&gt;&gt;&gt; print(areas)\n[1600. 3600.]\n</code></pre> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def calculate_bbox_areas_in_pixels(bboxes: np.ndarray, image_shape: tuple[int, int]) -&gt; np.ndarray:\n    \"\"\"Calculate areas for multiple bounding boxes.\n\n    This function computes the areas of bounding boxes given their normalized coordinates\n    and the dimensions of the image they belong to. The bounding boxes are expected to be\n    in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).\n\n    Args:\n        bboxes (np.ndarray): A numpy array of shape (N, 4+) where N is the number of bounding boxes.\n                             Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates.\n                             Additional columns beyond the first 4 are ignored.\n        image_shape (tuple[int, int]): A tuple containing the height and width of the image (height, width).\n\n    Returns:\n        np.ndarray: A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.\n                    Returns an empty array if the input `bboxes` is empty.\n\n    Note:\n        - The function assumes that the input bounding boxes are valid (i.e., x_max &gt; x_min and y_max &gt; y_min).\n          Invalid bounding boxes may result in negative areas.\n        - The function preserves the input array and creates a copy for internal calculations.\n        - The returned areas are in pixel units, not normalized.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; areas = calculate_bbox_areas(bboxes, image_shape)\n        &gt;&gt;&gt; print(areas)\n        [1600. 3600.]\n    \"\"\"\n    if len(bboxes) == 0:\n        return np.array([], dtype=np.float32)\n\n    height, width = image_shape\n    bboxes_denorm = bboxes.copy()\n    bboxes_denorm[:, [0, 2]] *= width\n    bboxes_denorm[:, [1, 3]] *= height\n    return (bboxes_denorm[:, 2] - bboxes_denorm[:, 0]) * (bboxes_denorm[:, 3] - bboxes_denorm[:, 1])\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.check_bboxes","title":"<code>def check_bboxes    (bboxes)    </code> [view source on GitHub]","text":"<p>Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>numpy array of shape (num_bboxes, 4+) where first 4 coordinates are x_min, y_min, x_max, y_max.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If any bbox is invalid.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef check_bboxes(bboxes: np.ndarray) -&gt; None:\n    \"\"\"Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.\n\n    Args:\n        bboxes: numpy array of shape (num_bboxes, 4+) where first 4 coordinates are x_min, y_min, x_max, y_max.\n\n    Raises:\n        ValueError: If any bbox is invalid.\n    \"\"\"\n    # Check if all values are in range [0, 1]\n    in_range = (bboxes[:, :4] &gt;= 0) &amp; (bboxes[:, :4] &lt;= 1)\n    close_to_zero = np.isclose(bboxes[:, :4], 0)\n    close_to_one = np.isclose(bboxes[:, :4], 1)\n    valid_range = in_range | close_to_zero | close_to_one\n\n    if not np.all(valid_range):\n        invalid_idx = np.where(~np.all(valid_range, axis=1))[0][0]\n        invalid_bbox = bboxes[invalid_idx]\n        invalid_coord = [\"x_min\", \"y_min\", \"x_max\", \"y_max\"][np.where(~valid_range[invalid_idx])[0][0]]\n        invalid_value = invalid_bbox[np.where(~valid_range[invalid_idx])[0][0]]\n        raise ValueError(\n            f\"Expected {invalid_coord} for bbox {invalid_bbox} to be in the range [0.0, 1.0], got {invalid_value}.\",\n        )\n\n    # Check if x_max &gt; x_min and y_max &gt; y_min\n    valid_order = (bboxes[:, 2] &gt; bboxes[:, 0]) &amp; (bboxes[:, 3] &gt; bboxes[:, 1])\n\n    if not np.all(valid_order):\n        invalid_idx = np.where(~valid_order)[0][0]\n        invalid_bbox = bboxes[invalid_idx]\n        if invalid_bbox[2] &lt;= invalid_bbox[0]:\n            raise ValueError(f\"x_max is less than or equal to x_min for bbox {invalid_bbox}.\")\n\n        raise ValueError(f\"y_max is less than or equal to y_min for bbox {invalid_bbox}.\")\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.clip_bboxes","title":"<code>def clip_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Clips the bounding box coordinates to ensure they fit within the boundaries of an image.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (num_boxes, 4+) in normalized format.                  The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>Image shape (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The clipped bounding boxes, normalized to the image dimensions.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef clip_bboxes(bboxes: np.ndarray, image_shape: tuple[int, int]) -&gt; np.ndarray:\n    \"\"\"Clips the bounding box coordinates to ensure they fit within the boundaries of an image.\n\n    Parameters:\n        bboxes (np.ndarray): Array of bounding boxes with shape (num_boxes, 4+) in normalized format.\n                             The first 4 columns are [x_min, y_min, x_max, y_max].\n        image_shape (Tuple[int, int]): Image shape (height, width).\n\n    Returns:\n        np.ndarray: The clipped bounding boxes, normalized to the image dimensions.\n\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Denormalize bboxes\n    denorm_bboxes = denormalize_bboxes(bboxes, image_shape)\n\n    ## Note:\n    # It could be tempting to use cols - 1 and rows - 1 as the upper bounds for the clipping\n\n    # But this would cause the bounding box to be clipped to the image dimensions - 1 which is not what we want.\n    # Bounding box lives not in the middle of pixels but between them.\n\n    # Example: for image with height 100, width 100, the pixel values are in the range [0, 99]\n    # but if we want bounding box to be 1 pixel width and height and lie on the boundary of the image\n    # it will be described as [99, 99, 100, 100] =&gt; clip by image_size - 1 will lead to [99, 99, 99, 99]\n    # which is incorrect\n\n    # It could be also tempting to clip `x_min`` to `cols - 1`` and `y_min` to `rows - 1`, but this also leads\n    # to another error. If image fully lies outside of the visible area and min_area is set to 0, then\n    # the bounding box will be clipped to the image size - 1 and will be 1 pixel in size and fully visible,\n    # but it should be completely removed.\n\n    # Clip coordinates\n    denorm_bboxes[:, [0, 2]] = np.clip(denorm_bboxes[:, [0, 2]], 0, width, out=denorm_bboxes[:, [0, 2]])\n    denorm_bboxes[:, [1, 3]] = np.clip(denorm_bboxes[:, [1, 3]], 0, height, out=denorm_bboxes[:, [1, 3]])\n\n    # Normalize clipped bboxes\n    return normalize_bboxes(denorm_bboxes, image_shape)\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.convert_bboxes_from_albumentations","title":"<code>def convert_bboxes_from_albumentations    (bboxes, target_format, image_shape, check_validity=False)    </code> [view source on GitHub]","text":"<p>Convert bounding boxes from the format used by albumentations to a specified format.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+).     The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>target_format</code> <code>Literal['coco', 'pascal_voc', 'yolo']</code> <p>Required format of the output bounding boxes. Should be 'coco', 'pascal_voc' or 'yolo'.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape (height, width).</p> <code>check_validity</code> <code>bool</code> <p>Check if all boxes are valid boxes.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of bounding boxes in the target format with shape (num_bboxes, 4+).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>target_format</code> is not 'coco', 'pascal_voc' or 'yolo'.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef convert_bboxes_from_albumentations(\n    bboxes: np.ndarray,\n    target_format: Literal[\"coco\", \"pascal_voc\", \"yolo\"],\n    image_shape: tuple[int, int],\n    check_validity: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Convert bounding boxes from the format used by albumentations to a specified format.\n\n    Args:\n        bboxes: A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+).\n                The first 4 columns are [x_min, y_min, x_max, y_max].\n        target_format: Required format of the output bounding boxes. Should be 'coco', 'pascal_voc' or 'yolo'.\n        image_shape: Image shape (height, width).\n        check_validity: Check if all boxes are valid boxes.\n\n    Returns:\n        np.ndarray: An array of bounding boxes in the target format with shape (num_bboxes, 4+).\n\n    Raises:\n        ValueError: If `target_format` is not 'coco', 'pascal_voc' or 'yolo'.\n    \"\"\"\n    if target_format not in {\"coco\", \"pascal_voc\", \"yolo\"}:\n        raise ValueError(\n            f\"Unknown target_format {target_format}. Supported formats are: 'coco', 'pascal_voc' and 'yolo'\",\n        )\n\n    if check_validity:\n        check_bboxes(bboxes)\n\n    converted_bboxes = np.zeros_like(bboxes)\n    converted_bboxes[:, 4:] = bboxes[:, 4:]  # Preserve additional columns\n\n    denormalized_bboxes = denormalize_bboxes(bboxes[:, :4], image_shape) if target_format != \"yolo\" else bboxes[:, :4]\n\n    if target_format == \"coco\":\n        converted_bboxes[:, 0] = denormalized_bboxes[:, 0]  # x_min\n        converted_bboxes[:, 1] = denormalized_bboxes[:, 1]  # y_min\n        converted_bboxes[:, 2] = denormalized_bboxes[:, 2] - denormalized_bboxes[:, 0]  # width\n        converted_bboxes[:, 3] = denormalized_bboxes[:, 3] - denormalized_bboxes[:, 1]  # height\n    elif target_format == \"yolo\":\n        converted_bboxes[:, 0] = (denormalized_bboxes[:, 0] + denormalized_bboxes[:, 2]) / 2  # x_center\n        converted_bboxes[:, 1] = (denormalized_bboxes[:, 1] + denormalized_bboxes[:, 3]) / 2  # y_center\n        converted_bboxes[:, 2] = denormalized_bboxes[:, 2] - denormalized_bboxes[:, 0]  # width\n        converted_bboxes[:, 3] = denormalized_bboxes[:, 3] - denormalized_bboxes[:, 1]  # height\n    else:  # pascal_voc\n        converted_bboxes[:, :4] = denormalized_bboxes\n\n    return converted_bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.convert_bboxes_to_albumentations","title":"<code>def convert_bboxes_to_albumentations    (bboxes, source_format, image_shape, check_validity=False)    </code> [view source on GitHub]","text":"<p>Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of <code>(x_min, y_min, x_max, y_max)</code> e.g. <code>(0.15, 0.27, 0.67, 0.5)</code>.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).</p> <code>source_format</code> <code>Literal['coco', 'pascal_voc', 'yolo']</code> <p>Format of the input bounding boxes. Should be 'coco', 'pascal_voc', or 'yolo'.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape (height, width).</p> <code>check_validity</code> <code>bool</code> <p>Check if all boxes are valid boxes.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>source_format</code> is not 'coco', 'pascal_voc', or 'yolo'.</p> <code>ValueError</code> <p>If in YOLO format, any coordinates are not in the range (0, 1].</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef convert_bboxes_to_albumentations(\n    bboxes: np.ndarray,\n    source_format: Literal[\"coco\", \"pascal_voc\", \"yolo\"],\n    image_shape: tuple[int, int],\n    check_validity: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Convert bounding boxes from a specified format to the format used by albumentations:\n    normalized coordinates of top-left and bottom-right corners of the bounding box in the form of\n    `(x_min, y_min, x_max, y_max)` e.g. `(0.15, 0.27, 0.67, 0.5)`.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n        source_format: Format of the input bounding boxes. Should be 'coco', 'pascal_voc', or 'yolo'.\n        image_shape: Image shape (height, width).\n        check_validity: Check if all boxes are valid boxes.\n\n    Returns:\n        np.ndarray: An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).\n\n    Raises:\n        ValueError: If `source_format` is not 'coco', 'pascal_voc', or 'yolo'.\n        ValueError: If in YOLO format, any coordinates are not in the range (0, 1].\n    \"\"\"\n    if source_format not in {\"coco\", \"pascal_voc\", \"yolo\"}:\n        raise ValueError(\n            f\"Unknown source_format {source_format}. Supported formats are: 'coco', 'pascal_voc' and 'yolo'\",\n        )\n\n    bboxes = bboxes.copy().astype(np.float32)\n    converted_bboxes = np.zeros_like(bboxes)\n    converted_bboxes[:, 4:] = bboxes[:, 4:]  # Preserve additional columns\n\n    if source_format == \"coco\":\n        converted_bboxes[:, 0] = bboxes[:, 0]  # x_min\n        converted_bboxes[:, 1] = bboxes[:, 1]  # y_min\n        converted_bboxes[:, 2] = bboxes[:, 0] + bboxes[:, 2]  # x_max\n        converted_bboxes[:, 3] = bboxes[:, 1] + bboxes[:, 3]  # y_max\n    elif source_format == \"yolo\":\n        if check_validity and np.any((bboxes[:, :4] &lt;= 0) | (bboxes[:, :4] &gt; 1)):\n            raise ValueError(f\"In YOLO format all coordinates must be float and in range (0, 1], got {bboxes}\")\n\n        w_half, h_half = bboxes[:, 2] / 2, bboxes[:, 3] / 2\n        converted_bboxes[:, 0] = bboxes[:, 0] - w_half  # x_min\n        converted_bboxes[:, 1] = bboxes[:, 1] - h_half  # y_min\n        converted_bboxes[:, 2] = bboxes[:, 0] + w_half  # x_max\n        converted_bboxes[:, 3] = bboxes[:, 1] + h_half  # y_max\n    else:  # pascal_voc\n        converted_bboxes[:, :4] = bboxes[:, :4]\n\n    if source_format != \"yolo\":\n        converted_bboxes[:, :4] = normalize_bboxes(converted_bboxes[:, :4], image_shape)\n\n    if check_validity:\n        check_bboxes(converted_bboxes)\n\n    return converted_bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.denormalize_bboxes","title":"<code>def denormalize_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Denormalize  array of bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape <code>(height, width)</code>.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef denormalize_bboxes(\n    bboxes: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Denormalize  array of bounding boxes.\n\n    Args:\n        bboxes: Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n        image_shape: Image shape `(height, width)`.\n\n    Returns:\n        Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    denormalized = bboxes.copy().astype(float)\n    denormalized[:, [0, 2]] *= cols\n    denormalized[:, [1, 3]] *= rows\n    return denormalized\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.filter_bboxes","title":"<code>def filter_bboxes    (bboxes, image_shape, min_area=0.0, min_visibility=0.0, min_width=1.0, min_height=1.0)    </code> [view source on GitHub]","text":"<p>Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by <code>min_area</code>. Also crops boxes to final image size.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>numpy array of bounding boxes with shape (num_bboxes, 4+).     The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area of a bounding box in pixels. Default: 0.0.</p> <code>min_visibility</code> <code>float</code> <p>Minimum fraction of area for a bounding box to remain. Default: 0.0.</p> <code>min_width</code> <code>float</code> <p>Minimum width of a bounding box in pixels. Default: 0.0.</p> <code>min_height</code> <code>float</code> <p>Minimum height of a bounding box in pixels. Default: 0.0.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>numpy array of filtered bounding boxes.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def filter_bboxes(\n    bboxes: np.ndarray,\n    image_shape: tuple[int, int],\n    min_area: float = 0.0,\n    min_visibility: float = 0.0,\n    min_width: float = 1.0,\n    min_height: float = 1.0,\n) -&gt; np.ndarray:\n    \"\"\"Remove bounding boxes that either lie outside of the visible area by more than min_visibility\n    or whose area in pixels is under the threshold set by `min_area`. Also crops boxes to final image size.\n\n    Args:\n        bboxes: numpy array of bounding boxes with shape (num_bboxes, 4+).\n                The first 4 columns are [x_min, y_min, x_max, y_max].\n        image_shape: Image shape (height, width).\n        min_area: Minimum area of a bounding box in pixels. Default: 0.0.\n        min_visibility: Minimum fraction of area for a bounding box to remain. Default: 0.0.\n        min_width: Minimum width of a bounding box in pixels. Default: 0.0.\n        min_height: Minimum height of a bounding box in pixels. Default: 0.0.\n\n    Returns:\n        numpy array of filtered bounding boxes.\n    \"\"\"\n    epsilon = 1e-7\n\n    if len(bboxes) == 0:\n        return np.array([], dtype=np.float32).reshape(0, 4)\n\n    # Calculate areas of bounding boxes before clipping in pixels\n    denormalized_box_areas = calculate_bbox_areas_in_pixels(bboxes, image_shape)\n\n    # Clip bounding boxes in ratio\n    clipped_bboxes = clip_bboxes(bboxes, image_shape)\n\n    # Calculate areas of clipped bounding boxes in pixels\n    clipped_box_areas = calculate_bbox_areas_in_pixels(clipped_bboxes, image_shape)\n\n    # Calculate width and height of the clipped bounding boxes\n    denormalized_bboxes = denormalize_bboxes(clipped_bboxes[:, :4], image_shape)\n\n    clipped_widths = denormalized_bboxes[:, 2] - denormalized_bboxes[:, 0]\n    clipped_heights = denormalized_bboxes[:, 3] - denormalized_bboxes[:, 1]\n\n    # Create a mask for bboxes that meet all criteria\n    mask = (\n        (denormalized_box_areas &gt;= epsilon)\n        &amp; (clipped_box_areas &gt;= min_area - epsilon)\n        &amp; (clipped_box_areas / denormalized_box_areas &gt;= min_visibility - epsilon)\n        &amp; (clipped_widths &gt;= min_width - epsilon)\n        &amp; (clipped_heights &gt;= min_height - epsilon)\n    )\n\n    # Apply the mask to get the filtered bboxes\n    filtered_bboxes = clipped_bboxes[mask]\n\n    return np.array([], dtype=np.float32).reshape(0, 4) if len(filtered_bboxes) == 0 else filtered_bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.masks_from_bboxes","title":"<code>def masks_from_bboxes    (bboxes, img_shape)    </code> [view source on GitHub]","text":"<p>Create binary masks from multiple bounding boxes</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4), where N is the number of boxes</p> <code>img_shape</code> <code>tuple[int, int]</code> <p>Image shape (height, width)</p> <p>Returns:</p> Type Description <code>masks</code> <p>Array of binary masks with shape (N, height, width)</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def masks_from_bboxes(bboxes: np.ndarray, img_shape: tuple[int, int]) -&gt; np.ndarray:\n    \"\"\"Create binary masks from multiple bounding boxes\n\n    Args:\n        bboxes: Array of bounding boxes with shape (N, 4), where N is the number of boxes\n        img_shape: Image shape (height, width)\n\n    Returns:\n        masks: Array of binary masks with shape (N, height, width)\n\n    \"\"\"\n    height, width = img_shape[:2]\n    masks = np.zeros((len(bboxes), height, width), dtype=np.uint8)\n    y, x = np.ogrid[:height, :width]\n\n    for i, (x_min, y_min, x_max, y_max) in enumerate(bboxes[:, :4].astype(int)):\n        masks[i] = (x_min &lt;= x) &amp; (x &lt; x_max) &amp; (y_min &lt;= y) &amp; (y &lt; y_max)\n\n    return masks\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.normalize_bboxes","title":"<code>def normalize_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Normalize array of bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape <code>(height, width)</code>.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef normalize_bboxes(bboxes: np.ndarray, image_shape: tuple[int, int]) -&gt; np.ndarray:\n    \"\"\"Normalize array of bounding boxes.\n\n    Args:\n        bboxes: Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n        image_shape: Image shape `(height, width)`.\n\n    Returns:\n        Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n\n    \"\"\"\n    rows, cols = image_shape[:2]\n    normalized = bboxes.copy().astype(float)\n    normalized[:, [0, 2]] /= cols\n    normalized[:, [1, 3]] /= rows\n    return normalized\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.union_of_bboxes","title":"<code>def union_of_bboxes    (bboxes, erosion_rate)    </code> [view source on GitHub]","text":"<p>Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>List of bounding boxes</p> <code>erosion_rate</code> <code>float</code> <p>How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.</p> <p>Returns:</p> Type Description <code>np.ndarray | None</code> <p>A bounding box <code>(x_min, y_min, x_max, y_max)</code> or None if no bboxes are given or if             the bounding boxes become invalid after erosion.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def union_of_bboxes(bboxes: np.ndarray, erosion_rate: float) -&gt; np.ndarray | None:\n    \"\"\"Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.\n\n    Args:\n        bboxes (np.ndarray): List of bounding boxes\n        erosion_rate (float): How much each bounding box can be shrunk, useful for erosive cropping.\n            Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.\n\n    Returns:\n        np.ndarray | None: A bounding box `(x_min, y_min, x_max, y_max)` or None if no bboxes are given or if\n                    the bounding boxes become invalid after erosion.\n    \"\"\"\n    if not bboxes.size:\n        return None\n\n    if erosion_rate == 1:\n        return None\n\n    if bboxes.shape[0] == 1:\n        return bboxes[0][:4]\n\n    epsilon = 1e-6\n\n    x_min, y_min = np.min(bboxes[:, :2], axis=0)\n    x_max, y_max = np.max(bboxes[:, 2:4], axis=0)\n\n    width = x_max - x_min\n    height = y_max - y_min\n\n    erosion_x = width * erosion_rate * 0.5\n    erosion_y = height * erosion_rate * 0.5\n\n    x_min += erosion_x\n    y_min += erosion_y\n    x_max -= erosion_x\n    y_max -= erosion_y\n\n    if abs(x_max - x_min) &lt; epsilon or abs(y_max - y_min) &lt; epsilon:\n        return None\n\n    return np.array([x_min, y_min, x_max, y_max], dtype=np.float32)\n</code></pre>"},{"location":"api_reference/core/composition/","title":"Composition API (core.composition)","text":""},{"location":"api_reference/core/composition/#albumentations.core.composition.BaseCompose","title":"<code>class  BaseCompose</code> <code>           (transforms, p, mask_interpolation=None, seed=None, save_applied_params=False)                                                     </code>  [view source on GitHub]","text":"<p>Base class for composing multiple transforms together.</p> <p>This class serves as a foundation for creating compositions of transforms in the Albumentations library. It provides basic functionality for managing a sequence of transforms and applying them to data.</p> <p>Attributes:</p> Name Type Description <code>transforms</code> <code>List[TransformType]</code> <p>A list of transforms to be applied.</p> <code>p</code> <code>float</code> <p>Probability of applying the compose. Should be in the range [0, 1].</p> <code>replay_mode</code> <code>bool</code> <p>If True, the compose is in replay mode.</p> <code>_additional_targets</code> <code>Dict[str, str]</code> <p>Additional targets for transforms.</p> <code>_available_keys</code> <code>Set[str]</code> <p>Set of available keys for data.</p> <code>processors</code> <code>Dict[str, Union[BboxProcessor, KeypointsProcessor]]</code> <p>Processors for specific data types.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>TransformsSeqType</code> <p>A sequence of transforms to compose.</p> <code>p</code> <code>float</code> <p>Probability of applying the compose.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If an invalid additional target is specified.</p> <p>Note</p> <ul> <li>Subclasses should implement the call method to define how   the composition is applied to data.</li> <li>The class supports serialization and deserialization of transforms.</li> <li>It provides methods for adding targets, setting deterministic behavior,   and checking data validity post-transform.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class BaseCompose(Serializable):\n    \"\"\"Base class for composing multiple transforms together.\n\n    This class serves as a foundation for creating compositions of transforms\n    in the Albumentations library. It provides basic functionality for\n    managing a sequence of transforms and applying them to data.\n\n    Attributes:\n        transforms (List[TransformType]): A list of transforms to be applied.\n        p (float): Probability of applying the compose. Should be in the range [0, 1].\n        replay_mode (bool): If True, the compose is in replay mode.\n        _additional_targets (Dict[str, str]): Additional targets for transforms.\n        _available_keys (Set[str]): Set of available keys for data.\n        processors (Dict[str, Union[BboxProcessor, KeypointsProcessor]]): Processors for specific data types.\n\n    Args:\n        transforms (TransformsSeqType): A sequence of transforms to compose.\n        p (float): Probability of applying the compose.\n\n    Raises:\n        ValueError: If an invalid additional target is specified.\n\n    Note:\n        - Subclasses should implement the __call__ method to define how\n          the composition is applied to data.\n        - The class supports serialization and deserialization of transforms.\n        - It provides methods for adding targets, setting deterministic behavior,\n          and checking data validity post-transform.\n    \"\"\"\n\n    _transforms_dict: dict[int, BasicTransform] | None = None\n    check_each_transform: tuple[DataProcessor, ...] | None = None\n    main_compose: bool = True\n\n    def __init__(\n        self,\n        transforms: TransformsSeqType,\n        p: float,\n        mask_interpolation: int | None = None,\n        seed: int | None = None,\n        save_applied_params: bool = False,\n    ):\n        if isinstance(transforms, (BaseCompose, BasicTransform)):\n            warnings.warn(\n                \"transforms is single transform, but a sequence is expected! Transform will be wrapped into list.\",\n                stacklevel=2,\n            )\n            transforms = [transforms]\n\n        self.transforms = transforms\n        self.p = p\n\n        self.replay_mode = False\n        self._additional_targets: dict[str, str] = {}\n        self._available_keys: set[str] = set()\n        self.processors: dict[str, BboxProcessor | KeypointsProcessor] = {}\n        self._set_keys()\n        self.set_mask_interpolation(mask_interpolation)\n        self.seed = seed\n        self.random_generator = np.random.default_rng(seed)\n        self.py_random = random.Random(seed)\n        self.set_random_seed(seed)\n        self.save_applied_params = save_applied_params\n\n    def _track_transform_params(self, transform: TransformType, data: dict[str, Any]) -&gt; None:\n        \"\"\"Track transform parameters if tracking is enabled.\"\"\"\n        if \"applied_transforms\" in data and hasattr(transform, \"params\") and transform.params:\n            data[\"applied_transforms\"].append((transform.__class__.__name__, transform.params.copy()))\n\n    def set_random_state(\n        self,\n        random_generator: np.random.Generator,\n        py_random: random.Random,\n    ) -&gt; None:\n        \"\"\"Set random state directly from generators.\n\n        Args:\n            random_generator: numpy random generator to use\n            py_random: python random generator to use\n        \"\"\"\n        self.random_generator = random_generator\n        self.py_random = py_random\n\n        # Propagate both random states to all transforms\n        for transform in self.transforms:\n            if isinstance(transform, (BasicTransform, BaseCompose)):\n                transform.set_random_state(random_generator, py_random)\n\n    def set_random_seed(self, seed: int | None) -&gt; None:\n        \"\"\"Set random state from seed.\n\n        Args:\n            seed: Random seed to use\n        \"\"\"\n        self.seed = seed\n        self.random_generator = np.random.default_rng(seed)\n        self.py_random = random.Random(seed)\n\n        # Propagate seed to all transforms\n        for transform in self.transforms:\n            if isinstance(transform, (BasicTransform, BaseCompose)):\n                transform.set_random_seed(seed)\n\n    def set_mask_interpolation(self, mask_interpolation: int | None) -&gt; None:\n        self.mask_interpolation = mask_interpolation\n        self._set_mask_interpolation_recursive(self.transforms)\n\n    def _set_mask_interpolation_recursive(self, transforms: TransformsSeqType) -&gt; None:\n        for transform in transforms:\n            if isinstance(transform, BasicTransform):\n                if hasattr(transform, \"mask_interpolation\") and self.mask_interpolation is not None:\n                    transform.mask_interpolation = self.mask_interpolation\n            elif isinstance(transform, BaseCompose):\n                transform.set_mask_interpolation(self.mask_interpolation)\n\n    def __iter__(self) -&gt; Iterator[TransformType]:\n        return iter(self.transforms)\n\n    def __len__(self) -&gt; int:\n        return len(self.transforms)\n\n    def __call__(self, *args: Any, **data: Any) -&gt; dict[str, Any]:\n        raise NotImplementedError\n\n    def __getitem__(self, item: int) -&gt; TransformType:\n        return self.transforms[item]\n\n    def __repr__(self) -&gt; str:\n        return self.indented_repr()\n\n    @property\n    def additional_targets(self) -&gt; dict[str, str]:\n        return self._additional_targets\n\n    @property\n    def available_keys(self) -&gt; set[str]:\n        return self._available_keys\n\n    def indented_repr(self, indent: int = REPR_INDENT_STEP) -&gt; str:\n        args = {k: v for k, v in self.to_dict_private().items() if not (k.startswith(\"__\") or k == \"transforms\")}\n        repr_string = self.__class__.__name__ + \"([\"\n        for t in self.transforms:\n            repr_string += \"\\n\"\n            t_repr = t.indented_repr(indent + REPR_INDENT_STEP) if hasattr(t, \"indented_repr\") else repr(t)\n            repr_string += \" \" * indent + t_repr + \",\"\n        repr_string += \"\\n\" + \" \" * (indent - REPR_INDENT_STEP) + f\"], {format_args(args)})\"\n        return repr_string\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return get_shortest_class_fullname(cls)\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        return {\n            \"__class_fullname__\": self.get_class_fullname(),\n            \"p\": self.p,\n            \"transforms\": [t.to_dict_private() for t in self.transforms],\n        }\n\n    def get_dict_with_id(self) -&gt; dict[str, Any]:\n        return {\n            \"__class_fullname__\": self.get_class_fullname(),\n            \"id\": id(self),\n            \"params\": None,\n            \"transforms\": [t.get_dict_with_id() for t in self.transforms],\n        }\n\n    def add_targets(self, additional_targets: dict[str, str] | None) -&gt; None:\n        if additional_targets:\n            for k, v in additional_targets.items():\n                if k in self._additional_targets and v != self._additional_targets[k]:\n                    raise ValueError(\n                        f\"Trying to overwrite existed additional targets. \"\n                        f\"Key={k} Exists={self._additional_targets[k]} New value: {v}\",\n                    )\n            self._additional_targets.update(additional_targets)\n            for t in self.transforms:\n                t.add_targets(additional_targets)\n            for proc in self.processors.values():\n                proc.add_targets(additional_targets)\n        self._set_keys()\n\n    def _set_keys(self) -&gt; None:\n        \"\"\"Set _available_keys\"\"\"\n        self._available_keys.update(self._additional_targets.keys())\n        for t in self.transforms:\n            self._available_keys.update(t.available_keys)\n            if hasattr(t, \"targets_as_params\"):\n                self._available_keys.update(t.targets_as_params)\n        if self.processors:\n            self._available_keys.update([\"labels\"])\n            for proc in self.processors.values():\n                if proc.default_data_name not in self._available_keys:  # if no transform to process this data\n                    warnings.warn(\n                        f\"Got processor for {proc.default_data_name}, but no transform to process it.\",\n                        stacklevel=2,\n                    )\n                self._available_keys.update(proc.data_fields)\n                if proc.params.label_fields:\n                    self._available_keys.update(proc.params.label_fields)\n\n    def set_deterministic(self, flag: bool, save_key: str = \"replay\") -&gt; None:\n        for t in self.transforms:\n            t.set_deterministic(flag, save_key)\n\n    def check_data_post_transform(self, data: Any) -&gt; dict[str, Any]:\n        if self.check_each_transform:\n            image_shape = get_shape(data[\"image\"])\n\n            for proc in self.check_each_transform:\n                for data_name in data:\n                    if data_name in proc.data_fields or (\n                        data_name in self._additional_targets\n                        and self._additional_targets[data_name] in proc.data_fields\n                    ):\n                        data[data_name] = proc.filter(data[data_name], image_shape)\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.Compose","title":"<code>class  Compose</code> <code>         (transforms, bbox_params=None, keypoint_params=None, additional_targets=None, p=1.0, is_check_shapes=True, strict=True, mask_interpolation=None, seed=None, save_applied_params=False)                           </code>  [view source on GitHub]","text":"<p>Compose multiple transforms together and apply them sequentially to input data.</p> <p>This class allows you to chain multiple image augmentation transforms and apply them in a specified order. It also handles bounding box and keypoint transformations if the appropriate parameters are provided.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>List[Union[BasicTransform, BaseCompose]]</code> <p>A list of transforms to apply.</p> <code>bbox_params</code> <code>Union[dict, BboxParams, None]</code> <p>Parameters for bounding box transforms. Can be a dict of params or a BboxParams object. Default is None.</p> <code>keypoint_params</code> <code>Union[dict, KeypointParams, None]</code> <p>Parameters for keypoint transforms. Can be a dict of params or a KeypointParams object. Default is None.</p> <code>additional_targets</code> <code>Dict[str, str]</code> <p>A dictionary mapping additional target names to their types. For example, {'image2': 'image'}. Default is None.</p> <code>p</code> <code>float</code> <p>Probability of applying all transforms. Should be in range [0, 1]. Default is 1.0.</p> <code>is_check_shapes</code> <code>bool</code> <p>If True, checks consistency of shapes for image/mask/masks on each call. Disable only if you are sure about your data consistency. Default is True.</p> <code>strict</code> <code>bool</code> <p>If True, raises an error on unknown input keys. If False, ignores them. Default is True.</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method for mask transforms. When defined, it overrides the interpolation method specified in individual transforms. Default is None.</p> <code>seed</code> <code>int</code> <p>Random seed. Default is None.</p> <code>save_applied_params</code> <code>bool</code> <p>If True, saves the applied parameters of each transform. Default is False.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomCrop(width=256, height=256),\n...     A.HorizontalFlip(p=0.5),\n...     A.RandomBrightnessContrast(p=0.2),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n</code></pre> <p>Note</p> <ul> <li>The class checks the validity of input data and shapes if is_check_args and is_check_shapes are True.</li> <li>When bbox_params or keypoint_params are provided, it sets up the corresponding processors.</li> <li>The transform can handle additional targets specified in the additional_targets dictionary.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class Compose(BaseCompose, HubMixin):\n    \"\"\"Compose multiple transforms together and apply them sequentially to input data.\n\n    This class allows you to chain multiple image augmentation transforms and apply them\n    in a specified order. It also handles bounding box and keypoint transformations if\n    the appropriate parameters are provided.\n\n    Args:\n        transforms (List[Union[BasicTransform, BaseCompose]]): A list of transforms to apply.\n        bbox_params (Union[dict, BboxParams, None]): Parameters for bounding box transforms.\n            Can be a dict of params or a BboxParams object. Default is None.\n        keypoint_params (Union[dict, KeypointParams, None]): Parameters for keypoint transforms.\n            Can be a dict of params or a KeypointParams object. Default is None.\n        additional_targets (Dict[str, str], optional): A dictionary mapping additional target names\n            to their types. For example, {'image2': 'image'}. Default is None.\n        p (float): Probability of applying all transforms. Should be in range [0, 1]. Default is 1.0.\n        is_check_shapes (bool): If True, checks consistency of shapes for image/mask/masks on each call.\n            Disable only if you are sure about your data consistency. Default is True.\n        strict (bool): If True, raises an error on unknown input keys. If False, ignores them. Default is True.\n        mask_interpolation (int, optional): Interpolation method for mask transforms. When defined,\n            it overrides the interpolation method specified in individual transforms. Default is None.\n        seed (int, optional): Random seed. Default is None.\n        save_applied_params (bool): If True, saves the applied parameters of each transform. Default is False.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomCrop(width=256, height=256),\n        ...     A.HorizontalFlip(p=0.5),\n        ...     A.RandomBrightnessContrast(p=0.2),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n\n    Note:\n        - The class checks the validity of input data and shapes if is_check_args and is_check_shapes are True.\n        - When bbox_params or keypoint_params are provided, it sets up the corresponding processors.\n        - The transform can handle additional targets specified in the additional_targets dictionary.\n    \"\"\"\n\n    def __init__(\n        self,\n        transforms: TransformsSeqType,\n        bbox_params: dict[str, Any] | BboxParams | None = None,\n        keypoint_params: dict[str, Any] | KeypointParams | None = None,\n        additional_targets: dict[str, str] | None = None,\n        p: float = 1.0,\n        is_check_shapes: bool = True,\n        strict: bool = True,\n        mask_interpolation: int | None = None,\n        seed: int | None = None,\n        save_applied_params: bool = False,\n    ):\n        super().__init__(\n            transforms=transforms,\n            p=p,\n            mask_interpolation=mask_interpolation,\n            seed=seed,\n            save_applied_params=save_applied_params,\n        )\n\n        if bbox_params:\n            if isinstance(bbox_params, dict):\n                b_params = BboxParams(**bbox_params)\n            elif isinstance(bbox_params, BboxParams):\n                b_params = bbox_params\n            else:\n                msg = \"unknown format of bbox_params, please use `dict` or `BboxParams`\"\n                raise ValueError(msg)\n            self.processors[\"bboxes\"] = BboxProcessor(b_params)\n\n        if keypoint_params:\n            if isinstance(keypoint_params, dict):\n                k_params = KeypointParams(**keypoint_params)\n            elif isinstance(keypoint_params, KeypointParams):\n                k_params = keypoint_params\n            else:\n                msg = \"unknown format of keypoint_params, please use `dict` or `KeypointParams`\"\n                raise ValueError(msg)\n            self.processors[\"keypoints\"] = KeypointsProcessor(k_params)\n\n        for proc in self.processors.values():\n            proc.ensure_transforms_valid(self.transforms)\n\n        self.add_targets(additional_targets)\n        if not self.transforms:  # if no transforms -&gt; do nothing, all keys will be available\n            self._available_keys.update(AVAILABLE_KEYS)\n\n        self.is_check_args = True\n        self.strict = strict\n\n        self.is_check_shapes = is_check_shapes\n        self.check_each_transform = tuple(  # processors that checks after each transform\n            proc for proc in self.processors.values() if getattr(proc.params, \"check_each_transform\", False)\n        )\n        self._set_check_args_for_transforms(self.transforms)\n\n        self._set_processors_for_transforms(self.transforms)\n\n        self.save_applied_params = save_applied_params\n        self._images_was_list = False\n        self._masks_was_list = False\n\n    def _set_processors_for_transforms(self, transforms: TransformsSeqType) -&gt; None:\n        for transform in transforms:\n            if isinstance(transform, BasicTransform):\n                if hasattr(transform, \"set_processors\"):\n                    transform.set_processors(self.processors)\n            elif isinstance(transform, BaseCompose):\n                self._set_processors_for_transforms(transform.transforms)\n\n    def _set_check_args_for_transforms(self, transforms: TransformsSeqType) -&gt; None:\n        for transform in transforms:\n            if isinstance(transform, BaseCompose):\n                self._set_check_args_for_transforms(transform.transforms)\n                transform.check_each_transform = self.check_each_transform\n                transform.processors = self.processors\n            if isinstance(transform, Compose):\n                transform.disable_check_args_private()\n\n    def disable_check_args_private(self) -&gt; None:\n        self.is_check_args = False\n        self.strict = False\n        self.main_compose = False\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if args:\n            msg = \"You have to pass data to augmentations as named arguments, for example: aug(image=image)\"\n            raise KeyError(msg)\n\n        if not isinstance(force_apply, (bool, int)):\n            msg = \"force_apply must have bool or int type\"\n            raise TypeError(msg)\n\n        # Initialize applied_transforms only in top-level Compose if requested\n        if self.save_applied_params and self.main_compose:\n            data[\"applied_transforms\"] = []\n\n        need_to_run = force_apply or self.py_random.random() &lt; self.p\n        if not need_to_run:\n            return data\n\n        self.preprocess(data)\n\n        for t in self.transforms:\n            data = t(**data)\n            self._track_transform_params(t, data)\n            data = self.check_data_post_transform(data)\n\n        return self.postprocess(data)\n\n    def preprocess(self, data: Any) -&gt; None:\n        \"\"\"Preprocess input data before applying transforms.\"\"\"\n        self._validate_data(data)\n        self._preprocess_processors(data)\n        self._preprocess_arrays(data)\n\n    def _validate_data(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Validate input data keys and arguments.\"\"\"\n        if not self.strict:\n            return\n\n        for data_name in data:\n            if not self._is_valid_key(data_name):\n                raise ValueError(f\"Key {data_name} is not in available keys.\")\n\n        if self.is_check_args:\n            self._check_args(**data)\n\n    def _is_valid_key(self, key: str) -&gt; bool:\n        \"\"\"Check if the key is valid for processing.\"\"\"\n        return key in self._available_keys or key in MASK_KEYS or key in IMAGE_KEYS or key == \"applied_transforms\"\n\n    def _preprocess_processors(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Run preprocessors if this is the main compose.\"\"\"\n        if not self.main_compose:\n            return\n\n        for processor in self.processors.values():\n            processor.ensure_data_valid(data)\n        for processor in self.processors.values():\n            processor.preprocess(data)\n\n    def _preprocess_arrays(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Convert lists to numpy arrays for images and masks.\"\"\"\n        self._preprocess_images(data)\n        self._preprocess_masks(data)\n\n    def _preprocess_images(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Convert image lists to numpy arrays.\"\"\"\n        if \"images\" not in data:\n            return\n\n        if isinstance(data[\"images\"], (list, tuple)):\n            self._images_was_list = True\n            data[\"images\"] = np.stack(data[\"images\"])\n        else:\n            self._images_was_list = False\n\n    def _preprocess_masks(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Convert mask lists to numpy arrays.\"\"\"\n        if \"masks\" not in data:\n            return\n\n        if isinstance(data[\"masks\"], (list, tuple)):\n            self._masks_was_list = True\n            data[\"masks\"] = np.stack(data[\"masks\"])\n        else:\n            self._masks_was_list = False\n\n    def postprocess(self, data: dict[str, Any]) -&gt; dict[str, Any]:\n        if self.main_compose:\n            for p in self.processors.values():\n                p.postprocess(data)\n\n            # Convert back to list if original input was a list\n            if \"images\" in data and self._images_was_list:\n                data[\"images\"] = list(data[\"images\"])\n\n            if \"masks\" in data and self._masks_was_list:\n                data[\"masks\"] = list(data[\"masks\"])\n\n        return data\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        dictionary = super().to_dict_private()\n        bbox_processor = self.processors.get(\"bboxes\")\n        keypoints_processor = self.processors.get(\"keypoints\")\n        dictionary.update(\n            {\n                \"bbox_params\": bbox_processor.params.to_dict_private() if bbox_processor else None,\n                \"keypoint_params\": (keypoints_processor.params.to_dict_private() if keypoints_processor else None),\n                \"additional_targets\": self.additional_targets,\n                \"is_check_shapes\": self.is_check_shapes,\n            },\n        )\n        return dictionary\n\n    def get_dict_with_id(self) -&gt; dict[str, Any]:\n        dictionary = super().get_dict_with_id()\n        bbox_processor = self.processors.get(\"bboxes\")\n        keypoints_processor = self.processors.get(\"keypoints\")\n        dictionary.update(\n            {\n                \"bbox_params\": bbox_processor.params.to_dict_private() if bbox_processor else None,\n                \"keypoint_params\": (keypoints_processor.params.to_dict_private() if keypoints_processor else None),\n                \"additional_targets\": self.additional_targets,\n                \"params\": None,\n                \"is_check_shapes\": self.is_check_shapes,\n            },\n        )\n        return dictionary\n\n    @staticmethod\n    def _check_single_data(data_name: str, data: Any) -&gt; tuple[int, int]:\n        if not isinstance(data, np.ndarray):\n            raise TypeError(f\"{data_name} must be numpy array type\")\n        return data.shape[:2]\n\n    @staticmethod\n    def _check_masks_data(data_name: str, data: Any) -&gt; tuple[int, int]:\n        \"\"\"Check masks data format and return shape.\n\n        Args:\n            data_name: Name of the data field being checked\n            data: Input data in one of these formats:\n                - List of numpy arrays, each of shape (H, W) or (H, W, C)\n                - Numpy array of shape (N, H, W) or (N, H, W, C)\n\n        Returns:\n            tuple: (height, width) of the first mask\n\n        Raises:\n            TypeError: If data format is invalid\n        \"\"\"\n        if isinstance(data, np.ndarray):\n            if data.ndim not in [3, 4]:  # (N,H,W) or (N,H,W,C)\n                raise TypeError(f\"{data_name} as numpy array must be 3D or 4D\")\n            return data.shape[1:3]  # Return (H,W)\n\n        if isinstance(data, (list, tuple)):\n            if not data:\n                raise ValueError(f\"{data_name} cannot be empty\")\n            if not all(isinstance(m, np.ndarray) for m in data):\n                raise TypeError(f\"All elements in {data_name} must be numpy arrays\")\n            if any(m.ndim not in [2, 3] for m in data):\n                raise TypeError(f\"All masks in {data_name} must be 2D or 3D numpy arrays\")\n            return data[0].shape[:2]\n\n        raise TypeError(f\"{data_name} must be either a numpy array or a sequence of numpy arrays\")\n\n    @staticmethod\n    def _check_multi_data(data_name: str, data: Any) -&gt; tuple[int, int]:\n        \"\"\"Check multi-image data format and return shape.\n\n        Args:\n            data_name: Name of the data field being checked\n            data: Input data in one of these formats:\n                - List-like of numpy arrays\n                - Numpy array of shape (N, H, W, C) or (N, H, W)\n\n        Returns:\n            tuple: (height, width) of the first image\n\n        Raises:\n            TypeError: If data format is invalid\n        \"\"\"\n        if isinstance(data, np.ndarray):\n            if data.ndim not in {3, 4}:  # (N,H,W) or (N,H,W,C)\n                raise TypeError(f\"{data_name} as numpy array must be 3D or 4D\")\n            return data.shape[1:3]  # Return (H,W)\n\n        if not isinstance(data, Sequence) or not isinstance(data[0], np.ndarray):\n            raise TypeError(f\"{data_name} must be either a numpy array or a list of numpy arrays\")\n        return data[0].shape[:2]\n\n    @staticmethod\n    def _check_bbox_keypoint_params(internal_data_name: str, processors: dict[str, Any]) -&gt; None:\n        if internal_data_name in CHECK_BBOX_PARAM and processors.get(\"bboxes\") is None:\n            raise ValueError(\"bbox_params must be specified for bbox transformations\")\n        if internal_data_name in CHECK_KEYPOINTS_PARAM and processors.get(\"keypoints\") is None:\n            raise ValueError(\"keypoints_params must be specified for keypoint transformations\")\n\n    @staticmethod\n    def _check_shapes(shapes: list[tuple[int, ...]], is_check_shapes: bool) -&gt; None:\n        if is_check_shapes and shapes and shapes.count(shapes[0]) != len(shapes):\n            raise ValueError(\n                \"Height and Width of image, mask or masks should be equal. You can disable shapes check \"\n                \"by setting a parameter is_check_shapes=False of Compose class (do it only if you are sure \"\n                \"about your data consistency).\",\n            )\n\n    def _check_args(self, **kwargs: Any) -&gt; None:\n        shapes = []  # For H,W checks\n        volume_shapes = []  # For D,H,W checks\n\n        for data_name, data in kwargs.items():\n            internal_name = self._additional_targets.get(data_name, data_name)\n\n            # For CHECKED_SINGLE, we must validate even if None\n            if internal_name in CHECKED_SINGLE:\n                if not isinstance(data, np.ndarray):\n                    raise TypeError(f\"{data_name} must be numpy array type\")\n                shapes.append(data.shape[:2])\n                continue\n\n            # Skip empty data or non-array/list inputs for other types\n            if data is None:\n                continue\n            if not isinstance(data, (np.ndarray, list)):\n                continue\n\n            self._check_bbox_keypoint_params(internal_name, self.processors)\n\n            shape = self._get_data_shape(data_name, internal_name, data)\n            if shape is None:\n                continue\n\n            # Handle different shape types\n            if internal_name in CHECKED_VOLUME | CHECKED_MASK3D:\n                shapes.append(shape[1:3])  # H,W from (D,H,W)\n                volume_shapes.append(shape[:3])  # D,H,W\n            elif internal_name in {\"volumes\", \"masks3d\"}:\n                shapes.append(shape[2:4])  # H,W from (N,D,H,W)\n                volume_shapes.append(shape[1:4])  # D,H,W from (N,D,H,W)\n            else:\n                shapes.append(shape[:2])  # H,W\n\n        self._check_shape_consistency(shapes, volume_shapes)\n\n    def _get_data_shape(self, data_name: str, internal_name: str, data: Any) -&gt; tuple[int, ...] | None:\n        \"\"\"Get shape of data based on its type.\"\"\"\n        if internal_name in CHECKED_SINGLE:\n            if not isinstance(data, np.ndarray):\n                raise TypeError(f\"{data_name} must be numpy array type\")\n            return data.shape\n\n        if internal_name in CHECKED_VOLUME:\n            return self._check_volume_data(data_name, data)\n\n        if internal_name in CHECKED_MASK3D:\n            return self._check_mask3d_data(data_name, data)\n\n        if internal_name in CHECKED_MULTI:\n            if internal_name == \"masks\":\n                return self._check_masks_data(data_name, data)\n            if internal_name in {\"volumes\", \"masks3d\"}:  # Group these together\n                if not isinstance(data, np.ndarray):\n                    raise TypeError(f\"{data_name} must be numpy array type\")\n                if data.ndim not in {4, 5}:  # (N,D,H,W) or (N,D,H,W,C)\n                    raise TypeError(f\"{data_name} must be 4D or 5D array\")\n                return data.shape  # Return full shape\n            return self._check_multi_data(data_name, data)\n\n        return None\n\n    def _check_shape_consistency(self, shapes: list[tuple[int, ...]], volume_shapes: list[tuple[int, ...]]) -&gt; None:\n        \"\"\"Check consistency of shapes.\"\"\"\n        # Check H,W consistency\n        self._check_shapes(shapes, self.is_check_shapes)\n\n        # Check D,H,W consistency for volumes and 3D masks\n        if self.is_check_shapes and volume_shapes and volume_shapes.count(volume_shapes[0]) != len(volume_shapes):\n            raise ValueError(\n                \"Depth, Height and Width of volume, mask3d, volumes and masks3d should be equal. \"\n                \"You can disable shapes check by setting is_check_shapes=False.\",\n            )\n\n    @staticmethod\n    def _check_volume_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        if data.ndim not in {3, 4}:  # (D,H,W) or (D,H,W,C)\n            raise TypeError(f\"{data_name} must be 3D or 4D array\")\n        return data.shape[:3]  # Return (D,H,W)\n\n    @staticmethod\n    def _check_volumes_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        if data.ndim not in {4, 5}:  # (N,D,H,W) or (N,D,H,W,C)\n            raise TypeError(f\"{data_name} must be 4D or 5D array\")\n        return data.shape[1:4]  # Return (D,H,W)\n\n    @staticmethod\n    def _check_mask3d_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        \"\"\"Check single volumetric mask data format and return shape.\"\"\"\n        if data.ndim not in {3, 4}:  # (D,H,W) or (D,H,W,C)\n            raise TypeError(f\"{data_name} must be 3D or 4D array\")\n        return data.shape[:3]  # Return (D,H,W)\n\n    @staticmethod\n    def _check_masks3d_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        \"\"\"Check multiple volumetric masks data format and return shape.\"\"\"\n        if data.ndim not in [4, 5]:  # (N,D,H,W) or (N,D,H,W,C)\n            raise TypeError(f\"{data_name} must be 4D or 5D array\")\n        return data.shape[1:4]  # Return (D,H,W)\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.OneOf","title":"<code>class  OneOf</code> <code>         (transforms, p=0.5)                 </code>  [view source on GitHub]","text":"<p>Select one of transforms to apply. Selected transform will be called with <code>force_apply=True</code>. Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>list</code> <p>list of transformations to compose.</p> <code>p</code> <code>float</code> <p>probability of applying selected transform. Default: 0.5.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class OneOf(BaseCompose):\n    \"\"\"Select one of transforms to apply. Selected transform will be called with `force_apply=True`.\n    Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.\n\n    Args:\n        transforms (list): list of transformations to compose.\n        p (float): probability of applying selected transform. Default: 0.5.\n\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, p: float = 0.5):\n        super().__init__(transforms=transforms, p=p)\n        transforms_ps = [t.p for t in self.transforms]\n        s = sum(transforms_ps)\n        self.transforms_ps = [t / s for t in transforms_ps]\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode:\n            for t in self.transforms:\n                data = t(**data)\n            return data\n\n        if self.transforms_ps and (force_apply or self.py_random.random() &lt; self.p):\n            idx: int = self.random_generator.choice(len(self.transforms), p=self.transforms_ps)\n            t = self.transforms[idx]\n            data = t(force_apply=True, **data)\n            self._track_transform_params(t, data)\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.OneOrOther","title":"<code>class  OneOrOther</code> <code>         (first=None, second=None, transforms=None, p=0.5)                 </code>  [view source on GitHub]","text":"<p>Select one or another transform to apply. Selected transform will be called with <code>force_apply=True</code>.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class OneOrOther(BaseCompose):\n    \"\"\"Select one or another transform to apply. Selected transform will be called with `force_apply=True`.\"\"\"\n\n    def __init__(\n        self,\n        first: TransformType | None = None,\n        second: TransformType | None = None,\n        transforms: TransformsSeqType | None = None,\n        p: float = 0.5,\n    ):\n        if transforms is None:\n            if first is None or second is None:\n                msg = \"You must set both first and second or set transforms argument.\"\n                raise ValueError(msg)\n            transforms = [first, second]\n        super().__init__(transforms, p)\n        if len(self.transforms) != NUM_ONEOF_TRANSFORMS:\n            warnings.warn(\"Length of transforms is not equal to 2.\", stacklevel=2)\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode:\n            for t in self.transforms:\n                data = t(**data)\n                self._track_transform_params(t, data)\n            return data\n\n        if self.py_random.random() &lt; self.p:\n            return self.transforms[0](force_apply=True, **data)\n\n        return self.transforms[-1](force_apply=True, **data)\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.RandomOrder","title":"<code>class  RandomOrder</code> <code>       (transforms, n=1, replace=False, p=1)                 </code>  [view source on GitHub]","text":"<p>Apply a random subset of transforms from the given list in a random order.</p> <p>The <code>RandomOrder</code> class allows you to select a specified number of transforms from a list and apply them to the input data in a random order. This is useful for creating more diverse augmentation pipelines where the order of transformations can vary, potentially leading to different results.</p> <p>Attributes:</p> Name Type Description <code>transforms</code> <code>TransformsSeqType</code> <p>A list of transformations to choose from.</p> <code>n</code> <code>int</code> <p>The number of transforms to apply. If <code>n</code> is greater than the number of available transforms      and <code>replace</code> is False, <code>n</code> will be set to the number of available transforms.</p> <code>replace</code> <code>bool</code> <p>Whether to sample transforms with replacement. If True, the same transform can be             selected multiple times. Default is False.</p> <code>p</code> <code>float</code> <p>Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.RandomOrder([\n...     A.HorizontalFlip(p=1),\n...     A.VerticalFlip(p=1),\n...     A.RandomBrightnessContrast(p=1),\n... ], n=2, replace=False, p=0.5)\n&gt;&gt;&gt; # This will apply 2 out of the 3 transforms in a random order with 50% probability\n</code></pre> <p>Note</p> <ul> <li>The probabilities of individual transforms are used as weights for sampling.</li> <li>When <code>replace</code> is True, the same transform can be selected multiple times.</li> <li>The random order of transforms will not be replayed in <code>ReplayCompose</code>.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class RandomOrder(SomeOf):\n    \"\"\"Apply a random subset of transforms from the given list in a random order.\n\n    The `RandomOrder` class allows you to select a specified number of transforms from a list and apply them\n    to the input data in a random order. This is useful for creating more diverse augmentation pipelines\n    where the order of transformations can vary, potentially leading to different results.\n\n    Attributes:\n        transforms (TransformsSeqType): A list of transformations to choose from.\n        n (int): The number of transforms to apply. If `n` is greater than the number of available transforms\n                 and `replace` is False, `n` will be set to the number of available transforms.\n        replace (bool): Whether to sample transforms with replacement. If True, the same transform can be\n                        selected multiple times. Default is False.\n        p (float): Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.RandomOrder([\n        ...     A.HorizontalFlip(p=1),\n        ...     A.VerticalFlip(p=1),\n        ...     A.RandomBrightnessContrast(p=1),\n        ... ], n=2, replace=False, p=0.5)\n        &gt;&gt;&gt; # This will apply 2 out of the 3 transforms in a random order with 50% probability\n\n    Note:\n        - The probabilities of individual transforms are used as weights for sampling.\n        - When `replace` is True, the same transform can be selected multiple times.\n        - The random order of transforms will not be replayed in `ReplayCompose`.\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, n: int = 1, replace: bool = False, p: float = 1):\n        super().__init__(transforms=transforms, n=n, replace=replace, p=p)\n\n    def _get_idx(self) -&gt; np.ndarray[np.int_]:\n        return self.random_generator.choice(\n            len(self.transforms),\n            size=self.n,\n            replace=self.replace,\n            p=self.transforms_ps,\n        )\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.SelectiveChannelTransform","title":"<code>class  SelectiveChannelTransform</code> <code>         (transforms, channels=(0, 1, 2), p=1.0)                 </code>  [view source on GitHub]","text":"<p>A transformation class to apply specified transforms to selected channels of an image.</p> <p>This class extends BaseCompose to allow selective application of transformations to specified image channels. It extracts the selected channels, applies the transformations, and then reinserts the transformed channels back into their original positions in the image.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>TransformsSeqType</code> <p>A sequence of transformations (from Albumentations) to be applied to the specified channels.</p> <code>channels</code> <code>Sequence[int]</code> <p>A sequence of integers specifying the indices of the channels to which the transforms should be applied.</p> <code>p</code> <code>float</code> <p>Probability that the transform will be applied; the default is 1.0 (always apply).</p> <p>Methods</p> <p>call(args, *kwargs):     Applies the transforms to the image according to the specified channels.     The input data should include 'image' key with the image array.</p> <p>Returns:</p> Type Description <code>dict[str, Any]</code> <p>The transformed data dictionary, which includes the transformed 'image' key.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class SelectiveChannelTransform(BaseCompose):\n    \"\"\"A transformation class to apply specified transforms to selected channels of an image.\n\n    This class extends BaseCompose to allow selective application of transformations to\n    specified image channels. It extracts the selected channels, applies the transformations,\n    and then reinserts the transformed channels back into their original positions in the image.\n\n    Parameters:\n        transforms (TransformsSeqType):\n            A sequence of transformations (from Albumentations) to be applied to the specified channels.\n        channels (Sequence[int]):\n            A sequence of integers specifying the indices of the channels to which the transforms should be applied.\n        p (float):\n            Probability that the transform will be applied; the default is 1.0 (always apply).\n\n    Methods:\n        __call__(*args, **kwargs):\n            Applies the transforms to the image according to the specified channels.\n            The input data should include 'image' key with the image array.\n\n    Returns:\n        dict[str, Any]: The transformed data dictionary, which includes the transformed 'image' key.\n    \"\"\"\n\n    def __init__(\n        self,\n        transforms: TransformsSeqType,\n        channels: Sequence[int] = (0, 1, 2),\n        p: float = 1.0,\n    ) -&gt; None:\n        super().__init__(transforms, p)\n        self.channels = channels\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if force_apply or self.py_random.random() &lt; self.p:\n            image = data[\"image\"]\n\n            selected_channels = image[:, :, self.channels]\n            sub_image = np.ascontiguousarray(selected_channels)\n\n            for t in self.transforms:\n                sub_image = t(image=sub_image)[\"image\"]\n                self._track_transform_params(t, sub_image)\n\n            transformed_channels = cv2.split(sub_image)\n            output_img = image.copy()\n\n            for idx, channel in zip(self.channels, transformed_channels):\n                output_img[:, :, idx] = channel\n\n            data[\"image\"] = np.ascontiguousarray(output_img)\n\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.Sequential","title":"<code>class  Sequential</code> <code>         (transforms, p=0.5)                 </code>  [view source on GitHub]","text":"<p>Sequentially applies all transforms to targets.</p> <p>Note</p> <p>This transform is not intended to be a replacement for <code>Compose</code>. Instead, it should be used inside <code>Compose</code> the same way <code>OneOf</code> or <code>OneOrOther</code> are used. For instance, you can combine <code>OneOf</code> with <code>Sequential</code> to create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly chose sequence to input data (see the <code>Example</code> section for an example definition of such pipeline).</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n&gt;&gt;&gt;    A.OneOf([\n&gt;&gt;&gt;        A.Sequential([\n&gt;&gt;&gt;            A.HorizontalFlip(p=0.5),\n&gt;&gt;&gt;            A.ShiftScaleRotate(p=0.5),\n&gt;&gt;&gt;        ]),\n&gt;&gt;&gt;        A.Sequential([\n&gt;&gt;&gt;            A.VerticalFlip(p=0.5),\n&gt;&gt;&gt;            A.RandomBrightnessContrast(p=0.5),\n&gt;&gt;&gt;        ]),\n&gt;&gt;&gt;    ], p=1)\n&gt;&gt;&gt; ])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class Sequential(BaseCompose):\n    \"\"\"Sequentially applies all transforms to targets.\n\n    Note:\n        This transform is not intended to be a replacement for `Compose`. Instead, it should be used inside `Compose`\n        the same way `OneOf` or `OneOrOther` are used. For instance, you can combine `OneOf` with `Sequential` to\n        create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly\n        chose sequence to input data (see the `Example` section for an example definition of such pipeline).\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        &gt;&gt;&gt;    A.OneOf([\n        &gt;&gt;&gt;        A.Sequential([\n        &gt;&gt;&gt;            A.HorizontalFlip(p=0.5),\n        &gt;&gt;&gt;            A.ShiftScaleRotate(p=0.5),\n        &gt;&gt;&gt;        ]),\n        &gt;&gt;&gt;        A.Sequential([\n        &gt;&gt;&gt;            A.VerticalFlip(p=0.5),\n        &gt;&gt;&gt;            A.RandomBrightnessContrast(p=0.5),\n        &gt;&gt;&gt;        ]),\n        &gt;&gt;&gt;    ], p=1)\n        &gt;&gt;&gt; ])\n\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, p: float = 0.5):\n        super().__init__(transforms, p)\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode or force_apply or self.py_random.random() &lt; self.p:\n            for t in self.transforms:\n                data = t(**data)\n                self._track_transform_params(t, data)\n                data = self.check_data_post_transform(data)\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.SomeOf","title":"<code>class  SomeOf</code> <code>         (transforms, n=1, replace=False, p=1)                   </code>  [view source on GitHub]","text":"<p>Apply a random subset of transforms from the given list.</p> <p>This class selects a specified number of transforms from the provided list and applies them to the input data. The selection can be done with or without replacement, allowing for the same transform to be potentially applied multiple times.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>List[Union[BasicTransform, BaseCompose]]</code> <p>A list of transforms to choose from.</p> <code>n</code> <code>int</code> <p>The number of transforms to apply. If greater than the number of      transforms and replace=False, it will be set to the number of transforms.</p> <code>replace</code> <code>bool</code> <p>Whether to sample transforms with replacement. Default is True.</p> <code>p</code> <code>float</code> <p>Probability of applying the selected transforms. Should be in the range [0, 1].        Default is 1.0.</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method for mask transforms.                                 When defined, it overrides the interpolation method                                 specified in individual transforms. Default is None.</p> <p>Note</p> <ul> <li>If <code>n</code> is greater than the number of transforms and <code>replace</code> is False,   <code>n</code> will be set to the number of transforms with a warning.</li> <li>The probabilities of individual transforms are used as weights for sampling.</li> <li>When <code>replace</code> is True, the same transform can be selected multiple times.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.SomeOf([\n...     A.HorizontalFlip(p=1),\n...     A.VerticalFlip(p=1),\n...     A.RandomBrightnessContrast(p=1),\n... ], n=2, replace=False, p=0.5)\n&gt;&gt;&gt; # This will apply 2 out of the 3 transforms with 50% probability\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class SomeOf(BaseCompose):\n    \"\"\"Apply a random subset of transforms from the given list.\n\n    This class selects a specified number of transforms from the provided list\n    and applies them to the input data. The selection can be done with or without\n    replacement, allowing for the same transform to be potentially applied multiple times.\n\n    Args:\n        transforms (List[Union[BasicTransform, BaseCompose]]): A list of transforms to choose from.\n        n (int): The number of transforms to apply. If greater than the number of\n                 transforms and replace=False, it will be set to the number of transforms.\n        replace (bool): Whether to sample transforms with replacement. Default is True.\n        p (float): Probability of applying the selected transforms. Should be in the range [0, 1].\n                   Default is 1.0.\n        mask_interpolation (int, optional): Interpolation method for mask transforms.\n                                            When defined, it overrides the interpolation method\n                                            specified in individual transforms. Default is None.\n\n    Note:\n        - If `n` is greater than the number of transforms and `replace` is False,\n          `n` will be set to the number of transforms with a warning.\n        - The probabilities of individual transforms are used as weights for sampling.\n        - When `replace` is True, the same transform can be selected multiple times.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.SomeOf([\n        ...     A.HorizontalFlip(p=1),\n        ...     A.VerticalFlip(p=1),\n        ...     A.RandomBrightnessContrast(p=1),\n        ... ], n=2, replace=False, p=0.5)\n        &gt;&gt;&gt; # This will apply 2 out of the 3 transforms with 50% probability\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, n: int = 1, replace: bool = False, p: float = 1):\n        super().__init__(transforms, p)\n        self.n = n\n        if not replace and n &gt; len(self.transforms):\n            self.n = len(self.transforms)\n            warnings.warn(\n                f\"`n` is greater than number of transforms. `n` will be set to {self.n}.\",\n                UserWarning,\n                stacklevel=2,\n            )\n        self.replace = replace\n        transforms_ps = [t.p for t in self.transforms]\n        s = sum(transforms_ps)\n        self.transforms_ps = [t / s for t in transforms_ps]\n\n    def __call__(self, *arg: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode:\n            for t in self.transforms:\n                data = t(**data)\n                data = self.check_data_post_transform(data)\n            return data\n\n        if self.transforms_ps and (force_apply or self.py_random.random() &lt; self.p):\n            for i in self._get_idx():\n                t = self.transforms[i]\n                data = t(force_apply=True, **data)\n                self._track_transform_params(t, data)\n                data = self.check_data_post_transform(data)\n        return data\n\n    def _get_idx(self) -&gt; np.ndarray[np.int_]:\n        idx = self.random_generator.choice(\n            len(self.transforms),\n            size=self.n,\n            replace=self.replace,\n            p=self.transforms_ps,\n        )\n        idx.sort()\n        return idx\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        dictionary = super().to_dict_private()\n        dictionary.update({\"n\": self.n, \"replace\": self.replace})\n        return dictionary\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/","title":"Helper functions for working with keypoints (augmentations.core.keypoints_utils)","text":""},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.KeypointParams","title":"<code>class  KeypointParams</code> <code>       (format, label_fields=None, remove_invisible=True, angle_in_degrees=True, check_each_transform=True)                         </code>  [view source on GitHub]","text":"<p>Parameters of keypoints</p> <p>Parameters:</p> Name Type Description <code>format</code> <code>str</code> <p>format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa'.</p> <p>x - X coordinate,</p> <p>y - Y coordinate</p> <p>s - Keypoint scale</p> <p>a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)</p> <code>label_fields</code> <code>list</code> <p>list of fields that are joined with keypoints, e.g labels. Should be same type as keypoints.</p> <code>remove_invisible</code> <code>bool</code> <p>to remove invisible points after transform or not</p> <code>angle_in_degrees</code> <code>bool</code> <p>angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints</p> <code>check_each_transform</code> <code>bool</code> <p>if <code>True</code>, then keypoints will be checked after each dual transform. Default: <code>True</code></p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>class KeypointParams(Params):\n    \"\"\"Parameters of keypoints\n\n    Args:\n        format (str): format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa'.\n\n            x - X coordinate,\n\n            y - Y coordinate\n\n            s - Keypoint scale\n\n            a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)\n        label_fields (list): list of fields that are joined with keypoints, e.g labels.\n            Should be same type as keypoints.\n        remove_invisible (bool): to remove invisible points after transform or not\n        angle_in_degrees (bool): angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints\n        check_each_transform (bool): if `True`, then keypoints will be checked after each dual transform.\n            Default: `True`\n\n    \"\"\"\n\n    def __init__(\n        self,\n        format: str,  # noqa: A002\n        label_fields: Sequence[str] | None = None,\n        remove_invisible: bool = True,\n        angle_in_degrees: bool = True,\n        check_each_transform: bool = True,\n    ):\n        super().__init__(format, label_fields)\n        self.remove_invisible = remove_invisible\n        self.angle_in_degrees = angle_in_degrees\n        self.check_each_transform = check_each_transform\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        data = super().to_dict_private()\n        data.update(\n            {\n                \"remove_invisible\": self.remove_invisible,\n                \"angle_in_degrees\": self.angle_in_degrees,\n                \"check_each_transform\": self.check_each_transform,\n            },\n        )\n        return data\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return \"KeypointParams\"\n\n    def __repr__(self) -&gt; str:\n        return (\n            f\"KeypointParams(format={self.format}, label_fields={self.label_fields},\"\n            f\" remove_invisible={self.remove_invisible}, angle_in_degrees={self.angle_in_degrees},\"\n            f\" check_each_transform={self.check_each_transform})\"\n        )\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.check_keypoints","title":"<code>def check_keypoints    (keypoints, image_shape)    </code> [view source on GitHub]","text":"<p>Check if keypoint coordinates are within valid ranges for the given image shape.</p> <p>This function validates that: 1. All x-coordinates are within [0, width) 2. All y-coordinates are within [0, height) 3. If angles are present (i.e., keypoints have more than 2 columns),    they are within the range [0, 2\u03c0)</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 2+), where N is the number of keypoints.                     Each row represents a keypoint with at least (x, y) coordinates.                     If present, the third column is assumed to be the angle.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the image (height, width).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If any keypoint coordinate is outside the valid range, or if any angle is invalid.         The error message will detail which keypoints are invalid and why.</p> <p>Note</p> <ul> <li>The function assumes that keypoint coordinates are in absolute pixel values, not normalized.</li> <li>Angles, if present, are assumed to be in radians.</li> <li>The constant PAIR should be defined elsewhere in the module, typically as 2.</li> </ul> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def check_keypoints(keypoints: np.ndarray, image_shape: tuple[int, int]) -&gt; None:\n    \"\"\"Check if keypoint coordinates are within valid ranges for the given image shape.\n\n    This function validates that:\n    1. All x-coordinates are within [0, width)\n    2. All y-coordinates are within [0, height)\n    3. If angles are present (i.e., keypoints have more than 2 columns),\n       they are within the range [0, 2\u03c0)\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 2+), where N is the number of keypoints.\n                                Each row represents a keypoint with at least (x, y) coordinates.\n                                If present, the third column is assumed to be the angle.\n        image_shape (Tuple[int, int]): The shape of the image (height, width).\n\n    Raises:\n        ValueError: If any keypoint coordinate is outside the valid range, or if any angle is invalid.\n                    The error message will detail which keypoints are invalid and why.\n\n    Note:\n        - The function assumes that keypoint coordinates are in absolute pixel values, not normalized.\n        - Angles, if present, are assumed to be in radians.\n        - The constant PAIR should be defined elsewhere in the module, typically as 2.\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Check x and y coordinates\n    x, y = keypoints[:, 0], keypoints[:, 1]\n    if np.any((x &lt; 0) | (x &gt;= width)) or np.any((y &lt; 0) | (y &gt;= height)):\n        invalid_x = np.where((x &lt; 0) | (x &gt;= width))[0]\n        invalid_y = np.where((y &lt; 0) | (y &gt;= height))[0]\n\n        error_messages = []\n\n        error_messages = [\n            f\"Expected {'x' if idx in invalid_x else 'y'} for keypoint {keypoints[idx]} to be \"\n            f\"in the range [0.0, {width if idx in invalid_x else height}], \"\n            f\"got {x[idx] if idx in invalid_x else y[idx]}.\"\n            for idx in sorted(set(invalid_x) | set(invalid_y))\n        ]\n\n        raise ValueError(\"\\n\".join(error_messages))\n\n    # Check angles\n    if keypoints.shape[1] &gt; PAIR:\n        angles = keypoints[:, 2]\n        invalid_angles = np.where((angles &lt; 0) | (angles &gt;= 2 * math.pi))[0]\n        if len(invalid_angles) &gt; 0:\n            error_messages = [\n                f\"Keypoint angle must be in range [0, 2 * PI). Got: {angles[idx]} for keypoint {keypoints[idx]}\"\n                for idx in invalid_angles\n            ]\n            raise ValueError(\"\\n\".join(error_messages))\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.convert_keypoints_from_albumentations","title":"<code>def convert_keypoints_from_albumentations    (keypoints, target_format, image_shape, check_validity=False, angle_in_degrees=True)    </code> [view source on GitHub]","text":"<p>Convert keypoints from Albumentations format to various other formats.</p> <p>This function takes keypoints in the standard Albumentations format [x, y, angle, scale] and converts them to the specified target format.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints in Albumentations format with shape (N, 4+),                     where N is the number of keypoints. Each row represents a keypoint                     [x, y, angle, scale, ...].</p> <code>target_format</code> <code>Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\"]</code> <p>The desired output format. - \"xy\": [x, y] - \"yx\": [y, x] - \"xya\": [x, y, angle] - \"xys\": [x, y, scale] - \"xyas\": [x, y, angle, scale] - \"xysa\": [x, y, scale, angle]</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <code>check_validity</code> <code>bool</code> <p>If True, check if the keypoints are within the image boundaries.                              Defaults to False.</p> <code>angle_in_degrees</code> <code>bool</code> <p>If True, convert output angles to degrees.                                If False, angles remain in radians.                                Defaults to True.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints in the specified target format with shape (N, 2+).             Any additional columns from the input keypoints beyond the first 4             are preserved and appended after the converted columns.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the target_format is not one of the supported formats.</p> <p>Note</p> <ul> <li>Input angles are assumed to be in the range [0, 2\u03c0) radians.</li> <li>If the input keypoints have additional columns beyond the first 4,   these columns are preserved in the output.</li> <li>The constant NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS should be defined   elsewhere in the module, typically as 4.</li> </ul> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def convert_keypoints_from_albumentations(\n    keypoints: np.ndarray,\n    target_format: Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\"],\n    image_shape: tuple[int, int],\n    check_validity: bool = False,\n    angle_in_degrees: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Convert keypoints from Albumentations format to various other formats.\n\n    This function takes keypoints in the standard Albumentations format [x, y, angle, scale]\n    and converts them to the specified target format.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints in Albumentations format with shape (N, 4+),\n                                where N is the number of keypoints. Each row represents a keypoint\n                                [x, y, angle, scale, ...].\n        target_format (Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\"]): The desired output format.\n            - \"xy\": [x, y]\n            - \"yx\": [y, x]\n            - \"xya\": [x, y, angle]\n            - \"xys\": [x, y, scale]\n            - \"xyas\": [x, y, angle, scale]\n            - \"xysa\": [x, y, scale, angle]\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n        check_validity (bool, optional): If True, check if the keypoints are within the image boundaries.\n                                         Defaults to False.\n        angle_in_degrees (bool, optional): If True, convert output angles to degrees.\n                                           If False, angles remain in radians.\n                                           Defaults to True.\n\n    Returns:\n        np.ndarray: Array of keypoints in the specified target format with shape (N, 2+).\n                    Any additional columns from the input keypoints beyond the first 4\n                    are preserved and appended after the converted columns.\n\n    Raises:\n        ValueError: If the target_format is not one of the supported formats.\n\n    Note:\n        - Input angles are assumed to be in the range [0, 2\u03c0) radians.\n        - If the input keypoints have additional columns beyond the first 4,\n          these columns are preserved in the output.\n        - The constant NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS should be defined\n          elsewhere in the module, typically as 4.\n    \"\"\"\n    if target_format not in keypoint_formats:\n        raise ValueError(f\"Unknown target_format {target_format}. Supported formats are: {keypoint_formats}\")\n\n    x, y, angle, scale = keypoints[:, 0], keypoints[:, 1], keypoints[:, 2], keypoints[:, 3]\n    angle = angle_to_2pi_range(angle)\n\n    if check_validity:\n        check_keypoints(np.column_stack((x, y, angle, scale)), image_shape)\n\n    if angle_in_degrees:\n        angle = np.degrees(angle)\n\n    format_to_columns = {\n        \"xy\": [x, y],\n        \"yx\": [y, x],\n        \"xya\": [x, y, angle],\n        \"xys\": [x, y, scale],\n        \"xyas\": [x, y, angle, scale],\n        \"xysa\": [x, y, scale, angle],\n    }\n\n    result = np.column_stack(format_to_columns[target_format])\n\n    # Add any additional columns from the original keypoints\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack((result, keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:]))\n\n    return result\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.convert_keypoints_to_albumentations","title":"<code>def convert_keypoints_to_albumentations    (keypoints, source_format, image_shape, check_validity=False, angle_in_degrees=True)    </code> [view source on GitHub]","text":"<p>Convert keypoints from various formats to the Albumentations format.</p> <p>This function takes keypoints in different formats and converts them to the standard Albumentations format: [x, y, angle, scale]. If the input format doesn't include angle or scale, these values are set to 0.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 2+), where N is the number of keypoints.                     The number of columns depends on the source_format.</p> <code>source_format</code> <code>Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\"]</code> <p>The format of the input keypoints. - \"xy\": [x, y] - \"yx\": [y, x] - \"xya\": [x, y, angle] - \"xys\": [x, y, scale] - \"xyas\": [x, y, angle, scale] - \"xysa\": [x, y, scale, angle]</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <code>check_validity</code> <code>bool</code> <p>If True, check if the converted keypoints are within the image boundaries.                              Defaults to False.</p> <code>angle_in_degrees</code> <code>bool</code> <p>If True, convert input angles from degrees to radians.                                Defaults to True.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints in Albumentations format [x, y, angle, scale] with shape (N, 4+).             Any additional columns from the input keypoints are preserved and appended after the             first 4 columns.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the source_format is not one of the supported formats.</p> <p>Note</p> <ul> <li>Angles are converted to the range [0, 2\u03c0) radians.</li> <li>If the input keypoints have additional columns beyond what's specified in the source_format,   these columns are preserved in the output.</li> </ul> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def convert_keypoints_to_albumentations(\n    keypoints: np.ndarray,\n    source_format: Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\"],\n    image_shape: tuple[int, int],\n    check_validity: bool = False,\n    angle_in_degrees: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Convert keypoints from various formats to the Albumentations format.\n\n    This function takes keypoints in different formats and converts them to the standard\n    Albumentations format: [x, y, angle, scale]. If the input format doesn't include\n    angle or scale, these values are set to 0.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 2+), where N is the number of keypoints.\n                                The number of columns depends on the source_format.\n        source_format (Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\"]): The format of the input keypoints.\n            - \"xy\": [x, y]\n            - \"yx\": [y, x]\n            - \"xya\": [x, y, angle]\n            - \"xys\": [x, y, scale]\n            - \"xyas\": [x, y, angle, scale]\n            - \"xysa\": [x, y, scale, angle]\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n        check_validity (bool, optional): If True, check if the converted keypoints are within the image boundaries.\n                                         Defaults to False.\n        angle_in_degrees (bool, optional): If True, convert input angles from degrees to radians.\n                                           Defaults to True.\n\n    Returns:\n        np.ndarray: Array of keypoints in Albumentations format [x, y, angle, scale] with shape (N, 4+).\n                    Any additional columns from the input keypoints are preserved and appended after the\n                    first 4 columns.\n\n    Raises:\n        ValueError: If the source_format is not one of the supported formats.\n\n    Note:\n        - Angles are converted to the range [0, 2\u03c0) radians.\n        - If the input keypoints have additional columns beyond what's specified in the source_format,\n          these columns are preserved in the output.\n    \"\"\"\n    if source_format not in keypoint_formats:\n        raise ValueError(f\"Unknown source_format {source_format}. Supported formats are: {keypoint_formats}\")\n\n    format_to_indices: dict[str, list[int | None]] = {\n        \"xy\": [0, 1, None, None],\n        \"yx\": [1, 0, None, None],\n        \"xya\": [0, 1, 2, None],\n        \"xys\": [0, 1, None, 2],\n        \"xyas\": [0, 1, 2, 3],\n        \"xysa\": [0, 1, 3, 2],\n    }\n\n    indices: list[int | None] = format_to_indices[source_format]\n\n    processed_keypoints = np.zeros((keypoints.shape[0], NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS), dtype=np.float32)\n\n    for i, idx in enumerate(indices):\n        if idx is not None:\n            processed_keypoints[:, i] = keypoints[:, idx]\n\n    if angle_in_degrees and indices[2] is not None:\n        processed_keypoints[:, 2] = np.radians(processed_keypoints[:, 2])\n\n    processed_keypoints[:, 2] = angle_to_2pi_range(processed_keypoints[:, 2])\n\n    if keypoints.shape[1] &gt; len(source_format):\n        processed_keypoints = np.column_stack((processed_keypoints, keypoints[:, len(source_format) :]))\n\n    if check_validity:\n        check_keypoints(processed_keypoints, image_shape)\n\n    return processed_keypoints\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.filter_keypoints","title":"<code>def filter_keypoints    (keypoints, image_shape, remove_invisible)    </code> [view source on GitHub]","text":"<p>Filter keypoints to remove those outside the image boundaries.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 2+) where N is the number of keypoints.        Each row represents a keypoint (x, y, ...).</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>A tuple (height, width) representing the image dimensions.</p> <code>remove_invisible</code> <code>bool</code> <p>If True, remove keypoints outside the image boundaries.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of filtered keypoints.</p> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def filter_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    remove_invisible: bool,\n) -&gt; np.ndarray:\n    \"\"\"Filter keypoints to remove those outside the image boundaries.\n\n    Args:\n        keypoints: A numpy array of shape (N, 2+) where N is the number of keypoints.\n                   Each row represents a keypoint (x, y, ...).\n        image_shape: A tuple (height, width) representing the image dimensions.\n        remove_invisible: If True, remove keypoints outside the image boundaries.\n\n    Returns:\n        A numpy array of filtered keypoints.\n    \"\"\"\n    if not remove_invisible:\n        return keypoints\n\n    if not keypoints.size:\n        return keypoints\n\n    height, width = image_shape[:2]\n\n    # Create boolean mask for visible keypoints\n    x, y = keypoints[:, 0], keypoints[:, 1]\n    visible = (x &gt;= 0) &amp; (x &lt; width) &amp; (y &gt;= 0) &amp; (y &lt; height)\n\n    # Apply the mask to filter keypoints\n    return keypoints[visible]\n</code></pre>"},{"location":"api_reference/core/serialization/","title":"Serialization API (core.serialization)","text":""},{"location":"api_reference/core/serialization/#albumentations.core.serialization.Serializable","title":"<code>class  Serializable</code> <code> </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>class Serializable(metaclass=SerializableMeta):\n    @classmethod\n    @abstractmethod\n    def is_serializable(cls) -&gt; bool:\n        raise NotImplementedError\n\n    @classmethod\n    @abstractmethod\n    def get_class_fullname(cls) -&gt; str:\n        raise NotImplementedError\n\n    @abstractmethod\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        raise NotImplementedError\n\n    def to_dict(self, on_not_implemented_error: str = \"raise\") -&gt; dict[str, Any]:\n        \"\"\"Take a transform pipeline and convert it to a serializable representation that uses only standard\n        python data types: dictionaries, lists, strings, integers, and floats.\n\n        Args:\n            self: A transform that should be serialized. If the transform doesn't implement the `to_dict`\n                method and `on_not_implemented_error` equals to 'raise' then `NotImplementedError` is raised.\n                If `on_not_implemented_error` equals to 'warn' then `NotImplementedError` will be ignored\n                but no transform parameters will be serialized.\n            on_not_implemented_error (str): `raise` or `warn`.\n\n        \"\"\"\n        if on_not_implemented_error not in {\"raise\", \"warn\"}:\n            msg = f\"Unknown on_not_implemented_error value: {on_not_implemented_error}. Supported values are: 'raise' \"\n            \"and 'warn'\"\n            raise ValueError(msg)\n        try:\n            transform_dict = self.to_dict_private()\n        except NotImplementedError:\n            if on_not_implemented_error == \"raise\":\n                raise\n\n            transform_dict = {}\n            warnings.warn(\n                f\"Got NotImplementedError while trying to serialize {self}. Object arguments are not preserved. \"\n                f\"Implement either '{self.__class__.__name__}.get_transform_init_args_names' \"\n                f\"or '{self.__class__.__name__}.get_transform_init_args' \"\n                \"method to make the transform serializable\",\n                stacklevel=2,\n            )\n        return {\"__version__\": __version__, \"transform\": transform_dict}\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.SerializableMeta","title":"<code>class  SerializableMeta</code> <code> </code>  [view source on GitHub]","text":"<p>A metaclass that is used to register classes in <code>SERIALIZABLE_REGISTRY</code> or <code>NON_SERIALIZABLE_REGISTRY</code> so they can be found later while deserializing transformation pipeline using classes full names.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>class SerializableMeta(ABCMeta):\n    \"\"\"A metaclass that is used to register classes in `SERIALIZABLE_REGISTRY` or `NON_SERIALIZABLE_REGISTRY`\n    so they can be found later while deserializing transformation pipeline using classes full names.\n    \"\"\"\n\n    def __new__(cls, name: str, bases: tuple[type, ...], *args: Any, **kwargs: Any) -&gt; SerializableMeta:\n        cls_obj = super().__new__(cls, name, bases, *args, **kwargs)\n        if name != \"Serializable\" and ABC not in bases:\n            if cls_obj.is_serializable():\n                SERIALIZABLE_REGISTRY[cls_obj.get_class_fullname()] = cls_obj\n            else:\n                NON_SERIALIZABLE_REGISTRY[cls_obj.get_class_fullname()] = cls_obj\n        return cls_obj\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return get_shortest_class_fullname(cls)\n\n    @classmethod\n    def _to_dict(cls) -&gt; dict[str, Any]:\n        return {}\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.from_dict","title":"<code>def from_dict    (transform_dict, nonserializable=None)    </code> [view source on GitHub]","text":"<p>transform_dict: A dictionary with serialized transform pipeline. nonserializable (dict): A dictionary that contains non-serializable transforms.     This dictionary is required when you are restoring a pipeline that contains non-serializable transforms.     Keys in that dictionary should be named same as <code>name</code> arguments in respective transforms from     a serialized pipeline.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def from_dict(\n    transform_dict: dict[str, Any],\n    nonserializable: dict[str, Any] | None = None,\n) -&gt; Serializable | None:\n    \"\"\"Args:\n    transform_dict: A dictionary with serialized transform pipeline.\n    nonserializable (dict): A dictionary that contains non-serializable transforms.\n        This dictionary is required when you are restoring a pipeline that contains non-serializable transforms.\n        Keys in that dictionary should be named same as `name` arguments in respective transforms from\n        a serialized pipeline.\n\n    \"\"\"\n    register_additional_transforms()\n    transform = transform_dict[\"transform\"]\n    lmbd = instantiate_nonserializable(transform, nonserializable)\n    if lmbd:\n        return lmbd\n    name = transform[\"__class_fullname__\"]\n    args = {k: v for k, v in transform.items() if k != \"__class_fullname__\"}\n    cls = SERIALIZABLE_REGISTRY[shorten_class_name(name)]\n    if \"transforms\" in args:\n        args[\"transforms\"] = [from_dict({\"transform\": t}, nonserializable=nonserializable) for t in args[\"transforms\"]]\n    return cls(**args)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.get_shortest_class_fullname","title":"<code>def get_shortest_class_fullname    (cls)    </code> [view source on GitHub]","text":"<p>The function <code>get_shortest_class_fullname</code> takes a class object as input and returns its shortened full name.</p> <p>:param cls: The parameter <code>cls</code> is of type <code>Type[BasicCompose]</code>, which means it expects a class that is a subclass of <code>BasicCompose</code> :type cls: Type[BasicCompose] :return: a string, which is the shortened version of the full class name.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def get_shortest_class_fullname(cls: type[Any]) -&gt; str:\n    \"\"\"The function `get_shortest_class_fullname` takes a class object as input and returns its shortened\n    full name.\n\n    :param cls: The parameter `cls` is of type `Type[BasicCompose]`, which means it expects a class that\n    is a subclass of `BasicCompose`\n    :type cls: Type[BasicCompose]\n    :return: a string, which is the shortened version of the full class name.\n    \"\"\"\n    class_fullname = f\"{cls.__module__}.{cls.__name__}\"\n    return shorten_class_name(class_fullname)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.load","title":"<code>def load    (filepath_or_buffer, data_format='json', nonserializable=None)    </code> [view source on GitHub]","text":"<p>Load a serialized pipeline from a file or file-like object and construct a transform pipeline.</p> <p>Parameters:</p> Name Type Description <code>filepath_or_buffer</code> <code>Union[str, Path, TextIO]</code> <p>The file path or file-like object to read the serialized data from. If a string is provided, it is interpreted as a path to a file. If a file-like object is provided, the serialized data will be read from it directly.</p> <code>data_format</code> <code>str</code> <p>The format of the serialized data. Valid options are 'json' and 'yaml'. Defaults to 'json'.</p> <code>nonserializable</code> <code>Optional[dict[str, Any]]</code> <p>A dictionary that contains non-serializable transforms. This dictionary is required when restoring a pipeline that contains non-serializable transforms. Keys in the dictionary should be named the same as the <code>name</code> arguments in respective transforms from the serialized pipeline. Defaults to None.</p> <p>Returns:</p> Type Description <code>object</code> <p>The deserialized transform pipeline.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>data_format</code> is 'yaml' but PyYAML is not installed.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def load(\n    filepath_or_buffer: str | Path | TextIO,\n    data_format: str = \"json\",\n    nonserializable: dict[str, Any] | None = None,\n) -&gt; object:\n    \"\"\"Load a serialized pipeline from a file or file-like object and construct a transform pipeline.\n\n    Args:\n        filepath_or_buffer (Union[str, Path, TextIO]): The file path or file-like object to read the serialized\n            data from.\n            If a string is provided, it is interpreted as a path to a file. If a file-like object is provided,\n            the serialized data will be read from it directly.\n        data_format (str): The format of the serialized data. Valid options are 'json' and 'yaml'.\n            Defaults to 'json'.\n        nonserializable (Optional[dict[str, Any]]): A dictionary that contains non-serializable transforms.\n            This dictionary is required when restoring a pipeline that contains non-serializable transforms.\n            Keys in the dictionary should be named the same as the `name` arguments in respective transforms\n            from the serialized pipeline. Defaults to None.\n\n    Returns:\n        object: The deserialized transform pipeline.\n\n    Raises:\n        ValueError: If `data_format` is 'yaml' but PyYAML is not installed.\n\n    \"\"\"\n    check_data_format(data_format)\n\n    if isinstance(filepath_or_buffer, (str, Path)):  # Assume it's a filepath\n        with open(filepath_or_buffer) as f:\n            if data_format == \"json\":\n                transform_dict = json.load(f)\n            else:\n                if not yaml_available:\n                    msg = \"You need to install PyYAML to load a pipeline in yaml format\"\n                    raise ValueError(msg)\n                transform_dict = yaml.safe_load(f)\n    elif data_format == \"json\":\n        transform_dict = json.load(filepath_or_buffer)\n    else:\n        if not yaml_available:\n            msg = \"You need to install PyYAML to load a pipeline in yaml format\"\n            raise ValueError(msg)\n        transform_dict = yaml.safe_load(filepath_or_buffer)\n\n    return from_dict(transform_dict, nonserializable=nonserializable)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.register_additional_transforms","title":"<code>def register_additional_transforms    ()    </code> [view source on GitHub]","text":"<p>Register transforms that are not imported directly into the <code>albumentations</code> module by checking the availability of optional dependencies.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def register_additional_transforms() -&gt; None:\n    \"\"\"Register transforms that are not imported directly into the `albumentations` module by checking\n    the availability of optional dependencies.\n    \"\"\"\n    if importlib.util.find_spec(\"torch\") is not None:\n        try:\n            # Import `albumentations.pytorch` only if `torch` is installed.\n            import albumentations.pytorch\n\n            # Use a dummy operation to acknowledge the use of the imported module and avoid linting errors.\n            _ = albumentations.pytorch.ToTensorV2\n        except ImportError:\n            pass\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.save","title":"<code>def save    (transform, filepath_or_buffer, data_format='json', on_not_implemented_error='raise')    </code> [view source on GitHub]","text":"<p>Serialize a transform pipeline and save it to either a file specified by a path or a file-like object in either JSON or YAML format.</p> <p>Parameters:</p> Name Type Description <code>transform</code> <code>Serializable</code> <p>The transform pipeline to serialize.</p> <code>filepath_or_buffer</code> <code>Union[str, Path, TextIO]</code> <p>The file path or file-like object to write the serialized data to. If a string is provided, it is interpreted as a path to a file. If a file-like object is provided, the serialized data will be written to it directly.</p> <code>data_format</code> <code>str</code> <p>The format to serialize the data in. Valid options are 'json' and 'yaml'. Defaults to 'json'.</p> <code>on_not_implemented_error</code> <code>str</code> <p>Determines the behavior if a transform does not implement the <code>to_dict</code> method. If set to 'raise', a <code>NotImplementedError</code> is raised. If set to 'warn', the exception is ignored, and no transform arguments are saved. Defaults to 'raise'.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>data_format</code> is 'yaml' but PyYAML is not installed.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def save(\n    transform: Serializable,\n    filepath_or_buffer: str | Path | TextIO,\n    data_format: str = \"json\",\n    on_not_implemented_error: str = \"raise\",\n) -&gt; None:\n    \"\"\"Serialize a transform pipeline and save it to either a file specified by a path or a file-like object\n    in either JSON or YAML format.\n\n    Args:\n        transform (Serializable): The transform pipeline to serialize.\n        filepath_or_buffer (Union[str, Path, TextIO]): The file path or file-like object to write the serialized\n            data to.\n            If a string is provided, it is interpreted as a path to a file. If a file-like object is provided,\n            the serialized data will be written to it directly.\n        data_format (str): The format to serialize the data in. Valid options are 'json' and 'yaml'.\n            Defaults to 'json'.\n        on_not_implemented_error (str): Determines the behavior if a transform does not implement the `to_dict` method.\n            If set to 'raise', a `NotImplementedError` is raised. If set to 'warn', the exception is ignored, and\n            no transform arguments are saved. Defaults to 'raise'.\n\n    Raises:\n        ValueError: If `data_format` is 'yaml' but PyYAML is not installed.\n\n    \"\"\"\n    check_data_format(data_format)\n    transform_dict = transform.to_dict(on_not_implemented_error=on_not_implemented_error)\n    transform_dict = serialize_enum(transform_dict)\n\n    # Determine whether to write to a file or a file-like object\n    if isinstance(filepath_or_buffer, (str, Path)):  # It's a filepath\n        with open(filepath_or_buffer, \"w\") as f:\n            if data_format == \"yaml\":\n                if not yaml_available:\n                    msg = \"You need to install PyYAML to save a pipeline in YAML format\"\n                    raise ValueError(msg)\n                yaml.safe_dump(transform_dict, f, default_flow_style=False)\n            elif data_format == \"json\":\n                json.dump(transform_dict, f)\n    elif data_format == \"yaml\":\n        if not yaml_available:\n            msg = \"You need to install PyYAML to save a pipeline in YAML format\"\n            raise ValueError(msg)\n        yaml.safe_dump(transform_dict, filepath_or_buffer, default_flow_style=False)\n    elif data_format == \"json\":\n        json.dump(transform_dict, filepath_or_buffer, indent=2)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.serialize_enum","title":"<code>def serialize_enum    (obj)    </code> [view source on GitHub]","text":"<p>Recursively search for Enum objects and convert them to their value. Also handle any Mapping or Sequence types.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def serialize_enum(obj: Any) -&gt; Any:\n    \"\"\"Recursively search for Enum objects and convert them to their value.\n    Also handle any Mapping or Sequence types.\n    \"\"\"\n    if isinstance(obj, Mapping):\n        return {k: serialize_enum(v) for k, v in obj.items()}\n    if isinstance(obj, Sequence) and not isinstance(obj, str):  # exclude strings since they're also sequences\n        return [serialize_enum(v) for v in obj]\n    return obj.value if isinstance(obj, Enum) else obj\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.to_dict","title":"<code>def to_dict    (transform, on_not_implemented_error='raise')    </code> [view source on GitHub]","text":"<p>Take a transform pipeline and convert it to a serializable representation that uses only standard python data types: dictionaries, lists, strings, integers, and floats.</p> <p>Parameters:</p> Name Type Description <code>transform</code> <code>Serializable</code> <p>A transform that should be serialized. If the transform doesn't implement the <code>to_dict</code> method and <code>on_not_implemented_error</code> equals to 'raise' then <code>NotImplementedError</code> is raised. If <code>on_not_implemented_error</code> equals to 'warn' then <code>NotImplementedError</code> will be ignored but no transform parameters will be serialized.</p> <code>on_not_implemented_error</code> <code>str</code> <p><code>raise</code> or <code>warn</code>.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def to_dict(transform: Serializable, on_not_implemented_error: str = \"raise\") -&gt; dict[str, Any]:\n    \"\"\"Take a transform pipeline and convert it to a serializable representation that uses only standard\n    python data types: dictionaries, lists, strings, integers, and floats.\n\n    Args:\n        transform: A transform that should be serialized. If the transform doesn't implement the `to_dict`\n            method and `on_not_implemented_error` equals to 'raise' then `NotImplementedError` is raised.\n            If `on_not_implemented_error` equals to 'warn' then `NotImplementedError` will be ignored\n            but no transform parameters will be serialized.\n        on_not_implemented_error (str): `raise` or `warn`.\n\n    \"\"\"\n    return transform.to_dict(on_not_implemented_error)\n</code></pre>"},{"location":"api_reference/core/transforms_interface/","title":"Transforms Interface (core.transforms_interface)","text":""},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.BaseTransformInitSchema","title":"<code>class  BaseTransformInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class BaseTransformInitSchema(BaseModel):\n    model_config = ConfigDict(arbitrary_types_allowed=True)\n    always_apply: bool | None\n    p: ProbabilityType\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.BasicTransform","title":"<code>class  BasicTransform</code> <code>         (p=0.5, always_apply=None)                                                                                     </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class BasicTransform(Serializable, metaclass=CombinedMeta):\n    _targets: tuple[Targets, ...] | Targets  # targets that this transform can work on\n    _available_keys: set[str]  # targets that this transform, as string, lower-cased\n    _key2func: dict[\n        str,\n        Callable[..., Any],\n    ]  # mapping for targets (plus additional targets) and methods for which they depend\n    call_backup = None\n    interpolation: int\n    fill: DropoutFillValue\n    fill_mask: ColorType | None\n    # replay mode params\n    deterministic: bool = False\n    save_key = \"replay\"\n    replay_mode = False\n    applied_in_replay = False\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(self, p: float = 0.5, always_apply: bool | None = None):\n        self.p = p\n        if always_apply is not None:\n            if always_apply:\n                warn(\n                    \"always_apply is deprecated. Use `p=1` if you want to always apply the transform.\"\n                    \" self.p will be set to 1.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n                self.p = 1.0\n            else:\n                warn(\n                    \"always_apply is deprecated.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n        self._additional_targets: dict[str, str] = {}\n        # replay mode params\n        self.params: dict[Any, Any] = {}\n        self._key2func = {}\n        self._set_keys()\n        self.processors: dict[str, BboxProcessor | KeypointsProcessor] = {}\n        self.seed: int | None = None\n        self.random_generator = np.random.default_rng(self.seed)\n        self.py_random = random.Random(self.seed)\n\n    def set_random_state(\n        self,\n        random_generator: np.random.Generator,\n        py_random: random.Random,\n    ) -&gt; None:\n        \"\"\"Set random state directly from generators.\n\n        Args:\n            random_generator: numpy random generator to use\n            py_random: python random generator to use\n        \"\"\"\n        self.random_generator = random_generator\n        self.py_random = py_random\n\n    def set_random_seed(self, seed: int | None) -&gt; None:\n        \"\"\"Set random state from seed.\n\n        Args:\n            seed: Random seed to use\n        \"\"\"\n        self.seed = seed\n        self.random_generator = np.random.default_rng(seed)\n        self.py_random = random.Random(seed)\n\n    def get_dict_with_id(self) -&gt; dict[str, Any]:\n        d = self.to_dict_private()\n        d[\"id\"] = id(self)\n        return d\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        \"\"\"Returns names of arguments that are used in __init__ method of the transform.\"\"\"\n        msg = (\n            f\"Class {self.get_class_fullname()} is not serializable because the `get_transform_init_args_names` \"\n            \"method is not implemented\"\n        )\n        raise NotImplementedError(msg)\n\n    def set_processors(self, processors: dict[str, BboxProcessor | KeypointsProcessor]) -&gt; None:\n        self.processors = processors\n\n    def get_processor(self, key: str) -&gt; BboxProcessor | KeypointsProcessor | None:\n        return self.processors.get(key)\n\n    def __call__(self, *args: Any, force_apply: bool = False, **kwargs: Any) -&gt; Any:\n        if args:\n            msg = \"You have to pass data to augmentations as named arguments, for example: aug(image=image)\"\n            raise KeyError(msg)\n        if self.replay_mode:\n            if self.applied_in_replay:\n                return self.apply_with_params(self.params, **kwargs)\n            return kwargs\n\n        # Reset params at the start of each call\n        self.params = {}\n\n        if self.should_apply(force_apply=force_apply):\n            params = self.get_params()\n            params = self.update_params_shape(params=params, data=kwargs)\n\n            if self.targets_as_params:  # check if all required targets are in kwargs.\n                missing_keys = set(self.targets_as_params).difference(kwargs.keys())\n                if missing_keys and not (missing_keys == {\"image\"} and \"images\" in kwargs):\n                    msg = f\"{self.__class__.__name__} requires {self.targets_as_params} missing keys: {missing_keys}\"\n                    raise ValueError(msg)\n\n            params_dependent_on_data = self.get_params_dependent_on_data(params=params, data=kwargs)\n            params.update(params_dependent_on_data)\n\n            if self.targets_as_params:  # this block will be removed after removing `get_params_dependent_on_targets`\n                targets_as_params = {k: kwargs.get(k) for k in self.targets_as_params}\n                if missing_keys:  # here we expecting case when missing_keys == {\"image\"} and \"images\" in kwargs\n                    targets_as_params[\"image\"] = kwargs[\"images\"][0]\n                params_dependent_on_targets = self.get_params_dependent_on_targets(targets_as_params)\n                params.update(params_dependent_on_targets)\n\n            # Store the final params\n            self.params = params\n\n            if self.deterministic:\n                kwargs[self.save_key][id(self)] = deepcopy(params)\n            return self.apply_with_params(params, **kwargs)\n\n        return kwargs\n\n    def get_applied_params(self) -&gt; dict[str, Any]:\n        \"\"\"Returns the parameters that were used in the last transform application.\n        Returns empty dict if transform was not applied.\n        \"\"\"\n        return self.params\n\n    def should_apply(self, force_apply: bool = False) -&gt; bool:\n        if self.p &lt;= 0.0:\n            return False\n        if self.p &gt;= 1.0 or force_apply:\n            return True\n        return self.py_random.random() &lt; self.p\n\n    def apply_with_params(self, params: dict[str, Any], *args: Any, **kwargs: Any) -&gt; dict[str, Any]:\n        \"\"\"Apply transforms with parameters.\"\"\"\n        params = self.update_params(params, **kwargs)  # remove after move parameters like interpolation\n        res = {}\n        for key, arg in kwargs.items():\n            if key in self._key2func and arg is not None:\n                target_function = self._key2func[key]\n                res[key] = ensure_contiguous_output(\n                    target_function(ensure_contiguous_output(arg), **params),\n                )\n            else:\n                res[key] = arg\n        return res\n\n    def set_deterministic(self, flag: bool, save_key: str = \"replay\") -&gt; BasicTransform:\n        \"\"\"Set transform to be deterministic.\"\"\"\n        if save_key == \"params\":\n            msg = \"params save_key is reserved\"\n            raise KeyError(msg)\n\n        self.deterministic = flag\n        if self.deterministic and self.targets_as_params:\n            warn(\n                self.get_class_fullname() + \" could work incorrectly in ReplayMode for other input data\"\n                \" because its' params depend on targets.\",\n                stacklevel=2,\n            )\n        self.save_key = save_key\n        return self\n\n    def __repr__(self) -&gt; str:\n        state = self.get_base_init_args()\n        state.update(self.get_transform_init_args())\n        return f\"{self.__class__.__name__}({format_args(state)})\"\n\n    def apply(self, img: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform on image.\"\"\"\n        raise NotImplementedError\n\n    def apply_to_images(self, images: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform on images.\n\n        Args:\n            images: Input images as numpy array of shape:\n                - (num_images, height, width, channels)\n                - (num_images, height, width) for grayscale\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed images as numpy array in the same format as input\n        \"\"\"\n        # Handle batched numpy array input\n        transformed = np.stack([self.apply(image, **params) for image in images])\n        return np.require(transformed, requirements=[\"C_CONTIGUOUS\"])\n\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform slice by slice to a volume.\n\n        Args:\n            volume: Input volume of shape (depth, height, width) or (depth, height, width, channels)\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed volume as numpy array in the same format as input\n        \"\"\"\n        return self.apply_to_images(volume, *args, **params)\n\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to multiple volumes.\"\"\"\n        return np.stack([self.apply_to_volume(vol, *args, **params) for vol in volumes])\n\n    def get_params(self) -&gt; dict[str, Any]:\n        \"\"\"Returns parameters independent of input.\"\"\"\n        return {}\n\n    def update_params_shape(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Updates parameters with input shape.\"\"\"\n        # Extract shape from volume, volumes, image, or images\n        if \"volume\" in data:\n            shape = data[\"volume\"][0].shape  # Take first slice of volume\n        elif \"volumes\" in data:\n            shape = data[\"volumes\"][0][0].shape  # Take first slice of first volume\n        elif \"image\" in data:\n            shape = data[\"image\"].shape\n        else:\n            shape = data[\"images\"][0].shape\n\n        # For volumes/images, shape will be either (H, W) or (H, W, C)\n        params[\"shape\"] = shape\n        return params\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Returns parameters dependent on input.\"\"\"\n        return params\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        # mapping for targets and methods for which they depend\n        # for example:\n        # &gt;&gt;  {\"image\": self.apply}\n        # &gt;&gt;  {\"masks\": self.apply_to_masks}\n        raise NotImplementedError\n\n    def _set_keys(self) -&gt; None:\n        \"\"\"Set _available_keys.\"\"\"\n        if not hasattr(self, \"_targets\"):\n            self._available_keys = set()\n        else:\n            self._available_keys = {\n                target.value.lower()\n                for target in (self._targets if isinstance(self._targets, tuple) else [self._targets])\n            }\n        self._available_keys.update(self.targets.keys())\n        self._key2func = {key: self.targets[key] for key in self._available_keys if key in self.targets}\n\n    @property\n    def available_keys(self) -&gt; set[str]:\n        \"\"\"Returns set of available keys.\"\"\"\n        return self._available_keys\n\n    def update_params(self, params: dict[str, Any], **kwargs: Any) -&gt; dict[str, Any]:\n        \"\"\"Update parameters with transform specific params.\n        This method is deprecated, use:\n        - `get_params` for transform specific params like interpolation and\n        - `update_params_shape` for data like shape.\n        \"\"\"\n        if hasattr(self, \"interpolation\"):\n            params[\"interpolation\"] = self.interpolation\n        if hasattr(self, \"fill\"):\n            params[\"fill\"] = self.fill\n        if hasattr(self, \"fill_mask\"):\n            params[\"fill_mask\"] = self.fill_mask\n\n        # Use update_params_shape to get shape consistently\n        return self.update_params_shape(params, kwargs)\n\n    def add_targets(self, additional_targets: dict[str, str]) -&gt; None:\n        \"\"\"Add targets to transform them the same way as one of existing targets.\n        ex: {'target_image': 'image'}\n        ex: {'obj1_mask': 'mask', 'obj2_mask': 'mask'}\n        by the way you must have at least one object with key 'image'\n\n        Args:\n            additional_targets (dict): keys - new target name, values - old target name. ex: {'image2': 'image'}\n\n        \"\"\"\n        for k, v in additional_targets.items():\n            if k in self._additional_targets and v != self._additional_targets[k]:\n                raise ValueError(\n                    f\"Trying to overwrite existed additional targets. \"\n                    f\"Key={k} Exists={self._additional_targets[k]} New value: {v}\",\n                )\n            if v in self._available_keys:\n                self._additional_targets[k] = v\n                self._key2func[k] = self.targets[v]\n                self._available_keys.add(k)\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        \"\"\"Targets used to get params dependent on targets.\n        This is used to check input has all required targets.\n        \"\"\"\n        return []\n\n    def get_params_dependent_on_targets(self, params: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"This method is deprecated.\n        Use `get_params_dependent_on_data` instead.\n        Returns parameters dependent on targets.\n        Dependent target is defined in `self.targets_as_params`\n        \"\"\"\n        return {}\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return get_shortest_class_fullname(cls)\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    def get_base_init_args(self) -&gt; dict[str, Any]:\n        \"\"\"Returns base init args - p\"\"\"\n        return {\"p\": self.p}\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        \"\"\"Exclude seed from init args during serialization\"\"\"\n        args = {k: getattr(self, k) for k in self.get_transform_init_args_names()}\n        args.pop(\"seed\", None)  # Remove seed from args\n        return args\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        state = {\"__class_fullname__\": self.get_class_fullname()}\n        state.update(self.get_base_init_args())\n        state.update(self.get_transform_init_args())\n        return state\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.DualTransform","title":"<code>class  DualTransform</code> <code> </code>  [view source on GitHub]","text":"<p>A base class for transformations that should be applied both to an image and its corresponding properties such as masks, bounding boxes, and keypoints. This class ensures that when a transform is applied to an image, all associated entities are transformed accordingly to maintain consistency between the image and its annotations.</p> <p>Methods</p> <p>apply(img: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to the image.</p> <pre><code>img: Input image of shape (H, W, C) or (H, W) for grayscale.\n**params: Additional parameters specific to the transform.\n\nReturns Transformed image of the same shape as input.\n</code></pre> <p>apply_to_images(images: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to multiple images.</p> <pre><code>images: Input images of shape (N, H, W, C) or (N, H, W) for grayscale.\n**params: Additional parameters specific to the transform.\n\nReturns Transformed images in the same format as input.\n</code></pre> <p>apply_to_mask(mask: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to a mask.</p> <pre><code>mask: Input mask of shape (H, W), (H, W, C) for multi-channel masks\n**params: Additional parameters specific to the transform.\n\nReturns Transformed mask in the same format as input.\n</code></pre> <p>apply_to_masks(masks: np.ndarray, **params: Any) -&gt; np.ndarray | list[np.ndarray]:     Apply the transform to multiple masks.</p> <pre><code>masks: Array of shape (N, H, W) or (N, H, W, C) where N is number of masks\n**params: Additional parameters specific to the transform.\nReturns Transformed masks in the same format as input.\n</code></pre> <p>apply_to_keypoints(keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to keypoints.</p> <pre><code>!!! keypoints \"Array of shape (N, 2+) where N is the number of keypoints.\"\n    **params: Additional parameters specific to the transform.\nReturns Transformed keypoints array of shape (N, 2+).\n</code></pre> <p>apply_to_bboxes(bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to bounding boxes.</p> <pre><code>!!! bboxes \"Array of shape (N, 4+) where N is the number of bounding boxes,\"\n        and each row is in the format [x_min, y_min, x_max, y_max].\n**params: Additional parameters specific to the transform.\n\nReturns Transformed bounding boxes array of shape (N, 4+).\n</code></pre> <p>apply_to_volume(volume: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to a volume.</p> <pre><code>volume: Input volume of shape (D, H, W) or (D, H, W, C).\n**params: Additional parameters specific to the transform.\n\nReturns Transformed volume of the same shape as input.\n</code></pre> <p>apply_to_volumes(volumes: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to multiple volumes.</p> <pre><code>volumes: Input volumes of shape (N, D, H, W) or (N, D, H, W, C).\n**params: Additional parameters specific to the transform.\n\nReturns Transformed volumes in the same format as input.\n</code></pre> <p>apply_to_mask3d(mask: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to a 3D mask.</p> <pre><code>mask: Input 3D mask of shape (D, H, W) or (D, H, W, C)\n**params: Additional parameters specific to the transform.\n\nReturns Transformed 3D mask in the same format as input.\n</code></pre> <p>apply_to_masks3d(masks: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to multiple 3D masks.</p> <pre><code>masks: Input 3D masks of shape (N, D, H, W) or (N, D, H, W, C)\n**params: Additional parameters specific to the transform.\n\nReturns Transformed 3D masks in the same format as input.\n</code></pre> <p>Note</p> <ul> <li>All <code>apply_*</code> methods should maintain the input shape and format of the data.</li> <li>When applying transforms to masks, ensure that discrete values (e.g., class labels) are preserved.</li> <li>For keypoints and bounding boxes, the transformation should maintain their relative positions     with respect to the transformed image.</li> <li>The difference between <code>apply_to_mask</code> and <code>apply_to_masks</code> is mainly in how they handle 3D arrays:     <code>apply_to_mask</code> treats a 3D array as a multi-channel mask, while <code>apply_to_masks</code> treats it as     multiple single-channel masks.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class DualTransform(BasicTransform):\n    \"\"\"A base class for transformations that should be applied both to an image and its corresponding properties\n    such as masks, bounding boxes, and keypoints. This class ensures that when a transform is applied to an image,\n    all associated entities are transformed accordingly to maintain consistency between the image and its annotations.\n\n    Methods:\n        apply(img: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to the image.\n\n            img: Input image of shape (H, W, C) or (H, W) for grayscale.\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed image of the same shape as input.\n\n        apply_to_images(images: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to multiple images.\n\n            images: Input images of shape (N, H, W, C) or (N, H, W) for grayscale.\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed images in the same format as input.\n\n        apply_to_mask(mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to a mask.\n\n            mask: Input mask of shape (H, W), (H, W, C) for multi-channel masks\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed mask in the same format as input.\n\n        apply_to_masks(masks: np.ndarray, **params: Any) -&gt; np.ndarray | list[np.ndarray]:\n            Apply the transform to multiple masks.\n\n            masks: Array of shape (N, H, W) or (N, H, W, C) where N is number of masks\n            **params: Additional parameters specific to the transform.\n            Returns Transformed masks in the same format as input.\n\n        apply_to_keypoints(keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to keypoints.\n\n            keypoints: Array of shape (N, 2+) where N is the number of keypoints.\n                **params: Additional parameters specific to the transform.\n            Returns Transformed keypoints array of shape (N, 2+).\n\n        apply_to_bboxes(bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to bounding boxes.\n\n            bboxes: Array of shape (N, 4+) where N is the number of bounding boxes,\n                    and each row is in the format [x_min, y_min, x_max, y_max].\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed bounding boxes array of shape (N, 4+).\n\n        apply_to_volume(volume: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to a volume.\n\n            volume: Input volume of shape (D, H, W) or (D, H, W, C).\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed volume of the same shape as input.\n\n        apply_to_volumes(volumes: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to multiple volumes.\n\n            volumes: Input volumes of shape (N, D, H, W) or (N, D, H, W, C).\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed volumes in the same format as input.\n\n        apply_to_mask3d(mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to a 3D mask.\n\n            mask: Input 3D mask of shape (D, H, W) or (D, H, W, C)\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed 3D mask in the same format as input.\n\n        apply_to_masks3d(masks: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to multiple 3D masks.\n\n            masks: Input 3D masks of shape (N, D, H, W) or (N, D, H, W, C)\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed 3D masks in the same format as input.\n\n    Note:\n        - All `apply_*` methods should maintain the input shape and format of the data.\n        - When applying transforms to masks, ensure that discrete values (e.g., class labels) are preserved.\n        - For keypoints and bounding boxes, the transformation should maintain their relative positions\n            with respect to the transformed image.\n        - The difference between `apply_to_mask` and `apply_to_masks` is mainly in how they handle 3D arrays:\n            `apply_to_mask` treats a 3D array as a multi-channel mask, while `apply_to_masks` treats it as\n            multiple single-channel masks.\n\n    \"\"\"\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        return {\n            \"image\": self.apply,\n            \"images\": self.apply_to_images,\n            \"mask\": self.apply_to_mask,\n            \"masks\": self.apply_to_masks,\n            \"mask3d\": self.apply_to_mask3d,\n            \"masks3d\": self.apply_to_masks3d,\n            \"bboxes\": self.apply_to_bboxes,\n            \"keypoints\": self.apply_to_keypoints,\n            \"volume\": self.apply_to_volume,\n            \"volumes\": self.apply_to_volumes,\n        }\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        msg = f\"Method apply_to_keypoints is not implemented in class {self.__class__.__name__}\"\n        raise NotImplementedError(msg)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        raise NotImplementedError(f\"BBoxes not implemented for {self.__class__.__name__}\")\n\n    def apply_to_mask(self, mask: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(mask, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_masks(self, masks: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to multiple masks.\n\n        Args:\n            masks: Array of shape (N, H, W) or (N, H, W, C) where N is number of masks\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Array of transformed masks with same shape as input\n        \"\"\"\n        return self.apply(masks, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to single 3D mask.\n\n        Args:\n            mask3d: Input 3D mask of shape (D, H, W) or (D, H, W, C)\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed 3D mask in the same format as input\n        \"\"\"\n        return self.apply_to_mask(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to batch of 3D masks.\n\n        Args:\n            masks3d: Input 3D masks of shape (N, D, H, W) or (N, D, H, W, C)\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed 3D masks in the same format as input\n        \"\"\"\n        return self.apply_to_mask(masks3d, *args, **params)\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.ImageOnlyTransform","title":"<code>class  ImageOnlyTransform</code> <code> </code>  [view source on GitHub]","text":"<p>Transform applied to image only.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class ImageOnlyTransform(BasicTransform):\n    \"\"\"Transform applied to image only.\"\"\"\n\n    _targets = (Targets.IMAGE, Targets.VOLUME)\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        return {\n            \"image\": self.apply,\n            \"images\": self.apply_to_images,\n            \"volume\": self.apply_to_volume,\n            \"volumes\": self.apply_to_volumes,\n        }\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.NoOp","title":"<code>class  NoOp</code> <code> </code>  [view source on GitHub]","text":"<p>Identity transform (does nothing).</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class NoOp(DualTransform):\n    \"\"\"Identity transform (does nothing).\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return keypoints\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return bboxes\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return img\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return mask\n\n    def apply_to_volume(self, volume: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return volume\n\n    def apply_to_mask3d(self, mask3d: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return mask3d\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.Transform3D","title":"<code>class  Transform3D</code> <code> </code>  [view source on GitHub]","text":"<p>Base class for all 3D transforms.</p> <p>Transform3D inherits from DualTransform because 3D transforms can be applied to both volumes and masks, similar to how 2D DualTransforms work with images and masks.</p> <p>Targets</p> <p>volume: 3D numpy array of shape (D, H, W) or (D, H, W, C) volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C) mask: 3D numpy array of shape (D, H, W) masks: Batch of 3D arrays of shape (N, D, H, W)</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class Transform3D(DualTransform):\n    \"\"\"Base class for all 3D transforms.\n\n    Transform3D inherits from DualTransform because 3D transforms can be applied to both\n    volumes and masks, similar to how 2D DualTransforms work with images and masks.\n\n    Targets:\n        volume: 3D numpy array of shape (D, H, W) or (D, H, W, C)\n        volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C)\n        mask: 3D numpy array of shape (D, H, W)\n        masks: Batch of 3D arrays of shape (N, D, H, W)\n    \"\"\"\n\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to single 3D volume.\"\"\"\n        raise NotImplementedError\n\n    @batch_transform(\"spatial\", keep_depth_dim=True, has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to batch of 3D volumes.\"\"\"\n        return self.apply_to_volume(volumes, *args, **params)\n\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to single 3D mask.\"\"\"\n        return self.apply_to_volume(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", keep_depth_dim=True, has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to batch of 3D masks.\"\"\"\n        return self.apply_to_mask3d(masks3d, *args, **params)\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        \"\"\"Define valid targets for 3D transforms.\"\"\"\n        return {\n            \"volume\": self.apply_to_volume,\n            \"volumes\": self.apply_to_volumes,\n            \"mask3d\": self.apply_to_mask3d,\n            \"masks3d\": self.apply_to_masks3d,\n        }\n</code></pre>"},{"location":"api_reference/pytorch/","title":"Index","text":"<ul> <li>Transforms (albumentations.pytorch.transforms)</li> </ul>"},{"location":"api_reference/pytorch/transforms/","title":"Transforms (pytorch.transforms)","text":""},{"location":"api_reference/pytorch/transforms/#albumentations.pytorch.transforms.ToTensor3D","title":"<code>class  ToTensor3D</code> <code>       (p=1.0, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Convert 3D volumes and masks to PyTorch tensors.</p> <p>This transform is designed for 3D medical imaging data. It converts numpy arrays to PyTorch tensors and ensures consistent channel positioning.</p> <p>For all inputs (volumes and masks):     - Input:  (D, H, W, C) or (D, H, W) - depth, height, width, [channels]     - Output: (C, D, H, W) - channels first format for PyTorch              For single-channel input, adds C=1 dimension</p> <p>Note</p> <p>This transform always moves channels to first position as this is the standard PyTorch format. For masks that need to stay in DHWC format, use a different transform or handle the transposition after this transform.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/pytorch/transforms.py</code> Python<pre><code>class ToTensor3D(BasicTransform):\n    \"\"\"Convert 3D volumes and masks to PyTorch tensors.\n\n    This transform is designed for 3D medical imaging data. It converts numpy arrays\n    to PyTorch tensors and ensures consistent channel positioning.\n\n    For all inputs (volumes and masks):\n        - Input:  (D, H, W, C) or (D, H, W) - depth, height, width, [channels]\n        - Output: (C, D, H, W) - channels first format for PyTorch\n                 For single-channel input, adds C=1 dimension\n\n    Note:\n        This transform always moves channels to first position as this is\n        the standard PyTorch format. For masks that need to stay in DHWC format,\n        use a different transform or handle the transposition after this transform.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0\n    \"\"\"\n\n    _targets = (Targets.IMAGE, Targets.MASK)\n\n    def __init__(self, p: float = 1.0, always_apply: bool | None = None):\n        super().__init__(p=p, always_apply=always_apply)\n\n    @property\n    def targets(self) -&gt; dict[str, Any]:\n        return {\n            \"volume\": self.apply_to_volume,\n            \"mask3d\": self.apply_to_mask3d,\n        }\n\n    def apply_to_volume(self, volume: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        \"\"\"Convert 3D volume to channels-first tensor.\"\"\"\n        if volume.ndim == NUM_VOLUME_DIMENSIONS:  # D,H,W,C\n            return torch.from_numpy(volume.transpose(3, 0, 1, 2))\n        if volume.ndim == NUM_VOLUME_DIMENSIONS - 1:  # D,H,W\n            return torch.from_numpy(volume[np.newaxis, ...])\n        raise ValueError(f\"Expected 3D or 4D array (D,H,W) or (D,H,W,C), got {volume.ndim}D array\")\n\n    def apply_to_mask3d(self, mask3d: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        \"\"\"Convert 3D mask to channels-first tensor.\"\"\"\n        return self.apply_to_volume(mask3d, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/pytorch/transforms/#albumentations.pytorch.transforms.ToTensorV2","title":"<code>class  ToTensorV2</code> <code>       (transpose_mask=False, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Converts images/masks to PyTorch Tensors, inheriting from BasicTransform. For images:     - If input is in <code>HWC</code> format, converts to PyTorch <code>CHW</code> format     - If input is in <code>HW</code> format, converts to PyTorch <code>1HW</code> format (adds channel dimension)</p> <p>Attributes:</p> Name Type Description <code>transpose_mask</code> <code>bool</code> <p>If True, transposes 3D input mask dimensions from <code>[height, width, num_channels]</code> to <code>[num_channels, height, width]</code>.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/pytorch/transforms.py</code> Python<pre><code>class ToTensorV2(BasicTransform):\n    \"\"\"Converts images/masks to PyTorch Tensors, inheriting from BasicTransform.\n    For images:\n        - If input is in `HWC` format, converts to PyTorch `CHW` format\n        - If input is in `HW` format, converts to PyTorch `1HW` format (adds channel dimension)\n\n    Attributes:\n        transpose_mask (bool): If True, transposes 3D input mask dimensions from `[height, width, num_channels]` to\n            `[num_channels, height, width]`.\n        p (float): Probability of applying the transform. Default: 1.0.\n    \"\"\"\n\n    _targets = (Targets.IMAGE, Targets.MASK)\n\n    def __init__(self, transpose_mask: bool = False, p: float = 1.0, always_apply: bool | None = None):\n        super().__init__(p=p, always_apply=always_apply)\n        self.transpose_mask = transpose_mask\n\n    @property\n    def targets(self) -&gt; dict[str, Any]:\n        return {\n            \"image\": self.apply,\n            \"images\": self.apply_to_images,\n            \"mask\": self.apply_to_mask,\n            \"masks\": self.apply_to_masks,\n        }\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        if img.ndim not in {MONO_CHANNEL_DIMENSIONS, NUM_MULTI_CHANNEL_DIMENSIONS}:\n            msg = \"Albumentations only supports images in HW or HWC format\"\n            raise ValueError(msg)\n\n        if img.ndim == MONO_CHANNEL_DIMENSIONS:\n            img = np.expand_dims(img, 2)\n\n        return torch.from_numpy(img.transpose(2, 0, 1))\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        if self.transpose_mask and mask.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n            mask = mask.transpose(2, 0, 1)\n        return torch.from_numpy(mask)\n\n    @overload\n    def apply_to_masks(self, masks: list[np.ndarray], **params: Any) -&gt; list[torch.Tensor]: ...\n\n    @overload\n    def apply_to_masks(self, masks: np.ndarray, **params: Any) -&gt; torch.Tensor: ...\n\n    def apply_to_masks(self, masks: np.ndarray | list[np.ndarray], **params: Any) -&gt; torch.Tensor | list[torch.Tensor]:\n        \"\"\"Convert numpy array or list of numpy array masks to torch tensor(s).\n\n        Args:\n            masks: Numpy array of shape (N, H, W) or (N, H, W, C),\n                or a list of numpy arrays with shape (H, W) or (H, W, C).\n            params: Additional parameters.\n\n        Returns:\n            If transpose_mask is True and input is (N, H, W, C), returns tensor of shape (N, C, H, W).\n            If transpose_mask is True and input is (H, W, C), returns a list of tensors with shape (C, H, W).\n            Otherwise, returns tensors with the same shape as input.\n        \"\"\"\n        if isinstance(masks, list):\n            return [self.apply_to_mask(mask, **params) for mask in masks]\n\n        if self.transpose_mask and masks.ndim == NUM_VOLUME_DIMENSIONS:  # (N, H, W, C)\n            masks = np.transpose(masks, (0, 3, 1, 2))  # -&gt; (N, C, H, W)\n        return torch.from_numpy(masks)\n\n    def apply_to_images(self, images: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        \"\"\"Convert batch of images from (N, H, W, C) to (N, C, H, W).\"\"\"\n        if images.ndim != NUM_VOLUME_DIMENSIONS:  # N,H,W,C\n            raise ValueError(f\"Expected 4D array (N,H,W,C), got {images.ndim}D array\")\n        return torch.from_numpy(images.transpose(0, 3, 1, 2))  # -&gt; (N,C,H,W)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"transpose_mask\",)\n</code></pre>"},{"location":"autoalbument/","title":"AutoAlbument Overview","text":"<p>AutoAlbument is an AutoML tool that learns image augmentation policies from data using the Faster AutoAugment algorithm. It relieves the user from manually selecting augmentations and tuning their parameters. AutoAlbument provides a complete ready-to-use configuration for an augmentation pipeline.</p> <p>AutoAlbument supports image classification and semantic segmentation tasks. The library requires Python 3.6 or higher.</p> <p>The source code and issue tracker are available at https://github.com/albumentations-team/autoalbument</p> <p>Table of contents:</p> <ul> <li>AutoAlbument introduction and core concepts</li> <li>Installation</li> <li>Benchmarks and a comparison with baseline augmentation strategies</li> <li>How to use AutoAlbument</li> <li>How to use an AutoAlbument Docker image</li> <li>How to use a custom classification or semantic segmentation model</li> <li>Metrics and their meaning</li> <li>Tuning parameters</li> <li>Examples</li> <li>Search algorithms</li> <li>FAQ</li> </ul>"},{"location":"autoalbument/benchmarks/","title":"Benchmarks and a comparison with baseline augmentation strategies","text":"<p>Here is a comparison between a baseline augmentation strategy and an augmentation policy discovered by AutoAlbument for different classification and semantic segmentation tasks. You can read more about these benchmarks in the autoalbument-benchmarks repository.</p>"},{"location":"autoalbument/benchmarks/#classification","title":"Classification","text":"Dataset Baseline Top-1 Accuracy AutoAlbument Top-1 Accuracy CIFAR10 91.79 96.02 SVHN 98.31 98.48 ImageNet 73.27 75.17"},{"location":"autoalbument/benchmarks/#semantic-segmentation","title":"Semantic segmentation","text":"Dataset Baseline mIOU AutoAlbument mIOU Pascal VOC 73.34 75.55 Cityscapes 79.47 79.92"},{"location":"autoalbument/custom_model/","title":"How to use a custom classification or semantic segmentation model","text":"<p>By default AutoAlbument uses <code>pytorch-image-models</code> for classification and <code>segmentation_models.pytorch</code> for semantic segmentation. You can use any model from these packages by providing an appropriate model name.</p> <p>However, you can also use a custom model with AutoAlbument. To do so, you need to define a Discriminator model. This Discriminator model should have two outputs.</p> <ul> <li> <p>The first output should provide a prediction for a classification or semantic segmentation task. For classification, it should output a tensor with a shape <code>[batch_size, num_classes]</code> with logits. For semantic segmentation, it should output a tensor with the shape <code>[batch_size, num_classes, height, width]</code> with logits.</p> </li> <li> <p>The second (auxiliary) output should return a tensor with the shape <code>[batch_size]</code> that contains logits for Discriminator's predictions (whether Discriminator thinks that an image wasn't or was augmented).</p> </li> </ul> <p>To create such a model, you need to subclass the <code>autoalbument.faster_autoaugment.models.BaseDiscriminator</code> class and implement the <code>forward</code> method. This method should take a batch of images, that is, a tensor with the shape <code>[batch_size, num_channels, height, width]</code>. It should return a tuple that contains tensors from the two outputs described above.</p> <p>As an example, take a look at how default classification and semantic segmentation models are defined in AutoAlbument - https://github.com/albumentations-team/autoalbument/blob/master/autoalbument/faster_autoaugment/models.py or explore an example of a custom model for the CIFAR10 dataset.</p> <p>Next, you need to specify this custom model in <code>config.yaml</code>, an  AutoAlbument config file. AutoAlbument uses the <code>instantiate</code> function from Hydra to instantiate an object. You need to set the <code>_target_</code> config variable in the <code>classification_model</code> or <code>semantic_segmentation_model</code> section, depending on the task. In this config variable, you need to provide a path to a class with the model. This path should be located inside PYTHONPATH, so Hydra could correctly use it. The simplest way is to define your model in a file such as <code>model.py</code> and place this file in the same directory with <code>dataset.py</code> and <code>search.yaml</code> because this directory is automatically added to PYTHONPATH. Next, you could define <code>_target_</code> such as <code>_target_: model.MyClassificationModel</code>.</p> <p>Take a look at the CIFAR10 example config that uses a custom model defined in model.py as a starting point for defining a custom model.</p>"},{"location":"autoalbument/docker/","title":"How to use an AutoAlbument Docker image","text":"<p>You can run AutoAlbument from a Docker image. The <code>ghcr.io/albumentations-team/autoalbument:latest</code> Docker image contains the latest release version of AutoAlbument.</p> <p>You can also use an image that contains a specific version of AutoAlbument. In that case, you need to use the AutoAlbument version as a tag for a Docker image, e.g., the <code>ghcr.io/albumentations-team/autoalbument:0.3.0</code> image contains AutoAlbument 0.3.0.</p> <p>The latest AutoAlbument image is based on the <code>pytorch/pytorch:1.7.0-cuda11.0-cudnn8-runtime</code> image.</p> <p>When you run a Docker container with AutoAlbument, you need to mount a config directory (a directory containing <code>dataset.py</code> and <code>search.yaml</code> files) and other required directories, such as a directory that contains training data.</p> <p>Here is an example command that runs a Docker container that will search for CIFAR10 augmentation policies.</p> <p><code>docker run -it --rm --gpus all --ipc=host -v ~/projects/autoalbument/examples/cifar10:/config -v ~/data:/home/autoalbument/data -u $(id -u ${USER}):$(id -g ${USER}) ghcr.io/albumentations-team/autoalbument:latest</code></p> <p>Let's take a look at the arguments:</p> <ul> <li><code>--it</code>. Tell Docker that you run an interactive process. Read more in the Docker documentation.</li> <li><code>--rm</code>. Automatically clean up a container when it exits. Read more in the Docker documentation.</li> <li><code>--gpus all</code>. Specify GPUs to use. Read more in the Docker documentation.</li> <li><code>--ipc=host</code>. Increase shared memory size for PyTorch DataLoader. Read more in the PyTorch documentation.</li> <li><code>-v ~/projects/autoalbument/examples/cifar10:/config</code>. Mounts the <code>~/projects/autoalbument/examples/cifar10</code> directory from the host to the <code>/config</code> directory into the container. This example assumes that you have the AutoAlbument repository in the <code>~/projects/autoalbument/</code> directory. Generally speaking, you need to mount a directory containing <code>dataset.py</code> and <code>search.yaml</code> into the <code>/config</code> directory in a container.</li> <li><code>-v ~/data:/home/autoalbument/data</code>. Mounts the directory <code>~/data</code> that contains the CIFAR10 dataset into the <code>/home/autoalbument/data</code> directory. You can mount a host directory with a dataset into any container directory, but you need to specify config parameters accordingly. In this example, we mount the directory into <code>/home/autoalbument/data</code> because we set this directory (<code>~/data/cifar10</code>) in the config as a root directory for the dataset. Note that Docker doesn't support tilde expansion for the HOME directory, so we explicitly name HOME directory as <code>/home/autoalbument</code> because <code>autoalbument</code> is a default user inside the container.</li> <li><code>-u $(id -u ${USER}):$(id -g ${USER})</code>. We use that command to tell Docker to use the host's user ID to run code inside a container. We need this command because AutoAlbument will produce artifacts in the config directory (such as augmentation configs and logs). We need that the host user owns those files (and not <code>root</code>, for example) so you can access them afterward.</li> <li><code>ghcr.io/albumentations-team/autoalbument:latest</code> is the Docker image's name. <code>latest</code> is a tag for the latest stable release. Alternatively, you can use a tag that specifies an AutoAlbument version, e.g., <code>ghcr.io/albumentations-team/autoalbument:0.3.0</code>.</li> </ul>"},{"location":"autoalbument/faq/","title":"FAQ","text":""},{"location":"autoalbument/faq/#search-takes-a-lot-of-time-how-can-i-speed-it-up","title":"Search takes a lot of time. How can I speed it up?","text":"<p>Instead of a full training dataset, you can use a reduced version to search for augmentation policies. For example, the authors of Faster AutoAugment used 6000 images from the 120 selected classes to find augmentation policies for ImageNet (while the full dataset for ILSVRC contains 1.2 million images and 1000 classes).</p>"},{"location":"autoalbument/how_to_use/","title":"How to use AutoAlbument","text":"<ol> <li>You need to create a configuration file with AutoAlbument parameters and a Python file that implements a custom PyTorch Dataset for your data. Next, you need to pass those files to AutoAlbument.</li> <li>AutoAlbument will use Generative Adversarial Network to discover augmentation policies and then create a file containing those policies.</li> <li>Finally, you can use Albumentations to load augmentation policies from the file and utilize them in your computer vision pipeline.</li> </ol>"},{"location":"autoalbument/how_to_use/#step-1-create-a-configuration-file-and-a-custom-pytorch-dataset-for-your-data","title":"Step 1. Create a configuration file and a custom PyTorch Dataset for your data","text":""},{"location":"autoalbument/how_to_use/#a-create-a-directory-with-configuration-files","title":"a. Create a directory with configuration files","text":"<p>Run <code>autoalbument-create --config-dir &lt;/path/to/directory&gt; --task &lt;deep_learning_task&gt; --num-classes &lt;num_classes&gt;</code>, e.g. <code>autoalbument-create --config-dir ~/experiments/autoalbument-search-cifar10 --task classification --num-classes 10</code>.  - A value for the <code>--config-dir</code> option should contain a path to the directory. AutoAlbument will create this directory and put two files into it: <code>dataset.py</code> and <code>search.yaml</code> (more on them later).   - A value for the <code>--task</code> option should contain the name of a deep learning task. Supported values are <code>classification</code> and <code>semantic_segmentation</code>.  - A value for the <code>--num-classes</code> option should contain the number of distinct classes in the classification or segmentation dataset.</p> <p>By default, AutoAlbument creates a <code>search.yaml</code> file that contains only most important configuration parameters. To explore all available parameters you can create a config file that contains them all by providing the <code>--generate-full-config</code> argument, e.g. <code>autoalbument-create --config-dir ~/experiments/autoalbument-search-cifar10 --task classification --num-classes 10 --generate-full-config</code></p>"},{"location":"autoalbument/how_to_use/#b-add-implementation-for-__len__-and-__getitem__-methods-in-datasetpy","title":"b. Add implementation for <code>__len__</code> and <code>__getitem__</code> methods in <code>dataset.py</code>","text":"<p>The <code>dataset.py</code> file created at step 1 by <code>autoalbument-create</code> contains stubs for implementing a PyTorch dataset (you can read more about creating custom PyTorch datasets here). You need to add implementation for <code>__len__</code> and <code>__getitem__</code> methods (and optionally add the initialization logic if required).</p> <p>A dataset for a classification task should return an image and a class label. A dataset for a segmentation task should return an image and an associated mask.</p>"},{"location":"autoalbument/how_to_use/#c-optional-adjust-search-parameters-in-searchyaml","title":"c. [Optional] Adjust search parameters in <code>search.yaml</code>","text":"<p>You may want to change the parameters that AutoAlbument will use to search for augmentation policies. To do this, you need to edit the <code>search.yaml</code> file created by <code>autoalbument-create</code> at step 1. Each configuration parameter contains a comment that describes the meaning of the setting. Please refer to the  \"Tuning the search parameters\" section that includes a description of the most critical parameters.</p> <p><code>search.yaml</code> is a Hydra config file. You can use all Hydra features inside it.</p>"},{"location":"autoalbument/how_to_use/#step-2-use-autoalbument-to-search-for-augmentation-policies","title":"Step 2. Use AutoAlbument to search for augmentation policies.","text":"<p>To search for augmentation policies, run <code>autoalbument-search --config-dir &lt;/path/to/directory&gt;</code>, e.g. <code>autoalbument-search --config-dir ~/experiments/autoalbument-search-cifar10</code>. The value of <code>--config-dir</code> should be the same value that was passed to <code>autoalbument-create</code> at step 1.</p> <p><code>autoalbument-search</code> will create a directory with output files (by default the path of the directory will be <code>&lt;config_dir&gt;/outputs/&lt;current_date&gt;/&lt;current_time&gt;</code>, but you can customize it in search.yaml).  The <code>policy</code> subdirectory will contain JSON files with policies found at each search phase's epoch.</p> <p><code>autoalbument-search</code> is a command wrapped with the <code>@hydra.main</code> decorator from Hydra. You can use all Hydra features when calling this command.</p> <p>AutoAlbument uses PyTorch to search for augmentation policies. You can speed up the search by using a CUDA-capable GPU.</p>"},{"location":"autoalbument/how_to_use/#step-3-use-albumentations-to-load-augmentation-policies-and-utilize-them-in-your-training-pipeline","title":"Step 3. Use Albumentations to load augmentation policies and utilize them in your training pipeline.","text":"<p>AutoAlbument produces a JSON file that contains a configuration for an augmentation pipeline. You can load that JSON file with Albumentations:</p> Text Only<pre><code>import albumentations as A\ntransform = A.load(\"/path/to/policy.json\")\n</code></pre> <p>Then you can use the created augmentation pipeline to augment the input data.</p> <p>For example, to augment an image for a classification task:</p> Text Only<pre><code>transformed = transform(image=image)\ntransformed_image = transformed[\"image\"]\n</code></pre> <p>To augment an image and a mask for a semantic segmentation task: Text Only<pre><code>transformed = transform(image=image, mask=mask)\ntransformed_image = transformed[\"image\"]\ntransformed_mask = transformed[\"mask\"]\n</code></pre></p>"},{"location":"autoalbument/how_to_use/#additional-resources","title":"Additional resources","text":"<ul> <li> <p>You can read more about the most important configuration parameters for AutoAlbument in Tuning the search parameters.</p> </li> <li> <p>To see examples of configuration files and custom PyTorch Datasets, please refer to Examples</p> </li> <li> <p>You can read more about using Albumentations for augmentation in those articles Image augmentation for classification, Mask augmentation for segmentation.</p> </li> <li> <p>Refer to this section of the documentation to get examples of how to use Albumentations with PyTorch and TensorFlow 2.</p> </li> </ul>"},{"location":"autoalbument/installation/","title":"Installation","text":"<p>AutoAlbument requires Python 3.6 or higher.</p>"},{"location":"autoalbument/installation/#pypi","title":"PyPI","text":"<p>To install the latest stable version from PyPI:</p> <p><code>pip install -U autoalbument</code></p>"},{"location":"autoalbument/installation/#github","title":"GitHub","text":"<p>To install the latest version from GitHub:</p> <p><code>pip install -U git+https://github.com/albumentations-team/autoalbument</code></p>"},{"location":"autoalbument/introduction/","title":"AutoAlbument introduction and core concepts","text":""},{"location":"autoalbument/introduction/#what-is-autoalbument","title":"What is AutoAlbument","text":"<p>AutoAlbument is a tool that automatically searches for the best augmentation policies for your data.</p> <p>Under the hood, it uses the Faster AutoAugment algorithm. Briefly speaking, the idea is to use a GAN-like architecture in which Generator applies augmentation to some input images, and Discriminator must determine whether an image was or wasn't augmented. This process helps to find augmentation policies that will produce images similar to the original images.</p>"},{"location":"autoalbument/introduction/#how-to-use-autoalbument","title":"How to use AutoAlbument","text":"<p>To use AutoAlbument, you need to define two things: a PyTorch Dataset for your data and configuration parameters for AutoAlbument. You can read the detailed instruction in the How to use AutoAlbument article.</p> <p>Internally AutoAlbument uses PyTorch Lightning for training a GAN and Hydra for handling configuration parameters.</p> <p>Here are a few things about AutoAlbument and Hydra.</p>"},{"location":"autoalbument/introduction/#hydra","title":"Hydra","text":"<p>The main internal configuration file is located at autoalbument/cli/conf/config.yaml</p> <p>Here is its content:</p> Text Only<pre><code>defaults:\n - _version\n - task\n - policy_model: default\n - classification_model: default\n - semantic_segmentation_model: default\n - data: default\n - searcher: default\n - trainer: default\n - optim: default\n - callbacks: default\n - logger: default\n - hydra: default\n - seed\n - search\n</code></pre> <p>Basically, it includes a bunch of config files with default values. Those config files are split into sets of closely related parameters such as model parameters or optimizer parameters. All default config files are located in their respective directories inside autoalbument/cli/conf</p> <p>The main config file also includes the <code>search.yaml</code> file, which you will use for overriding default parameters for your specific dataset and task (you can read more about creating the <code>search.yaml</code> file with <code>autoalbument-create</code> in How to use AutoAlbument)</p> <p>To allow great flexibility, AutoAlbument relies heavily on the <code>instantiate</code> function from Hydra. This function allows to define a path to a Python class in a YAML config (using the <code>_target_</code> parameter) along with arguments to that class, and Hydra will create an instance of this class with the provided arguments.</p> <p>As a practice example, if a config contains a definition like this:</p> Text Only<pre><code>_target_: autoalbument.faster_autoaugment.models.ClassificationModel\nnum_classes: 10\narchitecture: resnet18\npretrained: False\n</code></pre> <p>AutoAlbument will translate it approximately to the following call:</p> Text Only<pre><code>from autoalbument.faster_autoaugment.models import ClassificationModel\n\nmodel = ClassificationModel(num_classes=10, architecture='resnet18', pretrained=False)\n</code></pre> <p>By relying on this feature, AutoAlbument allows customizing its behavior without changing the library's internal code.</p>"},{"location":"autoalbument/introduction/#pytorch-lightning","title":"PyTorch Lightning","text":"<p>AutoAlbument relies on PyTorch Lightning to train a GAN. In AutoAlbument configs, you can configure PyTorch Lightning by passing the appropriate arguments to Trainer through the <code>trainer</code> config or defining a list of Callbacks through the <code>callbacks</code> config.</p>"},{"location":"autoalbument/metrics/","title":"Metrics and their meaning","text":"<p>During the search phase, AutoAlbument outputs four metrics: <code>loss</code>, <code>d_loss</code>, <code>a_loss</code>, and <code>Average Parameter Change</code> (at the end of an epoch).</p>"},{"location":"autoalbument/metrics/#a_loss","title":"a_loss","text":"<p><code>a_loss</code> is a loss for the policy network (or Generator in terms of GAN), which applies augmentations to input images.</p>"},{"location":"autoalbument/metrics/#d_loss","title":"d_loss","text":"<p><code>d_loss</code> is a loss for the Discriminator, the network that tries to guess whether the input image is an augmented or non-augmented one.</p>"},{"location":"autoalbument/metrics/#loss","title":"loss","text":"<p><code>loss</code> is a task-specific loss (<code>CrossEntropyLoss</code> for classification, <code>BCEWithLogitsLoss</code> for semantic segmentation) that acts as a regularizer and prevents the policy network from applying such augmentations that will make an object with class A looks like an object with class B.</p>"},{"location":"autoalbument/metrics/#average-parameter-change","title":"Average Parameter Change","text":"<p><code>Average Parameter Change</code> is a difference between magnitudes of augmentation parameters multiplied by their probabilities at the end of an epoch and the same parameters at the beginning of the epoch. The metric is calculated using the following formula:</p> <p></p> <ul> <li><code>m'</code>  and <code>m</code> are magnitude values for the i-th augmentation at the end and the beginning of the epoch, respectively.</li> <li><code>p'</code>  and <code>p</code> are probability values for the i-th augmentation at the end and the beginning of the epoch, respectively.</li> </ul> <p>The intuition behind this metric is that at the beginning, augmentation parameters are initialized at random, so they are now optimal and prone to heavy change at each epoch. After some time, these parameters should begin to converge, and they should change less at each epoch.</p>"},{"location":"autoalbument/metrics/#examples-for-metric-values","title":"Examples for metric values","text":"<p>Below are TensorBoard logs for AutoAlbument on different datasets. The search was performed using AutoAlbument configs from the examples directory.</p> <ul> <li>CIFAR10</li> <li>SVHN</li> <li>ImageNet</li> <li>Pascal VOC</li> <li>Cityscapes</li> </ul> <p>As you see, in all these charts, <code>loss</code> is slightly decreasing at each epoch, and <code>a_loss</code> or <code>d_loss</code> could either decrease or increase. <code>Average Parameter Change</code> is usually large at first epochs, but then it starts to decrease. As a rule of thumb, to decide whether you should stop AutoAlbument search and use the resulting policy, you should check that <code>Average Parameter Change</code> is stopped decreasing and started to oscillate, wait for a few more epochs, and use the found policy from that epoch.</p> <p>In autoalbument-benchmaks, we use AutoAlbument policies produced by the last epoch on these charts.</p>"},{"location":"autoalbument/search_algorithms/","title":"Search algorithms","text":"<p>AutoAlbument uses the following algorithms to search for augmentation policies.</p>"},{"location":"autoalbument/search_algorithms/#faster-autoaugment","title":"Faster AutoAugment","text":"<p>\"Faster AutoAugment: Learning Augmentation Strategies using Backpropagation\" by Ryuichiro Hataya, Jan Zdenek, Kazuki Yoshizoe, and Hideki Nakayama. Paper | Original implementation</p>"},{"location":"autoalbument/tuning_parameters/","title":"Tuning the search parameters","text":"<p>The <code>search.yaml</code> file contains parameters for the search of augmentation policies. Here is an example <code>search.yaml</code> for image classification on the CIFAR-10 dataset, and here is an example <code>search.yaml</code> for semantic segmentation on the Pascal VOC dataset.</p>"},{"location":"autoalbument/tuning_parameters/#task-specific-model","title":"Task-specific model","text":"<p>A task-specific model is a model that classifies images for a classification task or outputs masks for a semantic segmentation task. Settings for a task-specific model are defined by either <code>classification_model</code> or <code>semantic_segmentation_model</code> depending on a selected task. Ideally, you should select the same model (the same architecture and the same pretrained weights) that you will use in an actual task. AutoAlbument uses models from PyTorch Image Models and Segmentation models packages for classification and semantic segmentation respectively.</p>"},{"location":"autoalbument/tuning_parameters/#base-pytorch-parameters","title":"Base PyTorch parameters.","text":"<p>You may want to adjust the following parameters for a PyTorch pipeline:</p> <ul> <li><code>data.dataloader</code> parameters such as batch_size and <code>num_workers</code></li> <li>Number of epochs to search for best augmentation policies in <code>optim.epochs</code>.</li> <li>Learning rate for optimizers in <code>optim.main.lr</code> and <code>optim.policy.lr</code>.</li> </ul>"},{"location":"autoalbument/tuning_parameters/#parameters-for-the-augmentations-search","title":"Parameters for the augmentations search.","text":"<p>Those parameters are defined in <code>policy_model</code>. You may want to tune the following ones:</p> <ul> <li> <p><code>num_sub_policies</code> - number of distinct augmentation sub-policies. A random sub-policy is selected in each iteration, and that sub-policy is applied to input data. The larger number of sub-policies will produce a more diverse set of augmentations. On the other side, the more sub-policies you have, the more time and data you need to tune those sub-policies correctly.</p> </li> <li> <p><code>num_chunks</code> controls the balance between speed and diversity of augmentations in a search phase. Each batch is split-up into <code>num_chunks</code> chunks, and then a random sub-policy is applied to each chunk separately. The larger the value of <code>num_chunks</code> helps to learn augmentation policies better but simultaneously increases the searching time. Authors of FasterAutoAugment used such values for <code>num_chunks</code> that each chunk consisted of 8 to 16 images.</p> </li> <li> <p><code>operation_count</code> - the number of augmentation operations that will be applied to each input data instance. For example, <code>operation_count: 1</code> means that only one operation will be applied to an input image/mask, and <code>operation_count: 4</code> means that four sequential operations will be applied to each input image/mask. The larger number of operations produces a more diverse set of augmentations but simultaneously increases the searching time.</p> </li> </ul>"},{"location":"autoalbument/tuning_parameters/#preprocessing-transforms","title":"Preprocessing transforms","text":"<p>If images have different sizes or you want to train a model on image patches, you could define preprocessing transforms (such as Resizing, Cropping, and Padding) in <code>data.preprocessing</code>. Those transforms will always be applied to all input data. Found augmentation policies will also contain those preprocessing transforms.</p> <p>Note that it is crucial for Policy Model (a model that searches for augmentation parameters) to receive images of the same size that will be used during the training of an actual model. For some augmentations, parameters depend on input data's height and width (for example, hole sizes for the Cutout augmentation).</p>"},{"location":"autoalbument/examples/cifar10/","title":"Image classification on the CIFAR10 dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/cifar10</p>"},{"location":"autoalbument/examples/cifar10/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/cifar10/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/cifar10/#modelpy","title":"model.py","text":"Python"},{"location":"autoalbument/examples/cityscapes/","title":"Semantic segmentation on Cityscapes dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/cityscapes</p>"},{"location":"autoalbument/examples/cityscapes/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/cityscapes/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/imagenet/","title":"Image classification on the ImageNet dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/imagenet</p>"},{"location":"autoalbument/examples/imagenet/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/imagenet/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/list/","title":"List of examples","text":"<ul> <li>Image classification on the CIFAR10 dataset.</li> <li>Image classification on the SVHN dataset.</li> <li>Image classification on the ImageNet dataset.</li> <li>Semantic segmentation on the Pascal VOC dataset.</li> <li>Semantic segmentation on the Cityscapes dataset.</li> </ul> <p>To run the search with an example config:</p> Bash<pre><code>autoalbument-search --config-dir &lt;/path/to/directory_with_dataset.py_and_search.yaml&gt;\n</code></pre>"},{"location":"autoalbument/examples/pascal_voc/","title":"Semantic segmentation on the Pascal VOC dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/pascal_voc</p>"},{"location":"autoalbument/examples/pascal_voc/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/pascal_voc/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/svhn/","title":"Image classification on the SVHN dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/svhn</p>"},{"location":"autoalbument/examples/svhn/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/svhn/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/svhn/#modelpy","title":"model.py","text":"Python"},{"location":"contributing/coding_guidelines/","title":"Coding Guidelines","text":"<p>This document outlines the coding standards and best practices for contributing to Albumentations.</p>"},{"location":"contributing/coding_guidelines/#important-note-about-guidelines","title":"Important Note About Guidelines","text":"<p>These guidelines represent our current best practices, developed through experience maintaining and expanding the Albumentations codebase. While some existing code may not strictly follow these standards (due to historical reasons), we are gradually refactoring the codebase to align with these guidelines.</p> <p>For new contributions:</p> <ul> <li>All new code must follow these guidelines</li> <li>All modifications to existing code should move it closer to these standards</li> <li>Pull requests that introduce patterns we're trying to move away from will not be accepted</li> </ul> <p>For existing code:</p> <ul> <li>You may encounter patterns that don't match these guidelines (e.g., transforms with \"Random\" prefix or Union types for parameters)</li> <li>These are considered technical debt that we're working to address</li> <li>When modifying existing code, take the opportunity to align it with current standards where possible</li> </ul>"},{"location":"contributing/coding_guidelines/#code-style-and-formatting","title":"Code Style and Formatting","text":""},{"location":"contributing/coding_guidelines/#pre-commit-hooks","title":"Pre-commit Hooks","text":"<p>We use pre-commit hooks to maintain consistent code quality. These hooks automatically check and format your code before each commit.</p> <ul> <li>Install pre-commit if you haven't already:</li> </ul> Bash<pre><code>pip install pre-commit\npre-commit install\n</code></pre> <ul> <li>The hooks will run automatically on <code>git commit</code>. To run manually:</li> </ul> Bash<pre><code>pre-commit run --files $(find albumentations -type f)\n</code></pre>"},{"location":"contributing/coding_guidelines/#python-version-and-type-hints","title":"Python Version and Type Hints","text":"<ul> <li>Use Python 3.9+ features and syntax</li> <li>Always include type hints using Python 3.10+ typing syntax:</li> </ul> Python<pre><code># Correct\ndef transform(self, value: float, range: tuple[float, float]) -&gt; float:\n\n# Incorrect - don't use capital-case types\ndef transform(self, value: float, range: Tuple[float, float]) -&gt; Float:\n</code></pre> <ul> <li>Use <code>|</code> instead of <code>Union</code> and for optional types:</li> </ul> Python<pre><code># Correct\ndef process(value: int | float | None) -&gt; str:\n\n# Incorrect\ndef process(value: Optional[Union[int, float]) -&gt; str:\n</code></pre>"},{"location":"contributing/coding_guidelines/#naming-conventions","title":"Naming Conventions","text":""},{"location":"contributing/coding_guidelines/#transform-names","title":"Transform Names","text":"<ul> <li>Avoid adding \"Random\" prefix to new transforms</li> </ul> Python<pre><code># Correct\nclass Brightness(ImageOnlyTransform):\n\n# Incorrect (historical pattern)\nclass RandomBrightness(ImageOnlyTransform):\n</code></pre>"},{"location":"contributing/coding_guidelines/#parameter-naming","title":"Parameter Naming","text":"<ul> <li>Use <code>_range</code> suffix for interval parameters:</li> </ul> Python<pre><code># Correct\nbrightness_range: tuple[float, float]\nshadow_intensity_range: tuple[float, float]\n\n# Incorrect\nbrightness_limit: tuple[float, float]\nshadow_intensity: tuple[float, float]\n</code></pre>"},{"location":"contributing/coding_guidelines/#standard-parameter-names","title":"Standard Parameter Names","text":"<p>For transforms that handle gaps or boundaries, use these consistent names:</p> <ul> <li><code>border_mode</code>: Specifies how to handle gaps, not <code>mode</code> or <code>pad_mode</code></li> <li><code>fill</code>: Defines how to fill holes (pixel value or method), not <code>fill_value</code>, <code>cval</code>, <code>fill_color</code>, <code>pad_value</code>, <code>pad_cval</code>, <code>value</code>, <code>color</code></li> <li><code>fill_mask</code>: Same as <code>fill</code> but for mask filling, not <code>fill_mask_value</code>, <code>fill_mask_color</code>, <code>fill_mask_cval</code></li> </ul>"},{"location":"contributing/coding_guidelines/#parameter-types-and-ranges","title":"Parameter Types and Ranges","text":""},{"location":"contributing/coding_guidelines/#parameter-definitions","title":"Parameter Definitions","text":"<ul> <li>Prefer range parameters over fixed values:</li> </ul> Python<pre><code># Correct\ndef __init__(self, brightness_range: tuple[float, float] = (-0.2, 0.2)):\n\n# Avoid\ndef __init__(self, brightness: float = 0.2):\n</code></pre>"},{"location":"contributing/coding_guidelines/#avoid-union-types-for-parameters","title":"Avoid Union Types for Parameters","text":"<ul> <li>Don't use <code>Union[float, tuple[float, float]]</code> for parameters</li> <li>Instead, always use ranges where sampling is needed:</li> </ul> Python<pre><code># Correct\nscale_range: tuple[float, float] = (0.5, 1.5)\n\n# Avoid\nscale: float | tuple[float, float] = 1.0\n</code></pre> <ul> <li>For fixed values, use same value for both range ends:</li> </ul> Python<pre><code>brightness_range = (0.1, 0.1)  # Fixed brightness of 0.1\n</code></pre>"},{"location":"contributing/coding_guidelines/#transform-design-principles","title":"Transform Design Principles","text":""},{"location":"contributing/coding_guidelines/#relative-parameters","title":"Relative Parameters","text":"<ul> <li>Prefer parameters that are relative to image dimensions rather than fixed pixel values:</li> </ul> Python<pre><code># Correct - relative to image size\ndef __init__(self, crop_size_range: tuple[float, float] = (0.1, 0.3)):\n    # crop_size will be fraction of min(height, width)\n\n# Avoid - fixed pixel values\ndef __init__(self, crop_size_range: tuple[int, int] = (32, 96)):\n    # crop_size will be fixed regardless of image size\n</code></pre>"},{"location":"contributing/coding_guidelines/#data-type-consistency","title":"Data Type Consistency","text":"<ul> <li>Ensure transforms produce consistent results regardless of input data type</li> <li>Use provided decorators to handle type conversions:</li> <li><code>@uint8_io</code>: For transforms that work with uint8 images</li> <li><code>@float32_io</code>: For transforms that work with float32 images</li> </ul> <p>The decorators will:</p> <ul> <li>Pass through images that are already in the target type without conversion</li> <li>Convert other types as needed and convert back after processing</li> </ul> Python<pre><code>@uint8_io  # If input is uint8 =&gt; use as is; if float32 =&gt; convert to uint8, process, convert back\ndef apply(self, img: np.ndarray, **params) -&gt; np.ndarray:\n    # img is guaranteed to be uint8\n    # if input was float32 =&gt; result will be converted back to float32\n    # if input was uint8 =&gt; result will stay uint8\n    return cv2.blur(img, (3, 3))\n\n@float32_io  # If input is float32 =&gt; use as is; if uint8 =&gt; convert to float32, process, convert back\ndef apply(self, img: np.ndarray, **params) -&gt; np.ndarray:\n    # img is guaranteed to be float32 in range [0, 1]\n    # if input was uint8 =&gt; result will be converted back to uint8\n    # if input was float32 =&gt; result will stay float32\n    return img * 0.5\n\n# Avoid - manual type conversion\ndef apply(self, img: np.ndarray, **params) -&gt; np.ndarray:\n    if img.dtype != np.uint8:\n        img = (img * 255).clip(0, 255).astype(np.uint8)\n    result = cv2.blur(img, (3, 3))\n    if img.dtype != np.uint8:\n        result = result.astype(np.float32) / 255\n    return result\n</code></pre>"},{"location":"contributing/coding_guidelines/#channel-flexibility","title":"Channel Flexibility","text":"<ul> <li>Support arbitrary number of channels unless specifically constrained:</li> </ul> <p>```python   # Correct - works with any number of channels   def apply(self, img: np.ndarray, **params) -&gt; np.ndarray:       # img shape is (H, W, C), works for any C       return img * self.factor</p> <p># Also correct - explicitly requires RGB   def apply(self, img: np.ndarray, **params) -&gt; np.ndarray:       if img.shape[-1] != 3:           raise ValueError(\"Transform requires RGB image\")       return rgb_to_hsv(img)  # RGB-specific processing</p>"},{"location":"contributing/coding_guidelines/#random-number-generation","title":"Random Number Generation","text":""},{"location":"contributing/coding_guidelines/#using-random-generators","title":"Using Random Generators","text":"<ul> <li>Use class-level random generators instead of direct numpy or random calls:</li> </ul> Python<pre><code># Correct\nvalue = self.random_generator.uniform(0, 1, size=image.shape)\nchoice = self.py_random.choice(options)\n\n# Incorrect\nvalue = np.random.uniform(0, 1, size=image.shape)\nchoice = random.choice(options)\n</code></pre> <ul> <li>Prefer Python's standard library <code>random</code> over <code>numpy.random</code>:</li> </ul> Python<pre><code># Correct - using standard library random (faster)\nvalue = self.py_random.uniform(0, 1)\nchoice = self.py_random.choice(options)\n\n# Use numpy.random only when needed\nvalue = self.random_generator.randint(0, 255, size=image.shape)\n</code></pre>"},{"location":"contributing/coding_guidelines/#parameter-sampling","title":"Parameter Sampling","text":"<ul> <li>Handle all probability calculations in <code>get_params</code> or <code>get_params_dependent_on_data</code></li> <li>Don't perform random operations in <code>apply_xxx</code> or <code>__init__</code> methods:</li> </ul> Python<pre><code>def get_params(self):\n    return {\n        \"brightness\": self.random_generator.uniform(\n            self.brightness_range[0],\n            self.brightness_range[1]\n        )\n    }\n</code></pre>"},{"location":"contributing/coding_guidelines/#transform-development","title":"Transform Development","text":""},{"location":"contributing/coding_guidelines/#method-definitions","title":"Method Definitions","text":"<ul> <li>Don't use default arguments in <code>apply_xxx</code> methods:</li> </ul> Python<pre><code># Correct\ndef apply_to_mask(self, mask: np.ndarray, fill_mask: int) -&gt; np.ndarray:\n\n# Incorrect\ndef apply_to_mask(self, mask: np.ndarray, fill_mask: int = 0) -&gt; np.ndarray:\n</code></pre>"},{"location":"contributing/coding_guidelines/#parameter-generation","title":"Parameter Generation","text":""},{"location":"contributing/coding_guidelines/#using-get_params_dependent_on_data","title":"Using get_params_dependent_on_data","text":"<p>This method provides access to image shape and target data for parameter generation:</p> Python<pre><code>def get_params_dependent_on_data(\n    self,\n    params: dict[str, Any],\n    data: dict[str, Any]\n) -&gt; dict[str, Any]:\n    # Access image shape - always available\n    height, width = params[\"shape\"][:2]\n\n    # Access targets if they were passed to transform\n    image = data.get(\"image\")  # Original image\n    mask = data.get(\"mask\")    # Segmentation mask\n    bboxes = data.get(\"bboxes\")  # Bounding boxes\n    keypoints = data.get(\"keypoints\")  # Keypoint coordinates\n\n    # Example: Calculate parameters based on image size\n    crop_size = min(height, width) // 2\n    center_x = width // 2\n    center_y = height // 2\n\n    return {\n        \"crop_size\": crop_size,\n        \"center\": (center_x, center_y)\n    }\n</code></pre> <p>The method receives:</p> <ul> <li><code>params</code>: Dictionary containing image metadata, where <code>params[\"shape\"]</code> is always available</li> <li><code>data</code>: Dictionary containing all targets passed to the transform</li> </ul> <p>Use this method when you need to:</p> <ul> <li>Calculate parameters based on image dimensions</li> <li>Access target data for parameter generation</li> <li>Ensure transform parameters are appropriate for the input data</li> </ul>"},{"location":"contributing/coding_guidelines/#parameter-validation-with-initschema","title":"Parameter Validation with <code>InitSchema</code>","text":"<p>Each transform must include an <code>InitSchema</code> class that inherits from <code>BaseTransformInitSchema</code>. This class is responsible for:</p> <ul> <li>Validating input parameters before <code>__init__</code> execution</li> <li>Converting parameter types if needed</li> <li>Ensuring consistent parameter handling</li> </ul> Python<pre><code># Correct - full parameter validation\nclass RandomGravel(ImageOnlyTransform):\n    class InitSchema(BaseTransformInitSchema):\n      slant_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]\n      brightness_coefficient: float = Field(gt=0, le=1)\n\n\n  def __init__(self, slant_range: tuple[float, float], brightness_coefficient: float, p: float = 0.5):\n      super().__init__(p=p)\n      self.slant_range = slant_range\n      self.brightness_coefficient = brightness_coefficient\n</code></pre> Python<pre><code># Incorrect - missing InitSchema\nclass RandomGravel(ImageOnlyTransform):\n    def __init__(self, slant_range: tuple[float, float], brightness_coefficient: float, p: float = 0.5):\n        super().__init__(p=p)\n        self.slant_range = slant_range\n        self.brightness_coefficient = brightness_coefficient\n</code></pre>"},{"location":"contributing/coding_guidelines/#coordinate-systems","title":"Coordinate Systems","text":""},{"location":"contributing/coding_guidelines/#image-center-calculations","title":"Image Center Calculations","text":"<p>The center point calculation differs slightly between targets:</p> <ul> <li>For images, masks, and keypoints:</li> </ul> Python<pre><code># Correct - using helper function\nfrom albumentations.augmentations.geometric.functional import center\ncenter_x, center_y = center(image_shape)  # Returns ((width-1)/2, (height-1)/2)\n\n# Incorrect - manual calculation might miss the -1\ncenter_x = width / 2  # Wrong!\ncenter_y = height / 2  # Wrong!\n</code></pre> <ul> <li>For bounding boxes:</li> </ul> Python<pre><code># Correct - using helper function\nfrom albumentations.augmentations.geometric.functional import center_bbox\ncenter_x, center_y = center_bbox(image_shape)  # Returns (width/2, height/2)\n\n# Incorrect - using wrong center calculation\ncenter_x, center_y = center(image_shape)  # Wrong for bboxes!\n</code></pre> <p>This small difference is crucial for pixel-perfect accuracy. Always use the appropriate helper functions:</p> <ul> <li><code>center()</code> for image, mask, and keypoint transformations</li> <li><code>center_bbox()</code> for bounding box transformations</li> </ul>"},{"location":"contributing/coding_guidelines/#serialization-compatibility","title":"Serialization Compatibility","text":"<ul> <li>Ensure transforms work with both tuples and lists for range parameters</li> <li>Test serialization/deserialization with JSON and YAML formats</li> </ul>"},{"location":"contributing/coding_guidelines/#documentation","title":"Documentation","text":""},{"location":"contributing/coding_guidelines/#docstrings","title":"Docstrings","text":"<ul> <li>Use Google-style docstrings</li> <li>Include type information, parameter descriptions, and examples:</li> </ul> Python<pre><code>def transform(self, image: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply brightness transformation to the image.\n\n    Args:\n        image: Input image in RGB format.\n\n    Returns:\n        Transformed image.\n\n    Examples:\n        &gt;&gt;&gt; transform = Brightness(brightness_range=(-0.2, 0.2))\n        &gt;&gt;&gt; transformed = transform(image=image)\n    \"\"\"\n</code></pre>"},{"location":"contributing/coding_guidelines/#comments","title":"Comments","text":"<ul> <li>Add comments for complex logic</li> <li>Explain why, not what (the code shows what)</li> <li>Keep comments up to date with code changes</li> </ul>"},{"location":"contributing/coding_guidelines/#updating-transform-documentation","title":"Updating Transform Documentation","text":"<p>When adding a new transform or modifying the targets of an existing one, you must update the transforms documentation in the README:</p> <ol> <li>Generate the updated documentation by running:</li> </ol> Bash<pre><code>python -m tools.make_transforms_docs make\n</code></pre> <ol> <li> <p>This will output a formatted list of all transforms and their supported targets</p> </li> <li> <p>Update the relevant section in README.md with the new information</p> </li> <li> <p>Ensure the documentation accurately reflects which targets (image, mask, bboxes, keypoints, etc.) are supported by each transform</p> </li> </ol> <p>This helps maintain accurate and up-to-date documentation about transform capabilities.</p>"},{"location":"contributing/coding_guidelines/#testing","title":"Testing","text":""},{"location":"contributing/coding_guidelines/#test-coverage","title":"Test Coverage","text":"<ul> <li>Write tests for all new functionality</li> <li>Include edge cases and error conditions</li> <li>Ensure reproducibility with fixed random seeds</li> </ul>"},{"location":"contributing/coding_guidelines/#test-organization","title":"Test Organization","text":"<ul> <li>Place tests in the appropriate module under <code>tests/</code></li> <li>Follow existing test patterns and naming conventions</li> <li>Use pytest fixtures when appropriate</li> </ul>"},{"location":"contributing/coding_guidelines/#code-review-guidelines","title":"Code Review Guidelines","text":"<p>Before submitting your PR:</p> <ol> <li>Run all tests</li> <li>Run pre-commit hooks</li> <li>Check type hints</li> <li>Update documentation if needed</li> <li>Ensure code follows these guidelines</li> </ol>"},{"location":"contributing/coding_guidelines/#getting-help","title":"Getting Help","text":"<p>If you have questions about these guidelines:</p> <ol> <li>Join our Discord community</li> <li>Open a GitHub issue</li> <li>Ask in your pull request</li> </ol>"},{"location":"contributing/environment_setup/","title":"Setting Up Your Development Environment","text":"<p>This guide will help you set up your development environment for contributing to Albumentations.</p>"},{"location":"contributing/environment_setup/#prerequisites","title":"Prerequisites","text":"<ul> <li>Python 3.9 or higher</li> <li>Git</li> <li>A GitHub account</li> </ul>"},{"location":"contributing/environment_setup/#step-by-step-setup","title":"Step-by-Step Setup","text":""},{"location":"contributing/environment_setup/#1-fork-and-clone-the-repository","title":"1. Fork and Clone the Repository","text":"<ol> <li>Fork the Albumentations repository on GitHub</li> <li>Clone your fork locally:</li> </ol> Bash<pre><code>git clone https://github.com/YOUR_USERNAME/albumentations.git\ncd albumentations\n</code></pre>"},{"location":"contributing/environment_setup/#2-create-a-virtual-environment","title":"2. Create a Virtual Environment","text":"<p>Choose the appropriate commands for your operating system:</p>"},{"location":"contributing/environment_setup/#linux-macos","title":"Linux / macOS","text":"Bash<pre><code>python3 -m venv env\nsource env/bin/activate\n</code></pre>"},{"location":"contributing/environment_setup/#windows-cmdexe","title":"Windows (cmd.exe)","text":"Bash<pre><code>python -m venv env\nenv\\Scripts\\activate.bat\n</code></pre>"},{"location":"contributing/environment_setup/#windows-powershell","title":"Windows (PowerShell)","text":"Bash<pre><code>python -m venv env\nenv\\Scripts\\activate.ps1\n</code></pre>"},{"location":"contributing/environment_setup/#3-install-dependencies","title":"3. Install Dependencies","text":"<ol> <li>Install the project in editable mode:</li> </ol> Bash<pre><code>pip install -e .\n</code></pre> <ol> <li>Install development dependencies:</li> </ol> Bash<pre><code>pip install -r requirements-dev.txt\n</code></pre>"},{"location":"contributing/environment_setup/#4-set-up-pre-commit-hooks","title":"4. Set Up Pre-commit Hooks","text":"<p>Pre-commit hooks help maintain code quality by automatically checking your changes before each commit.</p> <ol> <li>Install pre-commit:</li> </ol> Bash<pre><code>pip install pre-commit\n</code></pre> <ol> <li>Set up the hooks:</li> </ol> Bash<pre><code>pre-commit install\n</code></pre> <ol> <li>(Optional) Run hooks manually on all files:</li> </ol> Bash<pre><code>pre-commit run --files $(find albumentations -type f)\n</code></pre>"},{"location":"contributing/environment_setup/#verifying-your-setup","title":"Verifying Your Setup","text":""},{"location":"contributing/environment_setup/#run-tests","title":"Run Tests","text":"<p>Ensure everything is set up correctly by running the test suite:</p> Bash<pre><code>pytest\n</code></pre>"},{"location":"contributing/environment_setup/#common-issues-and-solutions","title":"Common Issues and Solutions","text":""},{"location":"contributing/environment_setup/#permission-errors","title":"Permission Errors","text":"<ul> <li>Linux/macOS: If you encounter permission errors, try using <code>sudo</code> for system-wide installations or consider using <code>--user</code> flag with pip</li> <li>Windows: Run your terminal as administrator if you encounter permission issues</li> </ul>"},{"location":"contributing/environment_setup/#virtual-environment-not-activating","title":"Virtual Environment Not Activating","text":"<ul> <li>Ensure you're in the correct directory</li> <li>Check that Python is properly installed and in your system PATH</li> <li>Try creating the virtual environment with the full Python path</li> </ul>"},{"location":"contributing/environment_setup/#import-errors-after-installation","title":"Import Errors After Installation","text":"<ul> <li>Verify that you're using the correct virtual environment</li> <li>Confirm that all dependencies were installed successfully</li> <li>Try reinstalling the package in editable mode</li> </ul>"},{"location":"contributing/environment_setup/#next-steps","title":"Next Steps","text":"<p>After setting up your environment:</p> <ol> <li>Create a new branch for your work</li> <li>Make your changes</li> <li>Run tests and pre-commit hooks</li> <li>Submit a pull request</li> </ol> <p>For more detailed information about contributing, please refer to Coding Guidelines</p>"},{"location":"contributing/environment_setup/#getting-help","title":"Getting Help","text":"<p>If you encounter any issues with the setup:</p> <ol> <li>Check our Discord community</li> <li>Open an issue on GitHub</li> <li>Review existing issues for similar problems and solutions</li> </ol>"},{"location":"examples/","title":"List of examples","text":"<ul> <li>Defining a simple augmentation pipeline for image augmentation</li> <li>Using Albumentations to augment bounding boxes for object detection tasks</li> <li>How to use Albumentations for detection tasks if you need to keep all bounding boxes</li> <li>Using Albumentations for a semantic segmentation task</li> <li>Using Albumentations to augment keypoints</li> <li>Applying the same augmentation with the same parameters to multiple images, masks, bounding boxes, or keypoints</li> <li>Weather augmentations in Albumentations</li> <li>Example of applying XYMasking transform</li> <li>Example of applying ChromaticAberration transform</li> <li>Example of applying Morphological transform</li> <li>Example of applying D4 transform</li> <li>Example of applying RandomGridShuffle transform</li> <li>Example of applying OverlayElements transform</li> <li>Example of applying TextImage transform</li> <li>Migrating from torchvision to Albumentations</li> <li>Debugging an augmentation pipeline with ReplayCompose</li> <li>How to save and load parameters of an augmentation pipeline</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> <li>How to save and load transforms to HuggingFace Hub.</li> </ul>"},{"location":"examples/#examples-of-how-to-use-albumentations-with-different-deep-learning-frameworks","title":"Examples of how to use Albumentations with different deep learning frameworks","text":"<ul> <li>PyTorch</li> <li>PyTorch and Albumentations for image classification</li> <li>PyTorch and Albumentations for semantic segmentation</li> <li>TensorFlow 2</li> <li>Using Albumentations with Tensorflow</li> </ul>"},{"location":"external_resources/blog_posts_podcasts_talks/","title":"Blog posts, podcasts, talks, and videos about Albumentations","text":""},{"location":"external_resources/blog_posts_podcasts_talks/#blog-posts","title":"Blog posts","text":"<ul> <li>Custom Image Augmentation with Keras. Solving CIFAR-10 with Albumentations and TPU on Google Colab..</li> <li>Road detection using segmentation models and albumentations libraries on Keras.</li> <li>Image Data Augmentation for TensorFlow 2, Keras and PyTorch with Albumentations in Python</li> <li>Explore image augmentations using a convenient tool</li> <li>Image Augmentation using PyTorch and Albumentations</li> <li>Employing the albumentation library in PyTorch workflows. Bonus: Helper for selecting appropriate values!</li> <li>Overview of Albumentations: Open-source library for advanced image augmentations</li> </ul>"},{"location":"external_resources/blog_posts_podcasts_talks/#podcasts-talks-and-videos","title":"Podcasts, talks, and videos","text":"<ul> <li>PyConBY 2020: Eugene Khvedchenya - Albumentations: Fast and Flexible image augmentations</li> <li>Albumentations Framework: a fast image augmentations library | Interview with Dr. Vladimir Iglovikov</li> <li>Image Data Augmentation for TensorFlow 2, Keras and PyTorch with Albumentations in Python</li> <li>Bengali.AI competition - Ch 5. Image augmentations using albumentations</li> <li>Albumentations Tutorial for Data Augmentation</li> </ul>"},{"location":"external_resources/books/","title":"Books that mention Albumentations","text":"<ul> <li>Deep Learning For Dummies. John Paul Mueller, Luca Massaron. May 2019.</li> <li>Data Science Programming All-in-One For Dummies. John Paul Mueller, Luca Massaron. January 2020.</li> <li>PyTorch Computer Vision Cookbook. Michael Avendi. March 2020.</li> <li>Approaching (Almost) Any Machine Learning Problem. Abhishek Thakur. June 2020.</li> </ul>"},{"location":"external_resources/online_courses/","title":"Online classes that cover Albumentations","text":""},{"location":"external_resources/online_courses/#udemy","title":"Udemy","text":"<ul> <li>Modern Computer Vision &amp; Deep Learning with Python &amp; PyTorch</li> <li>Deep Learning for Image Segmentation with Python &amp; Pytorch</li> <li>Deep Learning Masterclass with TensorFlow 2 Over 20 Projects</li> <li>Master Deep Learning for Computer Vision in TensorFlow</li> <li>Deep Learning : Image Classification with Tensorflow in 2024</li> <li>Deep learning with PyTorch | Medical Imaging Competitions</li> <li>Veri Art\u0131r\u0131m\u0131: Albumentations ile Projelerle Veri Art\u0131r\u0131m\u0131</li> <li>Mastering Advanced Representation Learning (CV)</li> </ul>"},{"location":"external_resources/online_courses/#coursera","title":"Coursera","text":"<ul> <li>Deep Learning with PyTorch : Image Segmentation</li> <li>Facial Keypoint Detection with PyTorch</li> <li>Deep Learning with PyTorch : Object Localization</li> <li>Aerial Image Segmentation with PyTorch</li> </ul>"},{"location":"getting_started/augmentation_mapping/","title":"Transform Library Comparison Guide","text":"<p>This guide helps you find equivalent transforms between Albumentations and other popular libraries (torchvision and Kornia).</p>"},{"location":"getting_started/augmentation_mapping/#key-differences","title":"Key Differences","text":""},{"location":"getting_started/augmentation_mapping/#compared-to-torchvision","title":"Compared to TorchVision","text":"<ul> <li>Albumentations operates on numpy arrays (TorchVision uses PyTorch tensors)</li> <li>More parameters for fine-tuning transformations</li> <li>Built-in support for mask augmentation</li> <li>Better handling of bounding boxes and keypoints</li> </ul>"},{"location":"getting_started/augmentation_mapping/#compared-to-kornia","title":"Compared to Kornia","text":"<ul> <li>CPU-based numpy operations (Kornia uses GPU tensors)</li> <li>More comprehensive support for detection/segmentation</li> <li>Generally better CPU performance</li> <li>Simpler API for common tasks</li> </ul>"},{"location":"getting_started/augmentation_mapping/#common-transform-mappings","title":"Common Transform Mappings","text":""},{"location":"getting_started/augmentation_mapping/#basic-geometric-transforms","title":"Basic Geometric Transforms","text":"TorchVision Transform Albumentations Equivalent Notes Resize Resize / LongestMaxSize - TorchVision's <code>Resize</code> combines two Albumentations behaviors:\u00a0\u00a01. When given (h,w): equivalent to Albumentations <code>Resize</code>\u00a0\u00a02. When given single int + max_size: similar to <code>LongestMaxSize</code>- Albumentations allows separate interpolation method for masks- TorchVision has <code>antialias</code> parameter, Albumentations doesn't ScaleJitter OneOf + multiple Resize - Can be approximated in Albumentations using OneOf container with multiple Resize transforms- Example: <code>transforms = A.OneOf([</code> <code>A.Resize(height=int(target_h * scale), width=int(target_w * scale))</code> <code>for scale in np.linspace(0.1, 2.0, num=20)</code> <code>])</code>- Not exactly the same as continuous random scaling, but provides similar functionality RandomShortestSize OneOf + SmallestMaxSize - Can be approximated in Albumentations using: <code>transforms = A.OneOf([</code> <code>A.SmallestMaxSize(max_size=size, max_height=max_size, max_width=max_size)</code> <code>for size in [480, 512, 544, 576, 608]</code> <code>])</code>- Randomly selects size for shortest side while maintaining aspect ratio- Optional <code>max_size</code> parameter limits longest side- TorchVision has <code>antialias</code> parameter, Albumentations doesn't RandomResize OneOf + Resize - TorchVision: randomly selects single size S between <code>min_size</code> and <code>max_size</code>, sets both <code>width</code> and <code>height</code> to <code>S</code>- No direct equivalent in Albumentations (RandomScale preserves aspect ratio)- Can be approximated using: <code>transforms = A.OneOf([</code> <code>A.Resize(size, size)</code> <code>for size in range(min_size, max_size + 1, step)</code> <code>])</code> RandomCrop RandomCrop - Both perform random cropping with similar core functionality- Key differences:\u00a0\u00a01. TorchVision accepts single int for square crop, Albumentations requires both height and width\u00a0\u00a02. Padding options differ:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports padding parameter for pre-padding\u00a0\u00a0\u00a0\u00a0- Albumentations: offers <code>pad_position</code> parameter ('center', 'top_left', etc.)\u00a0\u00a03. Fill value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports dict mapping for different types\u00a0\u00a0\u00a0\u00a0- Albumentations: separate <code>fill</code> and <code>fill_mask</code> parameters\u00a0\u00a04. Padding modes:\u00a0\u00a0\u00a0\u00a0- TorchVision: 'constant', 'edge', 'reflect', 'symmetric'\u00a0\u00a0\u00a0\u00a0- Albumentations: uses OpenCV border modes RandomResizedCrop RandomResizedCrop - Nearly identical functionality and parameters- Key differences:\u00a0\u00a01. TorchVision accepts single int for square output, Albumentations requires <code>(height, width)</code> tuple\u00a0\u00a02. Default values are the same <code>(scale=(0.08, 1.0), ratio=(0.75, 1.3333))</code>\u00a0\u00a03. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Separate <code>mask_interpolation</code> parameter\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomIoUCrop RandomSizedBBoxSafeCrop - Both ensure safe cropping with respect to bounding boxes- Key differences:\u00a0\u00a01. TorchVision:\u00a0\u00a0\u00a0\u00a0- Implements exact SSD paper approach\u00a0\u00a0\u00a0\u00a0- Uses IoU-based sampling strategy\u00a0\u00a0\u00a0\u00a0- Requires explicit sanitization of boxes after crop\u00a0\u00a02. Albumentations:\u00a0\u00a0\u00a0\u00a0- Simpler approach ensuring bbox safety\u00a0\u00a0\u00a0\u00a0- Directly specifies target size\u00a0\u00a0\u00a0\u00a0- Automatically handles bbox cleanup- For exact SSD-style cropping, might need custom implementation in Albumentations CenterCrop CenterCrop - Both crop the center part of the input- Key differences:\u00a0\u00a01. Size specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: accepts single int for square crop or (height, width) tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: requires separate height and width parameters\u00a0\u00a02. Padding behavior:\u00a0\u00a0\u00a0\u00a0- TorchVision: always pads with 0 if image is smaller\u00a0\u00a0\u00a0\u00a0- Albumentations: optional padding with <code>pad_if_needed</code>\u00a0\u00a03. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Configurable padding mode and position\u00a0\u00a0\u00a0\u00a0- Separate fill values for image and mask\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomHorizontalFlip HorizontalFlip - Identical functionality- Both have default probability p=0.5- Only naming difference: TorchVision includes \"Random\" in name RandomVerticalFlip VerticalFlip - Identical functionality- Both have default probability p=0.5- Only naming difference: TorchVision includes \"Random\" in name Pad Pad - Similar core padding functionality- Both support:\u00a0\u00a0- Single int for all sides\u00a0\u00a0- (pad_x, pad_y) for symmetric padding\u00a0\u00a0- (left, top, right, bottom) for per-side padding- Key differences:\u00a0\u00a01. Padding modes:\u00a0\u00a0\u00a0\u00a0- TorchVision: 'constant', 'edge', 'reflect', 'symmetric'\u00a0\u00a0\u00a0\u00a0- Albumentations: uses OpenCV border modes\u00a0\u00a02. Fill value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports dict mapping for different types\u00a0\u00a0\u00a0\u00a0- Albumentations: separate <code>fill</code> and <code>fill_mask</code> parameters\u00a0\u00a03. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomZoomOut RandomScale + PadIfNeeded - No direct equivalent in Albumentations- Can be approximated by combining: <code>A.Compose([</code> <code>A.RandomScale(scale_limit=(0.0, 3.0), p=0.5),  # scale_limit=(0.0, 3.0) maps to side_range=(1.0, 4.0)</code> <code>A.PadIfNeeded(min_height=height, min_width=width, border_mode=cv2.BORDER_CONSTANT, value=fill)</code> <code>])</code>- Key differences:\u00a0\u00a01. TorchVision implements specific SSD paper approach\u00a0\u00a02. Albumentations requires composition of two transforms RandomRotation Rotate - Similar core rotation functionality but with different parameters- Key differences:\u00a0\u00a01. Angle specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>degrees</code> parameter (-degrees, +degrees) or (min, max)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>limit</code> parameter (-limit, +limit) or (min, max)\u00a0\u00a02. Output size control:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>expand=True/False</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>crop_border=True/False</code>\u00a0\u00a03. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- Separate mask interpolation\u00a0\u00a0\u00a0\u00a0- Bbox rotation methods ('largest_box' or 'ellipse')\u00a0\u00a0\u00a0\u00a0- More border modes\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a04. Center specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports custom center point\u00a0\u00a0\u00a0\u00a0- Albumentations: always uses image center RandomAffine Affine - Both support core affine operations (translation, rotation, scale, shear)- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single parameters for each transform\u00a0\u00a0\u00a0\u00a0- Albumentations: more flexible with dict options for x/y axes\u00a0\u00a02. Scale handling:\u00a0\u00a0\u00a0\u00a0- Albumentations adds <code>keep_ratio</code> and <code>balanced_scale</code>\u00a0\u00a0\u00a0\u00a0- Albumentations supports independent x/y scaling\u00a0\u00a03. Translation:\u00a0\u00a0\u00a0\u00a0- TorchVision: fraction only\u00a0\u00a0\u00a0\u00a0- Albumentations: both percent and pixels\u00a0\u00a04. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- <code>fit_output</code> to adjust image plane\u00a0\u00a0\u00a0\u00a0- Separate mask interpolation\u00a0\u00a0\u00a0\u00a0- More border modes\u00a0\u00a0\u00a0\u00a0- Bbox rotation methods\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomPerspective Perspective - Both apply random perspective transformations- Key differences:\u00a0\u00a01. Distortion control:\u00a0\u00a0\u00a0\u00a0- TorchVision: single <code>distortion_scale</code> (0 to 1)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>scale</code> tuple for corner movement range\u00a0\u00a02. Output handling:\u00a0\u00a0\u00a0\u00a0- Albumentations adds <code>keep_size</code> and <code>fit_output</code> options\u00a0\u00a0\u00a0\u00a0- Can control whether to maintain original size\u00a0\u00a03. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- Separate mask interpolation\u00a0\u00a0\u00a0\u00a0- More border modes\u00a0\u00a0\u00a0\u00a0- Better control over output size and fitting ElasticTransform ElasticTransform - Similar core functionality: both apply elastic deformations to images- Key differences:\u00a0\u00a01. Parameters have opposite meanings:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>alpha</code> (displacement), <code>sigma</code> (smoothness)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>alpha</code> (smoothness), <code>sigma</code> (displacement)\u00a0\u00a02. Default values reflect this difference:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>alpha=50.0, sigma=5.0</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>alpha=1.0, sigma=50.0</code>- Note on implementation:\u00a0\u00a0- Albumentations follows Simard et al. 2003 paper more closely:\u00a0\u00a0\u00a0\u00a0- \u03c3 should be ~0.05 * image_size\u00a0\u00a0\u00a0\u00a0- \u03b1 should be proportional to \u03c3- Additional Albumentations features:\u00a0\u00a0- <code>approximate</code> mode\u00a0\u00a0- <code>same_dxdy</code> option\u00a0\u00a0- Choice of noise distribution\u00a0\u00a0- Separate mask interpolation ColorJitter ColorJitter - Similar core functionality: both randomly adjust brightness, contrast, saturation, and hue- Key similarities:\u00a0\u00a01. Same parameter names and meanings\u00a0\u00a02. Same value ranges (e.g., hue should be in [-0.5, 0.5])\u00a0\u00a03. Random order of transformations- Key differences:\u00a0\u00a01. Default values:\u00a0\u00a0\u00a0\u00a0- TorchVision: all None by default\u00a0\u00a0\u00a0\u00a0- Albumentations: defaults to (0.8, 1.2) for brightness/contrast/saturation\u00a0\u00a02. Implementation:\u00a0\u00a0\u00a0\u00a0- TorchVision: uses Pillow\u00a0\u00a0\u00a0\u00a0- Albumentations: uses OpenCV (may produce slightly different results)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Explicit probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- Value saturation instead of uint8 overflow RandomChannelPermutation ChannelShuffle - Both randomly permute image channels- Key similarities:\u00a0\u00a01. Same core functionality\u00a0\u00a02. Work on multi-channel images (typically RGB)- Key differences:\u00a0\u00a01. Naming convention only\u00a0\u00a02. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomPhotometricDistort RandomOrder + ColorJitter + ChannelShuffle - TorchVision's transform is from SSD paper, combines:\u00a0\u00a01. Color jittering (brightness, contrast, saturation, hue)\u00a0\u00a02. Random channel permutation- Can be replicated in Albumentations using: <code>A.RandomOrder([</code> <code>A.ColorJitter(brightness=(0.875, 1.125),</code> <code>contrast=(0.5, 1.5),</code> <code>saturation=(0.5, 1.5),</code> <code>hue=(-0.05, 0.05),</code> <code>p=0.5),</code> <code>A.ChannelShuffle(p=0.5)</code> <code>])</code> Grayscale ToGray - Similar core functionality: convert RGB to grayscale- Key differences:\u00a0\u00a01. Output channels:\u00a0\u00a0\u00a0\u00a0- TorchVision: only 1 or 3 channels\u00a0\u00a0\u00a0\u00a0- Albumentations: supports any number of output channels\u00a0\u00a02. Conversion methods:\u00a0\u00a0\u00a0\u00a0- TorchVision: single method (weighted RGB)\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple methods via <code>method</code> parameter:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 weighted_average (default, same as TorchVision)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 from_lab, desaturation, average, max, pca\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- More flexible channel handling RGB ToRGB - Similar core functionality: convert to RGB format- Key differences:\u00a0\u00a01. Input handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: accepts 1 or 3 channel inputs\u00a0\u00a0\u00a0\u00a0- Albumentations: only accepts single-channel inputs\u00a0\u00a02. Output channels:\u00a0\u00a0\u00a0\u00a0- TorchVision: always 3 channels\u00a0\u00a0\u00a0\u00a0- Albumentations: configurable via <code>num_output_channels</code>\u00a0\u00a03. Behavior:\u00a0\u00a0\u00a0\u00a0- TorchVision: converts to RGB if not already RGB\u00a0\u00a0\u00a0\u00a0- Albumentations: strictly grayscale to RGB conversion\u00a0\u00a04. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomGrayscale ToGray - Similar core functionality: convert to grayscale with probability- Key differences:\u00a0\u00a01. Default probability:\u00a0\u00a0\u00a0\u00a0- TorchVision: p=0.1\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a02. Output handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: always preserves input channels\u00a0\u00a0\u00a0\u00a0- Albumentations: configurable output channels\u00a0\u00a03. Conversion methods:\u00a0\u00a0\u00a0\u00a0- TorchVision: single method\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple methods with different channel support:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 weighted_average, from_lab: 3-channel only\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 desaturation, average, max, pca: any number of channels\u00a0\u00a04. Channel requirements:\u00a0\u00a0\u00a0\u00a0- TorchVision: works with 1 or 3 channels\u00a0\u00a0\u00a0\u00a0- Albumentations: depends on method chosen GaussianBlur GaussianBlur - Similar core functionality: apply Gaussian blur with random kernel size- Key similarities:\u00a0\u00a01. Both support random kernel sizes\u00a0\u00a02. Both support random sigma values- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>kernel_size</code> (exact size), <code>sigma</code> (range)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> (size range), <code>sigma_limit</code> (range)\u00a0\u00a02. Kernel size constraints:\u00a0\u00a0\u00a0\u00a0- TorchVision: must specify exact size\u00a0\u00a0\u00a0\u00a0- Albumentations: can specify range (3, 7) or auto-compute\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- Auto-computation of kernel size from sigma GaussianNoise GaussNoise - Similar core functionality: add Gaussian noise to images- Key similarities:\u00a0\u00a01. Both support mean and standard deviation parameters- Key differences:\u00a0\u00a01. Parameter ranges:\u00a0\u00a0\u00a0\u00a0- TorchVision: fixed values for mean and sigma\u00a0\u00a0\u00a0\u00a0- Albumentations: ranges for both (<code>std_range</code>, <code>mean_range</code>)\u00a0\u00a02. Value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: expects float [0,1], has clip option\u00a0\u00a0\u00a0\u00a0- Albumentations: auto-scales based on dtype\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Per-channel noise option\u00a0\u00a0\u00a0\u00a0- Noise scale factor for performance\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomInvert InvertImg - Similar core functionality: invert image colors- Key similarities:\u00a0\u00a01. Both invert pixel values\u00a0\u00a02. Both have default probability of 0.5- Key differences:\u00a0\u00a01. Value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: works with [0,1] float tensors\u00a0\u00a0\u00a0\u00a0- Albumentations: auto-handles uint8 (255) and float32 (1.0) RandomPosterize Posterize - Similar core functionality: reduce color bits- Key similarities:\u00a0\u00a01. Both posterize images with probability p=0.5- Key differences:\u00a0\u00a01. Bits specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single fixed value [0-8]\u00a0\u00a0\u00a0\u00a0- Albumentations: flexible options with [1-7] (recommended):\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Single value for all channels\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Range (min_bits, max_bits)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Per-channel values [r,g,b]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Per-channel ranges [(r_min,r_max), ...]\u00a0\u00a02. Practical range:\u00a0\u00a0\u00a0\u00a0- TorchVision: includes 0 (black) and 8 (unchanged)\u00a0\u00a0\u00a0\u00a0- Albumentations: recommended [1-7] for actual posterization RandomSolarize Solarize - Similar core functionality: invert pixels above threshold- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both invert values above threshold- Key differences:\u00a0\u00a01. Threshold specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single fixed threshold value\u00a0\u00a0\u00a0\u00a0- Albumentations: range via <code>threshold_range</code>\u00a0\u00a02. Value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: works with raw threshold values\u00a0\u00a0\u00a0\u00a0- Albumentations: uses normalized [0,1] range:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 uint8: multiplied by 255\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 float32: multiplied by 1.0 RandomAdjustSharpness Sharpen - Similar core functionality: adjust image sharpness- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single <code>sharpness_factor</code>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 0: blurred\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 1: original\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 2: doubled sharpness\u00a0\u00a0\u00a0\u00a0- Albumentations: more controls:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 <code>alpha</code>: effect visibility [0,1]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 <code>lightness</code>: contrast control\u00a0\u00a02. Method options:\u00a0\u00a0\u00a0\u00a0- TorchVision: single method\u00a0\u00a0\u00a0\u00a0- Albumentations: two methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 'kernel': Laplacian operator\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 'gaussian': blur interpolation RandomAutocontrast AutoContrast Same core functionality with identical parameters (p=0.5) RandomEqualize Equalize - Similar core functionality: histogram equalization- Key similarities:\u00a0\u00a01. Both have default probability <code>p=0.5</code>- Key differences:\u00a0\u00a01. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- Choice of algorithm (cv/pil methods)\u00a0\u00a0\u00a0\u00a0- Per-channel or luminance-based equalization\u00a0\u00a0\u00a0\u00a0- Optional masking support Normalize Normalize - Similar core functionality: normalize image values- Key similarities:\u00a0\u00a01. Both support mean/std normalization- Key differences:\u00a0\u00a01. Normalization options:\u00a0\u00a0\u00a0\u00a0- TorchVision: only (input - mean) / std\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 standard (same as TorchVision)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 image (global stats)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 image_per_channel\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 min_max\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 min_max_per_channel\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>max_pixel_value</code> parameter\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomErasing Erasing - Similar core functionality: randomly erase image regions- Key similarities:\u00a0\u00a01. Both have default probability <code>p=0.5</code>\u00a0\u00a02. Same default scale=(0.02, 0.33)\u00a0\u00a03. Same default ratio=(0.3, 3.3)- Key differences:\u00a0\u00a01. Fill value options:\u00a0\u00a0\u00a0\u00a0- TorchVision: number/tuple or 'random'\u00a0\u00a0\u00a0\u00a0- Albumentations: additional options:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 random_uniform\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 inpaint_telea\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 inpaint_ns\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Mask fill value option\u00a0\u00a0\u00a0\u00a0- Support for masks, bboxes, keypoints JPEG ImageCompression - Similar core functionality: apply JPEG compression- Key similarities:\u00a0\u00a01. Both use quality range 1-100\u00a0\u00a02. Both support quality ranges- Key differences:\u00a0\u00a01. Compression types:\u00a0\u00a0\u00a0\u00a0- TorchVision: JPEG only\u00a0\u00a0\u00a0\u00a0- Albumentations: JPEG and WebP\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- Default quality range (99, 100)"},{"location":"getting_started/augmentation_mapping/#kornia-to-albumentations","title":"Kornia to Albumentations","text":"Kornia Albumentations Notes ColorJitter ColorJitter - Similar core functionality: randomly adjust brightness, contrast, saturation, and hue- Key similarities:\u00a0\u00a01. Both support same parameters (brightness, contrast, saturation, hue)\u00a0\u00a02. Both allow float or tuple ranges for parameters- Key differences:\u00a0\u00a01. Default values:\u00a0\u00a0\u00a0\u00a0- Albumentations: (0.8, 1.2) for brightness/contrast/saturation\u00a0\u00a0\u00a0\u00a0- Kornia: 0.0 for all parameters\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a03. Note: Kornia recommends using <code>ColorJiggle</code> instead as it follows color theory better RandomAutoContrast AutoContrast - Similar core functionality: enhance image contrast automatically- Key similarities:\u00a0\u00a01. Both stretch intensity range to use full range\u00a0\u00a02. Both preserve relative intensities- Key differences:\u00a0\u00a01. Default probability:\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>clip_output</code> parameter to control value clipping RandomBoxBlur Blur - Similar core functionality: apply box/average blur to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both apply box/average blur filter- Key differences:\u00a0\u00a01. Kernel size specification:\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> parameter for range (e.g., (3, 7))\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>kernel_size</code> tuple (default (3, 3))\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>border_type</code> parameter ('reflect', 'replicate', 'circular')\u00a0\u00a0\u00a0\u00a0- <code>normalized</code> parameter for L1 norm control RandomBrightness RandomBrightnessContrast - Different scope:\u00a0\u00a0- Kornia: brightness only\u00a0\u00a0- Albumentations: combines brightness and contrast- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>brightness</code> tuple (default: (1.0, 1.0))\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>brightness_limit</code> (default: (-0.2, 0.2))\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>brightness_by_max</code> parameter for adjustment method\u00a0\u00a0\u00a0\u00a0- <code>ensure_safe_range</code> to prevent overflow/underflow\u00a0\u00a0\u00a0\u00a0- Combined contrast control\u00a0\u00a04. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>clip_output</code> parameter RandomChannelDropout ChannelDropout - Similar core functionality: randomly drop image channels- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both allow specifying fill value for dropped channels- Key differences:\u00a0\u00a01. Channel drop specification:\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>num_drop_channels</code> (default: 1)\u00a0\u00a0\u00a0\u00a0- Albumentations: flexible <code>channel_drop_range</code> tuple (default: (1, 1))\u00a0\u00a02. Error handling:\u00a0\u00a0\u00a0\u00a0- Albumentations: explicit checks for single-channel images and invalid ranges\u00a0\u00a0\u00a0\u00a0- Kornia: simpler parameter validation RandomChannelShuffle ChannelShuffle - Identical core functionality: randomly shuffle image channels RandomClahe CLAHE - Similar core functionality: apply Contrast Limited Adaptive Histogram Equalization- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both allow configuring grid size and clip limit- Key differences:\u00a0\u00a01. Parameter defaults:\u00a0\u00a0\u00a0\u00a0- Kornia: clip_limit=(40.0, 40.0), grid_size=(8, 8)\u00a0\u00a0\u00a0\u00a0- Albumentations: clip_limit=(1, 4), tile_grid_size=(8, 8)\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>slow_and_differentiable</code> parameter for implementation choice RandomContrast RandomBrightnessContrast - Different scope:\u00a0\u00a0- Kornia: contrast only\u00a0\u00a0- Albumentations: combines brightness and contrast- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>contrast</code> tuple (default: (1.0, 1.0))\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>contrast_limit</code> (default: (-0.2, 0.2))\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>ensure_safe_range</code> to prevent overflow/underflow\u00a0\u00a0\u00a0\u00a0- Combined brightness control\u00a0\u00a04. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>clip_output</code> parameter RandomEqualize Equalize - Similar core functionality: apply histogram equalization- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>mode</code> parameter to choose between 'cv' and 'pil' methods\u00a0\u00a0\u00a0\u00a0- <code>by_channels</code> parameter for per-channel or luminance-based equalization\u00a0\u00a0\u00a0\u00a0- <code>mask</code> parameter to selectively apply equalization\u00a0\u00a0\u00a0\u00a0- <code>mask_params</code> for dynamic mask generation RandomGamma RandomGamma - Similar core functionality: apply random gamma correction- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: separate <code>gamma</code> (1.0, 1.0) and <code>gain</code> (1.0, 1.0) tuples\u00a0\u00a0\u00a0\u00a0- Albumentations: single <code>gamma_limit</code> (80, 120) as percentage range\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>eps</code> parameter to prevent numerical errors RandomGaussianBlur GaussianBlur - Similar core functionality: apply Gaussian blur with random parameters- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support kernel size and sigma parameters- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: requires explicit <code>kernel_size</code> and <code>sigma</code> range\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> (default: (3, 7)) and <code>sigma_limit</code> (default: 0)\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>border_type</code> parameter for padding mode\u00a0\u00a0\u00a0\u00a0- <code>separable</code> parameter for 1D convolution optimization RandomGaussianIllumination Illumination - Similar core functionality: apply illumination effects- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support controlling effect intensity and position- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: Gaussian illumination patterns only\u00a0\u00a0\u00a0\u00a0- Albumentations: Multiple modes (linear, corner, gaussian)\u00a0\u00a02. Parameter ranges:\u00a0\u00a0\u00a0\u00a0- Kornia: gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: intensity_range=(0.01, 0.2), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>mode</code> parameter for different effect types\u00a0\u00a0\u00a0\u00a0- <code>effect_type</code> for brighten/darken control\u00a0\u00a0\u00a0\u00a0- <code>angle_range</code> for linear gradients\u00a0\u00a04. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>sign</code> parameter for effect direction RandomGaussianNoise GaussNoise - Similar core functionality: add Gaussian noise to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>mean</code> (default: 0.0) and <code>std</code> (default: 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: ranges via <code>std_range</code> (0.2, 0.44) and <code>mean_range</code> (0.0, 0.0)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>per_channel</code> parameter for independent channel noise\u00a0\u00a0\u00a0\u00a0- <code>noise_scale_factor</code> for performance optimization\u00a0\u00a0\u00a0\u00a0- Automatic value scaling based on image dtype RandomGrayscale ToGray - Similar core functionality: convert images to grayscale- Key differences:\u00a0\u00a01. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=0.1\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a02. Conversion options:\u00a0\u00a0\u00a0\u00a0- Kornia: customizable <code>rgb_weights</code> for channel mixing\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple <code>method</code> options (weighted_average, from_lab, desaturation, average, max, pca)\u00a0\u00a03. Output control:\u00a0\u00a0\u00a0\u00a0- Kornia: always 3-channel output\u00a0\u00a0\u00a0\u00a0- Albumentations: configurable <code>num_output_channels</code> RandomHue ColorJitter (hue parameter) - Similar core functionality: adjust image hue- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: hue-only transform\u00a0\u00a0\u00a0\u00a0- Albumentations: part of ColorJitter with brightness, contrast, and saturation\u00a0\u00a02. Default values:\u00a0\u00a0\u00a0\u00a0- Kornia: hue=(0.0, 0.0), p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: hue=(-0.5, 0.5), p=0.5 RandomInvert InvertImg - Similar core functionality: invert image values- Key differences:\u00a0\u00a01. Maximum value handling:\u00a0\u00a0\u00a0\u00a0- Kornia: configurable via <code>max_val</code> parameter (default: 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: automatically determined by dtype (255 for uint8, 1.0 for float32) RandomJPEG ImageCompression - Similar core functionality: apply image compression- Key differences:\u00a0\u00a01. Compression options:\u00a0\u00a0\u00a0\u00a0- Kornia: JPEG only\u00a0\u00a0\u00a0\u00a0- Albumentations: supports both JPEG and WebP\u00a0\u00a02. Quality specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>jpeg_quality</code> (default: 50.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>quality_range</code> (default: (99, 100))\u00a0\u00a03. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5 RandomLinearCornerIllumination Illumination (corner mode) - Similar core functionality: apply corner illumination effects- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: corner illumination only\u00a0\u00a0\u00a0\u00a0- Albumentations: part of general Illumination transform with multiple modes\u00a0\u00a02. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>gain</code> (0.01, 0.2) and <code>sign</code> (-1.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>intensity_range</code> (0.01, 0.2) and <code>effect_type</code> (brighten/darken/both)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Multiple illumination modes (linear, corner, gaussian)\u00a0\u00a0\u00a0\u00a0- More control over effect parameters RandomLinearIllumination Illumination (linear mode) - Similar core functionality: apply linear illumination effects- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: linear illumination only\u00a0\u00a0\u00a0\u00a0- Albumentations: part of general Illumination transform with multiple modes\u00a0\u00a02. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>gain</code> (0.01, 0.2) and <code>sign</code> (-1.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>intensity_range</code> (0.01, 0.2), <code>effect_type</code> (brighten/darken/both), and <code>angle_range</code> (0, 360)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Multiple illumination modes (linear, corner, gaussian)\u00a0\u00a0\u00a0\u00a0- Explicit angle control for gradient direction RandomMedianBlur MedianBlur - Similar core functionality: apply median blur filter- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Kernel size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>kernel_size</code> tuple (default: (3, 3))\u00a0\u00a0\u00a0\u00a0- Albumentations: range via <code>blur_limit</code> (default: (3, 7))\u00a0\u00a02. Kernel constraints:\u00a0\u00a0\u00a0\u00a0- Albumentations: enforces odd kernel sizes RandomMotionBlur MotionBlur - Similar core functionality: apply directional motion blur- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support angle and direction control- Key differences:\u00a0\u00a01. Kernel size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>kernel_size</code> as int or tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> (default: (3, 7))\u00a0\u00a02. Angle control:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>angle</code> parameter with symmetric range (-angle, angle)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>angle_range</code> (default: (0, 360))\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>allow_shifted</code> parameter for kernel position control RandomPlanckianJitter PlanckianJitter - Similar core functionality: apply physics-based color temperature variations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support 'blackbody' and 'cied' modes- Key differences:\u00a0\u00a01. Temperature control:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>select_from</code> parameter for discrete jitter selection\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>temperature_limit</code> for continuous range\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>sampling_method</code> parameter ('uniform' or 'gaussian')\u00a0\u00a0\u00a0\u00a0- More detailed control over temperature ranges\u00a0\u00a0\u00a0\u00a0- Better documentation of physics-based effects RandomPlasmaBrightness PlasmaBrightnessContrast - Similar core functionality: apply fractal-based brightness adjustments- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Diamond-Square algorithm for pattern generation- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>roughness</code> (0.1, 0.7) and <code>intensity</code> (0.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>brightness_range</code> (-0.3, 0.3), <code>contrast_range</code> (-0.3, 0.3), <code>roughness</code> (default: 3.0)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Combined brightness and contrast adjustment\u00a0\u00a0\u00a0\u00a0- <code>plasma_size</code> parameter for pattern detail control\u00a0\u00a0\u00a0\u00a0- More detailed mathematical formulation and documentation RandomPlasmaContrast PlasmaBrightnessContrast - Similar core functionality: apply fractal-based contrast adjustments- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Diamond-Square algorithm for pattern generation- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>roughness</code> (0.1, 0.7) only\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>contrast_range</code> (-0.3, 0.3), <code>roughness</code> (default: 3.0), <code>plasma_size</code> (default: 256)\u00a0\u00a02. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: contrast-only adjustment\u00a0\u00a0\u00a0\u00a0- Albumentations: combined brightness and contrast adjustment\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- More detailed mathematical formulation\u00a0\u00a0\u00a0\u00a0- Pattern size control via <code>plasma_size</code> RandomPlasmaShadow PlasmaShadow - Similar core functionality: apply fractal-based shadow effects- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Diamond-Square algorithm for pattern generation- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>roughness</code> (0.1, 0.7), <code>shade_intensity</code> (-1.0, 0.0), <code>shade_quantity</code> (0.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>shadow_intensity_range</code> (0.3, 0.7), <code>plasma_size</code> (default: 256), <code>roughness</code> (default: 3.0)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Pattern size control via <code>plasma_size</code>\u00a0\u00a0\u00a0\u00a0- More intuitive intensity range (0 to 1)\u00a0\u00a0\u00a0\u00a0- More detailed mathematical formulation and documentation RandomPosterize Posterize - Similar core functionality: reduce color bits in image- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both operate on color bit reduction- Key differences:\u00a0\u00a01. Bit specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>bits</code> parameter (default: 3) with range (0, 8], can be float or tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>num_bits</code> parameter (default: 4) with range [1, 7], supports multiple formats:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single int for all channels\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Tuple for random range\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* List for per-channel specification\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* List of tuples for per-channel ranges\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- More flexible channel-wise control\u00a0\u00a0\u00a0\u00a0- More detailed documentation and mathematical background RandomRain RandomRain - Similar core functionality: add rain effects to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Rain parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>number_of_drops</code> (1000, 2000), <code>drop_height</code> (5, 20), <code>drop_width</code> (-5, 5)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>slant_range</code> (-10, 10), <code>drop_length</code> (20), <code>drop_width</code> (1)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>drop_color</code> customization\u00a0\u00a0\u00a0\u00a0- <code>blur_value</code> for atmospheric effect\u00a0\u00a0\u00a0\u00a0- <code>brightness_coefficient</code> for lighting adjustment\u00a0\u00a0\u00a0\u00a0- <code>rain_type</code> presets (drizzle, heavy, torrential)\u00a0\u00a03. Approach:\u00a0\u00a0\u00a0\u00a0- Kornia: Direct drop placement\u00a0\u00a0\u00a0\u00a0- Albumentations: More realistic simulation with slant, blur, and brightness effects RandomRGBShift AdditiveNoise - Similar core functionality: add noise/shifts to image channels- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both can affect individual channels- Key differences:\u00a0\u00a01. Approach:\u00a0\u00a0\u00a0\u00a0- Kornia: Simple RGB channel shifts with individual limits\u00a0\u00a0\u00a0\u00a0- Albumentations: More sophisticated noise generation with multiple distributions\u00a0\u00a02. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>r_shift_limit</code>, <code>g_shift_limit</code>, <code>b_shift_limit</code> (all default: 0.5)\u00a0\u00a0\u00a0\u00a0- Albumentations: Flexible noise configuration with:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Multiple noise types (uniform, gaussian, laplace, beta)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Different spatial modes (constant, per_pixel, shared)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Customizable distribution parameters\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Performance optimization options\u00a0\u00a0\u00a0\u00a0- More detailed control over noise distribution\u00a0\u00a0\u00a0\u00a0- Spatial application modes RandomSaltAndPepperNoise SaltAndPepper - Similar core functionality: apply salt and pepper noise to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use same default parameters:\u00a0\u00a0\u00a0\u00a0- <code>amount</code> (0.01, 0.06)\u00a0\u00a0\u00a0\u00a0- <code>salt_vs_pepper</code> (0.4, 0.6)- Key differences:\u00a0\u00a01. Parameter flexibility:\u00a0\u00a0\u00a0\u00a0- Kornia: Supports single float or tuple for parameters\u00a0\u00a0\u00a0\u00a0- Albumentations: Requires tuples for ranges\u00a0\u00a02. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Detailed mathematical formulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Clear examples for different noise levels\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Implementation notes and edge cases\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* References to academic sources RandomSaturation ColorJitter - Different scope and functionality:- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: Saturation-only adjustment\u00a0\u00a0\u00a0\u00a0- Albumentations: Combined brightness, contrast, saturation, and hue adjustment\u00a0\u00a02. Default parameters:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>saturation</code> (1.0, 1.0), p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>saturation</code> (0.8, 1.2), p=0.5\u00a0\u00a03. Implementation:\u00a0\u00a0\u00a0\u00a0- Kornia: Aligns with PIL/TorchVision implementation\u00a0\u00a0\u00a0\u00a0- Albumentations: Uses OpenCV with noted differences in HSV conversion\u00a0\u00a04. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Brightness adjustment\u00a0\u00a0\u00a0\u00a0- Contrast adjustment\u00a0\u00a0\u00a0\u00a0- Hue adjustment\u00a0\u00a0\u00a0\u00a0- Random order of transformations RandomSharpness Sharpen - Similar core functionality: sharpen images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>sharpness</code> parameter (default: 0.5)\u00a0\u00a0\u00a0\u00a0- Albumentations: More detailed control with:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>alpha</code> (0.2, 0.5) for effect visibility\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>lightness</code> (0.5, 1.0) for contrast\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>method</code> choice ('kernel' or 'gaussian')\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>kernel_size</code> and <code>sigma</code> for gaussian method\u00a0\u00a02. Implementation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: Single approach\u00a0\u00a0\u00a0\u00a0- Albumentations: Two methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Kernel-based using Laplacian operator\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Gaussian interpolation\u00a0\u00a03. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides detailed mathematical formulation and references RandomSnow RandomSnow - Similar core functionality: add snow effects to images- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>snow_coefficient</code> (0.5, 0.5), <code>brightness</code> (2, 2), p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>snow_point_range</code> (0.1, 0.3), <code>brightness_coeff</code> (2.5), p=0.5\u00a0\u00a02. Implementation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: Single approach\u00a0\u00a0\u00a0\u00a0- Albumentations: Two methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"bleach\": Simple pixel value thresholding\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"texture\": Advanced snow texture simulation\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Detailed snow simulation with:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* HSV color space manipulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Gaussian noise for texture\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Depth effect simulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Sparkle effects\u00a0\u00a04. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides detailed mathematical formulation and implementation notes RandomSolarize Solarize - Similar core functionality: invert pixel values above threshold- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Two parameters:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>thresholds</code> (default: 0.1) for threshold range\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>additions</code> (default: 0.1) for value adjustment\u00a0\u00a0\u00a0\u00a0- Albumentations: Single parameter:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>threshold_range</code> (default: (0.5, 0.5))\u00a0\u00a02. Threshold handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Generates from (0.5 - x, 0.5 + x) for float input\u00a0\u00a0\u00a0\u00a0- Albumentations: Direct range specification, scaled by image type max value\u00a0\u00a03. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Detailed examples for both uint8 and float32 images\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Clear mathematical formulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Image type-specific behavior explanation CenterCrop CenterCrop - Similar core functionality: crop center of image- Key similarities:\u00a0\u00a01. Both have default probability p=1.0- Key differences:\u00a0\u00a01. Size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>size</code> parameter (int or tuple)\u00a0\u00a0\u00a0\u00a0- Albumentations: Separate <code>height</code> and <code>width</code> parameters\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> for interpolation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>resample</code> mode selection\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>cropping_mode</code> ('slice' or 'resample')\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>pad_if_needed</code> for handling small images\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>border_mode</code> for padding method\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>fill</code> and <code>fill_mask</code> for padding values\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>pad_position</code> options\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Supports images, masks, bboxes, and keypoints PadTo PadIfNeeded - Can achieve same core functionality- Key similarities:\u00a0\u00a01. Both have default probability p=1.0\u00a0\u00a02. Can pad to exact size:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>size=(height, width)</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>min_height=height, min_width=width</code>- Key differences:\u00a0\u00a01. Parameter naming:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>size</code> tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: Separate dimension parameters\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simple <code>pad_mode</code> selection\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>pad_value</code>\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Flexible <code>position</code> options\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate <code>fill</code> and <code>fill_mask</code>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Optional divisibility padding\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Multiple target support\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomAffine Affine - Similar core functionality: apply affine transformations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support rotation, translation, scaling, and shear- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>degrees</code> for rotation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>translate</code> as fraction\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>scale</code> as tuple\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>shear</code> in degrees\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* More flexible parameter formats\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Supports both percent and pixel translation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Dictionary format for independent axis control\u00a0\u00a02. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* <code>fit_output</code> for automatic size adjustment\u00a0\u00a0\u00a0\u00a0* <code>keep_ratio</code> for aspect ratio preservation\u00a0\u00a0\u00a0\u00a0* <code>rotate_method</code> options\u00a0\u00a0\u00a0\u00a0* <code>balanced_scale</code> for even scale distribution\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomCrop RandomCrop - Similar core functionality: randomly crop image patches- Key similarities:\u00a0\u00a01. Both have default probability p=1.0\u00a0\u00a02. Both support padding if needed- Key differences:\u00a0\u00a01. Size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>size</code> tuple (height, width)\u00a0\u00a0\u00a0\u00a0- Albumentations: Separate <code>height</code> and <code>width</code> parameters\u00a0\u00a02. Padding options:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Flexible padding sizes (int, tuple[2], tuple[4])\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Multiple padding modes (constant, reflect, replicate)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single fill value\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simpler padding interface\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate fill values for image and mask\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Flexible pad positioning\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomElasticTransform ElasticTransform - Similar core functionality: apply elastic deformations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Gaussian smoothing for displacement fields\u00a0\u00a03. Both support independent control of x/y deformations:\u00a0\u00a0\u00a0\u00a0- Kornia: via separate values in <code>sigma</code>/<code>alpha</code> tuples\u00a0\u00a0\u00a0\u00a0- Albumentations: via <code>same_dxdy</code> parameter- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>kernel_size</code> tuple (63, 63)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>sigma</code> tuple (32.0, 32.0)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>alpha</code> tuple (1.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>sigma</code> (default: 50.0)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>alpha</code> (default: 1.0)\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Control over padding mode\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>approximate</code> mode for faster processing\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Choice of noise distribution\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomErasing Erasing - Similar core functionality: randomly erase rectangular regions- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Same default parameters:\u00a0\u00a0\u00a0\u00a0* <code>scale</code> (0.02, 0.33)\u00a0\u00a0\u00a0\u00a0* <code>ratio</code> (0.3, 3.3)- Key differences:\u00a0\u00a01. Fill value options:\u00a0\u00a0\u00a0\u00a0- Kornia: Simple numeric <code>value</code> (default: 0.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: Rich <code>fill</code> options:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Numeric values\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"random\" per pixel\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"random_uniform\" per region\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"inpaint_telea\" method\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"inpaint_ns\" method\u00a0\u00a02. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* Separate <code>mask_fill</code> value\u00a0\u00a0\u00a0\u00a0* Support for masks, bboxes, keypoints\u00a0\u00a0\u00a0\u00a0* Inpainting options for more natural-looking results\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomFisheye OpticalDistortion - Similar core functionality: apply optical/fisheye distortion- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support fisheye distortion- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate <code>center_x</code>, <code>center_y</code> for distortion center\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>gamma</code> for distortion strength\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>distort_limit</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>mode</code> selection ('camera' or 'fisheye')\u00a0\u00a02. Distortion models:\u00a0\u00a0\u00a0\u00a0- Kornia: Fisheye only\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Camera matrix model\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fisheye model\u00a0\u00a03. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* Separate interpolation methods for image and mask\u00a0\u00a0\u00a0\u00a0* Support for masks, bboxes, keypoints\u00a0\u00a04. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomHorizontalFlip HorizontalFlip - Similar core functionality: flip image horizontally- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Simple operation with same visual result- Key differences:\u00a0\u00a01. Batch handling:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Additional <code>p_batch</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>same_on_batch</code> option\u00a0\u00a0\u00a0\u00a0- Albumentations: No batch-specific parameters\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomPerspective Perspective - Similar core functionality: apply perspective transformation- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both transform image by moving corners\u00a0\u00a03. Both support different interpolation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: via <code>resample</code> (BILINEAR, NEAREST)\u00a0\u00a0\u00a0\u00a0- Albumentations: via <code>interpolation</code> (INTER_LINEAR, INTER_NEAREST, etc.)- Key differences:\u00a0\u00a01. Distortion control:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>distortion_scale</code> (0 to 1, default: 0.5)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>sampling_method</code> ('basic' or 'area_preserving')\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>scale</code> tuple for corner movement range\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>fit_output</code> option for image capture\u00a0\u00a02. Output handling:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>keepdim</code> for batch form\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>keep_size</code> for output dimensions\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Border mode and fill options\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomResizedCrop RandomResizedCrop - Similar core functionality: crop random patches and resize- Key similarities:\u00a0\u00a01. Both have default probability p=1.0\u00a0\u00a02. Same default parameters:\u00a0\u00a0\u00a0\u00a0* <code>scale</code> (0.08, 1.0)\u00a0\u00a0\u00a0\u00a0* <code>ratio</code> (~0.75, ~1.33)\u00a0\u00a03. Both support different interpolation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: via <code>resample</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: via <code>interpolation</code>- Key differences:\u00a0\u00a01. Implementation options:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>cropping_mode</code> ('slice' or 'resample')\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>keepdim</code> for batch form\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fallback to center crop after 10 attempts\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomRotation90 RandomRotate90 - Similar core functionality: rotate image by 90 degrees- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both rotate in 90-degree increments- Key differences:\u00a0\u00a01. Rotation control:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>times</code> parameter to specify range of rotations\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>resample</code> and <code>align_corners</code> for interpolation\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simpler implementation (0-3 rotations)\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomRotation Rotate - Similar core functionality: rotate image by random angle- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support different interpolation methods- Key differences:\u00a0\u00a01. Angle specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>degrees</code> parameter (if single value, range is (-degrees, +degrees))\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>limit</code> parameter (default: (-90, 90))\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> for interpolation\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Border mode options\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fill values for padding\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>rotate_method</code> for bboxes\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>crop_border</code> option\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomShear Affine (shear parameter) - Similar core functionality: apply shear transformation- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support different interpolation methods\u00a0\u00a03. Both support independent x/y shear control- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Dedicated shear transform\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>shear</code> parameter supports float, tuple(2), or tuple(4)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simple padding modes (zeros, border, reflection)\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Part of general Affine transform\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>shear</code> supports number, tuple, or dict format\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* More border modes and fill options\u00a0\u00a02. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a0\u00a0\u00a0* <code>fit_output</code> option\u00a0\u00a0\u00a0\u00a0* Combined with other affine transforms\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomThinPlateSpline ThinPlateSpline - Similar core functionality: apply smooth, non-rigid deformations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use thin plate spline algorithm\u00a0\u00a03. Both support interpolation options- Key differences:\u00a0\u00a01. Deformation control:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>scale</code> parameter (default: 0.2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fixed control point grid\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>scale_range</code> tuple for range of deformation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Configurable <code>num_control_points</code>\u00a0\u00a02. Implementation details:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Binary mode choice (bilinear/nearest)\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* OpenCV interpolation flags\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* More granular control over grid\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomVerticalFlip VerticalFlip - Similar core functionality: flip image vertically- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Simple operation with same visual result- Key differences:\u00a0\u00a01. Implementation:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Additional <code>p_batch</code> parameter\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simpler implementation\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints"},{"location":"getting_started/augmentation_mapping/#key-differences_1","title":"Key Differences","text":""},{"location":"getting_started/augmentation_mapping/#compared-to-torchvision_1","title":"Compared to TorchVision","text":"<ul> <li>Albumentations operates on numpy arrays instead of PyTorch tensors</li> <li>Albumentations typically provides more parameters for fine-tuning transformations</li> <li>Most Albumentations transforms support both image and mask augmentation</li> <li>Better support for bounding box and keypoint augmentation</li> </ul>"},{"location":"getting_started/augmentation_mapping/#compared-to-kornia_1","title":"Compared to Kornia","text":"<ul> <li>Kornia operates directly on GPU tensors, while Albumentations works with numpy arrays</li> <li>Albumentations provides more comprehensive support for object detection and segmentation tasks</li> <li>Albumentations typically offers better performance for CPU-based augmentations</li> </ul>"},{"location":"getting_started/augmentation_mapping/#performance-comparison","title":"Performance Comparison","text":"<p>According to benchmarking results, Albumentations generally offers superior CPU performance compared to TorchVision and Kornia for most transforms. Here are some key highlights: Common Transforms Performance (images/second, higher is better)</p> Transform Albumentations TorchVision Kornia Notes HorizontalFlip 8,618 914 390 Albumentations is ~9x faster than TorchVision, ~22x faster than Kornia VerticalFlip 22,847 3,198 1,212 Albumentations is ~7x faster than TorchVision, ~19x faster than Kornia RandomResizedCrop 2,828 511 287 Albumentations is ~5.5x faster than TorchVision, ~10x faster than Kornia Normalize 1,196 519 626 Albumentations is ~2x faster than both ColorJitter 628 46 55 Albumentations is ~13x faster than both"},{"location":"getting_started/augmentation_mapping/#key-performance-insights","title":"Key Performance Insights:","text":"<ul> <li>Basic Operations: Albumentations excels at basic transforms like flips and crops, often being 5-20x faster than alternatives</li> <li>Complex Operations: For more complex transforms like elastic deformation, the performance gap narrows</li> <li>Memory Efficiency: Working with numpy arrays (Albumentations) is generally more memory efficient than tensor operations (Kornia/TorchVision) on CPU</li> </ul>"},{"location":"getting_started/augmentation_mapping/#when-to-choose-each-library","title":"When to Choose Each Library:","text":"<ul> <li>Albumentations: Best choice for CPU-based preprocessing pipelines and when maximum performance is needed</li> <li>Kornia: Consider when doing augmentation on GPU with existing PyTorch tensors</li> <li>TorchVision: Good choice when deeply integrated into PyTorch ecosystem and GPU performance isn't critical</li> </ul> <p>Note: Benchmarks performed on macOS-15.0.1-arm64 with Python 3.12.7. Your results may vary based on hardware and setup.</p>"},{"location":"getting_started/augmentation_mapping/#code-examples","title":"Code Examples","text":""},{"location":"getting_started/augmentation_mapping/#torchvision-to-albumentations","title":"TorchVision to Albumentations","text":"Python<pre><code># TorchVision\ntransforms = T.Compose([\n    T.RandomHorizontalFlip(p=0.5),\n    T.RandomRotation(10),\n    T.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n])\n\n# Albumentations equivalent\ntransforms = A.Compose([\n    A.HorizontalFlip(p=0.5),\n    A.Rotate(limit=10),\n    A.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n])\n</code></pre>"},{"location":"getting_started/augmentation_mapping/#kornia-to-albumentations_1","title":"Kornia to Albumentations","text":"Python<pre><code># Kornia\ntransforms = K.AugmentationSequential(\n    K.RandomHorizontalFlip(p=0.5),\n    K.RandomRotation(degrees=10),\n    K.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n)\n\n# Albumentations equivalent\ntransforms = A.Compose([\n    A.HorizontalFlip(p=0.5),\n    A.Rotate(limit=10),\n    A.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n])\n</code></pre>"},{"location":"getting_started/augmentation_mapping/#additional-resources","title":"Additional Resources","text":"<ul> <li>TorchVision Transforms Documentation</li> <li>Kornia Augmentation Documentation</li> <li>Albumentations Documentation</li> </ul>"},{"location":"getting_started/bounding_boxes_augmentation/","title":"Bounding boxes augmentation for object detection","text":""},{"location":"getting_started/bounding_boxes_augmentation/#different-annotations-formats","title":"Different annotations formats","text":"<p>Bounding boxes are rectangles that mark objects on an image. There are multiple formats of bounding boxes annotations. Each format uses its specific representation of bounding boxes coordinates. Albumentations supports four formats: <code>pascal_voc</code>, <code>albumentations</code>, <code>coco</code>, and <code>yolo</code> .</p> <p>Let's take a look at each of those formats and how they represent coordinates of bounding boxes.</p> <p>As an example, we will use an image from the dataset named Common Objects in Context. It contains one bounding box that marks a cat. The image width is 640 pixels, and its height is 480 pixels. The width of the bounding box is 322 pixels, and its height is 117 pixels.</p> <p>The bounding box has the following <code>(x, y)</code> coordinates of its corners: top-left is <code>(x_min, y_min)</code> or <code>(98px, 345px)</code>, top-right is <code>(x_max, y_min)</code> or <code>(420px, 345px)</code>, bottom-left is <code>(x_min, y_max)</code> or <code>(98px, 462px)</code>, bottom-right is <code>(x_max, y_max)</code> or <code>(420px, 462px)</code>. As you see, coordinates of the bounding box's corners are calculated with respect to the top-left corner of the image which has <code>(x, y)</code> coordinates <code>(0, 0)</code>.</p> <p> An example image with a bounding box from the COCO dataset</p>"},{"location":"getting_started/bounding_boxes_augmentation/#pascal_voc","title":"pascal_voc","text":"<p><code>pascal_voc</code> is a format used by the Pascal VOC dataset. Coordinates of a bounding box are encoded with four values in pixels: <code>[x_min, y_min, x_max, y_max]</code>.  <code>x_min</code> and <code>y_min</code> are coordinates of the top-left corner of the bounding box. <code>x_max</code> and <code>y_max</code> are coordinates of bottom-right corner of the bounding box.</p> <p>Coordinates of the example bounding box in this format are <code>[98, 345, 420, 462]</code>.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#albumentations","title":"albumentations","text":"<p><code>albumentations</code> is similar to <code>pascal_voc</code>, because it also uses four values <code>[x_min, y_min, x_max, y_max]</code> to represent a bounding box. But unlike <code>pascal_voc</code>, <code>albumentations</code> uses normalized values. To normalize values, we divide coordinates in pixels for the x- and y-axis by the width and the height of the image.</p> <p>Coordinates of the example bounding box in this format are <code>[98 / 640, 345 / 480, 420 / 640, 462 / 480]</code> which are <code>[0.153125, 0.71875, 0.65625, 0.9625]</code>.</p> <p>Albumentations uses this format internally to work with bounding boxes and augment them.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#coco","title":"coco","text":"<p><code>coco</code> is a format used by the Common Objects in Context COCO dataset.</p> <p>In <code>coco</code>, a bounding box is defined by four values in pixels <code>[x_min, y_min, width, height]</code>. They are coordinates of the top-left corner along with the width and height of the bounding box.</p> <p>Coordinates of the example bounding box in this format are <code>[98, 345, 322, 117]</code>.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#yolo","title":"yolo","text":"<p>In <code>yolo</code>, a bounding box is represented by four values <code>[x_center, y_center, width, height]</code>. <code>x_center</code> and <code>y_center</code> are the normalized coordinates of the center of the bounding box. To make coordinates normalized, we take pixel values of x and y, which marks the center of the bounding box on the x- and y-axis. Then we divide the value of x by the width of the image and value of y by the height of the image. <code>width</code> and <code>height</code> represent the width and the height of the bounding box. They are normalized as well.</p> <p>Coordinates of the example bounding box in this format are <code>[((420 + 98) / 2) / 640, ((462 + 345) / 2) / 480, 322 / 640, 117 / 480]</code> which are <code>[0.4046875, 0.840625, 0.503125, 0.24375]</code>.</p> <p> How different formats represent coordinates of a bounding box</p>"},{"location":"getting_started/bounding_boxes_augmentation/#bounding-boxes-augmentation","title":"Bounding boxes augmentation","text":"<p>Just like with images and masks augmentation, the process of augmenting bounding boxes consists of 4 steps.</p> <ol> <li>You import the required libraries.</li> <li>You define an augmentation pipeline.</li> <li>You read images and bounding boxes from the disk.</li> <li>You pass an image and bounding boxes to the augmentation pipeline and receive augmented images and boxes.</li> </ol> <p>Note</p> <p>Some transforms in Albumentation don't support bounding boxes. If you try to use them you will get an exception. Please refer to this article to check whether a transform can augment bounding boxes.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#step-1-import-the-required-libraries","title":"Step 1. Import the required libraries.","text":"Python<pre><code>import albumentations as A\nimport cv2\n</code></pre>"},{"location":"getting_started/bounding_boxes_augmentation/#step-2-define-an-augmentation-pipeline","title":"Step 2. Define an augmentation pipeline.","text":"<p>Here an example of a minimal declaration of an augmentation pipeline that works with bounding boxes.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=450, height=450),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], bbox_params=A.BboxParams(format='coco'))\n</code></pre> <p>Note that unlike image and masks augmentation, <code>Compose</code> now has an additional parameter <code>bbox_params</code>. You need to pass an instance of <code>A.BboxParams</code> to that argument. <code>A.BboxParams</code> specifies settings for working with bounding boxes. <code>format</code> sets the format for bounding boxes coordinates.</p> <p>It can either be <code>pascal_voc</code>, <code>albumentations</code>, <code>coco</code> or <code>yolo</code>. This value is required because Albumentation needs to know the coordinates' source format for bounding boxes to apply augmentations correctly.</p> <p>Besides <code>format</code>, <code>A.BboxParams</code> supports a few more settings.</p> <p>Here is an example of <code>Compose</code> that shows all available settings with <code>A.BboxParams</code>:</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=450, height=450),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], bbox_params=A.BboxParams(format='coco', min_area=1024, min_visibility=0.1, label_fields=['class_labels']))\n</code></pre>"},{"location":"getting_started/bounding_boxes_augmentation/#min_area-and-min_visibility","title":"<code>min_area</code> and <code>min_visibility</code>","text":"<p><code>min_area</code> and <code>min_visibility</code> parameters control what Albumentations should do to the augmented bounding boxes if their size has changed after augmentation. The size of bounding boxes could change if you apply spatial augmentations, for example, when you crop a part of an image or when you resize an image.</p> <p><code>min_area</code> is a value in pixels. If the area of a bounding box after augmentation becomes smaller than <code>min_area</code>, Albumentations will drop that box. So the returned list of augmented bounding boxes won't contain that bounding box.</p> <p><code>min_visibility</code> is a value between 0 and 1. If the ratio of the bounding box area after augmentation to <code>the area of the bounding box before augmentation</code> becomes smaller than <code>min_visibility</code>, Albumentations will drop that box. So if the augmentation process cuts the most of the bounding box, that box won't be present in the returned list of the augmented bounding boxes.</p> <p>Here is an example image that contains two bounding boxes. Bounding boxes coordinates are declared using the <code>coco</code> format.</p> <p> An example image with two bounding boxes</p> <p>First, we apply the <code>CenterCrop</code> augmentation without declaring parameters <code>min_area</code> and <code>min_visibility</code>. The augmented image contains two bounding boxes.</p> <p> An example image with two bounding boxes after applying augmentation</p> <p>Next, we apply the same <code>CenterCrop</code> augmentation, but now we also use the <code>min_area</code> parameter. Now, the augmented image contains only one bounding box, because the other bounding box's area after augmentation became smaller than <code>min_area</code>, so Albumentations dropped that bounding box.</p> <p> An example image with one bounding box after applying augmentation with 'min_area'</p> <p>Finally, we apply the <code>CenterCrop</code> augmentation with the <code>min_visibility</code>. After that augmentation, the resulting image doesn't contain any bounding box, because visibility of all bounding boxes after augmentation are below threshold set by <code>min_visibility</code>.</p> <p> An example image with zero bounding boxes after applying augmentation with 'min_visibility'</p>"},{"location":"getting_started/bounding_boxes_augmentation/#class-labels-for-bounding-boxes","title":"Class labels for bounding boxes","text":"<p>Besides coordinates, each bounding box should have an associated class label that tells which object lies inside the bounding box. There are two ways to pass a label for a bounding box.</p> <p>Let's say you have an example image with three objects: <code>dog</code>, <code>cat</code>, and <code>sports ball</code>. Bounding boxes coordinates in the <code>coco</code> format for those objects are <code>[23, 74, 295, 388]</code>, <code>[377, 294, 252, 161]</code>, and <code>[333, 421, 49, 49]</code>.</p> <p> An example image with 3 bounding boxes from the COCO dataset</p>"},{"location":"getting_started/bounding_boxes_augmentation/#1-you-can-pass-labels-along-with-bounding-boxes-coordinates-by-adding-them-as-additional-values-to-the-list-of-coordinates","title":"1. You can pass labels along with bounding boxes coordinates by adding them as additional values to the list of coordinates.","text":"<p>For the image above, bounding boxes with class labels will become <code>[23, 74, 295, 388, 'dog']</code>, <code>[377, 294, 252, 161, 'cat']</code>, and <code>[333, 421, 49, 49, 'sports ball']</code>.</p> <p>Class labels could be of any type: integer, string, or any other Python data type. For example, integer values as class labels will look the following: <code>[23, 74, 295, 388, 18]</code>, <code>[377, 294, 252, 161, 17]</code>, and <code>[333, 421, 49, 49, 37].</code></p> <p>Also, you can use multiple class values for each bounding box, for example <code>[23, 74, 295, 388, 'dog', 'animal']</code>, <code>[377, 294, 252, 161, 'cat', 'animal']</code>, and <code>[333, 421, 49, 49, 'sports ball', 'item']</code>.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#2you-can-pass-labels-for-bounding-boxes-as-a-separate-list-the-preferred-way","title":"2.You can pass labels for bounding boxes as a separate list (the preferred way).","text":"<p>For example, if you have three bounding boxes like <code>[23, 74, 295, 388]</code>, <code>[377, 294, 252, 161]</code>, and <code>[333, 421, 49, 49]</code> you can create a separate list with values like <code>['cat', 'dog', 'sports ball']</code>, or <code>[18, 17, 37]</code> that contains class labels for those bounding boxes. Next, you pass that list with class labels as a separate argument to the <code>transform</code> function. Albumentations needs to know the names of all those lists with class labels to join them with augmented bounding boxes correctly. Then, if a bounding box is dropped after augmentation because it is no longer visible, Albumentations will drop the class label for that box as well. Use <code>label_fields</code> parameter to set names for all arguments in <code>transform</code> that will contain label descriptions for bounding boxes (more on that in Step 4).</p>"},{"location":"getting_started/bounding_boxes_augmentation/#step-3-read-images-and-bounding-boxes-from-the-disk","title":"Step 3. Read images and bounding boxes from the disk.","text":"<p>Read an image from the disk.</p> Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> <p>Bounding boxes can be stored on the disk in different serialization formats: JSON, XML, YAML, CSV, etc. So the code to read bounding boxes depends on the actual format of data on the disk.</p> <p>After you read the data from the disk, you need to prepare bounding boxes for Albumentations.</p> <p>Albumentations expects that bounding boxes will be represented as a list of lists. Each list contains information about a single bounding box. A bounding box definition should have at list four elements that represent the coordinates of that bounding box. The actual meaning of those four values depends on the format of bounding boxes (either <code>pascal_voc</code>, <code>albumentations</code>, <code>coco</code>, or <code>yolo</code>). Besides four coordinates, each definition of a bounding box may contain one or more extra values. You can use those extra values to store additional information about the bounding box, such as a class label of the object inside the box. During augmentation, Albumentations will not process those extra values. The library will return them as is along with the updated coordinates of the augmented bounding box.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#step-4-pass-an-image-and-bounding-boxes-to-the-augmentation-pipeline-and-receive-augmented-images-and-boxes","title":"Step 4. Pass an image and bounding boxes to the augmentation pipeline and receive augmented images and boxes.","text":"<p>As discussed in Step 2, there are two ways of passing class labels along with bounding boxes coordinates:</p>"},{"location":"getting_started/bounding_boxes_augmentation/#1-pass-class-labels-along-with-coordinates","title":"1. Pass class labels along with coordinates","text":"<p>So, if you have coordinates of three bounding boxes that look like this:</p> Python<pre><code>bboxes = [\n    [23, 74, 295, 388],\n    [377, 294, 252, 161],\n    [333, 421, 49, 49],\n]\n</code></pre> <p>you can add a class label for each bounding box as an additional element of the list along with four coordinates. So now a list with bounding boxes and their coordinates will look the following:</p> Python<pre><code>bboxes = [\n    [23, 74, 295, 388, 'dog'],\n    [377, 294, 252, 161, 'cat'],\n    [333, 421, 49, 49, 'sports ball'],\n]\n</code></pre> <p>or with multiple labels per each bounding box: Python<pre><code>bboxes = [\n    [23, 74, 295, 388, 'dog', 'animal'],\n    [377, 294, 252, 161, 'cat', 'animal'],\n    [333, 421, 49, 49, 'sports ball', 'item'],\n]\n</code></pre></p> <p>You can use any data type for declaring class labels. It can be string, integer, or any other Python data type.</p> <p>Next, you pass an image and bounding boxes for it to the <code>transform</code> function and receive the augmented image and bounding boxes.</p> Python<pre><code>transformed = transform(image=image, bboxes=bboxes)\ntransformed_image = transformed['image']\ntransformed_bboxes = transformed['bboxes']\n</code></pre> <p> Example input and output data for bounding boxes augmentation</p>"},{"location":"getting_started/bounding_boxes_augmentation/#2-pass-class-labels-in-a-separate-argument-to-transform-the-preferred-way","title":"2. Pass class labels in a separate argument to <code>transform</code> (the preferred way).","text":"<p>Let's say you have coordinates of three bounding boxes Python<pre><code>bboxes = [\n    [23, 74, 295, 388],\n    [377, 294, 252, 161],\n    [333, 421, 49, 49],\n]\n</code></pre></p> <p>You can create a separate list that contains class labels for those bounding boxes:</p> Python<pre><code>class_labels = ['cat', 'dog', 'parrot']\n</code></pre> <p>Then you pass both bounding boxes and class labels to <code>transform</code>. Note that to pass class labels, you need to use the name of the argument that you declared in <code>label_fields</code> when creating an instance of Compose in step 2. In our case, we set the name of the argument to <code>class_labels</code>.</p> Python<pre><code>transformed = transform(image=image, bboxes=bboxes, class_labels=class_labels)\ntransformed_image = transformed['image']\ntransformed_bboxes = transformed['bboxes']\ntransformed_class_labels = transformed['class_labels']\n</code></pre> <p> Example input and output data for bounding boxes augmentation with a separate argument for class labels</p> <p>Note that <code>label_fields</code> expects a list, so you can set multiple fields that contain labels for your bounding boxes. So if you declare Compose like</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=450, height=450),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], bbox_params=A.BboxParams(format='coco', label_fields=['class_labels', 'class_categories'])))\n</code></pre> <p>you can use those multiple arguments to pass info about class labels, like</p> Python<pre><code>class_labels = ['cat', 'dog', 'parrot']\nclass_categories = ['animal', 'animal', 'item']\n\ntransformed = transform(image=image, bboxes=bboxes, class_labels=class_labels, class_categories=class_categories)\ntransformed_image = transformed['image']\ntransformed_bboxes = transformed['bboxes']\ntransformed_class_labels = transformed['class_labels']\ntransformed_class_categories = transformed['class_categories']\n</code></pre>"},{"location":"getting_started/bounding_boxes_augmentation/#examples","title":"Examples","text":"<ul> <li>Using Albumentations to augment bounding boxes for object detection tasks</li> <li>How to use Albumentations for detection tasks if you need to keep all bounding boxes</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/image_augmentation/","title":"Image augmentation for classification","text":"<p>We can divide the process of image augmentation into four steps:</p> <ol> <li>Import albumentations and a library to read images from the disk (e.g., OpenCV).</li> <li>Define an augmentation pipeline.</li> <li>Read images from the disk.</li> <li>Pass images to the augmentation pipeline and receive augmented images.</li> </ol>"},{"location":"getting_started/image_augmentation/#step-1-import-the-required-libraries","title":"Step 1. Import the required libraries.","text":"<ul> <li>Import Albumentations</li> </ul> Python<pre><code>import albumentations as A\n</code></pre> <ul> <li>Import a library to read images from the disk. In this example, we will use OpenCV. It is an open-source computer vision library that supports many image formats. Albumentations has OpenCV as a dependency, so you already have OpenCV installed.</li> </ul> Python<pre><code>import cv2\n</code></pre>"},{"location":"getting_started/image_augmentation/#step-2-define-an-augmentation-pipeline","title":"Step 2. Define an augmentation pipeline.","text":"<p>To define an augmentation pipeline, you need to create an instance of the <code>Compose</code> class. As an argument to the <code>Compose</code> class, you need to pass a list of augmentations you want to apply. A call to <code>Compose</code> will return a transform function that will perform image augmentation.</p> <p>Let's look at an example:</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=256, height=256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n])\n</code></pre> <p>In the example, <code>Compose</code> receives a list with three augmentations: <code>A.RandomCrop</code>, <code>A.HorizontalFlip</code>, and <code>A.RandomBrighntessContrast</code>. You can find the full list of all available augmentations in the GitHub repository and in the API Docs. A demo playground that demonstrates how augmentations will transform the input image is available at https://explore.albumentations.ai.</p> <p>To create an augmentation, you create an instance of the required augmentation class and pass augmentation parameters to it. <code>A.RandomCrop</code> receives two parameters, <code>height</code> and <code>width</code>. <code>A.RandomCrop(width=256, height=256)</code> means that <code>A.RandomCrop</code> will take an input image, extract a random patch with size 256 by 256 pixels from it and then pass the result to the next augmentation in the pipeline (in this case to <code>A.HorizontalFlip</code>).</p> <p><code>A.HorizontalFlip</code> in this example has one parameter named <code>p</code>. <code>p</code> is a special parameter that is supported by almost all augmentations. It controls the probability of applying the augmentation. <code>p=0.5</code> means that with a probability of 50%, the transform will flip the image horizontally, and with a probability of 50%, the transform won't modify the input image.</p> <p><code>A.RandomBrighntessContrast</code> in the example also has one parameter, <code>p</code>. With a probability of 20%, this augmentation will change the brightness and contrast of the image received from <code>A.HorizontalFlip</code>. And with a probability of 80%, it will keep the received image unchanged.</p> <p> A visualized version of the augmentation pipeline. You pass an image to it, the image goes through all transformations, and then you receive an augmented image from the pipeline.</p>"},{"location":"getting_started/image_augmentation/#step-3-read-images-from-the-disk","title":"Step 3. Read images from the disk.","text":"<p>To pass an image to the augmentation pipeline, you need to read it from the disk. The pipeline expects to receive an image in the form of a NumPy array. If it is a color image, it should have three channels in the following order: Red, Green, Blue (so a regular RGB image).</p> <p>To read images from the disk, you can use OpenCV - a popular library for image processing. It supports a lot of input formats and is installed along with Albumentations since Albumentations utilizes that library under the hood for a lot of augmentations.</p> <p>To import OpenCV</p> Python<pre><code>import cv2\n</code></pre> <p>To read an image with OpenCV</p> <p>Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> Note the usage of <code>cv2.cvtColor</code>. For historical reasons, OpenCV reads an image in BGR format (so color channels of the image have the following order: Blue, Green, Red). Albumentations uses the most common and popular RGB image format. So when using OpenCV, we need to convert the image format to RGB explicitly.</p> <p>Besides OpenCV, you can use other image processing libraries.</p>"},{"location":"getting_started/image_augmentation/#pillow","title":"Pillow","text":"<p>Pillow is a popular Python image processing library.</p> <ul> <li>Install Pillow</li> </ul> Bash<pre><code>    pip install pillow\n</code></pre> <ul> <li>Import Pillow and NumPy (we need NumPy to convert a Pillow image to a NumPy array. NumPy is already installed along with Albumentations).</li> </ul> Python<pre><code>from PIL import Image\nimport numpy as np\n</code></pre> <ul> <li>Read an image with Pillow and convert it to a NumPy array. Python<pre><code>pillow_image = Image.open(\"image.jpg\")\nimage = np.array(pillow_image)\n</code></pre></li> </ul>"},{"location":"getting_started/image_augmentation/#step-4-pass-images-to-the-augmentation-pipeline-and-receive-augmented-images","title":"Step 4. Pass images to the augmentation pipeline and receive augmented images.","text":"<p>To pass an image to the augmentation pipeline you need to call the <code>transform</code> function created by a call to <code>A.Compose</code> at Step 2. In the <code>image</code> argument to that function, you need to pass an image that you want to augment.</p> Python<pre><code>transformed = transform(image=image)\n</code></pre> <p><code>transform</code> will return a dictionary with a single key <code>image</code>. Value at that key will contain an augmented image.</p> Python<pre><code>transformed_image = transformed[\"image\"]\n</code></pre> <p>To augment the next image, you need to call <code>transform</code> again and pass a new image as the <code>image</code> argument:</p> Python<pre><code>another_transformed_image = transform(image=another_image)[\"image\"]\n</code></pre> <p>Each augmentation will change the input image with the probability set by the parameter <code>p</code>. Also, many augmentations have parameters that control the magnitude of changes that will be applied to an image. For example, <code>A.RandomBrightnessContrast</code> has two parameters: <code>brightness_limit</code> that controls the magnitude of adjusting brightness and <code>contrast_limit</code> that controls the magnitude of adjusting contrast. The bigger the value, the more the augmentation will change an image. During augmentation, a magnitude of the transformation is sampled from a uniform distribution limited by <code>brightness_limit</code> and <code>contrast_limit</code>. That means that if you make multiple calls to <code>transform</code> with the same input image, you will get a different output image each time.</p> Python<pre><code>transform = A.Compose([\n    A.RandomBrightnessContrast(brightness_limit=1, contrast_limit=1, p=1.0),\n])\ntransformed_image_1 = transform(image=image)['image']\ntransformed_image_2 = transform(image=image)['image']\ntransformed_image_3 = transform(image=image)['image']\n</code></pre> <p></p>"},{"location":"getting_started/image_augmentation/#examples","title":"Examples","text":"<ul> <li>Defining a simple augmentation pipeline for image augmentation</li> <li>Working with non-8-bit images</li> <li>Weather augmentations in Albumentations</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/installation/","title":"Installation","text":"<p>Albumentations requires Python 3.8 or higher.</p>"},{"location":"getting_started/installation/#install-the-latest-stable-version-from-pypi","title":"Install the latest stable version from PyPI","text":"Bash<pre><code>pip install -U albumentations\n</code></pre>"},{"location":"getting_started/installation/#install-the-latest-version-from-the-master-branch-on-github","title":"Install the latest version from the master branch on GitHub","text":"Bash<pre><code>pip install -U git+https://github.com/albumentations-team/albumentations\n</code></pre>"},{"location":"getting_started/installation/#note-on-opencv-dependencies","title":"Note on OpenCV dependencies","text":"<p>By default, pip downloads a wheel distribution of Albumentations. This distribution has <code>opencv-python-headless</code> as its dependency.</p> <p>If you already have some OpenCV distribution (such as <code>opencv-python-headless</code>, <code>opencv-python</code>, <code>opencv-contrib-python</code> or <code>opencv-contrib-python-headless</code>) installed in your Python environment, you can force Albumentations to use it by providing the <code>--no-binary qudida,albumentations</code> argument to pip, e.g.</p> Bash<pre><code>pip install -U albumentations\n</code></pre> <p>pip will use the following logic to determine the required OpenCV distribution:</p> <ol> <li>If your Python environment already contains <code>opencv-python</code>, <code>opencv-contrib-python</code>, <code>opencv-contrib-python-headless</code> or <code>opencv-python-headless</code> pip will use it.</li> <li>If your Python environment doesn't contain any OpenCV distribution from step 1, pip will download <code>opencv-python-headless</code>.</li> </ol>"},{"location":"getting_started/installation/#install-the-latest-stable-version-from-conda-forge","title":"Install the latest stable version from conda-forge","text":"<p>If you are using Anaconda or Miniconda you can install Albumentations from conda-forge:</p> Bash<pre><code>conda install -c conda-forge albumentations\n</code></pre>"},{"location":"getting_started/keypoints_augmentation/","title":"Keypoints augmentation","text":"<p>Computer vision tasks such as human pose estimation, face detection, and emotion recognition usually work with keypoints on the image.</p> <p>In the case of pose estimation, keypoints mark human joints such as shoulder, elbow, wrist, knee, etc.</p> <p> Keypoints annotations along with visualized edges between keypoints. Images are from the COCO dataset.</p> <p>In the case of face detection, keypoints mark important areas of the face such as eyes, nose, corners of the mouth, etc.</p> <p> Facial keypoints. Source: the \"Facial Keypoints Detection\" competition on Kaggle.</p> <p>To define a keypoint, you usually need two values, x and y coordinates of the keypoint. Coordinates of the keypoint are calculated with respect to the top-left corner of the image which has <code>(x, y)</code> coordinates <code>(0, 0)</code>. Often keypoints have associated labels such as <code>right_elbow</code>, <code>left_wrist</code>, etc.</p> <p> An example image with five keypoints from the COCO dataset</p> <p>Some classical computer vision algorithms, such as SIFT, may use four values to describe a keypoint. In addition to the x and y coordinates, there are keypoint scale and keypoint angle. Albumentations support those values as well.</p> <p> A keypoint may also has associated scale and angle values</p> <p>Keypoint angles are counter-clockwise. For example, in the following image, the angle value is 65\u00b0. You can read more about angle of rotation in the Wikipedia article. </p>"},{"location":"getting_started/keypoints_augmentation/#supported-formats-for-keypoints-coordinates","title":"Supported formats for keypoints' coordinates.","text":"<ul> <li> <p><code>xy</code>. A keypoint is defined by x and y coordinates in pixels.</p> </li> <li> <p><code>yx</code>. A keypoint is defined by y and x coordinates in pixels.</p> </li> <li> <p><code>xya</code>. A keypoint is defined by x and y coordinates in pixels and the angle.</p> </li> <li> <p><code>xys</code>. A keypoint is defined by x and y coordinates in pixels, and the scale.</p> </li> <li> <p><code>xyas</code>. A keypoint is defined by x and y coordinates in pixels, the angle, and the scale.</p> </li> <li> <p><code>xysa</code>. A keypoint is defined by x and y coordinates in pixels, the scale, and the angle.</p> </li> </ul>"},{"location":"getting_started/keypoints_augmentation/#augmenting-keypoints","title":"Augmenting keypoints","text":"<p>The process of augmenting keypoints looks very similar to the bounding boxes augmentation. It consists of 4 steps.</p> <ol> <li>You import the required libraries.</li> <li>You define an augmentation pipeline.</li> <li>You read images and keypoints from the disk.</li> <li>You pass an image and keypoints to the augmentation pipeline and receive augmented images and keypoints.</li> </ol> <p>Note</p> <p>Some transforms in Albumentation don't support keypoints. If you try to use them you will get an exception. Please refer to this article to check whether a transform can augment keypoints.</p>"},{"location":"getting_started/keypoints_augmentation/#step-1-import-the-required-libraries","title":"Step 1. Import the required libraries.","text":"Python<pre><code>import albumentations as A\nimport cv2\n</code></pre>"},{"location":"getting_started/keypoints_augmentation/#step-2-define-an-augmentation-pipeline","title":"Step 2. Define an augmentation pipeline.","text":"<p>Here an example of a minimal declaration of an augmentation pipeline that works with keypoints.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy'))\n</code></pre> <p>Note that just like with bounding boxes, <code>Compose</code> has an additional parameter that defines the format for keypoints' coordinates. In the case of keypoints, it is called <code>keypoint_params</code>. Here we pass an instance of <code>A.KeypointParams</code> that says that <code>xy</code> coordinates format should be used.</p> <p>Besides <code>format</code>, <code>A.KeypointParams</code> supports a few more settings.</p> <p>Here is an example of <code>Compose</code> that shows all available settings with <code>A.KeypointParams</code></p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy', label_fields=['class_labels'], remove_invisible=True, angle_in_degrees=True))\n</code></pre>"},{"location":"getting_started/keypoints_augmentation/#label_fields","title":"<code>label_fields</code>","text":"<p>In some computer vision tasks, keypoints have not only coordinates but associated labels as well. For example, in pose estimation, each keypoint has a label such as <code>elbow</code>, <code>knee</code> or <code>wrist</code>. You need to pass those labels in a separate argument (or arguments, because you can use multiple fields) to the <code>transform</code> function that will augment keypoints. <code>label_fields</code> defines names of those fields. Step 4 describes how you need to use the <code>transform</code> function.</p>"},{"location":"getting_started/keypoints_augmentation/#remove_invisible","title":"<code>remove_invisible</code>","text":"<p>After the augmentation, some keypoints may become invisible because they will be located outside of the augmented image's visible area. For example, if you crop a part of the image, all the keypoints outside of the cropped area will become invisible. If <code>remove_invisible</code> is set to <code>True</code>, Albumentations won't return invisible keypoints. <code>remove_invisible</code> is set to <code>True</code> by default, so if you don't pass that argument, Albumentations won't return invisible keypoints.</p>"},{"location":"getting_started/keypoints_augmentation/#angle_in_degrees","title":"<code>angle_in_degrees</code>","text":"<p>If <code>angle_in_degrees</code> is set to <code>True</code> (this is the default value), then Albumentations expects that the angle value in formats <code>xya</code>, <code>xyas</code>, and <code>xysa</code> is defined in angles. If <code>angle_in_degrees</code> is set to <code>False</code>, Albumentations expects that the angle value is specified in radians.</p> <p>This setting doesn't affect <code>xy</code> and <code>yx</code> formats, because those formats don't use angles.</p>"},{"location":"getting_started/keypoints_augmentation/#3-read-images-and-keypoints-from-the-disk","title":"3. Read images and keypoints from the disk.","text":"<p>Read an image from the disk.</p> <p>Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> Keypoints can be stored on the disk in different serialization formats: JSON, XML, YAML, CSV, etc. So the code to read keypoints depends on the actual format of data on the disk.</p> <p>After you read the data from the disk, you need to prepare keypoints for Albumentations.</p> <p>Albumentations expects that keypoint will be represented as a list of lists. Each list contains information about a single keypoint. A definition of keypoint should have two to four elements depending on the selected format of keypoints. The first two elements are x and y coordinates of a keypoint in pixels (or y and x coordinates in the <code>yx</code> format). The third and fourth elements may be the angle and the scale of keypoint if you select a format that uses those values.</p>"},{"location":"getting_started/keypoints_augmentation/#step-4-pass-an-image-and-keypoints-to-the-augmentation-pipeline-and-receive-augmented-images-and-boxes","title":"Step 4. Pass an image and keypoints to the augmentation pipeline and receive augmented images and boxes.","text":"<p>Let's say you have an example image with five keypoints.</p> <p>A list with those five keypoints' coordinates in the <code>xy</code> format will look the following:</p> Python<pre><code>keypoints = [\n    (264, 203),\n    (86, 88),\n    (254, 160),\n    (193, 103),\n    (65, 341),\n]\n</code></pre> <p>Then you pass those keypoints to the <code>transform</code> function along with the image and receive the augmented versions of image and keypoints.</p> Python<pre><code>transformed = transform(image=image, keypoints=keypoints)\ntransformed_image = transformed['image']\ntransformed_keypoints = transformed['keypoints']\n</code></pre> <p> The augmented image with augmented keypoints</p> <p>If you set <code>remove_invisible</code> to <code>False</code> in <code>keypoint_params</code>, then Albumentations will return all keypoints, even if they lie outside the visible area. In the example image below, you can see that the keypoint for the right hip is located outside the image, but Albumentations still returned it. The area outside the image is highlighted in yellow.</p> <p> When <code>remove_invisible</code> is set to <code>False</code> Albumentations will return all keypoints, even those located outside the image</p> <p>If keypoints have associated class labels, you need to create a list that contains those labels:</p> Python<pre><code>class_labels = [\n    'left_elbow',\n    'right_elbow',\n    'left_wrist',\n    'right_wrist',\n    'right_hip',\n]\n</code></pre> <p>Also, you need to declare the name of the argument to <code>transform</code> that will contain those labels. For declaration, you need to use the <code>label_fields</code> parameters of <code>A.KeypointParams</code>.</p> <p>For example, we could use the <code>class_labels</code> name for the argument with labels.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy', label_fields=['class_labels']))\n</code></pre> <p>Next, you pass both keypoints' coordinates and class labels to <code>transform</code>.</p> Python<pre><code>transformed = transform(image=image, keypoints=keypoints, class_labels=class_labels)\ntransformed_image = transformed['image']\ntransformed_keypoints = transformed['keypoints']\ntransformed_class_labels = transformed['class_labels']\n</code></pre> <p>Note that <code>label_fields</code> expects a list, so you can set multiple fields that contain labels for your keypoints. So if you declare Compose like</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy', label_fields=['class_labels', 'class_sides']))\n</code></pre> <p>you can use those multiple arguments to pass info about class labels, like</p> Python<pre><code>class_labels = [\n    'left_elbow',\n    'right_elbow',\n    'left_wrist',\n    'right_wrist',\n    'right_hip',\n]\n\nclass_sides = ['left', 'right', 'left', 'right', 'right']\n\ntransformed = transform(image=image, keypoints=keypoints, class_labels=class_labels, class_sides=class_sides)\ntransformed_class_sides = transformed['class_sides']\ntransformed_class_labels = transformed['class_labels']\ntransformed_keypoints = transformed['keypoints']\ntransformed_image = transformed['image']\n</code></pre> <p> Example input and output data for keypoints augmentation with two separate arguments for class labels</p> <p>Note</p> <p>Some augmentations may affect class labels and make them incorrect. For example, the <code>HorizontalFlip</code> augmentation mirrors the input image. When you apply that augmentation to keypoints that mark the side of body parts (left or right), those keypoints will point to the wrong side (since <code>left</code> on the mirrored image becomes <code>right</code>). So when you are creating an augmentation pipeline look carefully which augmentations could be applied to the input data.</p> <p> <code>HorizontalFlip</code> may make keypoints' labels incorrect</p>"},{"location":"getting_started/keypoints_augmentation/#examples","title":"Examples","text":"<ul> <li>Using Albumentations to augment keypoints</li> </ul>"},{"location":"getting_started/mask_augmentation/","title":"Mask augmentation for segmentation","text":"<p>For instance and semantic segmentation tasks, you need to augment both the input image and one or more output masks.</p> <p>Albumentations ensures that the input image and the output mask will receive the same set of augmentations with the same parameters.</p> <p>The process of augmenting images and masks looks very similar to the regular image-only augmentation.</p> <ol> <li>You import the required libraries.</li> <li>You define an augmentation pipeline.</li> <li>You read images and masks from the disk.</li> <li>You pass an image and one or more masks to the augmentation pipeline and receive augmented images and masks.</li> </ol>"},{"location":"getting_started/mask_augmentation/#steps-1-and-2-import-the-required-libraries-and-define-an-augmentation-pipeline","title":"Steps 1 and 2. Import the required libraries and define an augmentation pipeline.","text":"<p>Image augmentation for classification described Steps 1 and 2 in great detail. These are the same steps for the simultaneous augmentation of images and masks.</p> Python<pre><code>import albumentations as A\nimport cv2\n\ntransform = A.Compose([\n    A.RandomCrop(width=256, height=256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n])\n</code></pre>"},{"location":"getting_started/mask_augmentation/#step-3-read-images-and-masks-from-the-disk","title":"Step 3. Read images and masks from the disk.","text":"<ul> <li>Reading an image</li> </ul> Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> <ul> <li>For semantic segmentation, you usually read one mask per image. Albumentations expects the mask to be a NumPy array. The height and width of the mask should have the same values as the height and width of the image.</li> </ul> Python<pre><code>mask = cv2.imread(\"/path/to/mask.png\")\n</code></pre> <ul> <li>For instance segmentation, you sometimes need to read multiple masks per image. Then you create a list that contains all the masks.</li> </ul> Python<pre><code>mask_1 = cv2.imread(\"/path/to/mask_1.png\")\nmask_2 = cv2.imread(\"/path/to/mask_2.png\")\nmask_3 = cv2.imread(\"/path/to/mask_3.png\")\nmasks = [mask_1, mask_2, mask_3]\n</code></pre> <p>Some datasets use other formats to store masks. For example, they can use Run-Length Encoding or Polygon coordinates. In that case, you need to convert a mask to a NumPy before augmenting it with Albumentations. Often dataset authors provide special libraries and tools to simplify the conversion.</p>"},{"location":"getting_started/mask_augmentation/#step-4-pass-image-and-masks-to-the-augmentation-pipeline-and-receive-augmented-images-and-masks","title":"Step 4. Pass image and masks to the augmentation pipeline and receive augmented images and masks.","text":"<p>If the image has one associated mask, you need to call <code>transform</code> with two arguments: <code>image</code> and <code>mask</code>. In <code>image</code> you should pass the input image, in <code>mask</code> you should pass the output mask. <code>transform</code> will return a dictionary with two keys: <code>image</code> will contain the augmented image, and <code>mask</code> will contain the augmented mask.</p> Python<pre><code>transformed = transform(image=image, mask=mask)\ntransformed_image = transformed['image']\ntransformed_mask = transformed['mask']\n</code></pre> <p></p> <p>An image and a mask before and after augmentation. Inria Aerial Image Labeling dataset contains aerial photos as well as their segmentation masks. Each pixel of the mask is marked as 1 if the pixel belongs to the class <code>building</code> and 0 otherwise.</p> <p>If the image has multiple associated masks, you should use the <code>masks</code> argument instead of <code>mask</code>. In <code>masks</code> you should pass a list of masks.</p> Python<pre><code>transformed = transform(image=image, masks=masks)\ntransformed_image = transformed['image']\ntransformed_masks = transformed['masks']\n</code></pre>"},{"location":"getting_started/mask_augmentation/#examples","title":"Examples","text":"<ul> <li>Using Albumentations for a semantic segmentation task</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/setting_probabilities/","title":"Setting probabilities for transforms in an augmentation pipeline","text":"<p>Each augmentation in Albumentations has a parameter named <code>p</code> that sets the probability of applying that augmentation to input data.</p> <p>The following augmentations have the default value of <code>p</code> set 1 (which means that by default they will be applied to each instance of input data): <code>Compose</code>, <code>ReplayCompose</code>, <code>CenterCrop</code>, <code>Crop</code>, <code>CropNonEmptyMaskIfExists</code>, <code>FromFloat</code>, <code>CenterCrop</code>, <code>Crop</code>, <code>CropNonEmptyMaskIfExists</code>, <code>FromFloat</code>, <code>IAACropAndPad</code>, <code>Lambda</code>, <code>LongestMaxSize</code>, <code>Normalize</code>, <code>PadIfNeeded</code>, <code>RandomCrop</code>, <code>RandomCropNearBBox</code>, <code>RandomResizedCrop</code>, <code>RandomSizedBBoxSafeCrop</code>, <code>RandomSizedCrop</code>, <code>Resize</code>, <code>SmallestMaxSize</code>, <code>ToFloat</code>.</p> <p>All other augmentations have the default value of <code>p</code> set 0.5, which means that by default, they will be applied to 50% of instances of input data.</p> <p>Let's take a look at the example:</p> Python<pre><code>import albumentations as A\nimport cv2\n\np1 = 0.95\np2 = 0.85\np3 = 0.75\n\n\ntransform = A.Compose([\n    A.RandomRotate90(p=p2),\n    A.OneOf([\n        A.IAAAdditiveGaussianNoise(p=0.9),\n        A.GaussNoise(p=0.6),\n    ], p=p3)\n], p=p1)\n\nimage = cv2.imread('some/image.jpg')\nimage = cv2.cvtColor(cv2.COLOR_BGR2RGB)\n\ntransformed = transform(image=image)\ntransformed_image = transformed['image']\n</code></pre> <p>We declare an augmentation pipeline. In this pipeline, we use three placeholder values to set probabilities: <code>p1</code>, <code>p2</code>, and <code>p3</code>. Let's take a closer look at them.</p>"},{"location":"getting_started/setting_probabilities/#p1","title":"<code>p1</code>","text":"<p><code>p1</code> sets the probability that the augmentation pipeline will apply augmentations at all.</p> <p>If <code>p1</code> is set to 0, then augmentations inside <code>Compose</code> will never be applied to the input image, so the augmentation pipeline will always return the input image unchanged.</p> <p>If <code>p1</code> is set to 1, then all augmentations inside <code>Compose</code> will have a chance to be applied. The example above contains two augmentations inside <code>Compose</code>: <code>RandomRotate90</code> and the <code>OneOf</code> block with two child augmentations (more on their probabilities later). Any value of <code>p1</code> between 0 and 1 means that augmentations inside <code>Compose</code> could be applied with the probability between 0 and 100%.</p> <p>If <code>p1</code> equals to 1 or <code>p1</code> is less than 1, but the random generator decides to apply augmentations inside Compose probabilities <code>p2</code> and <code>p3</code> come into play.</p>"},{"location":"getting_started/setting_probabilities/#p2","title":"<code>p2</code>","text":"<p>Each augmentation inside <code>Compose</code> has a probability of being applied. <code>p2</code> sets the probability of applying <code>RandomRotate90</code>. In the example above, <code>p2</code> equals 0.85, so <code>RandomRotate90</code> has an 85% chance to be applied to the input image.</p>"},{"location":"getting_started/setting_probabilities/#p3","title":"<code>p3</code>","text":"<p><code>p3</code> sets the probability of applying the <code>OneOf</code> block. If the random generator decided to apply <code>RandomRotate90</code> at the previous step, then <code>OneOf</code> will receive data augmented by it. If the random generator decided not to apply <code>RandomRotate90</code> then <code>OneOf</code> will receive the input data (that was passed to <code>Compose</code>) since <code>RandomRotate90</code> is skipped.</p> <p>The <code>OneOf</code>block applies one of the augmentations inside it. That means that if the random generator chooses to apply <code>OneOf</code> then one child augmentation from it will be applied to the input data.</p> <p>To decide which augmentation within the <code>OneOf</code> block is used, Albumentations uses the following rule:</p> <p>The <code>OneOf</code> block normalizes the probabilities of all augmentations inside it, so their probabilities sum up to 1. Next, <code>OneOf</code> chooses one of the augmentations inside it with a chance defined by its normalized probability and applies it to the input data. In the example above <code>IAAAdditiveGaussianNoise</code> has probability 0.9 and <code>GaussNoise</code> probability 0.6. After normalization, they become 0.6 and 0.4. Which means that <code>OneOf</code> will decide that it should use <code>IAAAdditiveGaussianNoise</code> with probability 0.6 and <code>GaussNoise</code> otherwise.</p>"},{"location":"getting_started/setting_probabilities/#example-calculations","title":"Example calculations","text":"<p>Thus, each augmentation in the example above will be applied with the probability:</p> <ul> <li><code>RandomRotate90</code>: <code>p1</code> * <code>p2</code></li> <li><code>IAAAdditiveGaussianNoise</code>: <code>p1</code> * <code>p3</code> * (0.9 / (0.9 + 0.6))</li> <li><code>GaussianNoise</code>: <code>p1</code> * <code>p3</code> * (0.6 / (0.9 + 0.6))</li> </ul>"},{"location":"getting_started/simultaneous_augmentation/","title":"Simultaneous augmentation of multiple targets: masks, bounding boxes, keypoints","text":"<p>Albumentations can apply the same set of transformations to the input images and all the targets that are passed to <code>transform</code>: masks, bounding boxes, and keypoints.</p> <p>Please refer to articles Image augmentation for classification, Mask augmentation for segmentation, Bounding boxes augmentation for object detection, and Keypoints augmentation for the detailed description of each data type.</p> <p>Note</p> <p>Some transforms in Albumentation don't support bounding boxes or keypoints. If you try to use them you will get an exception. Please refer to this article to check whether a transform can augment bounding boxes and keypoints.</p> <p>Below is an example, how you can simultaneously augment the input image, mask, bounding boxes with their labels, and keypoints with their labels. Note that the only required argument to <code>transform</code> is <code>image</code>; all other arguments are optional, and you can combine them in any way.</p>"},{"location":"getting_started/simultaneous_augmentation/#step-1-define-compose-with-parameters-that-specify-formats-for-bounding-boxes-and-keypoints","title":"Step 1. Define <code>Compose</code> with parameters that specify formats for bounding boxes and keypoints.","text":"Python<pre><code>transform = A.Compose(\n  [A.RandomCrop(width=330, height=330), A.RandomBrightnessContrast(p=0.2)],\n  bbox_params=A.BboxParams(format=\"coco\", label_fields=[\"bbox_classes\"]),\n  keypoint_params=A.KeypointParams(format=\"xy\", label_fields=[\"keypoints_classes\"]),\n)\n</code></pre>"},{"location":"getting_started/simultaneous_augmentation/#step-2-load-all-required-data-from-the-disk","title":"Step 2. Load all required data from the disk","text":"<p>Please refer to articles Image augmentation for classification, Mask augmentation for segmentation, Bounding boxes augmentation for object detection, and Keypoints augmentation for more information about loading the input data.</p> <p>For example, here is an image from the COCO dataset. that has one associated mask, one bounding box with the class label <code>person</code>, and five keypoints that define body parts.</p> <p> An example image with mask, bounding boxes and keypoints</p>"},{"location":"getting_started/simultaneous_augmentation/#step-3-pass-all-targets-to-transform-and-receive-their-augmented-versions","title":"Step 3. Pass all targets to <code>transform</code> and receive their augmented versions","text":"Python<pre><code>transformed = transform(\n  image=img,\n  mask=mask,\n  bboxes=bboxes,\n  bbox_classes=bbox_classes,\n  keypoints=keypoints,\n  keypoints_classes=keypoints_classes,\n)\ntransformed_image = transformed[\"image\"]\ntransformed_mask = transformed[\"mask\"]\ntransformed_bboxes = transformed[\"bboxes\"]\ntransformed_bbox_classes = transformed[\"bbox_classes\"]\ntransformed_keypoints = transformed[\"keypoints\"]\ntransformed_keypoints_classes = transformed[\"keypoints_classes\"]\n</code></pre> <p> The augmented version of the image and its targets</p>"},{"location":"getting_started/simultaneous_augmentation/#examples","title":"Examples","text":"<ul> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/transforms_and_targets/","title":"A list of transforms and their supported targets","text":"<p>We can split all transforms into two groups: pixel-level transforms, and spatial-level transforms. Pixel-level transforms will change just an input image and will leave any additional targets such as masks, bounding boxes, and keypoints unchanged. Spatial-level transforms will simultaneously change both an input image as well as additional targets such as masks, bounding boxes, and keypoints. For the additional information, please refer to this section of \"Why you need a dedicated library for image augmentation\".</p>"},{"location":"getting_started/transforms_and_targets/#pixel-level-transforms","title":"Pixel-level transforms","text":"<p>Here is a list of all available pixel-level transforms. You can apply a pixel-level transform to any target, and under the hood, the transform will change only the input image and return any other input targets such as masks, bounding boxes, or keypoints unchanged.</p> <ul> <li>AdditiveNoise</li> <li>AdvancedBlur</li> <li>AutoContrast</li> <li>Blur</li> <li>CLAHE</li> <li>ChannelDropout</li> <li>ChannelShuffle</li> <li>ChromaticAberration</li> <li>ColorJitter</li> <li>Defocus</li> <li>Downscale</li> <li>Emboss</li> <li>Equalize</li> <li>FDA</li> <li>FancyPCA</li> <li>FromFloat</li> <li>GaussNoise</li> <li>GaussianBlur</li> <li>GlassBlur</li> <li>HistogramMatching</li> <li>HueSaturationValue</li> <li>ISONoise</li> <li>Illumination</li> <li>ImageCompression</li> <li>InvertImg</li> <li>MedianBlur</li> <li>MotionBlur</li> <li>MultiplicativeNoise</li> <li>Normalize</li> <li>PixelDistributionAdaptation</li> <li>PlanckianJitter</li> <li>PlasmaBrightnessContrast</li> <li>PlasmaShadow</li> <li>Posterize</li> <li>RGBShift</li> <li>RandomBrightnessContrast</li> <li>RandomFog</li> <li>RandomGamma</li> <li>RandomGravel</li> <li>RandomRain</li> <li>RandomShadow</li> <li>RandomSnow</li> <li>RandomSunFlare</li> <li>RandomToneCurve</li> <li>RingingOvershoot</li> <li>SaltAndPepper</li> <li>Sharpen</li> <li>ShotNoise</li> <li>Solarize</li> <li>Spatter</li> <li>Superpixels</li> <li>TemplateTransform</li> <li>TextImage</li> <li>ToFloat</li> <li>ToGray</li> <li>ToRGB</li> <li>ToSepia</li> <li>UnsharpMask</li> <li>ZoomBlur</li> </ul>"},{"location":"getting_started/transforms_and_targets/#spatial-level-transforms","title":"Spatial-level transforms","text":"<p>Here is a table with spatial-level transforms and targets they support. If you try to apply a spatial-level transform to an unsupported target, Albumentations will raise an error.</p> Transform Image Mask BBoxes Keypoints Volume Mask3D Affine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 AtLeastOneBBoxRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 BBoxSafeRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CenterCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CoarseDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Crop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropAndPad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropNonEmptyMaskIfExists \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 D4 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ElasticTransform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Erasing \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 FrequencyMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridElasticDeform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 HorizontalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Lambda \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 LongestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 MaskDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Morphological \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 NoOp \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OpticalDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OverlayElements \u2713 \u2713 Pad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PadIfNeeded \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Perspective \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PiecewiseAffine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PixelDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropFromBorders \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropNearBBox \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomGridShuffle \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomResizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomRotate90 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomScale \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedBBoxSafeCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Resize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Rotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SafeRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ShiftScaleRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SmallestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ThinPlateSpline \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeReverse \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Transpose \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 VerticalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 XYMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713"},{"location":"getting_started/video_augmentation/","title":"Working with Video Data in Albumentations","text":""},{"location":"getting_started/video_augmentation/#overview","title":"Overview","text":"<p>While Albumentations is primarily known for image augmentation, it can effectively process video data by treating it as a sequence of frames. When you pass a video as a numpy array, Albumentations will apply the same transform with identical parameters to each frame, ensuring temporal consistency.</p>"},{"location":"getting_started/video_augmentation/#data-format","title":"Data Format","text":""},{"location":"getting_started/video_augmentation/#video-frames","title":"Video Frames","text":"<p>Albumentations accepts video data as numpy arrays in the following formats: - <code>(N, H, W)</code> - Grayscale video (N frames) - <code>(N, H, W, C)</code> - Color video (N frames)</p> <p>Where: - N = Number of frames - H = Height - W = Width  - C = Channels (e.g., 3 for RGB)</p>"},{"location":"getting_started/video_augmentation/#video-masks","title":"Video Masks","text":"<p>For video segmentation tasks, masks should match the frame dimensions: - <code>(N, H, W)</code> - Binary or single-class masks - <code>(N, H, W, C)</code> - Multi-class masks</p>"},{"location":"getting_started/video_augmentation/#basic-usage","title":"Basic Usage","text":"Python<pre><code>import albumentations as A\nimport numpy as np\n</code></pre>"},{"location":"getting_started/video_augmentation/#create-transform-pipeline","title":"Create transform pipeline","text":"Python<pre><code>transform = A.Compose([\n    A.RandomCrop(height=224, width=224),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], seed=42)\n</code></pre>"},{"location":"getting_started/video_augmentation/#example-video-data","title":"Example video data","text":"Python<pre><code>video = np.random.rand(32, 256, 256, 3) # 32 RGB frames\nmasks = np.zeros((32, 256, 256)) # 32 binary masks\n</code></pre>"},{"location":"getting_started/video_augmentation/#apply-transform","title":"Apply transform","text":"Python<pre><code>augmented_video = transform(images=video, masks=masks)\n</code></pre>"},{"location":"getting_started/video_augmentation/#apply-transforms-same-parameters-for-all-frames","title":"Apply transforms - same parameters for all frames","text":"Python<pre><code>transformed = transform(images=video, mask=masks)\ntransformed_video = transformed['image']\ntransformed_masks = transformed['mask']\n</code></pre>"},{"location":"getting_started/video_augmentation/#key-features","title":"Key Features","text":"<ol> <li> <p>Temporal Consistency: The same transform with identical parameters is applied to all frames, preserving temporal consistency.</p> </li> <li> <p>Memory Efficiency: Frames are processed as a batch, avoiding repeated parameter generation.</p> </li> <li> <p>Compatible with All Transforms: Works with any Albumentations transform that supports the image target.</p> </li> </ol>"},{"location":"getting_started/video_augmentation/#example-pipeline-for-video-processing","title":"Example Pipeline for Video Processing","text":"Python<pre><code>def create_video_pipeline(\n    crop_size=(224, 224),\n    p_spatial=0.5,\n    p_color=0.3\n    ):\n    return A.Compose([\n        # Spatial transforms - same crop/flip for all frames\n        A.RandomCrop(\n            height=crop_size[0],\n            width=crop_size[1],\n            p=1.0\n        ),\n        A.HorizontalFlip(p=p_spatial),\n        # Color transforms - same adjustment for all frames\n        A.ColorJitter(\n            brightness=0.2,\n            contrast=0.2,\n            saturation=0.2,\n            hue=0.1,\n            p=p_color\n        ),\n        # Noise/blur - same pattern for all frames\n        A.GaussianBlur(p=0.3),\n    ])\n</code></pre>"},{"location":"getting_started/video_augmentation/#best-practices","title":"Best Practices","text":"<ol> <li>Performance Optimization:</li> <li>Place cropping operations first to reduce computation</li> <li>Consider frame rate and whether all frames need processing</li> </ol>"},{"location":"getting_started/video_augmentation/#next-steps","title":"Next Steps","text":"<ul> <li>Learn about Volumetric Data (3D) for volumetric data</li> </ul>"},{"location":"getting_started/volumetric_augmentation/","title":"Introduction to 3D Medical Image Augmentation","text":""},{"location":"getting_started/volumetric_augmentation/#overview","title":"Overview","text":"<p>While primarily used for medical imaging (CT scans, MRI), Albumentations' 3D transforms can be applied to various volumetric data types</p>"},{"location":"getting_started/volumetric_augmentation/#medical-imaging","title":"Medical Imaging","text":"<ul> <li>CT and MRI scans</li> <li>Ultrasound volumes</li> <li>PET scans</li> <li>Multi-modal medical imaging</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#scientific-data","title":"Scientific Data","text":"<ul> <li>Microscopy z-stacks</li> <li>Cryo-EM volumes</li> <li>Geological seismic data</li> <li>Weather radar volumes</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#industrial-applications","title":"Industrial Applications","text":"<ul> <li>3D NDT (Non-Destructive Testing) scans</li> <li>Industrial CT for quality control</li> <li>Material analysis volumes</li> <li>3D ultrasonic testing data</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#computer-vision","title":"Computer Vision","text":"<ul> <li>Depth camera sequences</li> <li>LiDAR point cloud voxelizations</li> <li>Multi-view stereo reconstructions</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#data-format","title":"Data Format","text":""},{"location":"getting_started/volumetric_augmentation/#volumes","title":"Volumes","text":"<p>Albumentations expects 3D volumes as numpy arrays in the following formats: - <code>(D, H, W)</code> - Single-channel volumes (e.g., CT scans) - <code>(D, H, W, C)</code> - Multi-channel volumes (e.g., multi-modal MRI)</p> <p>Where: - D = Depth (number of slices) - H = Height - W = Width  - C = Channels (optional)</p>"},{"location":"getting_started/volumetric_augmentation/#3d-masks","title":"3D Masks","text":"<p>Segmentation masks should match the volume dimensions: - <code>(D, H, W)</code> - Binary or single-class masks - <code>(D, H, W, C)</code> - Multi-class masks</p>"},{"location":"getting_started/volumetric_augmentation/#basic-usage","title":"Basic Usage","text":"Python<pre><code>import albumentations as A\nimport numpy as np\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#create-a-basic-3d-augmentation-pipeline","title":"Create a basic 3D augmentation pipeline","text":"Python<pre><code>transform = A.Compose([\n    # Crop volume to a fixed size for memory efficiency\n    A.RandomCrop3D(size=(64, 128, 128), p=1.0),    \n    # Randomly remove cubic regions to simulate occlusions\n    A.CoarseDropout3D(\n        num_holes_range=(2, 6),\n        hole_depth_range=(0.1, 0.3),\n        hole_height_range=(0.1, 0.3),\n        hole_width_range=(0.1, 0.3),\n        p=0.5\n    ),    \n])\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#apply-to-volume-and-mask","title":"Apply to volume and mask","text":"Python<pre><code>volume = np.random.rand(96, 256, 256) # Your 3D medical volume\nmask = np.zeros((96, 256, 256)) # Your 3D segmentation mask\ntransformed = transform(volume=volume, mask3d=mask)\ntransformed_volume = transformed['volume']\ntransformed_mask = transformed['mask3d']\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#available-3d-transforms","title":"Available 3D Transforms","text":"<p>Here are some examples of available 3D transforms:</p> <ul> <li><code>CenterCrop3D</code> - Crop the center part of a 3D volume</li> <li><code>RandomCrop3D</code> - Randomly crop a part of a 3D volume</li> <li><code>Pad3D</code> - Pad a 3D volume</li> <li><code>PadIfNeeded3D</code> - Pad if volume size is less than desired size</li> <li><code>CoarseDropout3D</code> - Random dropout of 3D cubic regions</li> <li><code>CubicSymmetry</code> - Apply random cubic symmetry transformations</li> </ul> <p>For a complete and up-to-date list of all available 3D transforms, please see our API Reference.</p>"},{"location":"getting_started/volumetric_augmentation/#combining-2d-and-3d-transforms","title":"Combining 2D and 3D Transforms","text":"<p>You can combine 2D and 3D transforms in the same pipeline. 2D transforms will be applied slice-by-slice in the XY plane:</p> Python<pre><code>transform = A.Compose([\n    # 3D transforms\n    A.RandomCrop3D(size=(64, 128, 128)),\n    # 2D transforms (applied to each XY slice)\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n])\n\ntransformed = transform(volume=volume, mask3d=mask)\ntransformed_volume = transformed['volume']\ntransformed_mask = transformed['mask3d']\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#best-practices","title":"Best Practices","text":"<ol> <li>Memory Management: 3D volumes can be large. Consider using smaller crop sizes or processing in patches.</li> <li>Place cropping operations at the beginning of your pipeline for better performance</li> <li>Example: A <code>256x256x256</code> volume cropped to <code>64x64x64</code> will process subsequent transforms ~64x faster</li> </ol>"},{"location":"getting_started/volumetric_augmentation/#efficient-pipeline-cropping-first","title":"Efficient pipeline - cropping first","text":"Python<pre><code>efficient_transform = A.Compose([\nA.RandomCrop3D(size=(64, 64, 64)), # Do this first!\nA.CoarseDropout3D(...),\nA.RandomBrightnessContrast(...)\n])\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#less-efficient-pipeline-processing-full-volume-unnecessarily","title":"Less efficient pipeline - processing full volume unnecessarily","text":"Python<pre><code>inefficient_transform = A.Compose([\nA.CoarseDropout3D(...), # Processing full volume\nA.RandomBrightnessContrast(...), # Processing full volume\nA.RandomCrop3D(size=(64, 64, 64)) # Cropping at the end\n])\n</code></pre> <ol> <li>Avoid Interpolation Artifacts: For highest quality augmentation, prefer transforms that only rearrange existing voxels without interpolation:</li> </ol> <p>a) Available Artifact-Free Transforms:       - <code>HorizontalFlip</code>, <code>VerticalFlip</code> - Mirror images across X or Y axes       - <code>RandomRotate90</code> - Rotate by 90 degrees in XY plane       - <code>D4</code> - All possible combinations of flips and 90-degree rotations in XY plane (8 variants)       - <code>CubicSymmetry</code> - 3D extension of D4, includes all 48 possible cube symmetries</p> <p>These transforms maintain perfect image quality because they only move existing voxels to new positions without creating new values through interpolation.</p> <pre><code>b) Benefits of Artifact-Free Transforms:\n- Preserve original voxel values exactly\n- Maintain spatial relationships between tissues\n- No blurring or information loss\n- Faster computation (no interpolation needed)\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#example-pipeline","title":"Example Pipeline","text":"<p>Here's a complete example of a medical image augmentation pipeline:</p> Python<pre><code>import albumentations as A\nimport numpy as np\n\ndef create_3d_pipeline(\n    crop_size=(64, 128, 128),\n    p_spatial=0.5,\n    p_intensity=0.3\n    ):\n    return A.Compose([\n        # Spatial transforms\n        A.RandomCrop3D(\n            size=crop_size,\n            p=1.0\n        ),\n        A.CubicSymmetry(p=p_spatial),\n        # Intensity transforms\n        A.CoarseDropout3D(\n            num_holes_range=(2, 5),\n            hole_depth_range=(0.1, 0.2),\n            hole_height_range=(0.1, 0.2),\n            hole_width_range=(0.1, 0.2),\n            p=p_intensity\n        ),\n    ])\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#usage","title":"Usage","text":"Python<pre><code>transform = create_3d_pipeline()\nvolume = np.random.rand(96, 256, 256)\nmask = np.zeros((96, 256, 256))\ntransformed = transform(volume=volume, mask3d=mask)\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#next-steps","title":"Next Steps","text":"<ul> <li>Learn about Video Augmentation for sequential data</li> </ul>"},{"location":"integrations/","title":"Integrations","text":"<p>Here are some examples of how to use Albumentations with different deep learning frameworks and tools:</p> <ul> <li>HuggingFace</li> <li>FiftyOne</li> <li>Roboflow</li> </ul>"},{"location":"integrations/fiftyone/","title":"FiftyOne integration","text":""},{"location":"integrations/fiftyone/#introduction","title":"Introduction","text":"<p>FiftyOne is an open-source visualization and analysis tool for machine learning datasets, particularly useful in computer vision projects. It facilitates detailed dataset examination and the fine-tuning of model performance.</p> <p>Albumentations could be used in FiftyOne via the FiftyOne Plugin.</p> <p>With the FiftyOne Albumentations plugin, you can transform any and all labels of type Detections, Keypoints, Segmentation, and Heatmap, or just the <code>images</code> themselves.</p> <p>Info</p> <p>This tutorial is almost entirely based on the FiftyOne Documentation and serves as an overview of the functionality of the FiftyOne Albumentations plugin.</p> <p>For more up to date information check the original source.</p> <p>This integration guide will focus on the setup process and the functionality of the plugin.</p> <p>For a tutorial on how to curate your augmentations, check out the Data Augmentation Tutorial as FiftyOne Documentation.</p>"},{"location":"integrations/fiftyone/#overview","title":"Overview","text":"<p>Albumentations supports 80+ transforms spanning pixel-level, geometric transformations, and more.</p> <p>As of April 29, 2024 FiftyOne supports:</p> <ul> <li>AdvancedBlur</li> <li>GridDropout</li> <li>MaskDropout</li> <li>PiecewiseAffine</li> <li>RandomGravel</li> <li>RandomGridShuffle</li> <li>RandomShadow</li> <li>RandomSunFlare</li> <li>Rotate</li> </ul>"},{"location":"integrations/fiftyone/#functionality","title":"Functionality","text":"<p>The FiftyOne Albumentations plugin provides the following functionality:</p> <ul> <li>Apply Albumentations transformations to your dataset, your current view, or selected samples</li> <li>Visualize the effects of these transformations directly within the FiftyOne App</li> <li>View samples generated by the last applied transformation</li> <li>Save augmented samples to the dataset</li> <li>Get info about the last applied transformation</li> <li>Save transformation pipelines to the dataset for reproducibility</li> </ul>"},{"location":"integrations/fiftyone/#setup","title":"Setup","text":"<p>Make sure you have FiftyOne and Albumentations installed:</p> Bash<pre><code>pip install -U fiftyone albumentations\n</code></pre> <p>Next, install the FiftyOne Albumentations plugin:</p> Bash<pre><code>fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin\n</code></pre> <p>Note</p> <p>If you have the FiftyOne Plugin Utils plugin installed, you can also install the Albumentations plugin via the <code>install_plugin</code> operator, selecting the Albumentations plugin from the community dropdown menu.</p> <p>You will also need to load (and download if necessary) a dataset to apply the augmentations to. For this guide, we'll use the the quickstart dataset:</p> Python<pre><code>import fiftyone as fo\nimport fiftyone.zoo as foz\n\n## only take 5 samples for quick demonstration\ndataset = foz.load_zoo_dataset(\"quickstart\", max_samples=5)\n\n# only keep the ground truth labels\ndataset.select_fields(\"ground_truth\").keep_fields()\n\nsession = fo.launch_app(dataset)\n</code></pre> <p>Note</p> <p>The quickstart dataset only contains Detections labels. If you want to test Albumentations transformations on other label types, here are some quick examples to get you started, using FiftyOne's Hugging Face Transformers and Ultralytics integrations: Bash<pre><code>pip install -U transformers ultralytics\n</code></pre> Python<pre><code>import fiftyone as fo\nimport fiftyone.zoo as foz\n\nfrom ultralytics import YOLO\n\n# Keypoints\nmodel = YOLO(\"yolov8l-pose.pt\")\ndataset.apply_model(model, label_field=\"keypoints\")\n\n# Instance Segmentation\nmodel = YOLO(\"yolov8l-seg.pt\")\ndataset.apply_model(model, label_field=\"instances\")\n\n# Semantic Segmentation\nmodel = foz.load_zoo_model(\n    \"segmentation-transformer-torch\",\n    name_or_path=\"Intel/dpt-large-ade\",\n)\ndataset.apply_model(model, label_field=\"mask\")\n\n# Heatmap\nmodel = foz.load_zoo_model(\n    \"depth-estimation-transformer-torch\",\n    name_or_path=\"LiheYoung/depth-anything-small-hf\",\n)\ndataset.apply_model(model, label_field=\"depth_map\")\n</code></pre></p>"},{"location":"integrations/fiftyone/#apply-transformations","title":"Apply transformations","text":"<p>To apply Albumentations transformations to your dataset, you can use the augment_with_albumentations operator. Press the backtick key to open the operator modal, and select the <code>augment_with_albumentations</code> operator from the dropdown menu.</p> <p>You can then configure the transformations to apply:</p> <ul> <li>Number of augmentations per sample: The number of augmented samples to generate for each input sample. The default is 1, which is sufficient for deterministic transformations, but for probabilistic transformations, you may want to generate multiple samples to see the range of possible outputs.</li> <li>Number of transforms: The number of transformations to compose into the pipeline to be applied to each sample. The default is 1, but you can set this as high as you'd like \u2014 the more transformations, the more complex the augmentations will be. You will be able to configure each transform separately.</li> <li>Target view: The view to which the transformations will be applied. The default is <code>dataset</code>, but you can also apply the transformations to the current view or to currently selected samples within the app.</li> <li>Execution mode: If you set <code>delegated=False</code>, the operation will be executed immediately. If you set <code>delegated=True</code>, the operation will be queued as a job, which you can then run in the background from your terminal with:</li> </ul> Bash<pre><code>fiftyone delegated launch\n</code></pre> <p>For each transformation, you can select either a \"primitive\" transformation from the Albumentations library, or a \"saved\" transformation pipeline that you have previously saved to the dataset. These saved pipelines can consist of one or more transformations.</p> <p>When you apply a primitive transformation, you can configure the parameters of the transformation directly within the app. The available parameters, their default values, types, and docstrings are all integrated directly from the Albumentations library.</p> <p></p> <p>When you apply a saved pipeline, there will not be any parameters to configure.</p> <p></p>"},{"location":"integrations/fiftyone/#visualize-transformations","title":"Visualize transformations","text":"<p>Once you've applied the transformations, you can visualize the effects of the transformations directly within the FiftyOne App. All augmented samples will be added to the dataset, and will be tagged as <code>augmented</code> so that you can easily filter for just augmented or non-augmented samples in the app.</p> <p></p> <p>You can also filter for augmented samples programmatically with the match_tags() method:</p> Python<pre><code># get just the augmented samples\naugmented_view = dataset.match_tags(\"augmented\")\n\n# get just the non-augmented samples\nnon_augmented_view = dataset.match_tags(\"augmented\", bool=False)\n</code></pre> <p>However, matching on these tags will return all samples that have been generated by an augmentation, not just the samples that were generated by the last applied transformation \u2014 as you will see shortly, we can save augmentations to the dataset. To get just the samples generated by the last applied transformation, you can use the view_last_albumentations_run operator:</p> <p></p> <p>Note</p> <p>For all samples added to the dataset by the FiftyOne Albumentations plugin, there will be a field <code>\"transform\"</code>, which contains the information not just about the pipeline that was applied, but also about the specific parameters that were used for this application of the pipeline. For example, if you had a HorizontalFlip transformation with an application probability of <code>p=0.5</code>, the contents of the <code>\"transform\"</code> field tell you whether or not this transformation was applied to the sample!</p>"},{"location":"integrations/fiftyone/#save-augmentations","title":"Save augmentations","text":"<p>By default all augmentations are temporary, as the FiftyOne Albumentations plugin is primarily designed for rapid prototyping and experimentation. This means that when you generated a new batch of augmented samples, the previous batch of augmented samples will be removed from the dataset, and the image files will be deleted from disk.</p> <p>However, if you want to save the augmented samples to the dataset, you can use the save_albumentations_augmentations operator, which will save the augmented samples to the dataset while keeping the augmented tag on the samples.</p> <p></p>"},{"location":"integrations/fiftyone/#get-last-transformation-info","title":"Get last transformation info","text":"<p>When you apply a transformation pipeline to samples in your dataset using the FiftyOne Albumentations plugin, this information is captured and stored using FiftyOne's custom runs. This means that you can easily access the information about the last applied transformation.</p> <p>In the FiftyOne App, you can use the get_last_albumentations_run_info operator to display a formatted summary of the relevant information:</p> <p></p> <p>Note</p> <p>You can also access this information programmatically by getting info about the custom run that the information is stored in. For the Albumentations plugin, this info is stored via the key <code>'_last_albumentations_run'</code>:</p> Python<pre><code>last_run_info = dataset.get_run_info(\"_last_albumentations_run\")\nprint(last_run_info)\n</code></pre>"},{"location":"integrations/fiftyone/#save-transformations","title":"Save transformations","text":"<p>If you are satisfied with the transformation pipeline you have created, you can save the entire composition of transformations to the dataset, hyperparameters and all. This means that after your rapid prototyping phase, you can easily move to a more reproducible workflow, and you can share your transformations or port them to other datasets.</p> <p>To save a transformation pipeline, you can use the save_albumentations_transform operator:</p> <p>After doing so, you will be able to view the information about this saved transformation pipeline using the get_albumentations_run_info operator:</p> <p></p> <p>Additionally, you will have access to this saved transformation pipeline under the \"saved\" tab for each transformation in the augment_with_albumentations operator modal.</p>"},{"location":"integrations/huggingface/","title":"HuggingFace","text":"<ul> <li>Image classification</li> <li>Object Detection</li> </ul>"},{"location":"integrations/huggingface/image_classification_albumentations/","title":"Fine-tuning for Image Classification with \ud83e\udd17 Transformers","text":"<p>This notebook shows how to fine-tune any pretrained Vision model for Image Classification on a custom dataset. The idea is to add a randomly initialized classification head on top of a pre-trained encoder, and fine-tune the model altogether on a labeled dataset.</p>"},{"location":"integrations/huggingface/image_classification_albumentations/#imagefolder-feature","title":"ImageFolder feature","text":"<p>This notebook leverages the ImageFolder feature to easily run the notebook on a custom dataset (namely, EuroSAT in this tutorial). You can either load a <code>Dataset</code> from local folders or from local/remote files, like zip or tar.</p>"},{"location":"integrations/huggingface/image_classification_albumentations/#any-model","title":"Any model","text":"<p>This notebook is built to run on any image classification dataset with any vision model checkpoint from the Model Hub as long as that model has a version with a Image Classification head, such as: * ViT * Swin Transformer * ConvNeXT</p> <ul> <li>in short, any model supported by AutoModelForImageClassification.</li> </ul>"},{"location":"integrations/huggingface/image_classification_albumentations/#albumentations","title":"Albumentations","text":"<p>In this notebook, we are going to leverage the Albumentations library for data augmentation. Note that we have other versions of this notebook available as well with other libraries including:</p> <ul> <li>Torchvision's Transforms</li> <li>Kornia</li> <li>imgaug. </li> </ul> <p>Depending on the model and the GPU you are using, you might need to adjust the batch size to avoid out-of-memory errors. Set those two parameters, then the rest of the notebook should run smoothly.</p> <p>In this notebook, we'll fine-tune from the https://huggingface.co/facebook/convnext-tiny-224 checkpoint, but note that there are many, many more available on the hub.</p> Python<pre><code>model_checkpoint = \"facebook/convnext-tiny-224\" # pre-trained model from which to fine-tune\nbatch_size = 32 # batch size for training and evaluation\n</code></pre> <p>Before we start, let's install the <code>datasets</code>, <code>transformers</code> and <code>albumentations</code> libraries.</p> Python<pre><code>!pip install -q datasets transformers\n</code></pre> <pre><code>\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 325 kB 8.7 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 4.0 MB 67.0 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 77 kB 8.1 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1.1 MB 48.8 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 136 kB 72.0 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 212 kB 72.9 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 127 kB 75.0 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 895 kB 67.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 6.5 MB 56.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 596 kB 76.4 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 144 kB 76.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 94 kB 3.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 271 kB 77.3 MB/s \n\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\ndatascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.\u001b[0m\n\u001b[?25h\n</code></pre> Python<pre><code>!pip install -q albumentations\n</code></pre> <pre><code>\u001b[?25l\n</code></pre> <p>\u001b[K     |\u258c                               | 10 kB 26.1 MB/s eta 0:00:01 \u001b[K     |\u2588                               | 20 kB 27.6 MB/s eta 0:00:01 \u001b[K     |\u2588\u258b                              | 30 kB 11.8 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588                              | 40 kB 8.9 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u258b                             | 51 kB 6.7 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u258f                            | 61 kB 7.9 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u258b                            | 71 kB 8.0 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u258f                           | 81 kB 7.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u258a                           | 92 kB 8.2 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u258f                          | 102 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u258a                          | 112 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u258e                         | 122 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u258a                         | 133 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e                        | 143 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                        | 153 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e                       | 163 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                       | 174 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                      | 184 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                      | 194 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                     | 204 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                     | 215 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                    | 225 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                    | 235 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                   | 245 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                   | 256 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                  | 266 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                  | 276 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                 | 286 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                 | 296 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b                | 307 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                | 317 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b               | 327 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f              | 337 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b              | 348 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f             | 358 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a             | 368 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f            | 378 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a            | 389 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e           | 399 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a           | 409 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e          | 419 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589          | 430 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e         | 440 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589         | 450 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d        | 460 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589        | 471 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d       | 481 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588       | 491 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d      | 501 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588      | 512 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c     | 522 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588     | 532 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c    | 542 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588    | 552 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 563 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 573 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b  | 583 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 593 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b | 604 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 614 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b| 624 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 631 kB 8.4 MB/s      \u001b[?25h  Building wheel for imgaug (setup.py) ... \u001b[?25l\u001b[?25hdone</p> <p>If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.</p> <p>To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.</p> <p>First you have to store your authentication token from the Hugging Face website (sign up here if you haven't already!) then execute the following cell and input your token:</p> Python<pre><code>from huggingface_hub import notebook_login\n\nnotebook_login()\n</code></pre> <pre><code>Login successful\nYour token has been saved to /root/.huggingface/token\n\u001b[1m\u001b[31mAuthenticated through git-credential store but this isn't the helper defined on your machine.\nYou might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default\n\ngit config --global credential.helper store\u001b[0m\n</code></pre> <p>Then you need to install Git-LFS to upload your model checkpoints:</p> Python<pre><code>%%capture\n!sudo apt -qq install git-lfs\n!git config --global credential.helper store\n</code></pre> <p>We also quickly upload some telemetry - this tells us which examples and software versions are getting used so we know where to prioritize our maintenance efforts. We don't collect (or care about) any personally identifiable information, but if you'd prefer not to be counted, feel free to skip this step or delete this cell entirely.</p> Python<pre><code>from transformers.utils import send_example_telemetry\n\nsend_example_telemetry(\"image_classification_albumentations_notebook\", framework=\"pytorch\")\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#fine-tuning-a-model-on-an-image-classification-task","title":"Fine-tuning a model on an image classification task","text":"<p>In this notebook, we will see how to fine-tune one of the \ud83e\udd17 Transformers vision models on an Image Classification dataset.</p> <p>Given an image, the goal is to predict an appropriate class for it, like \"tiger\". The screenshot below is taken from a ViT fine-tuned on ImageNet-1k - try out the inference widget!</p> <p></p>"},{"location":"integrations/huggingface/image_classification_albumentations/#loading-the-dataset","title":"Loading the dataset","text":"<p>We will use the \ud83e\udd17 Datasets library's ImageFolder feature to download our custom dataset into a DatasetDict.</p> <p>In this case, the EuroSAT dataset is hosted remotely, so we provide the <code>data_files</code> argument. Alternatively, if you have local folders with images, you can load them using the <code>data_dir</code> argument. </p> Python<pre><code>from datasets import load_dataset \n\n# load a custom dataset from local/remote files using the ImageFolder feature\n\n# option 1: local/remote files (supporting the following formats: tar, gzip, zip, xz, rar, zstd)\ndataset = load_dataset(\"imagefolder\", data_files=\"https://madm.dfki.de/files/sentinel/EuroSAT.zip\")\n\n# note that you can also provide several splits:\n# dataset = load_dataset(\"imagefolder\", data_files={\"train\": [\"path/to/file1\", \"path/to/file2\"], \"test\": [\"path/to/file3\", \"path/to/file4\"]})\n\n# note that you can push your dataset to the hub very easily (and reload afterwards using load_dataset)!\n# dataset.push_to_hub(\"nielsr/eurosat\")\n# dataset.push_to_hub(\"nielsr/eurosat\", private=True)\n\n# option 2: local folder\n# dataset = load_dataset(\"imagefolder\", data_dir=\"path_to_folder\")\n\n# option 3: just load any existing dataset from the hub ...\n# dataset = load_dataset(\"cifar10\")\n</code></pre> <pre><code>Using custom data configuration default-0537267e6f812d56\n\n\nDownloading and preparing dataset image_folder/default to /root/.cache/huggingface/datasets/image_folder/default-0537267e6f812d56/0.0.0/ee92df8e96c6907f3c851a987be3fd03d4b93b247e727b69a8e23ac94392a091...\n\n\n\nDownloading data files: 0it [00:00, ?it/s]\n\n\n\nDownloading data files:   0%|          | 0/1 [00:00&lt;?, ?it/s]\n\n\n\nDownloading data:   0%|          | 0.00/94.3M [00:00&lt;?, ?B/s]\n\n\n\nExtracting data files:   0%|          | 0/1 [00:00&lt;?, ?it/s]\n\n\n\nGenerating train split: 0 examples [00:00, ? examples/s]\n\n\nDataset image_folder downloaded and prepared to /root/.cache/huggingface/datasets/image_folder/default-0537267e6f812d56/0.0.0/ee92df8e96c6907f3c851a987be3fd03d4b93b247e727b69a8e23ac94392a091. Subsequent calls will reuse this data.\n\n\n\n  0%|          | 0/1 [00:00&lt;?, ?it/s]\n</code></pre> <p>Let us also load the Accuracy metric, which we'll use to evaluate our model both during and after training.</p> Python<pre><code>from datasets import load_metric\n\nmetric = load_metric(\"accuracy\")\n</code></pre> <pre><code>Downloading builder script:   0%|          | 0.00/1.41k [00:00&lt;?, ?B/s]\n</code></pre> <p>The <code>dataset</code> object itself is a <code>DatasetDict</code>, which contains one key per split (in this case, only \"train\" for a training split).</p> Python<pre><code>dataset\n</code></pre> <pre><code>DatasetDict({\n    train: Dataset({\n        features: ['image', 'label'],\n        num_rows: 27000\n    })\n})\n</code></pre> <p>To access an actual element, you need to select a split first, then give an index:</p> Python<pre><code>example = dataset[\"train\"][10]\nexample\n</code></pre> <pre><code>{'image': &lt;PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=64x64 at 0x7FD62DA6B2D0&gt;,\n 'label': 2}\n</code></pre> <p>Each example consists of an image and a corresponding label. We can also verify this by checking the features of the dataset:</p> Python<pre><code>dataset[\"train\"].features\n</code></pre> <pre><code>{'image': Image(decode=True, id=None),\n 'label': ClassLabel(num_classes=10, names=['AnnualCrop', 'Forest', 'HerbaceousVegetation', 'Highway', 'Industrial', 'Pasture', 'PermanentCrop', 'Residential', 'River', 'SeaLake'], id=None)}\n</code></pre> <p>The cool thing is that we can directly view the image (as the 'image' field is an Image feature), as follows:</p> Python<pre><code>example['image']\n</code></pre> <p></p> <p>Let's make it a little bigger as the images in the EuroSAT dataset are of low resolution (64x64 pixels):</p> Python<pre><code>example['image'].resize((200, 200))\n</code></pre> <p></p> <p>Let's check the corresponding label:</p> Python<pre><code>example['label']\n</code></pre> <pre><code>2\n</code></pre> <p>As you can see, the <code>label</code> field is not an actual string label. By default the <code>ClassLabel</code> fields are encoded into integers for convenience:</p> Python<pre><code>dataset[\"train\"].features[\"label\"]\n</code></pre> <pre><code>ClassLabel(num_classes=10, names=['AnnualCrop', 'Forest', 'HerbaceousVegetation', 'Highway', 'Industrial', 'Pasture', 'PermanentCrop', 'Residential', 'River', 'SeaLake'], id=None)\n</code></pre> <p>Let's create an <code>id2label</code> dictionary to decode them back to strings and see what they are. The inverse <code>label2id</code> will be useful too, when we load the model later.</p> Python<pre><code>labels = dataset[\"train\"].features[\"label\"].names\nlabel2id, id2label = dict(), dict()\nfor i, label in enumerate(labels):\n    label2id[label] = i\n    id2label[i] = label\n\nid2label[2]\n</code></pre> <pre><code>'HerbaceousVegetation'\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#preprocessing-the-data","title":"Preprocessing the data","text":"<p>Before we can feed these images to our model, we need to preprocess them. </p> <p>Preprocessing images typically comes down to (1) resizing them to a particular size (2) normalizing the color channels (R,G,B) using a mean and standard deviation. These are referred to as image transformations.</p> <p>In addition, one typically performs what is called data augmentation during training (like random cropping and flipping) to make the model more robust and achieve higher accuracy. Data augmentation is also a great technique to increase the size of the training data.</p> <p>We will use <code>Albumentations</code> for the image transformations/data augmentation in this tutorial, but note that one can use any other package (like torchvision's transforms, imgaug, Kornia, etc.).</p> <p>To make sure we (1) resize to the appropriate size (2) use the appropriate image mean and standard deviation for the model architecture we are going to use, we instantiate what is called an image processor with the <code>AutoImageProcessor.from_pretrained</code> method.</p> <p>This image processor is a minimal preprocessor that can be used to prepare images for inference.</p> Python<pre><code>from transformers import AutoImageProcessor\n\nimage_processor = AutoImageProcessor.from_pretrained(model_checkpoint)\nimage_processor\n</code></pre> <pre><code>Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.\n\n\n\n\n\nConvNextImageProcessor {\n  \"crop_pct\": 0.875,\n  \"do_normalize\": true,\n  \"do_rescale\": true,\n  \"do_resize\": true,\n  \"image_mean\": [\n    0.485,\n    0.456,\n    0.406\n  ],\n  \"image_processor_type\": \"ConvNextImageProcessor\",\n  \"image_std\": [\n    0.229,\n    0.224,\n    0.225\n  ],\n  \"resample\": 3,\n  \"rescale_factor\": 0.00392156862745098,\n  \"size\": {\n    \"shortest_edge\": 224\n  }\n}\n</code></pre> <p>The Datasets library is made for processing data very easily. We can write custom functions, which can then be applied on an entire dataset (either using <code>.map()</code> or <code>.set_transform()</code>).</p> <p>Here we define 2 separate functions, one for training (which includes data augmentation) and one for validation (which only includes resizing, center cropping and normalizing). </p> Python<pre><code>import cv2\nimport albumentations as A\nimport numpy as np\n\nif \"height\" in image_processor.size:\n    size = (image_processor.size[\"height\"], image_processor.size[\"width\"])\n    crop_size = size\n    max_size = None\nelif \"shortest_edge\" in image_processor.size:\n    size = image_processor.size[\"shortest_edge\"]\n    crop_size = (size, size)\n    max_size = image_processor.size.get(\"longest_edge\")\n\ntrain_transforms = A.Compose([\n    A.Resize(height=size, width=size),\n    A.RandomRotate90(),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n    A.Normalize(),\n])\n\nval_transforms = A.Compose([\n    A.Resize(height=size, width=size),\n    A.Normalize(),\n])\n\ndef preprocess_train(examples):\n    examples[\"pixel_values\"] = [\n        train_transforms(image=np.array(image))[\"image\"] for image in examples[\"image\"]\n    ]\n\n    return examples\n\ndef preprocess_val(examples):\n    examples[\"pixel_values\"] = [\n        val_transforms(image=np.array(image))[\"image\"] for image in examples[\"image\"]\n    ]\n\n    return examples\n</code></pre> <p>Next, we can preprocess our dataset by applying these functions. We will use the <code>set_transform</code> functionality, which allows to apply the functions above on-the-fly (meaning that they will only be applied when the images are loaded in RAM).</p> Python<pre><code># split up training into training + validation\nsplits = dataset[\"train\"].train_test_split(test_size=0.1)\ntrain_ds = splits['train']\nval_ds = splits['test']\n</code></pre> Python<pre><code>train_ds.set_transform(preprocess_train)\nval_ds.set_transform(preprocess_val)\n</code></pre> <p>Let's check the first example:</p> Python<pre><code>train_ds[0]\n</code></pre> <pre><code>{'image': &lt;PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=64x64 at 0x7FD610178490&gt;,\n 'label': 5,\n 'pixel_values': array([[[-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         ...,\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052]],\n\n        [[-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         ...,\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052]],\n\n        [[-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         ...,\n         [-1.3986642 , -0.93277305, -0.4101089 ],\n         [-1.3986642 , -0.93277305, -0.4101089 ],\n         [-1.3986642 , -0.93277305, -0.4101089 ]],\n\n        ...,\n\n        [[-1.5014129 , -0.582633  , -0.35782132],\n         [-1.5014129 , -0.582633  , -0.35782132],\n         [-1.5014129 , -0.582633  , -0.35782132],\n         ...,\n         [-1.4842881 , -0.98529404, -0.5146841 ],\n         [-1.4671633 , -1.0028011 , -0.49725488],\n         [-1.4671633 , -1.0028011 , -0.49725488]],\n\n        [[-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.35782132],\n         ...,\n         [-1.4842881 , -0.98529404, -0.5146841 ],\n         [-1.4671633 , -1.0028011 , -0.49725488],\n         [-1.4671633 , -1.0028011 , -0.49725488]],\n\n        [[-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.35782132],\n         ...,\n         [-1.4842881 , -0.98529404, -0.5146841 ],\n         [-1.4671633 , -1.0028011 , -0.49725488],\n         [-1.4671633 , -1.0028011 , -0.49725488]]], dtype=float32)}\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#training-the-model","title":"Training the model","text":"<p>Now that our data is ready, we can download the pretrained model and fine-tune it. For classification we use the <code>AutoModelForImageClassification</code> class. Like with the image processor, the <code>from_pretrained</code> method will download and cache the model for us. As the label ids and the number of labels are dataset dependent, we pass <code>num_labels</code>, <code>label2id</code>, and <code>id2label</code> alongside the <code>model_checkpoint</code> he\u00a3re.</p> <p>NOTE: in case you're planning to fine-tune an already fine-tuned checkpoint, like facebook/convnext-tiny-224 (which has already been fine-tuned on ImageNet-1k), then you need to provide the additional argument <code>ignore_mismatched_sizes=True</code> to the <code>from_pretrained</code> method. This will make sure the output head is thrown away and replaced by a new, randomly initialized classification head that includes a custom number of output neurons.</p> Python<pre><code>from transformers import AutoModelForImageClassification, TrainingArguments, Trainer\n\nnum_labels = len(id2label)\nmodel = AutoModelForImageClassification.from_pretrained(\n    model_checkpoint, \n    label2id=label2id,\n    id2label=id2label,\n    ignore_mismatched_sizes = True, # provide this in case you'd like to fine-tune an already fine-tuned checkpoint\n)\n</code></pre> <pre><code>Downloading:   0%|          | 0.00/68.0k [00:00&lt;?, ?B/s]\n\n\n\nDownloading:   0%|          | 0.00/109M [00:00&lt;?, ?B/s]\n\n\nSome weights of ConvNextForImageClassification were not initialized from the model checkpoint at facebook/convnext-tiny-224 and are newly initialized because the shapes did not match:\n- classifier.weight: found shape torch.Size([1000, 768]) in the checkpoint and torch.Size([10, 768]) in the model instantiated\n- classifier.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated\nYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n</code></pre> <p>The warning is telling us we are throwing away some weights (the weights and bias of the <code>pooler</code> layer) and randomly initializing some other (the weights and bias of the <code>classifier</code> layer). This is expected in this case, because we are adding a new head for which we don't have pretrained weights, so the library warns us we should fine-tune this model before using it for inference, which is exactly what we are going to do.</p> <p>To instantiate a <code>Trainer</code>, we will need to define the training configuration and the evaluation metric. The most important is the <code>TrainingArguments</code>, which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model.</p> <p>Most of the training arguments are pretty self-explanatory, but one that is quite important here is <code>remove_unused_columns=False</code>. This one will drop any features not used by the model's call function. By default it's <code>True</code> because usually it's ideal to drop unused feature columns, making it easier to unpack inputs into the model's call function. But, in our case, we need the unused features ('img' in particular) in order to create 'pixel_values'.</p> Python<pre><code>model_name = model_checkpoint.split(\"/\")[-1]\n\nargs = TrainingArguments(\n    f\"{model_name}-finetuned-eurosat-albumentations\",\n    remove_unused_columns=False,\n    evaluation_strategy = \"epoch\",\n    save_strategy = \"epoch\",\n    learning_rate=5e-5,\n    per_device_train_batch_size=batch_size,\n    gradient_accumulation_steps=4,\n    per_device_eval_batch_size=batch_size,\n    num_train_epochs=3,\n    warmup_ratio=0.1,\n    logging_steps=10,\n    load_best_model_at_end=True,\n    metric_for_best_model=\"accuracy\",\n    push_to_hub=True,\n)\n</code></pre> <p>Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the <code>batch_size</code> defined at the top of the notebook and customize the number of epochs for training, as well as the weight decay. Since the best model might not be the one at the end of training, we ask the <code>Trainer</code> to load the best model it saved (according to <code>metric_name</code>) at the end of training.</p> <p>The last argument <code>push_to_hub</code> allows the Trainer to push the model to the Hub regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally with a name that is different from the name of the repository, or if you want to push your model under an organization and not your name space, use the <code>hub_model_id</code> argument to set the repo name (it needs to be the full name, including your namespace: for instance <code>\"nielsr/vit-finetuned-cifar10\"</code> or <code>\"huggingface/nielsr/vit-finetuned-cifar10\"</code>).</p> <p>Next, we need to define a function for how to compute the metrics from the predictions, which will just use the <code>metric</code> we loaded earlier. The only preprocessing we have to do is to take the argmax of our predicted logits:</p> Python<pre><code>import numpy as np\n\n# the compute_metrics function takes a Named Tuple as input:\n# predictions, which are the logits of the model as Numpy arrays,\n# and label_ids, which are the ground-truth labels as Numpy arrays.\ndef compute_metrics(eval_pred):\n    \"\"\"Computes accuracy on a batch of predictions\"\"\"\n    predictions = np.argmax(eval_pred.predictions, axis=1)\n    return metric.compute(predictions=predictions, references=eval_pred.label_ids)\n</code></pre> <p>We also define a <code>collate_fn</code>, which will be used to batch examples together. Each batch consists of 2 keys, namely <code>pixel_values</code> and <code>labels</code>.</p> Python<pre><code>import torch\n\ndef collate_fn(examples):\n    images = []\n    labels = []\n    for example in examples:\n        image = np.moveaxis(example[\"pixel_values\"], source=2, destination=0)\n        images.append(torch.from_numpy(image))\n        labels.append(example[\"label\"])\n\n    pixel_values = torch.stack(images)\n    labels = torch.tensor(labels)\n    return {\"pixel_values\": pixel_values, \"labels\": labels}\n</code></pre> <p>Then we just need to pass all of this along with our datasets to the <code>Trainer</code>:</p> Python<pre><code>trainer = Trainer(\n    model,\n    args,\n    train_dataset=train_ds,\n    eval_dataset=val_ds,\n    tokenizer=image_processor,\n    compute_metrics=compute_metrics,\n    data_collator=collate_fn,\n)\n</code></pre> <pre><code>/content/convnext-tiny-224-finetuned-eurosat-albumentations is already a clone of https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations. Make sure you pull the latest changes with `repo.git_pull()`.\n</code></pre> <p>You might wonder why we pass along the <code>image_processor</code> as a tokenizer when we already preprocessed our data. This is only to make sure the image processor configuration file (stored as JSON) will also be uploaded to the repo on the hub.</p> <p>Now we can finetune our model by calling the <code>train</code> method:</p> Python<pre><code>trainer.train()\n</code></pre> <pre><code>/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:309: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n  FutureWarning,\n***** Running training *****\n  Num examples = 24300\n  Num Epochs = 3\n  Instantaneous batch size per device = 32\n  Total train batch size (w. parallel, distributed &amp; accumulation) = 128\n  Gradient Accumulation steps = 4\n  Total optimization steps = 570\n\n\n\n\n&lt;div&gt;\n\n  &lt;progress value='570' max='570' style='width:300px; height:20px; vertical-align: middle;'&gt;&lt;/progress&gt;\n  [570/570 15:59, Epoch 3/3]\n&lt;/div&gt;\n&lt;table border=\"1\" class=\"dataframe\"&gt;\n</code></pre> Epoch Training Loss Validation Loss Accuracy 1 0.141000 0.149633 0.954444 2 0.073600 0.095782 0.971852 3 0.056800 0.072716 0.974815 <p><p></p> <pre><code>***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\nSaving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190/preprocessor_config.json\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\nSaving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380/preprocessor_config.json\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\nSaving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570/preprocessor_config.json\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n\n\nTraining completed. Do not forget to share your model on huggingface.co/models =)\n\n\nLoading best model from convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570 (score: 0.9748148148148148).\n\n\n\n\n\nTrainOutput(global_step=570, training_loss=0.34729809766275843, metrics={'train_runtime': 961.6293, 'train_samples_per_second': 75.809, 'train_steps_per_second': 0.593, 'total_flos': 1.8322098956292096e+18, 'train_loss': 0.34729809766275843, 'epoch': 3.0})\n</code></pre> <p>We can check with the <code>evaluate</code> method that our <code>Trainer</code> did reload the best model properly (if it was not the last one):</p> Python<pre><code>metrics = trainer.evaluate()\nprint(metrics)\n</code></pre> <pre><code>***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\n</code></pre>    [85/85 00:12]  <pre><code>{'eval_loss': 0.0727163776755333, 'eval_accuracy': 0.9748148148148148, 'eval_runtime': 13.0419, 'eval_samples_per_second': 207.026, 'eval_steps_per_second': 6.517, 'epoch': 3.0}\n</code></pre> <p>You can now upload the result of the training to the Hub, just execute this instruction (note that the Trainer will automatically create a model card for you, as well as adding Tensorboard metrics - see the \"Training metrics\" tab!):</p> Python<pre><code>trainer.push_to_hub()\n</code></pre> <pre><code>Saving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n\n\n\nUpload file runs/Apr12_12-03-24_1ad162e1ead9/events.out.tfevents.1649765159.1ad162e1ead9.73.4:  24%|##4       \u2026\n\n\n\nUpload file runs/Apr12_12-03-24_1ad162e1ead9/events.out.tfevents.1649767032.1ad162e1ead9.73.6: 100%|##########\u2026\n\n\nTo https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\n   c500b3f..2143b42  main -&gt; main\n\nTo https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\n   2143b42..71339cf  main -&gt; main\n\n\n\n\n\n\n'https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/commit/2143b423b5cacdde6daebd3ee2b5971ecab463f6'\n</code></pre> <p>You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier <code>\"your-username/the-name-you-picked\"</code> so for instance:</p> Python<pre><code>from transformers import AutoModelForImageClassification, AutoImageProcessor\n\nimage_processor = AutoImageProcessor.from_pretrained(\"nielsr/my-awesome-model\")\nmodel = AutoModelForImageClassification.from_pretrained(\"nielsr/my-awesome-model\")\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#inference","title":"Inference","text":"<p>Let's say you have a new image, on which you'd like to make a prediction. Let's load a satellite image of a highway (that's not part of the EuroSAT dataset), and see how the model does.</p> Python<pre><code>from PIL import Image\nimport requests\n\nurl = 'https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/highway.jpg'\nimage = Image.open(requests.get(url, stream=True).raw)\nimage\n</code></pre> <p></p> <p>We'll load the image processor and model from the hub (here, we use the Auto Classes, which will make sure the appropriate classes will be loaded automatically based on the <code>config.json</code> and <code>preprocessor_config.json</code> files of the repo on the hub):</p> Python<pre><code>from transformers import AutoModelForImageClassification, AutoImageProcessor\n\nrepo_name = \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\"\n\nimage_processor = AutoImageProcessor.from_pretrained(repo_name)\nmodel = AutoModelForImageClassification.from_pretrained(repo_name)\n</code></pre> <pre><code>https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmp04g0zg5n\n\n\n\nDownloading:   0%|          | 0.00/266 [00:00&lt;?, ?B/s]\n\n\nstoring https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json in cache at /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\ncreating metadata file for /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\nloading feature extractor configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json from cache at /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\nFeature extractor ConvNextFeatureExtractor {\n  \"crop_pct\": 0.875,\n  \"do_normalize\": true,\n  \"do_resize\": true,\n  \"feature_extractor_type\": \"ConvNextFeatureExtractor\",\n  \"image_mean\": [\n    0.485,\n    0.456,\n    0.406\n  ],\n  \"image_std\": [\n    0.229,\n    0.224,\n    0.225\n  ],\n  \"resample\": 3,\n  \"size\": 224\n}\n\nhttps://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmpbf9y4q39\n\n\n\nDownloading:   0%|          | 0.00/1.03k [00:00&lt;?, ?B/s]\n\n\nstoring https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json in cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\ncreating metadata file for /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nloading configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nModel config ConvNextConfig {\n  \"_name_or_path\": \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\",\n  \"architectures\": [\n    \"ConvNextForImageClassification\"\n  ],\n  \"depths\": [\n    3,\n    3,\n    9,\n    3\n  ],\n  \"drop_path_rate\": 0.0,\n  \"hidden_act\": \"gelu\",\n  \"hidden_sizes\": [\n    96,\n    192,\n    384,\n    768\n  ],\n  \"id2label\": {\n    \"0\": \"AnnualCrop\",\n    \"1\": \"Forest\",\n    \"2\": \"HerbaceousVegetation\",\n    \"3\": \"Highway\",\n    \"4\": \"Industrial\",\n    \"5\": \"Pasture\",\n    \"6\": \"PermanentCrop\",\n    \"7\": \"Residential\",\n    \"8\": \"River\",\n    \"9\": \"SeaLake\"\n  },\n  \"image_size\": 224,\n  \"initializer_range\": 0.02,\n  \"label2id\": {\n    \"AnnualCrop\": 0,\n    \"Forest\": 1,\n    \"HerbaceousVegetation\": 2,\n    \"Highway\": 3,\n    \"Industrial\": 4,\n    \"Pasture\": 5,\n    \"PermanentCrop\": 6,\n    \"Residential\": 7,\n    \"River\": 8,\n    \"SeaLake\": 9\n  },\n  \"layer_norm_eps\": 1e-12,\n  \"layer_scale_init_value\": 1e-06,\n  \"model_type\": \"convnext\",\n  \"num_channels\": 3,\n  \"num_stages\": 4,\n  \"patch_size\": 4,\n  \"problem_type\": \"single_label_classification\",\n  \"torch_dtype\": \"float32\",\n  \"transformers_version\": \"4.18.0\"\n}\n\nhttps://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmpzr_9yxjo\n\n\n\nDownloading:   0%|          | 0.00/106M [00:00&lt;?, ?B/s]\n\n\nstoring https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin in cache at /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\ncreating metadata file for /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\nloading weights file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\nAll model checkpoint weights were used when initializing ConvNextForImageClassification.\n\nAll the weights of ConvNextForImageClassification were initialized from the model checkpoint at nielsr/convnext-tiny-224-finetuned-eurosat-albumentations.\nIf your task is similar to the task the model of the checkpoint was trained on, you can already use ConvNextForImageClassification for predictions without further training.\n</code></pre> Python<pre><code># prepare image for the model\nencoding = image_processor(image.convert(\"RGB\"), return_tensors=\"pt\")\nprint(encoding.pixel_values.shape)\n</code></pre> <pre><code>torch.Size([1, 3, 224, 224])\n</code></pre> Python<pre><code>import torch\n\n# forward pass\nwith torch.no_grad():\n    outputs = model(**encoding)\n    logits = outputs.logits\n</code></pre> Python<pre><code>predicted_class_idx = logits.argmax(-1).item()\nprint(\"Predicted class:\", model.config.id2label[predicted_class_idx])\n</code></pre> <pre><code>Predicted class: Highway\n</code></pre> <p>Looks like our model got it correct! </p>"},{"location":"integrations/huggingface/image_classification_albumentations/#pipeline-api","title":"Pipeline API","text":"<p>An alternative way to quickly perform inference with any model on the hub is by leveraging the Pipeline API, which abstracts away all the steps we did manually above for us. It will perform the preprocessing, forward pass and postprocessing all in a single object. </p> <p>Let's showcase this for our trained model:</p> Python<pre><code>from transformers import pipeline\n\npipe = pipeline(\"image-classification\", \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\")\n</code></pre> <pre><code>loading configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nModel config ConvNextConfig {\n  \"_name_or_path\": \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\",\n  \"architectures\": [\n    \"ConvNextForImageClassification\"\n  ],\n  \"depths\": [\n    3,\n    3,\n    9,\n    3\n  ],\n  \"drop_path_rate\": 0.0,\n  \"hidden_act\": \"gelu\",\n  \"hidden_sizes\": [\n    96,\n    192,\n    384,\n    768\n  ],\n  \"id2label\": {\n    \"0\": \"AnnualCrop\",\n    \"1\": \"Forest\",\n    \"2\": \"HerbaceousVegetation\",\n    \"3\": \"Highway\",\n    \"4\": \"Industrial\",\n    \"5\": \"Pasture\",\n    \"6\": \"PermanentCrop\",\n    \"7\": \"Residential\",\n    \"8\": \"River\",\n    \"9\": \"SeaLake\"\n  },\n  \"image_size\": 224,\n  \"initializer_range\": 0.02,\n  \"label2id\": {\n    \"AnnualCrop\": 0,\n    \"Forest\": 1,\n    \"HerbaceousVegetation\": 2,\n    \"Highway\": 3,\n    \"Industrial\": 4,\n    \"Pasture\": 5,\n    \"PermanentCrop\": 6,\n    \"Residential\": 7,\n    \"River\": 8,\n    \"SeaLake\": 9\n  },\n  \"layer_norm_eps\": 1e-12,\n  \"layer_scale_init_value\": 1e-06,\n  \"model_type\": \"convnext\",\n  \"num_channels\": 3,\n  \"num_stages\": 4,\n  \"patch_size\": 4,\n  \"problem_type\": \"single_label_classification\",\n  \"torch_dtype\": \"float32\",\n  \"transformers_version\": \"4.18.0\"\n}\n\nloading configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nModel config ConvNextConfig {\n  \"_name_or_path\": \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\",\n  \"architectures\": [\n    \"ConvNextForImageClassification\"\n  ],\n  \"depths\": [\n    3,\n    3,\n    9,\n    3\n  ],\n  \"drop_path_rate\": 0.0,\n  \"hidden_act\": \"gelu\",\n  \"hidden_sizes\": [\n    96,\n    192,\n    384,\n    768\n  ],\n  \"id2label\": {\n    \"0\": \"AnnualCrop\",\n    \"1\": \"Forest\",\n    \"2\": \"HerbaceousVegetation\",\n    \"3\": \"Highway\",\n    \"4\": \"Industrial\",\n    \"5\": \"Pasture\",\n    \"6\": \"PermanentCrop\",\n    \"7\": \"Residential\",\n    \"8\": \"River\",\n    \"9\": \"SeaLake\"\n  },\n  \"image_size\": 224,\n  \"initializer_range\": 0.02,\n  \"label2id\": {\n    \"AnnualCrop\": 0,\n    \"Forest\": 1,\n    \"HerbaceousVegetation\": 2,\n    \"Highway\": 3,\n    \"Industrial\": 4,\n    \"Pasture\": 5,\n    \"PermanentCrop\": 6,\n    \"Residential\": 7,\n    \"River\": 8,\n    \"SeaLake\": 9\n  },\n  \"layer_norm_eps\": 1e-12,\n  \"layer_scale_init_value\": 1e-06,\n  \"model_type\": \"convnext\",\n  \"num_channels\": 3,\n  \"num_stages\": 4,\n  \"patch_size\": 4,\n  \"problem_type\": \"single_label_classification\",\n  \"torch_dtype\": \"float32\",\n  \"transformers_version\": \"4.18.0\"\n}\n\nloading weights file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\nAll model checkpoint weights were used when initializing ConvNextForImageClassification.\n\nAll the weights of ConvNextForImageClassification were initialized from the model checkpoint at nielsr/convnext-tiny-224-finetuned-eurosat-albumentations.\nIf your task is similar to the task the model of the checkpoint was trained on, you can already use ConvNextForImageClassification for predictions without further training.\nloading feature extractor configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json from cache at /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\nFeature extractor ConvNextFeatureExtractor {\n  \"crop_pct\": 0.875,\n  \"do_normalize\": true,\n  \"do_resize\": true,\n  \"feature_extractor_type\": \"ConvNextFeatureExtractor\",\n  \"image_mean\": [\n    0.485,\n    0.456,\n    0.406\n  ],\n  \"image_std\": [\n    0.229,\n    0.224,\n    0.225\n  ],\n  \"resample\": 3,\n  \"size\": 224\n}\n</code></pre> Python<pre><code>pipe(image)\n</code></pre> <pre><code>[{'label': 'Highway', 'score': 0.5163754224777222},\n {'label': 'River', 'score': 0.11824000626802444},\n {'label': 'AnnualCrop', 'score': 0.05467210337519646},\n {'label': 'PermanentCrop', 'score': 0.05066365748643875},\n {'label': 'Industrial', 'score': 0.049283623695373535}]\n</code></pre> <p>As we can see, it does not only show the class label with the highest probability, but does return the top 5 labels, with their corresponding scores. Note that the pipelines also work with local models and image_processor:</p> Python<pre><code>pipe = pipeline(\"image-classification\", \n                model=model,\n                feature_extractor=image_processor)\n</code></pre> Python<pre><code>pipe(image)\n</code></pre> <pre><code>[{'label': 'Highway', 'score': 0.5163754224777222},\n {'label': 'River', 'score': 0.11824000626802444},\n {'label': 'AnnualCrop', 'score': 0.05467210337519646},\n {'label': 'PermanentCrop', 'score': 0.05066365748643875},\n {'label': 'Industrial', 'score': 0.049283623695373535}]\n</code></pre> Python<pre><code>\n</code></pre>"},{"location":"integrations/huggingface/object_detection/","title":"Object Detection","text":""},{"location":"integrations/huggingface/object_detection/#object-detection","title":"Object detection","text":"<p>Object detection is the computer vision task of detecting instances (such as humans, buildings, or cars) in an image. Object detection models receive an image as input and output coordinates of the bounding boxes and associated labels of the detected objects. An image can contain multiple objects, each with its own bounding box and a label (e.g. it can have a car and a building), and each object can be present in different parts of an image (e.g. the image can have several cars). This task is commonly used in autonomous driving for detecting things like pedestrians, road signs, and traffic lights. Other applications include counting objects in images, image search, and more.</p> <p>In this guide, you will learn how to:</p> <ol> <li>Finetune DETR, a model that combines a convolutional  backbone with an encoder-decoder Transformer, on the CPPE-5  dataset.</li> <li>Use your finetuned model for inference.</li> </ol> <p> <p>To see all architectures and checkpoints compatible with this task, we recommend checking the task-page</p> <p></p> <p>Before you begin, make sure you have all the necessary libraries installed:</p> Bash<pre><code>pip install -q datasets transformers accelerate timm\npip install -q -U albumentations&gt;=1.4.5 torchmetrics pycocotools\n</code></pre> <p>You'll use \ud83e\udd17 Datasets to load a dataset from the Hugging Face Hub, \ud83e\udd17 Transformers to train your model, and <code>albumentations</code> to augment the data.</p> <p>We encourage you to share your model with the community. Log in to your Hugging Face account to upload it to the Hub. When prompted, enter your token to log in:</p> Python<pre><code>&gt;&gt;&gt; from huggingface_hub import notebook_login\n\n&gt;&gt;&gt; notebook_login()\n</code></pre> <p>To get started, we'll define global constants, namely the model name and image size. For this tutorial, we'll use the conditional DETR model due to its faster convergence. Feel free to select any object detection model available in the <code>transformers</code> library.</p> Python<pre><code>&gt;&gt;&gt; MODEL_NAME = \"microsoft/conditional-detr-resnet-50\"  # or \"facebook/detr-resnet-50\"\n&gt;&gt;&gt; IMAGE_SIZE = 480\n</code></pre>"},{"location":"integrations/huggingface/object_detection/#load-the-cppe-5-dataset","title":"Load the CPPE-5 dataset","text":"<p>The CPPE-5 dataset contains images with annotations identifying medical personal protective equipment (PPE) in the context of the COVID-19 pandemic.</p> <p>Start by loading the dataset and creating a <code>validation</code> split from <code>train</code>:</p> Python<pre><code>&gt;&gt;&gt; from datasets import load_dataset\n\n&gt;&gt;&gt; cppe5 = load_dataset(\"cppe-5\")\n\n&gt;&gt;&gt; if \"validation\" not in cppe5:\n...     split = cppe5[\"train\"].train_test_split(0.15, seed=1337)\n...     cppe5[\"train\"] = split[\"train\"]\n...     cppe5[\"validation\"] = split[\"test\"]\n\n&gt;&gt;&gt; cppe5\nDatasetDict({\n    train: Dataset({\n        features: ['image_id', 'image', 'width', 'height', 'objects'],\n        num_rows: 850\n    })\n    test: Dataset({\n        features: ['image_id', 'image', 'width', 'height', 'objects'],\n        num_rows: 29\n    })\n    validation: Dataset({\n        features: ['image_id', 'image', 'width', 'height', 'objects'],\n        num_rows: 150\n    })\n})\n</code></pre> <p>You'll see that this dataset has 1000 images for train and validation sets and a test set with 29 images.</p> <p>To get familiar with the data, explore what the examples look like.</p> Python<pre><code>&gt;&gt;&gt; cppe5[\"train\"][0]\n{\n  'image_id': 366,\n  'image': &lt;PIL.PngImagePlugin.PngImageFile image mode=RGBA size=500x290&gt;,\n  'width': 500,\n  'height': 500,\n  'objects': {\n    'id': [1932, 1933, 1934],\n    'area': [27063, 34200, 32431],\n    'bbox': [[29.0, 11.0, 97.0, 279.0],\n      [201.0, 1.0, 120.0, 285.0],\n      [382.0, 0.0, 113.0, 287.0]],\n    'category': [0, 0, 0]\n  }\n}\n</code></pre> <p>The examples in the dataset have the following fields: - <code>image_id</code>: the example image id - <code>image</code>: a <code>PIL.Image.Image</code> object containing the image - <code>width</code>: width of the image - <code>height</code>: height of the image - <code>objects</code>: a dictionary containing bounding box metadata for the objects in the image:   - <code>id</code>: the annotation id   - <code>area</code>: the area of the bounding box   - <code>bbox</code>: the object's bounding box (in the COCO format )   - <code>category</code>: the object's category, with possible values including <code>Coverall (0)</code>, <code>Face_Shield (1)</code>, <code>Gloves (2)</code>, <code>Goggles (3)</code> and <code>Mask (4)</code></p> <p>You may notice that the <code>bbox</code> field follows the COCO format, which is the format that the DETR model expects. However, the grouping of the fields inside <code>objects</code> differs from the annotation format DETR requires. You will need to apply some preprocessing transformations before using this data for training.</p> <p>To get an even better understanding of the data, visualize an example in the dataset.</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import os\n&gt;&gt;&gt; from PIL import Image, ImageDraw\n\n&gt;&gt;&gt; image = cppe5[\"train\"][2][\"image\"]\n&gt;&gt;&gt; annotations = cppe5[\"train\"][2][\"objects\"]\n&gt;&gt;&gt; draw = ImageDraw.Draw(image)\n\n&gt;&gt;&gt; categories = cppe5[\"train\"].features[\"objects\"].feature[\"category\"].names\n\n&gt;&gt;&gt; id2label = {index: x for index, x in enumerate(categories, start=0)}\n&gt;&gt;&gt; label2id = {v: k for k, v in id2label.items()}\n\n&gt;&gt;&gt; for i in range(len(annotations[\"id\"])):\n...     box = annotations[\"bbox\"][i]\n...     class_idx = annotations[\"category\"][i]\n...     x, y, w, h = tuple(box)\n...     # Check if coordinates are normalized or not\n...     if max(box) &gt; 1.0:\n...         # Coordinates are un-normalized, no need to re-scale them\n...         x1, y1 = int(x), int(y)\n...         x2, y2 = int(x + w), int(y + h)\n...     else:\n...         # Coordinates are normalized, re-scale them\n...         x1 = int(x * width)\n...         y1 = int(y * height)\n...         x2 = int((x + w) * width)\n...         y2 = int((y + h) * height)\n...     draw.rectangle((x, y, x + w, y + h), outline=\"red\", width=1)\n...     draw.text((x, y), id2label[class_idx], fill=\"white\")\n\n&gt;&gt;&gt; image\n</code></pre> <p>To visualize the bounding boxes with associated labels, you can get the labels from the dataset's metadata, specifically the <code>category</code> field. You'll also want to create dictionaries that map a label id to a label class (<code>id2label</code>) and the other way around (<code>label2id</code>). You can use them later when setting up the model. Including these maps will make your model reusable by others if you share it on the Hugging Face Hub. Please note that, the part of above code that draws the bounding boxes assume that it is in <code>COCO</code> format <code>(x_min, y_min, width, height)</code>. It has to be adjusted to work for other formats like <code>(x_min, y_min, x_max, y_max)</code>.</p> <p>As a final step of getting familiar with the data, explore it for potential issues. One common problem with datasets for object detection is bounding boxes that \"stretch\" beyond the edge of the image. Such \"runaway\" bounding boxes can raise errors during training and should be addressed. There are a few examples with this issue in this dataset. To keep things simple in this guide, we will set <code>clip=True</code> for <code>BboxParams</code> in transformations below.</p>"},{"location":"integrations/huggingface/object_detection/#preprocess-the-data","title":"Preprocess the data","text":"<p>To finetune a model, you must preprocess the data you plan to use to match precisely the approach used for the pre-trained model. [<code>AutoImageProcessor</code>] takes care of processing image data to create <code>pixel_values</code>, <code>pixel_mask</code>, and <code>labels</code> that a DETR model can train with. The image processor has some attributes that you won't have to worry about:</p> <ul> <li><code>image_mean = [0.485, 0.456, 0.406 ]</code></li> <li><code>image_std = [0.229, 0.224, 0.225]</code></li> </ul> <p>These are the mean and standard deviation used to normalize images during the model pre-training. These values are crucial to replicate when doing inference or finetuning a pre-trained image model.</p> <p>Instantiate the image processor from the same checkpoint as the model you want to finetune.</p> Python<pre><code>&gt;&gt;&gt; from transformers import AutoImageProcessor\n\n&gt;&gt;&gt; MAX_SIZE = IMAGE_SIZE\n\n&gt;&gt;&gt; image_processor = AutoImageProcessor.from_pretrained(\n...     MODEL_NAME,\n...     do_resize=True,\n...     size={\"max_height\": MAX_SIZE, \"max_width\": MAX_SIZE},\n...     do_pad=True,\n...     pad_size={\"height\": MAX_SIZE, \"width\": MAX_SIZE},\n... )\n</code></pre> <p>Before passing the images to the <code>image_processor</code>, apply two preprocessing transformations to the dataset: - Augmenting images - Reformatting annotations to meet DETR expectations</p> <p>First, to make sure the model does not overfit on the training data, you can apply image augmentation with any data augmentation library. Here we use Albumentations. This library ensures that transformations affect the image and update the bounding boxes accordingly. The \ud83e\udd17 Datasets library documentation has a detailed guide on how to augment images for object detection, and it uses the exact same dataset as an example. Apply some geometric and color transformations to the image. For additional augmentation options, explore the Albumentations Demo Space.</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n\n&gt;&gt;&gt; train_augment_and_transform = A.Compose(\n...     [\n...         A.Perspective(p=0.1),\n...         A.HorizontalFlip(p=0.5),\n...         A.RandomBrightnessContrast(p=0.5),\n...         A.HueSaturationValue(p=0.1),\n...     ],\n...     bbox_params=A.BboxParams(format=\"coco\", label_fields=[\"category\"], clip=True, min_area=25),\n... )\n\n&gt;&gt;&gt; validation_transform = A.Compose(\n...     [A.NoOp()],\n...     bbox_params=A.BboxParams(format=\"coco\", label_fields=[\"category\"], clip=True),\n... )\n</code></pre> <p>The <code>image_processor</code> expects the annotations to be in the following format: <code>{'image_id': int, 'annotations': List[Dict]}</code>,  where each dictionary is a COCO object annotation. Let's add a function to reformat annotations for a single example:</p> Python<pre><code>&gt;&gt;&gt; def format_image_annotations_as_coco(image_id, categories, areas, bboxes):\n...     \"\"\"Format one set of image annotations to the COCO format\n\n...     Args:\n...         image_id (str): image id. e.g. \"0001\"\n...         categories (List[int]): list of categories/class labels corresponding to provided bounding boxes\n...         areas (List[float]): list of corresponding areas to provided bounding boxes\n...         bboxes (List[Tuple[float]]): list of bounding boxes provided in COCO format\n...             ([center_x, center_y, width, height] in absolute coordinates)\n\n...     Returns:\n...         dict: {\n...             \"image_id\": image id,\n...             \"annotations\": list of formatted annotations\n...         }\n...     \"\"\"\n...     annotations = []\n...     for category, area, bbox in zip(categories, areas, bboxes):\n...         formatted_annotation = {\n...             \"image_id\": image_id,\n...             \"category_id\": category,\n...             \"iscrowd\": 0,\n...             \"area\": area,\n...             \"bbox\": list(bbox),\n...         }\n...         annotations.append(formatted_annotation)\n\n...     return {\n...         \"image_id\": image_id,\n...         \"annotations\": annotations,\n...     }\n</code></pre> <p>Now you can combine the image and annotation transformations to use on a batch of examples:</p> Python<pre><code>&gt;&gt;&gt; def augment_and_transform_batch(examples, transform, image_processor, return_pixel_mask=False):\n...     \"\"\"Apply augmentations and format annotations in COCO format for object detection task\"\"\"\n\n...     images = []\n...     annotations = []\n...     for image_id, image, objects in zip(examples[\"image_id\"], examples[\"image\"], examples[\"objects\"]):\n...         image = np.array(image.convert(\"RGB\"))\n\n...         # apply augmentations\n...         output = transform(image=image, bboxes=objects[\"bbox\"], category=objects[\"category\"])\n...         images.append(output[\"image\"])\n\n...         # format annotations in COCO format\n...         formatted_annotations = format_image_annotations_as_coco(\n...             image_id, output[\"category\"], objects[\"area\"], output[\"bboxes\"]\n...         )\n...         annotations.append(formatted_annotations)\n\n...     # Apply the image processor transformations: resizing, rescaling, normalization\n...     result = image_processor(images=images, annotations=annotations, return_tensors=\"pt\")\n\n...     if not return_pixel_mask:\n...         result.pop(\"pixel_mask\", None)\n\n...     return result\n</code></pre> <p>Apply this preprocessing function to the entire dataset using \ud83e\udd17 Datasets [<code>~datasets.Dataset.with_transform</code>] method. This method applies transformations on the fly when you load an element of the dataset.</p> <p>At this point, you can check what an example from the dataset looks like after the transformations. You should see a tensor with <code>pixel_values</code>, a tensor with <code>pixel_mask</code>, and <code>labels</code>.</p> Python<pre><code>&gt;&gt;&gt; from functools import partial\n\n&gt;&gt;&gt; # Make transform functions for batch and apply for dataset splits\n&gt;&gt;&gt; train_transform_batch = partial(\n...     augment_and_transform_batch, transform=train_augment_and_transform, image_processor=image_processor\n... )\n&gt;&gt;&gt; validation_transform_batch = partial(\n...     augment_and_transform_batch, transform=validation_transform, image_processor=image_processor\n... )\n\n&gt;&gt;&gt; cppe5[\"train\"] = cppe5[\"train\"].with_transform(train_transform_batch)\n&gt;&gt;&gt; cppe5[\"validation\"] = cppe5[\"validation\"].with_transform(validation_transform_batch)\n&gt;&gt;&gt; cppe5[\"test\"] = cppe5[\"test\"].with_transform(validation_transform_batch)\n\n&gt;&gt;&gt; cppe5[\"train\"][15]\n{'pixel_values': tensor([[[ 1.9235,  1.9407,  1.9749,  ..., -0.7822, -0.7479, -0.6965],\n          [ 1.9578,  1.9749,  1.9920,  ..., -0.7993, -0.7650, -0.7308],\n          [ 2.0092,  2.0092,  2.0263,  ..., -0.8507, -0.8164, -0.7822],\n          ...,\n          [ 0.0741,  0.0741,  0.0741,  ...,  0.0741,  0.0741,  0.0741],\n          [ 0.0741,  0.0741,  0.0741,  ...,  0.0741,  0.0741,  0.0741],\n          [ 0.0741,  0.0741,  0.0741,  ...,  0.0741,  0.0741,  0.0741]],\n\n          [[ 1.6232,  1.6408,  1.6583,  ...,  0.8704,  1.0105,  1.1331],\n          [ 1.6408,  1.6583,  1.6758,  ...,  0.8529,  0.9930,  1.0980],\n          [ 1.6933,  1.6933,  1.7108,  ...,  0.8179,  0.9580,  1.0630],\n          ...,\n          [ 0.2052,  0.2052,  0.2052,  ...,  0.2052,  0.2052,  0.2052],\n          [ 0.2052,  0.2052,  0.2052,  ...,  0.2052,  0.2052,  0.2052],\n          [ 0.2052,  0.2052,  0.2052,  ...,  0.2052,  0.2052,  0.2052]],\n\n          [[ 1.8905,  1.9080,  1.9428,  ..., -0.1487, -0.0964, -0.0615],\n          [ 1.9254,  1.9428,  1.9603,  ..., -0.1661, -0.1138, -0.0790],\n          [ 1.9777,  1.9777,  1.9951,  ..., -0.2010, -0.1138, -0.0790],\n          ...,\n          [ 0.4265,  0.4265,  0.4265,  ...,  0.4265,  0.4265,  0.4265],\n          [ 0.4265,  0.4265,  0.4265,  ...,  0.4265,  0.4265,  0.4265],\n          [ 0.4265,  0.4265,  0.4265,  ...,  0.4265,  0.4265,  0.4265]]]),\n  'labels': {'image_id': tensor([688]), 'class_labels': tensor([3, 4, 2, 0, 0]), 'boxes': tensor([[0.4700, 0.1933, 0.1467, 0.0767],\n          [0.4858, 0.2600, 0.1150, 0.1000],\n          [0.4042, 0.4517, 0.1217, 0.1300],\n          [0.4242, 0.3217, 0.3617, 0.5567],\n          [0.6617, 0.4033, 0.5400, 0.4533]]), 'area': tensor([ 4048.,  4140.,  5694., 72478., 88128.]), 'iscrowd': tensor([0, 0, 0, 0, 0]), 'orig_size': tensor([480, 480])}}\n</code></pre> <p>You have successfully augmented the individual images and prepared their annotations. However, preprocessing isn't complete yet. In the final step, create a custom <code>collate_fn</code> to batch images together. Pad images (which are now <code>pixel_values</code>) to the largest image in a batch, and create a corresponding <code>pixel_mask</code> to indicate which pixels are real (1) and which are padding (0).</p> Python<pre><code>&gt;&gt;&gt; import torch\n\n&gt;&gt;&gt; def collate_fn(batch):\n...     data = {}\n...     data[\"pixel_values\"] = torch.stack([x[\"pixel_values\"] for x in batch])\n...     data[\"labels\"] = [x[\"labels\"] for x in batch]\n...     if \"pixel_mask\" in batch[0]:\n...         data[\"pixel_mask\"] = torch.stack([x[\"pixel_mask\"] for x in batch])\n...     return data\n</code></pre>"},{"location":"integrations/huggingface/object_detection/#preparing-function-to-compute-map","title":"Preparing function to compute mAP","text":"<p>Object detection models are commonly evaluated with a set of COCO-style metrics. We are going to use <code>torchmetrics</code> to compute <code>mAP</code> (mean average precision) and <code>mAR</code> (mean average recall) metrics and will wrap it to <code>compute_metrics</code> function in order to use in [<code>Trainer</code>] for evaluation.</p> <p>Intermediate format of boxes used for training is <code>YOLO</code> (normalized) but we will compute metrics for boxes in <code>Pascal VOC</code> (absolute) format in order to correctly handle box areas. Let's define a function that converts bounding boxes to <code>Pascal VOC</code> format:</p> Python<pre><code>&gt;&gt;&gt; from transformers.image_transforms import center_to_corners_format\n\n&gt;&gt;&gt; def convert_bbox_yolo_to_pascal(boxes, image_size):\n...     \"\"\"\n...     Convert bounding boxes from YOLO format (x_center, y_center, width, height) in range [0, 1]\n...     to Pascal VOC format (x_min, y_min, x_max, y_max) in absolute coordinates.\n\n...     Args:\n...         boxes (torch.Tensor): Bounding boxes in YOLO format\n...         image_size (Tuple[int, int]): Image size in format (height, width)\n\n...     Returns:\n...         torch.Tensor: Bounding boxes in Pascal VOC format (x_min, y_min, x_max, y_max)\n...     \"\"\"\n...     # convert center to corners format\n...     boxes = center_to_corners_format(boxes)\n\n...     # convert to absolute coordinates\n...     height, width = image_size\n...     boxes = boxes * torch.tensor([[width, height, width, height]])\n\n...     return boxes\n</code></pre> <p>Then, in <code>compute_metrics</code> function we collect <code>predicted</code> and <code>target</code> bounding boxes, scores and labels from evaluation loop results and pass it to the scoring function.</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; from dataclasses import dataclass\n&gt;&gt;&gt; from torchmetrics.detection.mean_ap import MeanAveragePrecision\n\n\n&gt;&gt;&gt; @dataclass\n&gt;&gt;&gt; class ModelOutput:\n...     logits: torch.Tensor\n...     pred_boxes: torch.Tensor\n\n\n&gt;&gt;&gt; @torch.no_grad()\n&gt;&gt;&gt; def compute_metrics(evaluation_results, image_processor, threshold=0.0, id2label=None):\n...     \"\"\"\n...     Compute mean average mAP, mAR and their variants for the object detection task.\n\n...     Args:\n...         evaluation_results (EvalPrediction): Predictions and targets from evaluation.\n...         threshold (float, optional): Threshold to filter predicted boxes by confidence. Defaults to 0.0.\n...         id2label (Optional[dict], optional): Mapping from class id to class name. Defaults to None.\n\n...     Returns:\n...         Mapping[str, float]: Metrics in a form of dictionary {&lt;metric_name&gt;: &lt;metric_value&gt;}\n...     \"\"\"\n\n...     predictions, targets = evaluation_results.predictions, evaluation_results.label_ids\n\n...     # For metric computation we need to provide:\n...     #  - targets in a form of list of dictionaries with keys \"boxes\", \"labels\"\n...     #  - predictions in a form of list of dictionaries with keys \"boxes\", \"scores\", \"labels\"\n\n...     image_sizes = []\n...     post_processed_targets = []\n...     post_processed_predictions = []\n\n...     # Collect targets in the required format for metric computation\n...     for batch in targets:\n...         # collect image sizes, we will need them for predictions post processing\n...         batch_image_sizes = torch.tensor(np.array([x[\"orig_size\"] for x in batch]))\n...         image_sizes.append(batch_image_sizes)\n...         # collect targets in the required format for metric computation\n...         # boxes were converted to YOLO format needed for model training\n...         # here we will convert them to Pascal VOC format (x_min, y_min, x_max, y_max)\n...         for image_target in batch:\n...             boxes = torch.tensor(image_target[\"boxes\"])\n...             boxes = convert_bbox_yolo_to_pascal(boxes, image_target[\"orig_size\"])\n...             labels = torch.tensor(image_target[\"class_labels\"])\n...             post_processed_targets.append({\"boxes\": boxes, \"labels\": labels})\n\n...     # Collect predictions in the required format for metric computation,\n...     # model produce boxes in YOLO format, then image_processor convert them to Pascal VOC format\n...     for batch, target_sizes in zip(predictions, image_sizes):\n...         batch_logits, batch_boxes = batch[1], batch[2]\n...         output = ModelOutput(logits=torch.tensor(batch_logits), pred_boxes=torch.tensor(batch_boxes))\n...         post_processed_output = image_processor.post_process_object_detection(\n...             output, threshold=threshold, target_sizes=target_sizes\n...         )\n...         post_processed_predictions.extend(post_processed_output)\n\n...     # Compute metrics\n...     metric = MeanAveragePrecision(box_format=\"xyxy\", class_metrics=True)\n...     metric.update(post_processed_predictions, post_processed_targets)\n...     metrics = metric.compute()\n\n...     # Replace list of per class metrics with separate metric for each class\n...     classes = metrics.pop(\"classes\")\n...     map_per_class = metrics.pop(\"map_per_class\")\n...     mar_100_per_class = metrics.pop(\"mar_100_per_class\")\n...     for class_id, class_map, class_mar in zip(classes, map_per_class, mar_100_per_class):\n...         class_name = id2label[class_id.item()] if id2label is not None else class_id.item()\n...         metrics[f\"map_{class_name}\"] = class_map\n...         metrics[f\"mar_100_{class_name}\"] = class_mar\n\n...     metrics = {k: round(v.item(), 4) for k, v in metrics.items()}\n\n...     return metrics\n\n\n&gt;&gt;&gt; eval_compute_metrics_fn = partial(\n...     compute_metrics, image_processor=image_processor, id2label=id2label, threshold=0.0\n... )\n</code></pre>"},{"location":"integrations/huggingface/object_detection/#training-the-detection-model","title":"Training the detection model","text":"<p>You have done most of the heavy lifting in the previous sections, so now you are ready to train your model! The images in this dataset are still quite large, even after resizing. This means that finetuning this model will require at least one GPU.</p> <p>Training involves the following steps: 1. Load the model with [<code>AutoModelForObjectDetection</code>] using the same checkpoint as in the preprocessing. 2. Define your training hyperparameters in [<code>TrainingArguments</code>]. 3. Pass the training arguments to [<code>Trainer</code>] along with the model, dataset, image processor, and data collator. 4. Call [<code>~Trainer.train</code>] to finetune your model.</p> <p>When loading the model from the same checkpoint that you used for the preprocessing, remember to pass the <code>label2id</code> and <code>id2label</code> maps that you created earlier from the dataset's metadata. Additionally, we specify <code>ignore_mismatched_sizes=True</code> to replace the existing classification head with a new one.</p> Python<pre><code>&gt;&gt;&gt; from transformers import AutoModelForObjectDetection\n\n&gt;&gt;&gt; model = AutoModelForObjectDetection.from_pretrained(\n...     MODEL_NAME,\n...     id2label=id2label,\n...     label2id=label2id,\n...     ignore_mismatched_sizes=True,\n... )\n</code></pre> <p>In the [<code>TrainingArguments</code>] use <code>output_dir</code> to specify where to save your model, then configure hyperparameters as you see fit. For <code>num_train_epochs=30</code> training will take about 35 minutes in Google Colab T4 GPU, increase the number of epoch to get better results.</p> <p>Important notes:  - Do not remove unused columns because this will drop the image column. Without the image column, you can't create <code>pixel_values</code>. For this reason, set <code>remove_unused_columns</code> to <code>False</code>.  - Set <code>eval_do_concat_batches=False</code> to get proper evaluation results. Images have different number of target boxes, if batches are concatenated we will not be able to determine which boxes belongs to particular image.</p> <p>If you wish to share your model by pushing to the Hub, set <code>push_to_hub</code> to <code>True</code> (you must be signed in to Hugging Face to upload your model).</p> Python<pre><code>&gt;&gt;&gt; from transformers import TrainingArguments\n\n&gt;&gt;&gt; training_args = TrainingArguments(\n...     output_dir=\"detr_finetuned_cppe5\",\n...     num_train_epochs=30,\n...     fp16=False,\n...     per_device_train_batch_size=8,\n...     dataloader_num_workers=4,\n...     learning_rate=5e-5,\n...     lr_scheduler_type=\"cosine\",\n...     weight_decay=1e-4,\n...     max_grad_norm=0.01,\n...     metric_for_best_model=\"eval_map\",\n...     greater_is_better=True,\n...     load_best_model_at_end=True,\n...     eval_strategy=\"epoch\",\n...     save_strategy=\"epoch\",\n...     save_total_limit=2,\n...     remove_unused_columns=False,\n...     eval_do_concat_batches=False,\n...     push_to_hub=True,\n... )\n</code></pre> <p>Finally, bring everything together, and call [<code>~transformers.Trainer.train</code>]:</p> Python<pre><code>&gt;&gt;&gt; from transformers import Trainer\n\n&gt;&gt;&gt; trainer = Trainer(\n...     model=model,\n...     args=training_args,\n...     train_dataset=cppe5[\"train\"],\n...     eval_dataset=cppe5[\"validation\"],\n...     processing_class=image_processor,\n...     data_collator=collate_fn,\n...     compute_metrics=eval_compute_metrics_fn,\n... )\n\n&gt;&gt;&gt; trainer.train()\n</code></pre>    [3210/3210 26:07, Epoch 30/30]  Epoch Training Loss Validation Loss Map Map 50 Map 75 Map Small Map Medium Map Large Mar 1 Mar 10 Mar 100 Mar Small Mar Medium Mar Large Map Coverall Mar 100 Coverall Map Face Shield Mar 100 Face Shield Map Gloves Mar 100 Gloves Map Goggles Mar 100 Goggles Map Mask Mar 100 Mask 1 No log 2.629903 0.008900 0.023200 0.006500 0.001300 0.002800 0.020500 0.021500 0.070400 0.101400 0.007600 0.106200 0.096100 0.036700 0.232000 0.000300 0.019000 0.003900 0.125400 0.000100 0.003100 0.003500 0.127600 2 No log 3.479864 0.014800 0.034600 0.010800 0.008600 0.011700 0.012500 0.041100 0.098700 0.130000 0.056000 0.062200 0.111900 0.053500 0.447300 0.010600 0.100000 0.000200 0.022800 0.000100 0.015400 0.009700 0.064400 3 No log 2.107622 0.041700 0.094000 0.034300 0.024100 0.026400 0.047400 0.091500 0.182800 0.225800 0.087200 0.199400 0.210600 0.150900 0.571200 0.017300 0.101300 0.007300 0.180400 0.002100 0.026200 0.031000 0.250200 4 No log 2.031242 0.055900 0.120600 0.046900 0.013800 0.038100 0.090300 0.105900 0.225600 0.266100 0.130200 0.228100 0.330000 0.191000 0.572100 0.010600 0.157000 0.014600 0.235300 0.001700 0.052300 0.061800 0.313800 5 3.889400 1.883433 0.089700 0.201800 0.067300 0.022800 0.065300 0.129500 0.136000 0.272200 0.303700 0.112900 0.312500 0.424600 0.300200 0.585100 0.032700 0.202500 0.031300 0.271000 0.008700 0.126200 0.075500 0.333800 6 3.889400 1.807503 0.118500 0.270900 0.090200 0.034900 0.076700 0.152500 0.146100 0.297800 0.325400 0.171700 0.283700 0.545900 0.396900 0.554500 0.043000 0.262000 0.054500 0.271900 0.020300 0.230800 0.077600 0.308000 7 3.889400 1.716169 0.143500 0.307700 0.123200 0.045800 0.097800 0.258300 0.165300 0.327700 0.352600 0.140900 0.336700 0.599400 0.442900 0.620700 0.069400 0.301300 0.081600 0.292000 0.011000 0.230800 0.112700 0.318200 8 3.889400 1.679014 0.153000 0.355800 0.127900 0.038700 0.115600 0.291600 0.176000 0.322500 0.349700 0.135600 0.326100 0.643700 0.431700 0.582900 0.069800 0.265800 0.088600 0.274600 0.028300 0.280000 0.146700 0.345300 9 3.889400 1.618239 0.172100 0.375300 0.137600 0.046100 0.141700 0.308500 0.194000 0.356200 0.386200 0.162400 0.359200 0.677700 0.469800 0.623900 0.102100 0.317700 0.099100 0.290200 0.029300 0.335400 0.160200 0.364000 10 1.599700 1.572512 0.179500 0.400400 0.147200 0.056500 0.141700 0.316700 0.213100 0.357600 0.381300 0.197900 0.344300 0.638500 0.466900 0.623900 0.101300 0.311400 0.104700 0.279500 0.051600 0.338500 0.173000 0.353300 11 1.599700 1.528889 0.192200 0.415000 0.160800 0.053700 0.150500 0.378000 0.211500 0.371700 0.397800 0.204900 0.374600 0.684800 0.491900 0.632400 0.131200 0.346800 0.122000 0.300900 0.038400 0.344600 0.177500 0.364400 12 1.599700 1.517532 0.198300 0.429800 0.159800 0.066400 0.162900 0.383300 0.220700 0.382100 0.405400 0.214800 0.383200 0.672900 0.469000 0.610400 0.167800 0.379700 0.119700 0.307100 0.038100 0.335400 0.196800 0.394200 13 1.599700 1.488849 0.209800 0.452300 0.172300 0.094900 0.171100 0.437800 0.222000 0.379800 0.411500 0.203800 0.397300 0.707500 0.470700 0.620700 0.186900 0.407600 0.124200 0.306700 0.059300 0.355400 0.207700 0.367100 14 1.599700 1.482210 0.228900 0.482600 0.187800 0.083600 0.191800 0.444100 0.225900 0.376900 0.407400 0.182500 0.384800 0.700600 0.512100 0.640100 0.175000 0.363300 0.144300 0.300000 0.083100 0.363100 0.229900 0.370700 15 1.326800 1.475198 0.216300 0.455600 0.174900 0.088500 0.183500 0.424400 0.226900 0.373400 0.404300 0.199200 0.396400 0.677800 0.496300 0.633800 0.166300 0.392400 0.128900 0.312900 0.085200 0.312300 0.205000 0.370200 16 1.326800 1.459697 0.233200 0.504200 0.192200 0.096000 0.202000 0.430800 0.239100 0.382400 0.412600 0.219500 0.403100 0.670400 0.485200 0.625200 0.196500 0.410100 0.135700 0.299600 0.123100 0.356900 0.225300 0.371100 17 1.326800 1.407340 0.243400 0.511900 0.204500 0.121000 0.215700 0.468000 0.246200 0.394600 0.424200 0.225900 0.416100 0.705200 0.494900 0.638300 0.224900 0.430400 0.157200 0.317900 0.115700 0.369200 0.224200 0.365300 18 1.326800 1.419522 0.245100 0.521500 0.210000 0.116100 0.211500 0.489900 0.255400 0.391600 0.419700 0.198800 0.421200 0.701400 0.501800 0.634200 0.226700 0.410100 0.154400 0.321400 0.105900 0.352300 0.236700 0.380400 19 1.158600 1.398764 0.253600 0.519200 0.213600 0.135200 0.207700 0.491900 0.257300 0.397300 0.428000 0.241400 0.401800 0.703500 0.509700 0.631100 0.236700 0.441800 0.155900 0.330800 0.128100 0.352300 0.237500 0.384000 20 1.158600 1.390591 0.248800 0.520200 0.216600 0.127500 0.211400 0.471900 0.258300 0.407000 0.429100 0.240300 0.407600 0.708500 0.505800 0.623400 0.235500 0.431600 0.150000 0.325000 0.125700 0.375400 0.227200 0.390200 21 1.158600 1.360608 0.262700 0.544800 0.222100 0.134700 0.230000 0.487500 0.269500 0.413300 0.436300 0.236200 0.419100 0.709300 0.514100 0.637400 0.257200 0.450600 0.165100 0.338400 0.139400 0.372300 0.237700 0.382700 22 1.158600 1.368296 0.262800 0.542400 0.236400 0.137400 0.228100 0.498500 0.266500 0.409000 0.433000 0.239900 0.418500 0.697500 0.520500 0.641000 0.257500 0.455700 0.162600 0.334800 0.140200 0.353800 0.233200 0.379600 23 1.158600 1.368176 0.264800 0.541100 0.233100 0.138200 0.223900 0.498700 0.272300 0.407400 0.434400 0.233100 0.418300 0.702000 0.524400 0.642300 0.262300 0.444300 0.159700 0.335300 0.140500 0.366200 0.236900 0.384000 24 1.049700 1.355271 0.269700 0.549200 0.239100 0.134700 0.229900 0.519200 0.274800 0.412700 0.437600 0.245400 0.417200 0.711200 0.523200 0.644100 0.272100 0.440500 0.166700 0.341500 0.137700 0.373800 0.249000 0.388000 25 1.049700 1.355180 0.272500 0.547900 0.243800 0.149700 0.229900 0.523100 0.272500 0.415700 0.442200 0.256200 0.420200 0.705800 0.523900 0.639600 0.271700 0.451900 0.166300 0.346900 0.153700 0.383100 0.247000 0.389300 26 1.049700 1.349337 0.275600 0.556300 0.246400 0.146700 0.234800 0.516300 0.274200 0.418300 0.440900 0.248700 0.418900 0.705800 0.523200 0.636500 0.274700 0.440500 0.172400 0.349100 0.155600 0.384600 0.252300 0.393800 27 1.049700 1.350782 0.275200 0.548700 0.246800 0.147300 0.236400 0.527200 0.280100 0.416200 0.442600 0.253400 0.424000 0.710300 0.526600 0.640100 0.273200 0.445600 0.167000 0.346900 0.160100 0.387700 0.249200 0.392900 28 1.049700 1.346533 0.277000 0.552800 0.252900 0.147400 0.240000 0.527600 0.280900 0.420900 0.444100 0.255500 0.424500 0.711200 0.530200 0.646800 0.277400 0.441800 0.170900 0.346900 0.156600 0.389200 0.249600 0.396000 29 0.993700 1.346575 0.277100 0.554800 0.252900 0.148400 0.239700 0.523600 0.278400 0.420000 0.443300 0.256300 0.424000 0.705600 0.529600 0.647300 0.273900 0.439200 0.174300 0.348700 0.157600 0.386200 0.250100 0.395100 30 0.993700 1.346446 0.277400 0.554700 0.252700 0.147900 0.240800 0.523600 0.278800 0.420400 0.443300 0.256100 0.424200 0.705500 0.530100 0.646800 0.275600 0.440500 0.174500 0.348700 0.157300 0.386200 0.249200 0.394200 <p>  If you have set `push_to_hub` to `True` in the `training_args`, the training checkpoints are pushed to the Hugging Face Hub. Upon training completion, push the final model to the Hub as well by calling the [`~transformers.Trainer.push_to_hub`] method.  Python<pre><code>&gt;&gt;&gt; trainer.push_to_hub()\n</code></pre>  ## Evaluate  Python<pre><code>&gt;&gt;&gt; from pprint import pprint\n\n&gt;&gt;&gt; metrics = trainer.evaluate(eval_dataset=cppe5[\"test\"], metric_key_prefix=\"test\")\n&gt;&gt;&gt; pprint(metrics)\n{'epoch': 30.0,\n  'test_loss': 1.0877351760864258,\n  'test_map': 0.4116,\n  'test_map_50': 0.741,\n  'test_map_75': 0.3663,\n  'test_map_Coverall': 0.5937,\n  'test_map_Face_Shield': 0.5863,\n  'test_map_Gloves': 0.3416,\n  'test_map_Goggles': 0.1468,\n  'test_map_Mask': 0.3894,\n  'test_map_large': 0.5637,\n  'test_map_medium': 0.3257,\n  'test_map_small': 0.3589,\n  'test_mar_1': 0.323,\n  'test_mar_10': 0.5237,\n  'test_mar_100': 0.5587,\n  'test_mar_100_Coverall': 0.6756,\n  'test_mar_100_Face_Shield': 0.7294,\n  'test_mar_100_Gloves': 0.4721,\n  'test_mar_100_Goggles': 0.4125,\n  'test_mar_100_Mask': 0.5038,\n  'test_mar_large': 0.7283,\n  'test_mar_medium': 0.4901,\n  'test_mar_small': 0.4469,\n  'test_runtime': 1.6526,\n  'test_samples_per_second': 17.548,\n  'test_steps_per_second': 2.42}\n</code></pre>  These results can be further improved by adjusting the hyperparameters in [`TrainingArguments`]. Give it a go!  ## Inference  Now that you have finetuned a model, evaluated it, and uploaded it to the Hugging Face Hub, you can use it for inference.  Python<pre><code>&gt;&gt;&gt; import torch\n&gt;&gt;&gt; import requests\n\n&gt;&gt;&gt; from PIL import Image, ImageDraw\n&gt;&gt;&gt; from transformers import AutoImageProcessor, AutoModelForObjectDetection\n\n&gt;&gt;&gt; url = \"https://images.pexels.com/photos/8413299/pexels-photo-8413299.jpeg?auto=compress&amp;cs=tinysrgb&amp;w=630&amp;h=375&amp;dpr=2\"\n&gt;&gt;&gt; image = Image.open(requests.get(url, stream=True).raw)\n</code></pre>  Load model and image processor from the Hugging Face Hub (skip to use already trained in this session): Python<pre><code>&gt;&gt;&gt; from accelerate.test_utils.testing import get_backend\n# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)\n&gt;&gt;&gt; device, _, _ = get_backend()\n&gt;&gt;&gt; model_repo = \"qubvel-hf/detr_finetuned_cppe5\"\n\n&gt;&gt;&gt; image_processor = AutoImageProcessor.from_pretrained(model_repo)\n&gt;&gt;&gt; model = AutoModelForObjectDetection.from_pretrained(model_repo)\n&gt;&gt;&gt; model = model.to(device)\n</code></pre>  And detect bounding boxes:  Python<pre><code>&gt;&gt;&gt; with torch.no_grad():\n...     inputs = image_processor(images=[image], return_tensors=\"pt\")\n...     outputs = model(**inputs.to(device))\n...     target_sizes = torch.tensor([[image.size[1], image.size[0]]])\n...     results = image_processor.post_process_object_detection(outputs, threshold=0.3, target_sizes=target_sizes)[0]\n\n&gt;&gt;&gt; for score, label, box in zip(results[\"scores\"], results[\"labels\"], results[\"boxes\"]):\n...     box = [round(i, 2) for i in box.tolist()]\n...     print(\n...         f\"Detected {model.config.id2label[label.item()]} with confidence \"\n...         f\"{round(score.item(), 3)} at location {box}\"\n...     )\nDetected Gloves with confidence 0.683 at location [244.58, 124.33, 300.35, 185.13]\nDetected Mask with confidence 0.517 at location [143.73, 64.58, 219.57, 125.89]\nDetected Gloves with confidence 0.425 at location [179.15, 155.57, 262.4, 226.35]\nDetected Coverall with confidence 0.407 at location [307.13, -1.18, 477.82, 318.06]\nDetected Coverall with confidence 0.391 at location [68.61, 126.66, 309.03, 318.89]\n</code></pre>  Let's plot the result:  Python<pre><code>&gt;&gt;&gt; draw = ImageDraw.Draw(image)\n\n&gt;&gt;&gt; for score, label, box in zip(results[\"scores\"], results[\"labels\"], results[\"boxes\"]):\n...     box = [round(i, 2) for i in box.tolist()]\n...     x, y, x2, y2 = tuple(box)\n...     draw.rectangle((x, y, x2, y2), outline=\"red\", width=1)\n...     draw.text((x, y), model.config.id2label[label.item()], fill=\"white\")\n\n&gt;&gt;&gt; image\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/","title":"How to Train RT-DETR on Custom Dataset with Roboflow, HuggingFace and Albumentations","text":""},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#how-to-train-rt-detr-on-custom-dataset","title":"How to Train RT-DETR on Custom Dataset","text":"<p>RT-DETR, short for \"Real-Time DEtection TRansformer\", is a computer vision model developed by Peking University and Baidu. In their paper, \"DETRs Beat YOLOs on Real-time Object Detection\" the authors claim that RT-DETR can outperform YOLO models in object detection, both in terms of speed and accuracy. The model has been released under the Apache 2.0 license, making it a great option, especially for enterprise projects.</p> <p></p> <p>Recently, RT-DETR was added to the <code>transformers</code> library, significantly simplifying its fine-tuning process. In this tutorial, we will show you how to train RT-DETR on a custom dataset.</p>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#setup","title":"Setup","text":""},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#configure-your-api-keys","title":"Configure your API keys","text":"<p>To fine-tune RT-DETR, you need to provide your HuggingFace Token and Roboflow API key. Follow these steps:</p> <ul> <li>Open your <code>HuggingFace Settings</code> page. Click <code>Access Tokens</code> then <code>New Token</code> to generate new token.</li> <li>Go to your <code>Roboflow Settings</code> page. Click <code>Copy</code>. This will place your private key in the clipboard.</li> <li>In Colab, go to the left pane and click on <code>Secrets</code> (\ud83d\udd11).<ul> <li>Store HuggingFace Access Token under the name <code>HF_TOKEN</code>.</li> <li>Store Roboflow API Key under the name <code>ROBOFLOW_API_KEY</code>.</li> </ul> </li> </ul>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#select-the-runtime","title":"Select the runtime","text":"<p>Let's make sure that we have access to GPU. We can use <code>nvidia-smi</code> command to do that. In case of any problems navigate to <code>Edit</code> -&gt; <code>Notebook settings</code> -&gt; <code>Hardware accelerator</code>, set it to <code>L4 GPU</code>, and then click <code>Save</code>.</p> Python<pre><code>!nvidia-smi\n</code></pre> <pre><code>Thu Jul 11 09:20:53 2024       \n+---------------------------------------------------------------------------------------+\n| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |\n|-----------------------------------------+----------------------+----------------------+\n| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |\n| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |\n|                                         |                      |               MIG M. |\n|=========================================+======================+======================|\n|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |\n| N/A   65C    P8              11W /  70W |      0MiB / 15360MiB |      0%      Default |\n|                                         |                      |                  N/A |\n+-----------------------------------------+----------------------+----------------------+\n\n+---------------------------------------------------------------------------------------+\n| Processes:                                                                            |\n|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |\n|        ID   ID                                                             Usage      |\n|=======================================================================================|\n|  No running processes found                                                           |\n+---------------------------------------------------------------------------------------+\n</code></pre> <p>NOTE: To make it easier for us to manage datasets, images and models we create a <code>HOME</code> constant.</p> Python<pre><code>import os\nHOME = os.getcwd()\nprint(\"HOME:\", HOME)\n</code></pre> <pre><code>HOME: /content\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#install-dependencies","title":"Install dependencies","text":"Python<pre><code>!pip install -q git+https://github.com/huggingface/transformers.git\n!pip install -q git+https://github.com/roboflow/supervision.git\n!pip install -q accelerate\n!pip install -q roboflow\n!pip install -q torchmetrics\n!pip install -q \"albumentations&gt;=1.4.5\"\n</code></pre> <pre><code>  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n  Building wheel for transformers (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n  Building wheel for supervision (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m314.1/314.1 kB\u001b[0m \u001b[31m5.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m21.3/21.3 MB\u001b[0m \u001b[31m71.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m76.2/76.2 kB\u001b[0m \u001b[31m2.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m178.7/178.7 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m54.5/54.5 kB\u001b[0m \u001b[31m5.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m868.8/868.8 kB\u001b[0m \u001b[31m6.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m165.3/165.3 kB\u001b[0m \u001b[31m4.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m14.9/14.9 MB\u001b[0m \u001b[31m83.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m13.4/13.4 MB\u001b[0m \u001b[31m92.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m313.5/313.5 kB\u001b[0m \u001b[31m34.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[?25h\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#imports","title":"Imports","text":"Python<pre><code>import torch\nimport requests\n\nimport numpy as np\nimport supervision as sv\nimport albumentations as A\n\nfrom PIL import Image\nfrom pprint import pprint\nfrom roboflow import Roboflow\nfrom dataclasses import dataclass, replace\nfrom google.colab import userdata\nfrom torch.utils.data import Dataset\nfrom transformers import (\n    AutoImageProcessor,\n    AutoModelForObjectDetection,\n    TrainingArguments,\n    Trainer\n)\nfrom torchmetrics.detection.mean_ap import MeanAveragePrecision\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#inference-with-pre-trained-rt-detr-model","title":"Inference with pre-trained RT-DETR model","text":"Python<pre><code># @title Load model\n\nCHECKPOINT = \"PekingU/rtdetr_r50vd_coco_o365\"\nDEVICE = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\nmodel = AutoModelForObjectDetection.from_pretrained(CHECKPOINT).to(DEVICE)\nprocessor = AutoImageProcessor.from_pretrained(CHECKPOINT)\n</code></pre> <pre><code>config.json:   0%|          | 0.00/5.11k [00:00&lt;?, ?B/s]\n\n\n\nmodel.safetensors:   0%|          | 0.00/172M [00:00&lt;?, ?B/s]\n\n\n\npreprocessor_config.json:   0%|          | 0.00/841 [00:00&lt;?, ?B/s]\n</code></pre> Python<pre><code># @title Run inference\n\nURL = \"https://media.roboflow.com/notebooks/examples/dog.jpeg\"\n\nimage = Image.open(requests.get(URL, stream=True).raw)\ninputs = processor(image, return_tensors=\"pt\").to(DEVICE)\n\nwith torch.no_grad():\n    outputs = model(**inputs)\n\nw, h = image.size\nresults = processor.post_process_object_detection(\n    outputs, target_sizes=[(h, w)], threshold=0.3)\n</code></pre> Python<pre><code># @title Display result with NMS\n\ndetections = sv.Detections.from_transformers(results[0])\nlabels = [\n    model.config.id2label[class_id]\n    for class_id\n    in detections.class_id\n]\n\nannotated_image = image.copy()\nannotated_image = sv.BoundingBoxAnnotator().annotate(annotated_image, detections)\nannotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels=labels)\nannotated_image.thumbnail((600, 600))\nannotated_image\n</code></pre> Python<pre><code># @title Display result with NMS\n\ndetections = sv.Detections.from_transformers(results[0]).with_nms(threshold=0.1)\nlabels = [\n    model.config.id2label[class_id]\n    for class_id\n    in detections.class_id\n]\n\nannotated_image = image.copy()\nannotated_image = sv.BoundingBoxAnnotator().annotate(annotated_image, detections)\nannotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels=labels)\nannotated_image.thumbnail((600, 600))\nannotated_image\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#fine-tune-rt-detr-on-custom-dataset","title":"Fine-tune RT-DETR on custom dataset","text":"Python<pre><code># @title Download dataset from Roboflow Universe\n\nROBOFLOW_API_KEY = userdata.get('ROBOFLOW_API_KEY')\nrf = Roboflow(api_key=ROBOFLOW_API_KEY)\n\nproject = rf.workspace(\"roboflow-jvuqo\").project(\"poker-cards-fmjio\")\nversion = project.version(4)\ndataset = version.download(\"coco\")\n</code></pre> <pre><code>loading Roboflow workspace...\nloading Roboflow project...\n\n\nDownloading Dataset Version Zip in poker-cards-4 to coco:: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 39123/39123 [00:01&lt;00:00, 27288.54it/s]\n\n\n\n\n\nExtracting Dataset Version Zip to poker-cards-4 in coco:: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 907/907 [00:00&lt;00:00, 2984.59it/s]\n</code></pre> Python<pre><code>ds_train = sv.DetectionDataset.from_coco(\n    images_directory_path=f\"{dataset.location}/train\",\n    annotations_path=f\"{dataset.location}/train/_annotations.coco.json\",\n)\nds_valid = sv.DetectionDataset.from_coco(\n    images_directory_path=f\"{dataset.location}/valid\",\n    annotations_path=f\"{dataset.location}/valid/_annotations.coco.json\",\n)\nds_test = sv.DetectionDataset.from_coco(\n    images_directory_path=f\"{dataset.location}/test\",\n    annotations_path=f\"{dataset.location}/test/_annotations.coco.json\",\n)\n\nprint(f\"Number of training images: {len(ds_train)}\")\nprint(f\"Number of validation images: {len(ds_valid)}\")\nprint(f\"Number of test images: {len(ds_test)}\")\n</code></pre> <pre><code>Number of training images: 811\nNumber of validation images: 44\nNumber of test images: 44\n</code></pre> Python<pre><code># @title Display dataset sample\n\nGRID_SIZE = 5\n\ndef annotate(image, annotations, classes):\n    labels = [\n        classes[class_id]\n        for class_id\n        in annotations.class_id\n    ]\n\n    bounding_box_annotator = sv.BoundingBoxAnnotator()\n    label_annotator = sv.LabelAnnotator(text_scale=1, text_thickness=2)\n\n    annotated_image = image.copy()\n    annotated_image = bounding_box_annotator.annotate(annotated_image, annotations)\n    annotated_image = label_annotator.annotate(annotated_image, annotations, labels=labels)\n    return annotated_image\n\nannotated_images = []\nfor i in range(GRID_SIZE * GRID_SIZE):\n    _, image, annotations = ds_train[i]\n    annotated_image = annotate(image, annotations, ds_train.classes)\n    annotated_images.append(annotated_image)\n\ngrid = sv.create_tiles(\n    annotated_images,\n    grid_size=(GRID_SIZE, GRID_SIZE),\n    single_tile_size=(400, 400),\n    tile_padding_color=sv.Color.WHITE,\n    tile_margin_color=sv.Color.WHITE\n)\nsv.plot_image(grid, size=(10, 10))\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#preprocess-the-data","title":"Preprocess the data","text":"<p>To finetune a model, you must preprocess the data you plan to use to match precisely the approach used for the pre-trained model. AutoImageProcessor takes care of processing image data to create <code>pixel_values</code>, <code>pixel_mask</code>, and <code>labels</code> that a DETR model can train with. The image processor has some attributes that you won't have to worry about:</p> <ul> <li><code>image_mean = [0.485, 0.456, 0.406 ]</code></li> <li><code>image_std = [0.229, 0.224, 0.225]</code></li> </ul> <p>These are the mean and standard deviation used to normalize images during the model pre-training. These values are crucial to replicate when doing inference or finetuning a pre-trained image model.</p> <p>Instantiate the image processor from the same checkpoint as the model you want to finetune.</p> Python<pre><code>IMAGE_SIZE = 480\n\nprocessor = AutoImageProcessor.from_pretrained(\n    CHECKPOINT,\n    do_resize=True,\n    size={\"width\": IMAGE_SIZE, \"height\": IMAGE_SIZE},\n)\n</code></pre> <p>Before passing the images to the <code>processor</code>, apply two preprocessing transformations to the dataset:</p> <ul> <li>Augmenting images</li> <li>Reformatting annotations to meet RT-DETR expectations</li> </ul> <p>First, to make sure the model does not overfit on the training data, you can apply image augmentation with any data augmentation library. Here we use Albumentations. This library ensures that transformations affect the image and update the bounding boxes accordingly.</p> Python<pre><code>train_augmentation_and_transform = A.Compose(\n    [\n        A.Perspective(p=0.1),\n        A.HorizontalFlip(p=0.5),\n        A.RandomBrightnessContrast(p=0.5),\n        A.HueSaturationValue(p=0.1),\n    ],\n    bbox_params=A.BboxParams(\n        format=\"pascal_voc\",\n        label_fields=[\"category\"],\n        clip=True,\n        min_area=25\n    ),\n)\n\nvalid_transform = A.Compose(\n    [A.NoOp()],\n    bbox_params=A.BboxParams(\n        format=\"pascal_voc\",\n        label_fields=[\"category\"],\n        clip=True,\n        min_area=1\n    ),\n)\n</code></pre> Python<pre><code># @title Visualize some augmented images\n\nIMAGE_COUNT = 5\n\nfor i in range(IMAGE_COUNT):\n    _, image, annotations = ds_train[i]\n\n    output = train_augmentation_and_transform(\n        image=image,\n        bboxes=annotations.xyxy,\n        category=annotations.class_id\n    )\n\n    augmented_image = output[\"image\"]\n    augmented_annotations = replace(\n        annotations,\n        xyxy=np.array(output[\"bboxes\"]),\n        class_id=np.array(output[\"category\"])\n    )\n\n    annotated_images = [\n        annotate(image, annotations, ds_train.classes),\n        annotate(augmented_image, augmented_annotations, ds_train.classes)\n    ]\n    grid = sv.create_tiles(\n        annotated_images,\n        titles=['original', 'augmented'],\n        titles_scale=0.5,\n        single_tile_size=(400, 400),\n        tile_padding_color=sv.Color.WHITE,\n        tile_margin_color=sv.Color.WHITE\n    )\n    sv.plot_image(grid, size=(6, 6))\n</code></pre> <p></p> <p></p> <p></p> <p></p> <p></p> <p>The <code>processor</code> expects the annotations to be in the following format: <code>{'image_id': int, 'annotations': List[Dict]}</code>, where each dictionary is a COCO object annotation. Let's add a function to reformat annotations for a single example:</p> Python<pre><code>class PyTorchDetectionDataset(Dataset):\n    def __init__(self, dataset: sv.DetectionDataset, processor, transform: A.Compose = None):\n        self.dataset = dataset\n        self.processor = processor\n        self.transform = transform\n\n    @staticmethod\n    def annotations_as_coco(image_id, categories, boxes):\n        annotations = []\n        for category, bbox in zip(categories, boxes):\n            x1, y1, x2, y2 = bbox\n            formatted_annotation = {\n                \"image_id\": image_id,\n                \"category_id\": category,\n                \"bbox\": [x1, y1, x2 - x1, y2 - y1],\n                \"iscrowd\": 0,\n                \"area\": (x2 - x1) * (y2 - y1),\n            }\n            annotations.append(formatted_annotation)\n\n        return {\n            \"image_id\": image_id,\n            \"annotations\": annotations,\n        }\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx):\n        _, image, annotations = self.dataset[idx]\n\n        # Convert image to RGB numpy array\n        image = image[:, :, ::-1]\n        boxes = annotations.xyxy\n        categories = annotations.class_id\n\n        if self.transform:\n            transformed = self.transform(\n                image=image,\n                bboxes=boxes,\n                category=categories\n            )\n            image = transformed[\"image\"]\n            boxes = transformed[\"bboxes\"]\n            categories = transformed[\"category\"]\n\n\n        formatted_annotations = self.annotations_as_coco(\n            image_id=idx, categories=categories, boxes=boxes)\n        result = self.processor(\n            images=image, annotations=formatted_annotations, return_tensors=\"pt\")\n\n        # Image processor expands batch dimension, lets squeeze it\n        result = {k: v[0] for k, v in result.items()}\n\n        return result\n</code></pre> <p>Now you can combine the image and annotation transformations to use on a batch of examples:</p> Python<pre><code>pytorch_dataset_train = PyTorchDetectionDataset(\n    ds_train, processor, transform=train_augmentation_and_transform)\npytorch_dataset_valid = PyTorchDetectionDataset(\n    ds_valid, processor, transform=valid_transform)\npytorch_dataset_test = PyTorchDetectionDataset(\n    ds_test, processor, transform=valid_transform)\n\npytorch_dataset_train[15]\n</code></pre> <pre><code>{'pixel_values': tensor([[[0.0745, 0.0745, 0.0745,  ..., 0.2431, 0.2471, 0.2471],\n          [0.0745, 0.0745, 0.0745,  ..., 0.2510, 0.2549, 0.2549],\n          [0.0667, 0.0706, 0.0706,  ..., 0.2588, 0.2588, 0.2588],\n          ...,\n          [0.0118, 0.0118, 0.0118,  ..., 0.0510, 0.0549, 0.0510],\n          [0.0157, 0.0196, 0.0235,  ..., 0.0549, 0.0627, 0.0549],\n          [0.0235, 0.0275, 0.0314,  ..., 0.0549, 0.0627, 0.0549]],\n\n         [[0.0549, 0.0549, 0.0549,  ..., 0.3137, 0.3176, 0.3176],\n          [0.0549, 0.0549, 0.0549,  ..., 0.3216, 0.3255, 0.3255],\n          [0.0471, 0.0510, 0.0510,  ..., 0.3294, 0.3294, 0.3294],\n          ...,\n          [0.0000, 0.0000, 0.0000,  ..., 0.0353, 0.0392, 0.0353],\n          [0.0000, 0.0000, 0.0039,  ..., 0.0392, 0.0471, 0.0392],\n          [0.0000, 0.0039, 0.0078,  ..., 0.0392, 0.0471, 0.0392]],\n\n         [[0.0431, 0.0431, 0.0431,  ..., 0.3922, 0.3961, 0.3961],\n          [0.0431, 0.0431, 0.0431,  ..., 0.4000, 0.4039, 0.4039],\n          [0.0353, 0.0392, 0.0392,  ..., 0.4078, 0.4078, 0.4078],\n          ...,\n          [0.0000, 0.0000, 0.0000,  ..., 0.0314, 0.0353, 0.0314],\n          [0.0000, 0.0000, 0.0039,  ..., 0.0353, 0.0431, 0.0353],\n          [0.0000, 0.0039, 0.0078,  ..., 0.0353, 0.0431, 0.0353]]]),\n 'labels': {'size': tensor([480, 480]), 'image_id': tensor([15]), 'class_labels': tensor([36,  4, 44, 52, 48]), 'boxes': tensor([[0.7891, 0.4437, 0.2094, 0.3562],\n         [0.3984, 0.6484, 0.3187, 0.3906],\n         [0.5891, 0.4070, 0.2219, 0.3859],\n         [0.3484, 0.2812, 0.2625, 0.4094],\n         [0.1602, 0.5023, 0.2672, 0.4109]]), 'area': tensor([17185.5000, 28687.5000, 19729.1250, 24759.0000, 25297.3125]), 'iscrowd': tensor([0, 0, 0, 0, 0]), 'orig_size': tensor([640, 640])}}\n</code></pre> <p>You have successfully augmented the images and prepared their annotations. In the final step, create a custom collate_fn to batch images together.</p> Python<pre><code>def collate_fn(batch):\n    data = {}\n    data[\"pixel_values\"] = torch.stack([x[\"pixel_values\"] for x in batch])\n    data[\"labels\"] = [x[\"labels\"] for x in batch]\n    return data\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#preparing-function-to-compute-map","title":"Preparing function to compute mAP","text":"Python<pre><code>id2label = {id: label for id, label in enumerate(ds_train.classes)}\nlabel2id = {label: id for id, label in enumerate(ds_train.classes)}\n\n\n@dataclass\nclass ModelOutput:\n    logits: torch.Tensor\n    pred_boxes: torch.Tensor\n\n\nclass MAPEvaluator:\n\n    def __init__(self, image_processor, threshold=0.00, id2label=None):\n        self.image_processor = image_processor\n        self.threshold = threshold\n        self.id2label = id2label\n\n    def collect_image_sizes(self, targets):\n        \"\"\"Collect image sizes across the dataset as list of tensors with shape [batch_size, 2].\"\"\"\n        image_sizes = []\n        for batch in targets:\n            batch_image_sizes = torch.tensor(np.array([x[\"size\"] for x in batch]))\n            image_sizes.append(batch_image_sizes)\n        return image_sizes\n\n    def collect_targets(self, targets, image_sizes):\n        post_processed_targets = []\n        for target_batch, image_size_batch in zip(targets, image_sizes):\n            for target, (height, width) in zip(target_batch, image_size_batch):\n                boxes = target[\"boxes\"]\n                boxes = sv.xcycwh_to_xyxy(boxes)\n                boxes = boxes * np.array([width, height, width, height])\n                boxes = torch.tensor(boxes)\n                labels = torch.tensor(target[\"class_labels\"])\n                post_processed_targets.append({\"boxes\": boxes, \"labels\": labels})\n        return post_processed_targets\n\n    def collect_predictions(self, predictions, image_sizes):\n        post_processed_predictions = []\n        for batch, target_sizes in zip(predictions, image_sizes):\n            batch_logits, batch_boxes = batch[1], batch[2]\n            output = ModelOutput(logits=torch.tensor(batch_logits), pred_boxes=torch.tensor(batch_boxes))\n            post_processed_output = self.image_processor.post_process_object_detection(\n                output, threshold=self.threshold, target_sizes=target_sizes\n            )\n            post_processed_predictions.extend(post_processed_output)\n        return post_processed_predictions\n\n    @torch.no_grad()\n    def __call__(self, evaluation_results):\n\n        predictions, targets = evaluation_results.predictions, evaluation_results.label_ids\n\n        image_sizes = self.collect_image_sizes(targets)\n        post_processed_targets = self.collect_targets(targets, image_sizes)\n        post_processed_predictions = self.collect_predictions(predictions, image_sizes)\n\n        evaluator = MeanAveragePrecision(box_format=\"xyxy\", class_metrics=True)\n        evaluator.warn_on_many_detections = False\n        evaluator.update(post_processed_predictions, post_processed_targets)\n\n        metrics = evaluator.compute()\n\n        # Replace list of per class metrics with separate metric for each class\n        classes = metrics.pop(\"classes\")\n        map_per_class = metrics.pop(\"map_per_class\")\n        mar_100_per_class = metrics.pop(\"mar_100_per_class\")\n        for class_id, class_map, class_mar in zip(classes, map_per_class, mar_100_per_class):\n            class_name = id2label[class_id.item()] if id2label is not None else class_id.item()\n            metrics[f\"map_{class_name}\"] = class_map\n            metrics[f\"mar_100_{class_name}\"] = class_mar\n\n        metrics = {k: round(v.item(), 4) for k, v in metrics.items()}\n\n        return metrics\n\neval_compute_metrics_fn = MAPEvaluator(image_processor=processor, threshold=0.01, id2label=id2label)\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#training-the-detection-model","title":"Training the detection model","text":"<p>You have done most of the heavy lifting in the previous sections, so now you are ready to train your model! The images in this dataset are still quite large, even after resizing. This means that finetuning this model will require at least one GPU.</p> <p>Training involves the following steps:</p> <ul> <li>Load the model with <code>AutoModelForObjectDetection</code> using the same checkpoint as in the preprocessing.</li> <li>Define your training hyperparameters in <code>TrainingArguments</code>.</li> <li>Pass the training arguments to <code>Trainer</code> along with the model, dataset, image processor, and data collator.</li> <li>Call <code>train()</code> to finetune your model.</li> </ul> <p>When loading the model from the same checkpoint that you used for the preprocessing, remember to pass the <code>label2id</code> and <code>id2label</code> maps that you created earlier from the dataset's metadata. Additionally, we specify <code>ignore_mismatched_sizes=True</code> to replace the existing classification head with a new one.</p> Python<pre><code>model = AutoModelForObjectDetection.from_pretrained(\n    CHECKPOINT,\n    id2label=id2label,\n    label2id=label2id,\n    anchor_image_size=None,\n    ignore_mismatched_sizes=True,\n)\n</code></pre> <pre><code>Some weights of RTDetrForObjectDetection were not initialized from the model checkpoint at PekingU/rtdetr_r50vd_coco_o365 and are newly initialized because the shapes did not match:\n- model.decoder.class_embed.0.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.0.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.1.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.1.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.2.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.2.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.3.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.3.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.4.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.4.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.5.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.5.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.denoising_class_embed.weight: found shape torch.Size([81, 256]) in the checkpoint and torch.Size([54, 256]) in the model instantiated\n- model.enc_score_head.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.enc_score_head.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\nYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n</code></pre> <p>In the <code>TrainingArguments</code> use <code>output_dir</code> to specify where to save your model, then configure hyperparameters as you see fit. For <code>num_train_epochs=10</code> training will take about 15 minutes in Google Colab T4 GPU, increase the number of epoch to get better results.</p> <p>Important notes:</p> <ul> <li>Do not remove unused columns because this will drop the image column. Without the image column, you can't create <code>pixel_values</code>. For this reason, set <code>remove_unused_columns</code> to <code>False</code>.</li> <li>Set <code>eval_do_concat_batches=False</code> to get proper evaluation results. Images have different number of target boxes, if batches are concatenated we will not be able to determine which boxes belongs to particular image.</li> </ul> Python<pre><code>training_args = TrainingArguments(\n    output_dir=f\"{dataset.name.replace(' ', '-')}-finetune\",\n    num_train_epochs=20,\n    max_grad_norm=0.1,\n    learning_rate=5e-5,\n    warmup_steps=300,\n    per_device_train_batch_size=16,\n    dataloader_num_workers=2,\n    metric_for_best_model=\"eval_map\",\n    greater_is_better=True,\n    load_best_model_at_end=True,\n    eval_strategy=\"epoch\",\n    save_strategy=\"epoch\",\n    save_total_limit=2,\n    remove_unused_columns=False,\n    eval_do_concat_batches=False,\n)\n</code></pre> <p>Finally, bring everything together, and call <code>train()</code>:</p> Python<pre><code>trainer = Trainer(\n    model=model,\n    args=training_args,\n    train_dataset=pytorch_dataset_train,\n    eval_dataset=pytorch_dataset_valid,\n    tokenizer=processor,\n    data_collator=collate_fn,\n    compute_metrics=eval_compute_metrics_fn,\n)\n\ntrainer.train()\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#evaluate","title":"Evaluate","text":"Python<pre><code># @title Collect predictions\n\ntargets = []\npredictions = []\n\nfor i in range(len(ds_test)):\n    path, sourece_image, annotations = ds_test[i]\n\n    image = Image.open(path)\n    inputs = processor(image, return_tensors=\"pt\").to(DEVICE)\n\n    with torch.no_grad():\n        outputs = model(**inputs)\n\n    w, h = image.size\n    results = processor.post_process_object_detection(\n        outputs, target_sizes=[(h, w)], threshold=0.3)\n\n    detections = sv.Detections.from_transformers(results[0])\n\n    targets.append(annotations)\n    predictions.append(detections)\n</code></pre> Python<pre><code># @title Calculate mAP\nmean_average_precision = sv.MeanAveragePrecision.from_detections(\n    predictions=predictions,\n    targets=targets,\n)\n\nprint(f\"map50_95: {mean_average_precision.map50_95:.2f}\")\nprint(f\"map50: {mean_average_precision.map50:.2f}\")\nprint(f\"map75: {mean_average_precision.map75:.2f}\")\n</code></pre> <pre><code>map50_95: 0.89\nmap50: 0.94\nmap75: 0.94\n</code></pre> Python<pre><code># @title Calculate Confusion Matrix\nconfusion_matrix = sv.ConfusionMatrix.from_detections(\n    predictions=predictions,\n    targets=targets,\n    classes=ds_test.classes\n)\n\n_ = confusion_matrix.plot()\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#save-fine-tuned-model-on-hard-drive","title":"Save fine-tuned model on hard drive","text":"Python<pre><code>model.save_pretrained(\"/content/rt-detr/\")\nprocessor.save_pretrained(\"/content/rt-detr/\")\n</code></pre> <pre><code>['/content/rt-detr/preprocessor_config.json']\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#inference-with-fine-tuned-rt-detr-model","title":"Inference with fine-tuned RT-DETR model","text":"Python<pre><code>IMAGE_COUNT = 5\n\nfor i in range(IMAGE_COUNT):\n    path, sourece_image, annotations = ds_test[i]\n\n    image = Image.open(path)\n    inputs = processor(image, return_tensors=\"pt\").to(DEVICE)\n\n    with torch.no_grad():\n        outputs = model(**inputs)\n\n    w, h = image.size\n    results = processor.post_process_object_detection(\n        outputs, target_sizes=[(h, w)], threshold=0.3)\n\n    detections = sv.Detections.from_transformers(results[0]).with_nms(threshold=0.1)\n\n    annotated_images = [\n        annotate(sourece_image, annotations, ds_train.classes),\n        annotate(sourece_image, detections, ds_train.classes)\n    ]\n    grid = sv.create_tiles(\n        annotated_images,\n        titles=['ground truth', 'prediction'],\n        titles_scale=0.5,\n        single_tile_size=(400, 400),\n        tile_padding_color=sv.Color.WHITE,\n        tile_margin_color=sv.Color.WHITE\n    )\n    sv.plot_image(grid, size=(6, 6))\n</code></pre>"},{"location":"introduction/image_augmentation/","title":"What is image augmentation and how it can improve the performance of deep neural networks","text":"<p>Deep neural networks require a lot of training data to obtain good results and prevent overfitting. However, it is often very difficult to get enough training samples. Multiple reasons could make it very hard or even impossible to gather enough data:</p> <ul> <li> <p>To make a training dataset, you need to obtain images and then label them. For example, you need to assign correct class labels if you have an image classification task. For an object detection task, you need to draw bounding boxes around objects.  For a semantic segmentation task, you need to assign a correct class to each input image pixel. This process requires manual labor, and sometimes it could be very costly to label the training data. For example, to correctly label medical images, you need expensive domain experts.</p> </li> <li> <p>Sometimes even collecting training images could be hard. There are many legal restrictions for working with healthcare data, and obtaining it requires a lot of effort. Sometimes getting the training images is more feasible, but it will cost a lot of money. For example, to get satellite images, you need to pay a satellite operator to take those photos. To get images for road scene recognition, you need an operator that will drive a car and collect the required data.</p> </li> </ul>"},{"location":"introduction/image_augmentation/#image-augmentation-to-the-rescue","title":"Image augmentation to the rescue","text":"<p>Image augmentation is a process of creating new training examples from the existing ones. To make a new sample, you slightly change the original image. For instance, you could make a new image a little brighter; you could cut a piece from the original image; you could make a new image by mirroring the original one, etc.</p> <p>Here are some examples of transformations of the original image that will create a new training sample.</p> <p></p> <p>By applying those transformations to the original training dataset, you could create an almost infinite amount of new training samples.</p>"},{"location":"introduction/image_augmentation/#how-much-does-image-augmentation-improves-the-quality-and-performance-of-deep-neural-networks","title":"How much does image augmentation improves the quality and performance of deep neural networks","text":"<p>Basic augmentations techniques were used almost in all papers that describe the state-of-the-art models for image recognition.</p> <p>AlexNet was the first model that demonstrated exceptional capabilities of using deep neural networks for image recognition. For training, the authors used a set of basic image augmentation techniques. They resized original images to the fixed size of 256 by 256 pixels, and then they cropped patches of size 224 by 224 pixels as well as their horizontal reflections from those resized images. Also, they altered the intensities of the RGB channels in images.</p> <p>Successive state-of-the-art models such as Inception, ResNet, and EfficientNet also used image augmentation techniques for training.</p> <p>In 2018 Google published a paper about AutoAugment - an algorithm that automatically discovers the best set of augmentations for the dataset. They showed that a custom set of augmentations improves the performance of the model.</p> <p>Here is a comparison between a model that used only the base set of augmentations and a model that used a specific set of augmentations discovered by AutoAugment. The table shows Top-1 accuracy (%) on the ImageNet validation set; higher is better.</p> Model Base augmentations AutoAugment augmentations ResNet-50 76.3 77.6 ResNet-200 78.5 80.0 AmoebaNet-B (6,190) 82.2 82.8 AmoebaNet-C (6,228) 83.1 83.5 <p>The table demonstrates that a diverse set of image augmentations improves the performance of neural networks compared to a base set with only a few most popular transformation techniques.</p> <p>Augmentations help to fight overfitting and improve the performance of deep neural networks for computer vision tasks such as classification, segmentation, and object detection. The best part is that image augmentations libraries such as Albumentations make it possible to add image augmentations to any computer vision pipeline with minimal effort.</p>"},{"location":"introduction/why_albumentations/","title":"Why Albumentations","text":""},{"location":"introduction/why_albumentations/#a-single-interface-to-work-with-images-masks-bounding-boxes-and-key-points","title":"A single interface to work with images, masks, bounding boxes, and key points","text":"<p>Albumentations provides a single interface to work with different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, pose estimation, etc.</p>"},{"location":"introduction/why_albumentations/#battle-tested","title":"Battle-tested","text":"<p>The library is widely used in industry, deep learning research, machine learning competitions, and open source projects.</p>"},{"location":"introduction/why_albumentations/#high-performance","title":"High performance","text":"<p>Albumentations optimized for maximum speed and performance. Under the hood, the library uses highly optimized functions from OpenCV and NumPy for data processing. We have a regularly updated benchmark that compares the speed of popular image augmentations libraries for the most common image transformations. Albumentations demonstrates the best performance in most cases.</p>"},{"location":"introduction/why_albumentations/#diverse-set-of-supported-augmentations","title":"Diverse set of supported augmentations","text":"<p>Albumentations supports more than 60 different image augmentations.</p>"},{"location":"introduction/why_albumentations/#extensibility","title":"Extensibility","text":"<p>Albumentations allows to easily add new augmentations and use them in computer vision pipelines through a single interface along with built-in transformations.</p>"},{"location":"introduction/why_albumentations/#rigorous-testing","title":"Rigorous testing","text":"<p>Bugs in the augmentation pipeline could silently corrupt the input data. They can easily go unnoticed, but the performance of the models trained with incorrect data will degrade. Albumentations has an extensive test suite that helps to discover bugs during development.</p>"},{"location":"introduction/why_albumentations/#it-is-open-source-and-mit-licensed","title":"It is open source and MIT licensed","text":"<p>You can find the source code on GitHub.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/","title":"Why you need a dedicated library for image augmentation","text":"<p>At first glance, image augmentations look very simple; you apply basic transformations to an image: mirroring, cropping, changing brightness and contrast, etc.</p> <p>There are a lot of libraries that could do such image transformations. Here is an example of how you could use Pillow, a popular image processing library for Python, to make simple augmentations.</p> Python<pre><code>from PIL import Image, ImageEnhance\n\nimage = Image.open(\"parrot.jpg\")\n\nmirrored_image = image.transpose(Image.FLIP_LEFT_RIGHT)\n\nrotated_image = image.rotate(45)\n\nbrightness_enhancer = ImageEnhance.Brightness(image)\nbrighter_image = brightness_enhancer.enhance(factor=1.5)\n</code></pre> <p></p> <p>However, this approach has many limitations, and it doesn't handle all cases with image augmentation. An image augmentation library such as Albumentations gives you a lot of advantages.</p> <p>Here is a list of few pitfalls that augmentation libraries can handle very well.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#the-need-to-apply-the-same-transform-to-an-image-and-for-labels-for-segmentation-object-detection-and-keypoint-detection-tasks","title":"The need to apply the same transform to an image and for labels for segmentation, object detection, and keypoint detection tasks.","text":"<p>For image classification, you need to modify only an input image and keep output labels intact because output labels are invariant to image modifications.</p> <p></p> <p>Note</p> <p>There are some exceptions to this rule. For example, an image could contain a cat and have an assigned label <code>cat</code>. During image augmentation, if you crop a part of an image that doesn't have a cat on it, then the output label <code>cat</code> becomes wrong and misleading. Usually, you deal with those situations by deciding which augmentations you could apply to a dataset without risking to have problems with incorrect labels.</p> <p></p> <p>For segmentation, you need to apply some transformations both to an input image and an output mask. You also have to use the same parameters both for the image transformation and the mask transformation.</p> <p>Let's look at an example of a semantic segmentation task from Inria Aerial Image Labeling Dataset. The dataset contains aerial photos as well as masks for those photos. Each pixel of the mask is marked either as 1 if the pixel belongs to the class <code>building</code> and 0 otherwise.</p> <p>There are two types of image augmentations: pixel-level augmentations and spatial-level augmentations.</p> <p>Pixel-level augmentations change the values of pixels of the original image, but they don't change the output mask. Image transformations such as changing brightness or contrast of adjusting values of the RGB-palette of the image are pixel-level augmentations.</p> <p> We modify the input image by adjusting its brightness, but we keep the output mask intact.</p> <p>On the contrary, spatial-level augmentations change both the image and the mask. When you apply image transformations such as mirroring or rotation or cropping a part of the input image, you also need to apply the same transformation to the output label to preserve its correctness.</p> <p> We rotate both the input image and the output mask. We use the same set of transformations with the same parameters, both for the image and the mask.</p> <p>The same is true for object detection tasks. For pixel-level augmentations, you only need to change the input image. With spatial-level augmentations, you need to apply the same transformation not only to the image but for bounding boxes coordinates as well. After applying spatial-level augmentations, you need to update coordinates of bounding boxes to represent the correct locations of objects on the augmented image.</p> <p> Pixel-level augmentations such as brightness adjustment change only the input image but not the coordinates of bounding boxes. Spatial-level augmentations such as mirroring and cropping a part of the image change both the input image and the bounding boxes' coordinates.</p> <p>Albumentations knows how to correctly apply transformation both to the input data as well as the output labels.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#working-with-probabilities","title":"Working with probabilities","text":"<p>During training, you usually want to apply augmentations with a probability of less than 100% since you also need to have the original images in your training pipeline. Also, it is beneficial to be able to control the magnitude of image augmentation, how much does the augmentation change the original image. If the original dataset is large, you could apply only the basic augmentations with probability around 10-30% and with a small magnitude of changes. If the dataset is small, you need to act more aggressively with augmentations to prevent overfitting of neural networks, so you usually need to increase the probability of applying each augmentation to 40-50% and increase the magnitude of changes the augmentation makes to the image.</p> <p>Image augmentation libraries allow you to set the required probabilities and the magnitude of values for each transformation.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#declarative-definition-of-the-augmentation-pipeline-and-unified-interface","title":"Declarative definition of the augmentation pipeline and unified interface","text":"<p>Usually, you want to apply not a single augmentation, but a set of augmentations with specific parameters such as probability and magnitude of changes. Augmentation libraries allow you to declare such a pipeline in a single place and then use it for image transformation through a unified interface. Some libraries can store and load transformation parameters to formats such as JSON, YAML, etc.</p> <p>Here is an example definition of an augmentation pipeline. This pipeline will first crop a random 512px x 512px part of the input image. Then with probability 30%, it will randomly change brightness and contrast of that crop. Finally, with probability 50%, it will horizontally flip the resulting image.</p> Python<pre><code>import albumentations as A\n\ntransform = A.Compose([\n    A.RandomCrop(512, 512),\n    A.RandomBrightnessContrast(p=0.3),\n    A.HorizontalFlip(p=0.5),\n])\n</code></pre>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#rigorous-testing","title":"Rigorous testing","text":"<p>A bug in the augmentation pipeline could easily go unnoticed. A buggy pipeline could silently corrupt input data. There won't be any exceptions and code failures, but the performance of trained neural networks will degrade because they received a garbage input during training. Augmentation libraries usually have large test suites that capture regressions during development. Also large user base helps to find unnoticed bugs and report them to developers.</p>"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Welcome to Albumentations documentation","text":"<p>Albumentations is a fast and flexible image augmentation library. The library is widely used in industry, deep learning research, machine learning competitions, and open source projects. Albumentations is written in Python, and it is licensed under the MIT license. The source code is available at https://github.com/albumentations-team/albumentations.</p> <p>If you are new to image augmentation, start with our \"Learning Path\" for beginners. It describes what image augmentation is, how it can boost deep neural networks' performance, and why you should use Albumentations.</p> <p>For hands-on experience, check out our \"Quick Start Guide\" and \"Examples\" sections. They show how you can use the library for different computer vision tasks: image classification, semantic segmentation, instance segmentation, object detection, and keypoint detection. Each example includes a link to Google Colab, where you can run the code by yourself.</p> <p>You can also visit explore.albumentations.ai to visually explore and experiment with different augmentations in your browser. This interactive tool helps you better understand how each transform affects images before implementing it in your code.</p> <p>\"API Reference\" contains the description of Albumentations' methods and classes.</p>"},{"location":"#quick-start-guide","title":"Quick Start Guide","text":"<ul> <li>Installation</li> <li>Frequently Asked Questions</li> <li>Your First Augmentation Pipeline</li> </ul>"},{"location":"#working-with-multi-dimensional-data","title":"Working with Multi-dimensional Data","text":""},{"location":"#volumetric-data-3d","title":"Volumetric Data (3D)","text":"<ul> <li>Introduction to 3D (Volumetric) Image Augmentation</li> <li>Available 3D Transforms</li> </ul>"},{"location":"#video-and-sequential-data","title":"Video and Sequential Data","text":"<ul> <li>Video Frame Augmentation </li> </ul>"},{"location":"#learning-path","title":"Learning Path","text":""},{"location":"#beginners","title":"Beginners","text":"<ul> <li>What is Image Augmentation?</li> <li>Why Choose Albumentations?</li> <li>Basic Image Classification</li> </ul>"},{"location":"#intermediate","title":"Intermediate","text":"<ul> <li>Semantic Segmentation</li> <li>Object Detection</li> <li>Keypoint Detection</li> <li>Multi-target Augmentation</li> </ul>"},{"location":"#advanced","title":"Advanced","text":"<ul> <li>Pipeline Configuration</li> <li>Debugging with ReplayCompose</li> <li>Serialization</li> </ul>"},{"location":"#framework-integration","title":"Framework Integration","text":"<ul> <li>PyTorch</li> <li>TensorFlow</li> <li>HuggingFace</li> <li>Roboflow</li> <li>Voxel51</li> </ul>"},{"location":"#library-comparisons","title":"Library Comparisons","text":"<ul> <li>Transform Library Comparison - Find equivalent transforms between Albumentations and other libraries (torchvision, Kornia)</li> <li>Migration from torchvision - Step-by-step migration guide</li> </ul>"},{"location":"#examples","title":"Examples","text":"<ul> <li>Defining a simple augmentation pipeline for image augmentation</li> <li>Using Albumentations to augment bounding boxes for object detection tasks</li> <li>How to use Albumentations for detection tasks if you need to keep all bounding boxes</li> <li>Using Albumentations for a semantic segmentation task</li> <li>Using Albumentations to augment keypoints</li> <li>Applying the same augmentation with the same parameters to multiple images, masks, bounding boxes, or keypoints</li> <li>Weather augmentations in Albumentations</li> <li>Example of applying XYMasking transform</li> <li>Example of applying ChromaticAberration transform</li> <li>Example of applying Morphological transform</li> <li>Example of applying D4 transform</li> <li>Example of applying RandomGridShuffle transform</li> <li>Example of applying OverlayElements transform</li> <li>Example of applying TextImage transform</li> <li>Debugging an augmentation pipeline with ReplayCompose</li> <li>How to save and load parameters of an augmentation pipeline</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> <li>How to save and load transforms to HuggingFace Hub.</li> </ul>"},{"location":"#examples-of-how-to-use-albumentations-with-different-deep-learning-frameworks","title":"Examples of how to use Albumentations with different deep learning frameworks","text":"<ul> <li>PyTorch and Albumentations for image classification</li> <li>PyTorch and Albumentations for semantic segmentation</li> <li>Using Albumentations with Tensorflow</li> </ul>"},{"location":"#external-resources","title":"External resources","text":"<ul> <li>Blog posts, podcasts, talks, and videos about Albumentations</li> <li>Books that mention Albumentations</li> <li>Online courses that cover Albumentations</li> </ul>"},{"location":"#other-topics","title":"Other topics","text":"<ul> <li>Contributing</li> </ul>"},{"location":"#api-reference","title":"API Reference","text":"<ul> <li>Full API Reference on a single page</li> <li>Index</li> <li>Core API (albumentations.core)</li> <li>Augmentations (albumentations.augmentations)</li> <li>PyTorch Helpers (albumentations.pytorch)</li> </ul>"},{"location":"CONTRIBUTING/","title":"Contributing to Albumentations","text":"<p>Thank you for your interest in contributing to Albumentations! This guide will help you get started with contributing to our image augmentation library.</p>"},{"location":"CONTRIBUTING/#quick-start","title":"Quick Start","text":"<p>For small changes (e.g., bug fixes), feel free to submit a PR directly.</p> <p>For larger changes:</p> <ol> <li>Create an issue outlining your proposed change</li> <li>Join our Discord community to discuss your idea</li> </ol>"},{"location":"CONTRIBUTING/#contribution-guides","title":"Contribution Guides","text":"<p>We've organized our contribution guidelines into focused documents:</p> <ul> <li>Environment Setup Guide - How to set up your development environment</li> <li>Coding Guidelines - Code style, best practices, and technical requirements</li> </ul>"},{"location":"CONTRIBUTING/#contribution-process","title":"Contribution Process","text":"<ol> <li>Find an Issue: Look for open issues or propose a new one. For newcomers, look for issues labeled \"good first issue\"</li> <li>Set Up: Follow our Environment Setup Guide</li> <li>Create a Branch: <code>git checkout -b feature/my-new-feature</code></li> <li>Make Changes: Write code following our Coding Guidelines</li> <li>Test: Add tests and ensure all tests pass</li> <li>Submit: Open a Pull Request with a clear description of your changes</li> </ol>"},{"location":"CONTRIBUTING/#code-review-process","title":"Code Review Process","text":"<ol> <li>Maintainers will review your contribution</li> <li>Address any feedback or questions</li> <li>Once approved, your code will be merged</li> </ol>"},{"location":"CONTRIBUTING/#project-structure","title":"Project Structure","text":"<ul> <li><code>albumentations/</code> - Main source code</li> <li><code>tests/</code> - Test suite</li> <li><code>docs/</code> - Documentation</li> </ul>"},{"location":"CONTRIBUTING/#getting-help","title":"Getting Help","text":"<ul> <li>Join our Discord community</li> <li>Open a GitHub issue</li> <li>Ask questions in your pull request</li> </ul>"},{"location":"CONTRIBUTING/#license","title":"License","text":"<p>By contributing, you agree that your contributions will be licensed under the project's MIT License.</p>"},{"location":"benchmarking_results/","title":"Benchmarking results","text":""},{"location":"benchmarking_results/#benchmarking-results_1","title":"Benchmarking results","text":""},{"location":"benchmarking_results/#system-information","title":"System Information","text":"<ul> <li>Platform: macOS-15.0.1-arm64-arm-64bit</li> <li>Processor: arm</li> <li>CPU Count: 10</li> <li>Python Version: 3.12.7</li> </ul>"},{"location":"benchmarking_results/#benchmark-parameters","title":"Benchmark Parameters","text":"<ul> <li>Number of images: 1000</li> <li>Runs per transform: 10</li> <li>Max warmup iterations: 1000</li> </ul>"},{"location":"benchmarking_results/#library-versions","title":"Library Versions","text":"<ul> <li>albumentations: 1.4.20</li> <li>augly: 1.0.0</li> <li>imgaug: 0.4.0</li> <li>kornia: 0.7.3</li> <li>torchvision: 0.20.0</li> </ul>"},{"location":"benchmarking_results/#performance-comparison","title":"Performance Comparison","text":"<p>Number - is the number of uint8 RGB images processed per second on a single CPU core. Higher is better.</p> Transform albumentations1.4.20 augly1.0.0 imgaug0.4.0 kornia0.7.3 torchvision0.20.0 HorizontalFlip 8618 \u00b1 1233 4807 \u00b1 818 6042 \u00b1 788 390 \u00b1 106 914 \u00b1 67 VerticalFlip 22847 \u00b1 2031 9153 \u00b1 1291 10931 \u00b1 1844 1212 \u00b1 402 3198 \u00b1 200 Rotate 1146 \u00b1 79 1119 \u00b1 41 1136 \u00b1 218 143 \u00b1 11 181 \u00b1 11 Affine 682 \u00b1 192 - 774 \u00b1 97 147 \u00b1 9 130 \u00b1 12 Equalize 892 \u00b1 61 - 581 \u00b1 54 152 \u00b1 19 479 \u00b1 12 RandomCrop80 47341 \u00b1 20523 25272 \u00b1 1822 11503 \u00b1 441 1510 \u00b1 230 32109 \u00b1 1241 ShiftRGB 2349 \u00b1 76 - 1582 \u00b1 65 - - Resize 2316 \u00b1 166 611 \u00b1 78 1806 \u00b1 63 232 \u00b1 24 195 \u00b1 4 RandomGamma 8675 \u00b1 274 - 2318 \u00b1 269 108 \u00b1 13 - Grayscale 3056 \u00b1 47 2720 \u00b1 932 1681 \u00b1 156 289 \u00b1 75 1838 \u00b1 130 RandomPerspective 412 \u00b1 38 - 554 \u00b1 22 86 \u00b1 11 96 \u00b1 5 GaussianBlur 1728 \u00b1 89 242 \u00b1 4 1090 \u00b1 65 176 \u00b1 18 79 \u00b1 3 MedianBlur 868 \u00b1 60 - 813 \u00b1 30 5 \u00b1 0 - MotionBlur 4047 \u00b1 67 - 612 \u00b1 18 73 \u00b1 2 - Posterize 9094 \u00b1 301 - 2097 \u00b1 68 430 \u00b1 49 3196 \u00b1 185 JpegCompression 918 \u00b1 23 778 \u00b1 5 459 \u00b1 35 71 \u00b1 3 625 \u00b1 17 GaussianNoise 166 \u00b1 12 67 \u00b1 2 206 \u00b1 11 75 \u00b1 1 - Elastic 201 \u00b1 5 - 235 \u00b1 20 1 \u00b1 0 2 \u00b1 0 Clahe 454 \u00b1 22 - 335 \u00b1 43 94 \u00b1 9 - CoarseDropout 13368 \u00b1 744 - 671 \u00b1 38 536 \u00b1 87 - Blur 5267 \u00b1 543 246 \u00b1 3 3807 \u00b1 325 - - ColorJitter 628 \u00b1 55 255 \u00b1 13 - 55 \u00b1 18 46 \u00b1 2 Brightness 8956 \u00b1 300 1163 \u00b1 86 - 472 \u00b1 101 429 \u00b1 20 Contrast 8879 \u00b1 1426 736 \u00b1 79 - 425 \u00b1 52 335 \u00b1 35 RandomResizedCrop 2828 \u00b1 186 - - 287 \u00b1 58 511 \u00b1 10 Normalize 1196 \u00b1 56 - - 626 \u00b1 40 519 \u00b1 12 PlankianJitter 2204 \u00b1 385 - - 813 \u00b1 211 -"},{"location":"faq/","title":"Frequently Asked Questions","text":"<p>This FAQ covers common questions about Albumentations, from basic setup to advanced usage. You'll find information about:</p> <ul> <li>Installation troubleshooting and configuration</li> <li>Working with different data formats (images, video, volumetric data)</li> <li>Advanced usage patterns and best practices</li> <li>Integration with other tools and migration from other libraries</li> </ul> <p>If you don't find an answer to your question, please check our GitHub Issues or join our Discord community.</p>"},{"location":"faq/#installation","title":"Installation","text":""},{"location":"faq/#i-am-receiving-an-error-message-failed-building-wheel-for-imagecodecs-when-i-am-trying-to-install-albumentations-how-can-i-fix-the-problem","title":"I am receiving an error message <code>Failed building wheel for imagecodecs</code> when I am trying to install Albumentations. How can I fix the problem?","text":"<p>Try to update <code>pip</code> by running the following command:</p> Bash<pre><code>python -m pip install --upgrade pip\n</code></pre>"},{"location":"faq/#how-to-disable-automatic-checks-for-new-versions","title":"How to disable automatic checks for new versions?","text":"<p>To disable automatic checks for new versions, set the environment variable <code>NO_ALBUMENTATIONS_UPDATE</code> to <code>1</code>.</p>"},{"location":"faq/#how-to-make-albumentations-use-one-cpu-core","title":"How to make Albumentations use one CPU core?","text":"<p>Albumentations do not use multithreading by default, but libraries it depends on (like opencv) may use multithreading. To make Albumentations use one CPU core, you can set the following environment variables:</p> Python<pre><code>os.environ[\"OMP_NUM_THREADS\"] = \"1\"\nos.environ[\"OPENBLAS_NUM_THREADS\"] = \"1\"\nos.environ[\"MKL_NUM_THREADS\"] = \"1\"\nos.environ[\"VECLIB_MAXIMUM_THREADS\"] = \"1\"\nos.environ[\"NUMEXPR_NUM_THREADS\"] = \"1\"\n</code></pre>"},{"location":"faq/#data-formats-and-basic-usage","title":"Data Formats and Basic Usage","text":""},{"location":"faq/#supported-image-types","title":"Supported Image Types","text":"<p>Albumentations works with images of type uint8 and float32. uint8 images should be in the <code>[0, 255]</code> range, and float32 images should be in the <code>[0, 1]</code> range. If float32 images lie outside of the <code>[0, 1]</code> range, they will be automatically clipped to the <code>[0, 1]</code> range.</p>"},{"location":"faq/#why-do-you-call-cv2cvtcolorimage-cv2color_bgr2rgb-in-your-examples","title":"Why do you call <code>cv2.cvtColor(image, cv2.COLOR_BGR2RGB)</code> in your examples?","text":"<p>For historical reasons, OpenCV reads an image in BGR format (so color channels of the image have the following order: Blue, Green, Red). Albumentations uses the most common and popular RGB image format. So when using OpenCV, we need to convert the image format to RGB explicitly.</p>"},{"location":"faq/#how-to-have-reproducible-augmentations","title":"How to have reproducible augmentations?","text":"<p>To have reproducible augmentations, set the <code>seed</code> parameter in your transform pipeline. This will ensure that the same random parameters are used for each augmentation, resulting in the same output for the same input.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(height=256, width=256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], seed=42)\n</code></pre>"},{"location":"faq/#working-with-different-data-types","title":"Working with Different Data Types","text":""},{"location":"faq/#how-to-process-video-data-with-albumentations","title":"How to process video data with Albumentations?","text":"<p>Albumentations can process video data by treating it as a sequence of frames in numpy array format: - <code>(N, H, W)</code> - Grayscale video (N frames) - <code>(N, H, W, C)</code> - Color video (N frames)</p> <p>When you pass a video array, Albumentations will apply the same transform with identical parameters to each frame, ensuring temporal consistency.</p> Python<pre><code>video = np.random.rand(32, 256, 256, 3) # 32 RGB frames\n\ntransform = A.Compose([\n  A.RandomCrop(height=224, width=224),\n  A.HorizontalFlip(p=0.5)\n], seed=42)\n\ntransformed = transform(image=video)['image']\n</code></pre> <p>See Working with Video Data for more info.</p>"},{"location":"faq/#how-to-process-volumetric-data-with-albumentations","title":"How to process volumetric data with Albumentations?","text":"<p>Albumentations can process volumetric data by treating it as a sequence of 2D slices. When you pass a volumetric data as a numpy array, Albumentations will apply the same transform with identical parameters to each slice, ensuring temporal consistency.</p> <p>See Working with Volumetric Data (3D) for more info.</p>"},{"location":"faq/#my-computer-vision-pipeline-works-with-a-sequence-of-images-i-want-to-apply-the-same-augmentations-with-the-same-parameters-to-each-image-in-the-sequence-can-albumentations-do-it","title":"My computer vision pipeline works with a sequence of images. I want to apply the same augmentations with the same parameters to each image in the sequence. Can Albumentations do it?","text":"<p>Yes. You can define additional images, masks, bounding boxes, or keypoints through the <code>additional_targets</code> argument to <code>Compose</code>. You can then pass those additional targets to the augmentation pipeline, and Albumentations will augment them in the same way. See this example for more info.</p> <p>But if you want only to the sequence of images, you may just use <code>images</code> target that accepts <code>list[numpy.ndarray]</code> or np.ndarray with shape <code>(N, H, W, C) / (N, H, W)</code>.</p>"},{"location":"faq/#advanced-usage","title":"Advanced Usage","text":""},{"location":"faq/#how-can-i-find-which-augmentations-were-applied-to-the-input-data-and-which-parameters-they-used","title":"How can I find which augmentations were applied to the input data and which parameters they used?","text":"<p>You may pass <code>save_applied_params=True</code> to <code>Compose</code> to save the parameters of the applied augmentations. You can access them later using <code>applied_transforms</code>.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(256, 256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.5),\n    A.RandomGamma(p=0.5),\n    A.Normalize(),\n], save_applied_params=True, seed=42)\n\ntransformed = transform(image=image)['image']\n\nprint(transform[\"applied_transforms\"])\n</code></pre>"},{"location":"faq/#how-to-perform-balanced-scaling","title":"How to perform balanced scaling?","text":"<p>The default scaling logic in <code>RandomScale</code>, <code>ShiftScaleRotate</code>, and <code>Affine</code> transformations is biased towards upscaling.</p> <p>For example, if <code>scale_limit = (0.5, 2)</code>, a user might expect that the image will be scaled down in half of the cases and scaled up in the other half. However, in reality, the image will be scaled up in 75% of the cases and scaled down in only 25% of the cases. This is because the default behavior samples uniformly from the interval <code>[0.5, 2]</code>, and the interval <code>[0.5, 1]</code> is three times smaller than <code>[1, 2]</code>.</p> <p>To achieve balanced scaling, you can use <code>Affine</code> with <code>balanced_scale=True</code>, which ensures that the probability of scaling up and scaling down is equal.</p> Python<pre><code>balanced_scale_transform = A.Affine(scale=(0.5, 2), balanced_scale=True)\n</code></pre> <p>or use <code>OneOf</code> transform as follows:</p> Python<pre><code>balanced_scale_transform = A.OneOf([\n  A.Affine(scale=(0.5, 1), p=0.5),\n  A.Affine(scale=(1, 2), p=0.5)])\n</code></pre> <p>This approach ensures that exactly half of the samples will be upscaled and half will be downscaled.</p>"},{"location":"faq/#augmentations-have-a-parameter-named-p-that-sets-the-probability-of-applying-that-augmentation-how-does-p-work-in-nested-containers","title":"Augmentations have a parameter named <code>p</code> that sets the probability of applying that augmentation. How does <code>p</code> work in nested containers?","text":"<p>The <code>p</code> parameter sets the probability of applying a specific augmentation. When augmentations are nested within a top-level container like <code>Compose</code>, the effective probability of each augmentation is the product of the container's probability and the augmentation's probability.</p> <p>Let's look at an example when a container <code>Compose</code> contains one augmentation <code>Resize</code>:</p> Python<pre><code>transform = A.Compose([\n    A.Resize(height=256, width=256, p=1.0),\n], p=0.9)\n</code></pre> <p>In this case, <code>Resize</code> has a 90% chance to be applied. This is because there is a 90% chance for <code>Compose</code> to be applied (p=0.9). If <code>Compose</code> is applied, then <code>Resize</code> is applied with 100% probability <code>(p=1.0)</code>.</p> <p>To visualize:</p> <ul> <li>Probability of <code>Compose</code> being applied: 0.9</li> <li>Probability of <code>Resize</code> being applied given <code>Compose</code> is applied: 1.0</li> <li>Effective probability of <code>Resize</code> being applied: 0.9 * 1.0 = 0.9 (or 90%)</li> </ul> <p>This means that the effective probability of <code>Resize</code> being applied is the product of the probabilities of <code>Compose</code> and <code>Resize</code>, which is <code>0.9 * 1.0 = 0.9</code> or 90%. This principle applies to other transformations as well, where the overall probability is the product of the individual probabilities within the transformation pipeline.</p> <p>Here\u2019s another example:</p> Python<pre><code>transform = A.Compose([\n    A.Resize(height=256, width=256, p=0.5),\n], p=0.9)\n</code></pre> <p>In this example, Resize has an effective probability of being applied as <code>0.9 * 0.5</code> = 0.45 or 45%. This is because <code>Compose</code> is applied 90% of the time, and within that 90%, <code>Resize</code> is applied 50% of the time.</p>"},{"location":"faq/#i-created-annotations-for-bounding-boxes-using-labeling-service-or-labeling-software-how-can-i-use-those-annotations-in-albumentations","title":"I created annotations for bounding boxes using labeling service or labeling software. How can I use those annotations in Albumentations?","text":"<p>You need to convert those annotations to one of the formats, supported by Albumentations. For the list of formats, please refer to this article. Consult the documentation of the labeling service to see how you can export annotations in those formats.</p>"},{"location":"faq/#integration-and-migration","title":"Integration and Migration","text":""},{"location":"faq/#how-to-save-and-load-augmentation-transforms-to-huggingface-hub","title":"How to save and load augmentation transforms to HuggingFace Hub?","text":"Python<pre><code>import albumentations as A\nimport numpy as np\n\ntransform = A.Compose([\n    A.RandomCrop(256, 256),\n    A.HorizontalFlip(),\n    A.RandomBrightnessContrast(),\n    A.RGBShift(),\n    A.Normalize(),\n])\n\ntransform.save_pretrained(\"qubvel-hf/albu\", key=\"train\")\n# The 'key' parameter specifies the context or purpose of the saved transform,\n# allowing for organized and context-specific retrieval.\n# ^ this will save the transform to a directory \"qubvel-hf/albu\" with filename \"albumentations_config_train.json\"\n\ntransform.save_pretrained(\"qubvel-hf/albu\", key=\"train\", push_to_hub=True)\n# ^ this will save the transform to a directory \"qubvel-hf/albu\" with filename \"albumentations_config_train.json\"\n# + push the transform to the Hub to the repository \"qubvel-hf/albu\"\n\ntransform.push_to_hub(\"qubvel-hf/albu\", key=\"train\")\n# Use `save_pretrained` to save the transform locally and optionally push to the Hub.\n# Use `push_to_hub` to directly push the transform to the Hub without saving it locally.\n# ^ this will push the transform to the Hub to the repository \"qubvel-hf/albu\" (without saving it locally)\n\nloaded_transform = A.Compose.from_pretrained(\"qubvel-hf/albu\", key=\"train\")\n# ^ this will load the transform from local folder if exist or from the Hub repository \"qubvel-hf/albu\"\n</code></pre> <p>See this example for more info.</p>"},{"location":"faq/#how-do-i-migrate-from-other-augmentation-libraries-to-albumentations","title":"How do I migrate from other augmentation libraries to Albumentations?","text":"<p>If you're migrating from other libraries like torchvision or Kornia, you can refer to our Library Comparison &amp; Benchmarks guide. This guide provides:</p> <ol> <li>Mapping tables showing equivalent transforms between libraries</li> <li>Performance benchmarks demonstrating Albumentations' speed advantages</li> <li>Code examples for common migration scenarios</li> <li>Key differences in implementation and parameter handling</li> </ol> <p>For a quick visual comparison of different augmentations, you can also use our interactive tool at explore.albumentations.ai to see how transforms affect images before implementing them.</p> <p>For specific migration examples, see:</p> <ul> <li>Migrating from torchvision</li> <li>Performance comparison with other libraries</li> </ul>"},{"location":"frameworks_and_libraries/","title":"Frameworks and libraries that use Albumentations","text":""},{"location":"frameworks_and_libraries/#mmdetection","title":"MMDetection","text":"<p>https://github.com/open-mmlab/mmdetection</p> <p>MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.</p> <ul> <li>To install MMDetection with Albumentations follow the installation instructions.</li> <li>MMDetection has an example config with augmentations from Albumentations.</li> </ul>"},{"location":"frameworks_and_libraries/#yolov5","title":"YOLOv5","text":"<p>https://github.com/ultralytics/yolov5</p> <p>YOLOv5 \ud83d\ude80 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research into future vision AI methods, incorporating lessons learned and best practices evolved over thousands of hours of research and development.</p> <ul> <li>To use Albumentations along with YOLOv5 simply <code>pip install -U albumentations</code> and then update the augmentation pipeline as you see fit in the Albumentations class in <code>utils/augmentations.py</code>. An example is available in the YOLOv5 repository.</li> </ul>"},{"location":"frameworks_and_libraries/#other-frameworks-and-libraries","title":"Other frameworks and libraries","text":"<p>Other you can see find at GitHub</p>"},{"location":"api_reference/","title":"Index","text":"<ul> <li>Full API Reference on a single page</li> <li>Core API (albumentations.core)<ul> <li>Composition API (albumentations.core.composition)</li> <li>Serialization API (albumentations.core.serialization)</li> <li>Transforms Interface (albumentations.core.transforms_interface)</li> <li>Helper functions for working with bounding boxes (albumentations.core.bbox_utils)</li> <li>Helper functions for working with keypoints (albumentations.core.keypoints_utils)</li> </ul> </li> <li>Augmentations (albumentations.augmentations)<ul> <li>Transforms (albumentations.augmentations.transforms)</li> <li>Functional transforms (albumentations.augmentations.functional)</li> </ul> </li> <li>PyTorch Helpers (albumentations.pytorch)<ul> <li>Transforms (albumentations.pytorch.transforms)</li> </ul> </li> </ul>"},{"location":"api_reference/full_reference/","title":"Full API Reference on a single page","text":""},{"location":"api_reference/full_reference/#transform-types","title":"Transform Types","text":""},{"location":"api_reference/full_reference/#1-pixel-level-transforms","title":"1. Pixel-level transforms","text":"<p>Transforms that modify pixel values without changing spatial relationships. These can be safely applied to any target as they only affect the input image, leaving other targets (masks, bounding boxes, keypoints) unchanged.</p> <ul> <li>AdditiveNoise</li> <li>AdvancedBlur</li> <li>AutoContrast</li> <li>Blur</li> <li>CLAHE</li> <li>ChannelDropout</li> <li>ChannelShuffle</li> <li>ChromaticAberration</li> <li>ColorJitter</li> <li>Defocus</li> <li>Downscale</li> <li>Emboss</li> <li>Equalize</li> <li>FDA</li> <li>FancyPCA</li> <li>FromFloat</li> <li>GaussNoise</li> <li>GaussianBlur</li> <li>GlassBlur</li> <li>HistogramMatching</li> <li>HueSaturationValue</li> <li>ISONoise</li> <li>Illumination</li> <li>ImageCompression</li> <li>InvertImg</li> <li>MedianBlur</li> <li>MotionBlur</li> <li>MultiplicativeNoise</li> <li>Normalize</li> <li>PixelDistributionAdaptation</li> <li>PlanckianJitter</li> <li>PlasmaBrightnessContrast</li> <li>PlasmaShadow</li> <li>Posterize</li> <li>RGBShift</li> <li>RandomBrightnessContrast</li> <li>RandomFog</li> <li>RandomGamma</li> <li>RandomGravel</li> <li>RandomRain</li> <li>RandomShadow</li> <li>RandomSnow</li> <li>RandomSunFlare</li> <li>RandomToneCurve</li> <li>RingingOvershoot</li> <li>SaltAndPepper</li> <li>Sharpen</li> <li>ShotNoise</li> <li>Solarize</li> <li>Spatter</li> <li>Superpixels</li> <li>TemplateTransform</li> <li>TextImage</li> <li>ToFloat</li> <li>ToGray</li> <li>ToRGB</li> <li>ToSepia</li> <li>UnsharpMask</li> <li>ZoomBlur</li> </ul>"},{"location":"api_reference/full_reference/#2-spatial-level-transforms","title":"2. Spatial-level transforms","text":"<p>Transforms that modify the spatial arrangement of pixels/features. Different targets have different spatial transform support - see the compatibility table below:</p> Transform Image Mask BBoxes Keypoints Volume Mask3D Affine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 AtLeastOneBBoxRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 BBoxSafeRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CenterCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CoarseDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Crop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropAndPad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropNonEmptyMaskIfExists \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 D4 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ElasticTransform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Erasing \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 FrequencyMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridElasticDeform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 HorizontalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Lambda \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 LongestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 MaskDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Morphological \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 NoOp \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OpticalDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OverlayElements \u2713 \u2713 Pad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PadIfNeeded \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Perspective \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PiecewiseAffine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PixelDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropFromBorders \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropNearBBox \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomGridShuffle \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomResizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomRotate90 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomScale \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedBBoxSafeCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Resize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Rotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SafeRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ShiftScaleRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SmallestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ThinPlateSpline \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeReverse \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Transpose \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 VerticalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 XYMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713"},{"location":"api_reference/full_reference/#3-volumetric-3d-transforms","title":"3. Volumetric (3D) transforms","text":"<p>Transforms designed for three-dimensional data (D, H, W). These operate on volumes and their corresponding 3D masks, supporting both single-channel and multi-channel data.</p> Transform Image Mask BBoxes Keypoints Volume Mask3D CenterCrop3D \u2713 \u2713 \u2713 CoarseDropout3D \u2713 \u2713 \u2713 CubicSymmetry \u2713 \u2713 \u2713 Pad3D \u2713 \u2713 \u2713 PadIfNeeded3D \u2713 \u2713 \u2713 RandomCrop3D \u2713 \u2713 \u2713"},{"location":"api_reference/augmentations/","title":"Index","text":"<ul> <li>Transforms (albumentations.augmentations.transforms)</li> <li>Blur transforms (albumentations.augmentations.blur)</li> <li>Crop transforms (albumentations.augmentations.crops)</li> <li>Dropout transforms (albumentations.augmentations.dropout)</li> <li>Geometric transforms (albumentations.augmentations.geometric)</li> <li>Domain adaptation transforms (albumentations.augmentations.domain_adaptation)</li> <li>Functional transforms (albumentations.augmentations.functional)</li> </ul>"},{"location":"api_reference/augmentations/domain_adaptation/","title":"Domain adaptation transforms (augmentations.domain_adaptation)","text":""},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional","title":"<code>functional</code>","text":""},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional.apply_histogram","title":"<code>def apply_histogram    (img, reference_image, blend_ratio)    </code> [view source on GitHub]","text":"<p>Apply histogram matching to an input image using a reference image and blend the result.</p> <p>This function performs histogram matching between the input image and a reference image, then blends the result with the original input image based on the specified blend ratio.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The input image to be transformed. Can be either grayscale or RGB. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>reference_image</code> <code>np.ndarray</code> <p>The reference image used for histogram matching. Should have the same number of channels as the input image. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>blend_ratio</code> <code>float</code> <p>The ratio for blending the matched image with the original image. Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The transformed image after histogram matching and blending.     The output will have the same shape and dtype as the input image.</p> <p>Supported image types:     - Grayscale images: 2D arrays     - RGB images: 3D arrays with 3 channels     - Multispectral images: 3D arrays with more than 3 channels</p> <p>Note</p> <ul> <li>If the input and reference images have different sizes, the reference image   will be resized to match the input image's dimensions.</li> <li>The function uses a custom implementation of histogram matching based on OpenCV and NumPy.</li> <li>The @clipped and @preserve_channel_dim decorators ensure the output is within   the valid range and maintains the original number of dimensions.</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef apply_histogram(img: np.ndarray, reference_image: np.ndarray, blend_ratio: float) -&gt; np.ndarray:\n    \"\"\"Apply histogram matching to an input image using a reference image and blend the result.\n\n    This function performs histogram matching between the input image and a reference image,\n    then blends the result with the original input image based on the specified blend ratio.\n\n    Args:\n        img (np.ndarray): The input image to be transformed. Can be either grayscale or RGB.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        reference_image (np.ndarray): The reference image used for histogram matching.\n            Should have the same number of channels as the input image.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        blend_ratio (float): The ratio for blending the matched image with the original image.\n            Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.\n\n    Returns:\n        np.ndarray: The transformed image after histogram matching and blending.\n            The output will have the same shape and dtype as the input image.\n\n    Supported image types:\n        - Grayscale images: 2D arrays\n        - RGB images: 3D arrays with 3 channels\n        - Multispectral images: 3D arrays with more than 3 channels\n\n    Note:\n        - If the input and reference images have different sizes, the reference image\n          will be resized to match the input image's dimensions.\n        - The function uses a custom implementation of histogram matching based on OpenCV and NumPy.\n        - The @clipped and @preserve_channel_dim decorators ensure the output is within\n          the valid range and maintains the original number of dimensions.\n    \"\"\"\n    # Resize reference image only if necessary\n    if img.shape[:2] != reference_image.shape[:2]:\n        reference_image = cv2.resize(reference_image, dsize=(img.shape[1], img.shape[0]))\n\n    img = np.squeeze(img)\n    reference_image = np.squeeze(reference_image)\n\n    # Match histograms between the images\n    matched = match_histograms(img, reference_image)\n\n    # Blend the original image and the matched image\n    return add_weighted(matched, blend_ratio, img, 1 - blend_ratio)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional.fourier_domain_adaptation","title":"<code>def fourier_domain_adaptation    (img, target_img, beta)    </code> [view source on GitHub]","text":"<p>Apply Fourier Domain Adaptation to the input image using a target image.</p> <p>This function performs domain adaptation in the frequency domain by modifying the amplitude spectrum of the source image based on the target image's amplitude spectrum. It preserves the phase information of the source image, which helps maintain its content while adapting its style to match the target image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The source image to be adapted. Can be grayscale or RGB.</p> <code>target_img</code> <code>np.ndarray</code> <p>The target image used as a reference for adaptation. Should have the same dimensions as the source image.</p> <code>beta</code> <code>float</code> <p>The adaptation strength, typically in the range [0, 1]. Higher values result in stronger adaptation towards the target image's style.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The adapted image with the same shape and type as the input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the source and target images have different shapes.</p> <p>Note</p> <ul> <li>Both input images are converted to float32 for processing.</li> <li>The function handles both grayscale (2D) and color (3D) images.</li> <li>For grayscale images, an extra dimension is added to facilitate uniform processing.</li> <li>The adaptation is performed channel-wise for color images.</li> <li>The output is clipped to the valid range and preserves the original number of channels.</li> </ul> <p>The adaptation process involves the following steps for each channel: 1. Compute the 2D Fourier Transform of both source and target images. 2. Shift the zero frequency component to the center of the spectrum. 3. Extract amplitude and phase information from the source image's spectrum. 4. Mutate the source amplitude using the target amplitude and the beta parameter. 5. Combine the mutated amplitude with the original phase. 6. Perform the inverse Fourier Transform to obtain the adapted channel.</p> <p>The <code>low_freq_mutate</code> function (not shown here) is responsible for the actual amplitude mutation, focusing on low-frequency components which carry style information.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n&gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n</code></pre> <p>References</p> <ul> <li>\"FDA: Fourier Domain Adaptation for Semantic Segmentation\"   (Yang and Soatto, 2020, CVPR)   https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef fourier_domain_adaptation(img: np.ndarray, target_img: np.ndarray, beta: float) -&gt; np.ndarray:\n    \"\"\"Apply Fourier Domain Adaptation to the input image using a target image.\n\n    This function performs domain adaptation in the frequency domain by modifying the amplitude\n    spectrum of the source image based on the target image's amplitude spectrum. It preserves\n    the phase information of the source image, which helps maintain its content while adapting\n    its style to match the target image.\n\n    Args:\n        img (np.ndarray): The source image to be adapted. Can be grayscale or RGB.\n        target_img (np.ndarray): The target image used as a reference for adaptation.\n            Should have the same dimensions as the source image.\n        beta (float): The adaptation strength, typically in the range [0, 1].\n            Higher values result in stronger adaptation towards the target image's style.\n\n    Returns:\n        np.ndarray: The adapted image with the same shape and type as the input image.\n\n    Raises:\n        ValueError: If the source and target images have different shapes.\n\n    Note:\n        - Both input images are converted to float32 for processing.\n        - The function handles both grayscale (2D) and color (3D) images.\n        - For grayscale images, an extra dimension is added to facilitate uniform processing.\n        - The adaptation is performed channel-wise for color images.\n        - The output is clipped to the valid range and preserves the original number of channels.\n\n    The adaptation process involves the following steps for each channel:\n    1. Compute the 2D Fourier Transform of both source and target images.\n    2. Shift the zero frequency component to the center of the spectrum.\n    3. Extract amplitude and phase information from the source image's spectrum.\n    4. Mutate the source amplitude using the target amplitude and the beta parameter.\n    5. Combine the mutated amplitude with the original phase.\n    6. Perform the inverse Fourier Transform to obtain the adapted channel.\n\n    The `low_freq_mutate` function (not shown here) is responsible for the actual\n    amplitude mutation, focusing on low-frequency components which carry style information.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n        &gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n\n    References:\n        - \"FDA: Fourier Domain Adaptation for Semantic Segmentation\"\n          (Yang and Soatto, 2020, CVPR)\n          https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n    \"\"\"\n    src_img = img.astype(np.float32)\n    trg_img = target_img.astype(np.float32)\n\n    if src_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        src_img = np.expand_dims(src_img, axis=-1)\n    if trg_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        trg_img = np.expand_dims(trg_img, axis=-1)\n\n    num_channels = src_img.shape[-1]\n\n    # Prepare container for the output image\n    src_in_trg = np.zeros_like(src_img)\n\n    for channel_id in range(num_channels):\n        # Perform FFT on each channel\n        fft_src = np.fft.fft2(src_img[:, :, channel_id])\n        fft_trg = np.fft.fft2(trg_img[:, :, channel_id])\n\n        # Shift the zero frequency component to the center\n        fft_src_shifted = np.fft.fftshift(fft_src)\n        fft_trg_shifted = np.fft.fftshift(fft_trg)\n\n        # Extract amplitude and phase\n        amp_src, pha_src = np.abs(fft_src_shifted), np.angle(fft_src_shifted)\n        amp_trg = np.abs(fft_trg_shifted)\n\n        # Mutate the amplitude part of the source with the target\n        mutated_amp = low_freq_mutate(amp_src.copy(), amp_trg, beta)\n\n        # Combine the mutated amplitude with the original phase\n        fft_src_mutated = np.fft.ifftshift(mutated_amp * np.exp(1j * pha_src))\n\n        # Perform inverse FFT\n        src_in_trg_channel = np.fft.ifft2(fft_src_mutated)\n\n        # Store the result in the corresponding channel of the output image\n        src_in_trg[:, :, channel_id] = np.real(src_in_trg_channel)\n\n    return src_in_trg\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.functional.match_histograms","title":"<code>def match_histograms    (image, reference)    </code> [view source on GitHub]","text":"<p>Adjust an image so that its cumulative histogram matches that of another.</p> <p>The adjustment is applied separately for each channel.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image. Can be gray-scale or in color.</p> <code>reference</code> <code>np.ndarray</code> <p>Image to match histogram of. Must have the same number of channels as image.</p> <code>channel_axis</code> <p>If None, the image is assumed to be a grayscale (single channel) image. Otherwise, this parameter indicates which axis of the array corresponds to channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>Thrown when the number of channels in the input image and the reference differ.</p> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef match_histograms(image: np.ndarray, reference: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Adjust an image so that its cumulative histogram matches that of another.\n\n    The adjustment is applied separately for each channel.\n\n    Args:\n        image: Input image. Can be gray-scale or in color.\n        reference: Image to match histogram of. Must have the same number of channels as image.\n        channel_axis: If None, the image is assumed to be a grayscale (single channel) image.\n            Otherwise, this parameter indicates which axis of the array corresponds to channels.\n\n    Returns:\n        np.ndarray: Transformed input image.\n\n    Raises:\n        ValueError: Thrown when the number of channels in the input image and the reference differ.\n    \"\"\"\n    if reference.dtype != np.uint8:\n        reference = from_float(reference, np.uint8)\n\n    if image.ndim != reference.ndim:\n        raise ValueError(\"Image and reference must have the same number of dimensions.\")\n\n    # Expand dimensions for grayscale images\n    if image.ndim == 2:\n        image = np.expand_dims(image, axis=-1)\n    if reference.ndim == 2:\n        reference = np.expand_dims(reference, axis=-1)\n\n    matched = np.empty(image.shape, dtype=np.uint8)\n\n    num_channels = image.shape[-1]\n\n    for channel in range(num_channels):\n        matched_channel = _match_cumulative_cdf(image[..., channel], reference[..., channel]).astype(np.uint8)\n        matched[..., channel] = matched_channel\n\n    return matched\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms","title":"<code>transforms</code>","text":""},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.FDA","title":"<code>class  FDA</code> <code>       (reference_images, beta_limit=(0, 0.1), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.</p> <p>This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.</p> <p>The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more of the original image's characteristics and higher values leading to more pronounced adaptation effects. It is recommended to use beta values less than 0.3 to avoid introducing artifacts.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>Sequence of objects to be converted into images by <code>read_fn</code>. This typically involves paths to images that serve as target domain examples for adaptation.</p> <code>beta_limit</code> <code>tuple[float, float] | float</code> <p>Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.</p> <code>read_fn</code> <code>Callable</code> <p>User-defined function for reading images. It takes an element from <code>reference_images</code> and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <ul> <li>https://github.com/YanchaoYang/FDA</li> <li>https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n&gt;&gt;&gt; result = aug(image=image)\n</code></pre> <p>Note</p> <p>FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target domain samples are unavailable. It enables significant improvements in model generalization by aligning the low-level statistics of source and target images through a simple yet effective Fourier-based method.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class FDA(ImageOnlyTransform):\n    \"\"\"Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation\n    (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source\n    and target datasets, effectively adapting images from one domain to closely resemble those from another without\n    altering their semantic content.\n\n    This transform is particularly beneficial in scenarios where the training (source) and testing (target) images\n    come from different distributions, such as synthetic versus real images, or day versus night scenes.\n    Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain\n    alignment by swapping low-frequency components of the Fourier transform between the source and target images.\n    This technique has shown to improve the performance of models on the target domain, particularly for tasks\n    like semantic segmentation, without additional training for domain invariance.\n\n    The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more\n    of the original image's characteristics and higher values leading to more pronounced adaptation effects.\n    It is recommended to use beta values less than 0.3 to avoid introducing artifacts.\n\n    Args:\n        reference_images (Sequence[Any]): Sequence of objects to be converted into images by `read_fn`. This typically\n            involves paths to images that serve as target domain examples for adaptation.\n        beta_limit (tuple[float, float] | float): Coefficient beta from the paper, controlling the swapping extent of\n            frequency components. If one value is provided beta will be sampled from uniform\n            distribution [0, beta_limit]. Values should be less than 0.5.\n        read_fn (Callable): User-defined function for reading images. It takes an element from `reference_images` and\n            returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a\n            numpy array.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        - https://github.com/YanchaoYang/FDA\n        - https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n        &gt;&gt;&gt; result = aug(image=image)\n\n    Note:\n        FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target\n        domain samples are unavailable. It enables significant improvements in model generalization by aligning\n        the low-level statistics of source and target images through a simple yet effective Fourier-based method.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        read_fn: Callable[[Any], np.ndarray]\n        beta_limit: ZeroOneRangeType\n\n        @field_validator(\"beta_limit\")\n        @classmethod\n        def check_ranges(cls, value: tuple[float, float]) -&gt; tuple[float, float]:\n            bounds = 0, MAX_BETA_LIMIT\n            if not bounds[0] &lt;= value[0] &lt;= value[1] &lt;= bounds[1]:\n                raise ValueError(f\"Values should be in the range {bounds} got {value} \")\n            return value\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        beta_limit: ScaleFloatType = (0, 0.1),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.beta_limit = cast(tuple[float, float], beta_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        target_image: np.ndarray,\n        beta: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fourier_domain_adaptation(img, target_image, beta)\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n        target_img = self.read_fn(self.py_random.choice(self.reference_images))\n        target_img = cv2.resize(target_img, dsize=(width, height))\n\n        return {\"target_image\": target_img, \"beta\": self.py_random.uniform(*self.beta_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str]:\n        return \"reference_images\", \"beta_limit\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"FDA can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.HistogramMatching","title":"<code>class  HistogramMatching</code> <code>       (reference_images, blend_ratio=(0.5, 1.0), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Adjust the pixel values of an input image to match the histogram of a reference image.</p> <p>This transform applies histogram matching, a technique that modifies the distribution of pixel intensities in the input image to closely resemble that of a reference image. This process is performed independently for each channel in multi-channel images, provided both the input and reference images have the same number of channels.</p> <p>Histogram matching is particularly useful for: - Normalizing images from different sources or captured under varying conditions. - Preparing images for feature matching or other computer vision tasks where consistent   tone and contrast are important. - Simulating different lighting or camera conditions in a controlled manner.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the <code>read_fn</code>.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)</p> <code>read_fn</code> <code>Callable[[Any], np.ndarray]</code> <p>A function that takes an element from <code>reference_images</code> and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> <li>The effectiveness of the matching depends on the similarity between the input and reference images.</li> <li>For best results, choose reference images that represent the desired tone and contrast.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.HistogramMatching(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     read_fn=lambda x: x,\n...     p=1\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; matched_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Histogram Matching in scikit-image:   https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class HistogramMatching(ImageOnlyTransform):\n    \"\"\"Adjust the pixel values of an input image to match the histogram of a reference image.\n\n    This transform applies histogram matching, a technique that modifies the distribution of pixel\n    intensities in the input image to closely resemble that of a reference image. This process is\n    performed independently for each channel in multi-channel images, provided both the input and\n    reference images have the same number of channels.\n\n    Histogram matching is particularly useful for:\n    - Normalizing images from different sources or captured under varying conditions.\n    - Preparing images for feature matching or other computer vision tasks where consistent\n      tone and contrast are important.\n    - Simulating different lighting or camera conditions in a controlled manner.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of reference image sources. These can be\n            file paths, URLs, or any objects that can be converted to images by the `read_fn`.\n        blend_ratio (tuple[float, float]): Range for the blending factor between the original\n            and the matched image. Must be two floats between 0 and 1, where:\n            - 0 means no blending (original image is returned)\n            - 1 means full histogram matching\n            A random value within this range is chosen for each application.\n            Default: (0.5, 1.0)\n        read_fn (Callable[[Any], np.ndarray]): A function that takes an element from\n            `reference_images` and returns a numpy array representing the image.\n            Default: read_rgb_image (reads image file from disk)\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform cannot be directly serialized due to its dependency on external image data.\n        - The effectiveness of the matching depends on the similarity between the input and reference images.\n        - For best results, choose reference images that represent the desired tone and contrast.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.HistogramMatching(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     read_fn=lambda x: x,\n        ...     p=1\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; matched_image = result[\"image\"]\n\n    References:\n        - Histogram Matching in scikit-image:\n          https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.5, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n\n    def apply(\n        self: np.ndarray,\n        img: np.ndarray,\n        reference_image: np.ndarray,\n        blend_ratio: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return apply_histogram(img, reference_image, blend_ratio)\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"HistogramMatching can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation","title":"<code>class  PixelDistributionAdaptation</code> <code>       (reference_images, blend_ratio=(0.25, 1.0), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, transform_type='pca', p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical transformation (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.</p> <p>The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse    of the transform fitted on the reference image.</p> <p>This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of objects (typically image paths) that will be converted into images by <code>read_fn</code>. These images serve as references for the domain adaptation.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)</p> <code>read_fn</code> <code>Callable</code> <p>A user-defined function for reading and converting the objects in <code>reference_images</code> into numpy arrays. By default, it assumes these objects are image paths.</p> <code>transform_type</code> <code>Literal[\"pca\", \"standard\", \"minmax\"]</code> <p>Specifies the type of statistical transformation to apply. - \"pca\": Principal Component Analysis - \"standard\": StandardScaler (zero mean and unit variance) - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: \"pca\"</p> <code>p</code> <code>float</code> <p>The probability of applying the transform to any given image. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The effectiveness of the adaptation depends on the similarity between the input and reference domains.</li> <li>PCA transformation may alter color relationships more significantly than other methods.</li> <li>StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.</li> <li>The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.</li> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     transform_type=\"standard\",\n...     read_fn=lambda x: x,\n...     p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; adapted_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>https://github.com/arsenyinfo/qudida</li> <li>https://arxiv.org/abs/1911.11483</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class PixelDistributionAdaptation(ImageOnlyTransform):\n    \"\"\"Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image\n    with that of a reference image. This process involves fitting a simple statistical transformation\n    (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images,\n    transforming the original image with the transformation trained on it, and then applying the inverse\n    transformation using the transform fitted on the reference image. The result is an adapted image\n    that retains the original content while mimicking the pixel value distribution of the reference domain.\n\n    The process can be visualized as two main steps:\n    1. Adjusting the original image to a standard distribution space using a selected transform.\n    2. Moving the adjusted image into the distribution space of the reference image by applying the inverse\n       of the transform fitted on the reference image.\n\n    This technique is especially useful in scenarios where images from different domains (e.g., synthetic\n    vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in\n    image processing tasks.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of objects (typically image paths) that will be\n            converted into images by `read_fn`. These images serve as references for the domain adaptation.\n        blend_ratio (tuple[float, float]): Specifies the minimum and maximum blend ratio for mixing\n            the adapted image with the original. This enhances the diversity of the output images.\n            Values should be in the range [0, 1]. Default: (0.25, 1.0)\n        read_fn (Callable): A user-defined function for reading and converting the objects in\n            `reference_images` into numpy arrays. By default, it assumes these objects are image paths.\n        transform_type (Literal[\"pca\", \"standard\", \"minmax\"]): Specifies the type of statistical\n            transformation to apply.\n            - \"pca\": Principal Component Analysis\n            - \"standard\": StandardScaler (zero mean and unit variance)\n            - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1])\n            Default: \"pca\"\n        p (float): The probability of applying the transform to any given image. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The effectiveness of the adaptation depends on the similarity between the input and reference domains.\n        - PCA transformation may alter color relationships more significantly than other methods.\n        - StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.\n        - The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.\n        - This transform cannot be directly serialized due to its dependency on external image data.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     transform_type=\"standard\",\n        ...     read_fn=lambda x: x,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; adapted_image = result[\"image\"]\n\n    References:\n        - https://github.com/arsenyinfo/qudida\n        - https://arxiv.org/abs/1911.11483\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.25, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"] = \"pca\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n        self.transform_type = transform_type\n\n    def apply(self, img: np.ndarray, reference_image: np.ndarray, blend_ratio: float, **params: Any) -&gt; np.ndarray:\n        return adapt_pixel_distribution(\n            img,\n            ref=reference_image,\n            weight=blend_ratio,\n            transform_type=self.transform_type,\n        )\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\", \"transform_type\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"PixelDistributionAdaptation can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms.TemplateTransform","title":"<code>class  TemplateTransform</code> <code>       (templates, img_weight=(0.5, 0.5), template_weight=None, template_transform=None, name=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Apply blending of input image with specified templates.</p> <p>This transform overlays one or more template images onto the input image using alpha blending. It allows for creating complex composite images or simulating various visual effects.</p> <p>Parameters:</p> Name Type Description <code>templates</code> <code>numpy array | list[np.ndarray]</code> <p>Images to use as templates for the transform. If a single numpy array is provided, it will be used as the only template. If a list of numpy arrays is provided, one will be randomly chosen for each application.</p> <code>img_weight</code> <code>tuple[float, float]  | float</code> <p>Weight of the original image in the blend. If a single float, that value will always be used. If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application. To use a fixed weight, use (weight, weight). Default: (0.5, 0.5).</p> <code>template_transform</code> <code>A.Compose | None</code> <p>A composition of Albumentations transforms to apply to the template before blending. This should be an instance of A.Compose containing one or more Albumentations transforms. Default: None.</p> <code>name</code> <code>str | None</code> <p>Name of the transform instance. Used for serialization purposes. Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The template(s) must have the same number of channels as the input image or be single-channel.</li> <li>If a single-channel template is used with a multi-channel image, the template will be replicated across   all channels.</li> <li>The template(s) will be resized to match the input image size if they differ.</li> <li>To make this transform serializable, provide a name when initializing it.</li> </ul> <p>Mathematical Formulation:     Given:     - I: Input image     - T: Template image     - w_i: Weight of input image (sampled from img_weight)</p> <pre><code>The blended image B is computed as:\n\nB = w_i * I + (1 - w_i) * T\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-a-single-template","title":"Apply template transform with a single template","text":"Python<pre><code>&gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-multiple-templates-and-custom-weights","title":"Apply template transform with multiple templates and custom weights","text":"Python<pre><code>&gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=templates,\n...     img_weight=(0.3, 0.7),\n...     name=\"multi_template_transform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-additional-transforms-on-the-template","title":"Apply template transform with additional transforms on the template","text":"Python<pre><code>&gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=template,\n...     img_weight=0.6,\n...     template_transform=template_transform,\n...     name=\"transformed_template\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Image blending: https://en.wikipedia.org/wiki/Image_blending</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class TemplateTransform(ImageOnlyTransform):\n    \"\"\"Apply blending of input image with specified templates.\n\n    This transform overlays one or more template images onto the input image using alpha blending.\n    It allows for creating complex composite images or simulating various visual effects.\n\n    Args:\n        templates (numpy array | list[np.ndarray]): Images to use as templates for the transform.\n            If a single numpy array is provided, it will be used as the only template.\n            If a list of numpy arrays is provided, one will be randomly chosen for each application.\n\n        img_weight (tuple[float, float]  | float): Weight of the original image in the blend.\n            If a single float, that value will always be used.\n            If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application.\n            To use a fixed weight, use (weight, weight).\n            Default: (0.5, 0.5).\n\n        template_transform (A.Compose | None): A composition of Albumentations transforms to apply to the template\n            before blending.\n            This should be an instance of A.Compose containing one or more Albumentations transforms.\n            Default: None.\n\n        name (str | None): Name of the transform instance. Used for serialization purposes.\n            Default: None.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The template(s) must have the same number of channels as the input image or be single-channel.\n        - If a single-channel template is used with a multi-channel image, the template will be replicated across\n          all channels.\n        - The template(s) will be resized to match the input image size if they differ.\n        - To make this transform serializable, provide a name when initializing it.\n\n    Mathematical Formulation:\n        Given:\n        - I: Input image\n        - T: Template image\n        - w_i: Weight of input image (sampled from img_weight)\n\n        The blended image B is computed as:\n\n        B = w_i * I + (1 - w_i) * T\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply template transform with a single template\n        &gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with multiple templates and custom weights\n        &gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=templates,\n        ...     img_weight=(0.3, 0.7),\n        ...     name=\"multi_template_transform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with additional transforms on the template\n        &gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=template,\n        ...     img_weight=0.6,\n        ...     template_transform=template_transform,\n        ...     name=\"transformed_template\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n    References:\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Image blending: https://en.wikipedia.org/wiki/Image_blending\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        templates: np.ndarray | Sequence[np.ndarray]\n        img_weight: ZeroOneRangeType\n        template_weight: ZeroOneRangeType | None = Field(\n            deprecated=\"Template_weight is deprecated. Computed automatically as (1 - img_weight)\",\n        )\n        template_transform: Compose | BasicTransform | None = None\n        name: str | None\n\n        @field_validator(\"templates\")\n        @classmethod\n        def validate_templates(cls, v: np.ndarray | list[np.ndarray]) -&gt; list[np.ndarray]:\n            if isinstance(v, np.ndarray):\n                return [v]\n            if isinstance(v, list):\n                if not all(isinstance(item, np.ndarray) for item in v):\n                    msg = \"All templates must be numpy arrays.\"\n                    raise ValueError(msg)\n                return v\n            msg = \"Templates must be a numpy array or a list of numpy arrays.\"\n            raise TypeError(msg)\n\n    def __init__(\n        self,\n        templates: np.ndarray | list[np.ndarray],\n        img_weight: ScaleFloatType = (0.5, 0.5),\n        template_weight: None = None,\n        template_transform: Compose | BasicTransform | None = None,\n        name: str | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.templates = templates\n        self.img_weight = cast(tuple[float, float], img_weight)\n        self.template_transform = template_transform\n        self.name = name\n\n    def apply(\n        self,\n        img: np.ndarray,\n        template: np.ndarray,\n        img_weight: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if img_weight == 0:\n            return template\n        if img_weight == 1:\n            return img\n\n        return add_weighted(img, img_weight, template, 1 - img_weight)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"img_weight\": self.py_random.uniform(*self.img_weight),\n        }\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        template = self.py_random.choice(self.templates)\n\n        if self.template_transform is not None:\n            template = self.template_transform(image=template)[\"image\"]\n\n        if get_num_channels(template) not in [1, get_num_channels(image)]:\n            msg = (\n                \"Template must be a single channel or \"\n                \"has the same number of channels as input \"\n                f\"image ({get_num_channels(image)}), got {get_num_channels(template)}\"\n            )\n            raise ValueError(msg)\n\n        if template.dtype != image.dtype:\n            msg = \"Image and template must be the same image type\"\n            raise ValueError(msg)\n\n        if image.shape[:2] != template.shape[:2]:\n            template = fgeometric.resize(template, image.shape[:2], interpolation=cv2.INTER_AREA)\n\n        if get_num_channels(template) == 1 and get_num_channels(image) &gt; 1:\n            # Replicate single channel template across all channels to match input image\n            template = cv2.merge([template] * get_num_channels(image))\n        # in order to support grayscale image with dummy dim\n        template = template.reshape(image.shape)\n\n        return {\"template\": template}\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        if self.name is None:\n            msg = (\n                \"To make a TemplateTransform serializable you should provide the `name` argument, \"\n                \"e.g. `TemplateTransform(name='my_transform', ...)`.\"\n            )\n            raise ValueError(msg)\n        return {\"__class_fullname__\": self.get_class_fullname(), \"__name__\": self.name}\n</code></pre>"},{"location":"api_reference/augmentations/functional/","title":"Functional transforms (augmentations.functional)","text":""},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_fog","title":"<code>def add_fog    (img, fog_intensity, alpha_coef, fog_particle_positions, fog_particle_radiuses)    </code> [view source on GitHub]","text":"<p>Add fog to the input image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image.</p> <code>fog_intensity</code> <code>float</code> <p>Intensity of the fog effect, between 0 and 1.</p> <code>alpha_coef</code> <code>float</code> <p>Base alpha (transparency) value for fog particles.</p> <code>fog_particle_positions</code> <code>list[tuple[int, int]]</code> <p>List of (x, y) coordinates for fog particles.</p> <code>fog_particle_radiuses</code> <code>list[int]</code> <p>List of radiuses for each fog particle.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with added fog effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\n@preserve_channel_dim\ndef add_fog(\n    img: np.ndarray,\n    fog_intensity: float,\n    alpha_coef: float,\n    fog_particle_positions: list[tuple[int, int]],\n    fog_particle_radiuses: list[int],\n) -&gt; np.ndarray:\n    \"\"\"Add fog to the input image.\n\n    Args:\n        img (np.ndarray): Input image.\n        fog_intensity (float): Intensity of the fog effect, between 0 and 1.\n        alpha_coef (float): Base alpha (transparency) value for fog particles.\n        fog_particle_positions (list[tuple[int, int]]): List of (x, y) coordinates for fog particles.\n        fog_particle_radiuses (list[int]): List of radiuses for each fog particle.\n\n    Returns:\n        np.ndarray: Image with added fog effect.\n    \"\"\"\n    height, width = img.shape[:2]\n    num_channels = get_num_channels(img)\n\n    fog_layer = np.zeros((height, width, num_channels), dtype=np.uint8)\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    for (x, y), radius in zip(fog_particle_positions, fog_particle_radiuses):\n        color = max_value if num_channels == 1 else (max_value,) * num_channels\n        cv2.circle(\n            fog_layer,\n            center=(x, y),\n            radius=radius,\n            color=color,\n            thickness=-1,\n        )\n\n    # Apply gaussian blur to the fog layer\n    fog_layer = cv2.GaussianBlur(fog_layer, (25, 25), 0)\n\n    # Blend the fog layer with the original image\n    alpha = np.mean(fog_layer, axis=2, keepdims=True) / max_value * alpha_coef * fog_intensity\n\n    result = img * (1 - alpha) + fog_layer * alpha\n\n    return clip(result, np.uint8, inplace=True)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_rain","title":"<code>def add_rain    (img, slant, drop_length, drop_width, drop_color, blur_value, brightness_coefficient, rain_drops)    </code> [view source on GitHub]","text":"<p>Adds rain drops to the image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image.</p> <code>slant</code> <code>int</code> <p>The angle of the rain drops.</p> <code>drop_length</code> <code>int</code> <p>The length of each rain drop.</p> <code>drop_width</code> <code>int</code> <p>The width of each rain drop.</p> <code>drop_color</code> <code>tuple[int, int, int]</code> <p>The color of the rain drops in RGB format.</p> <code>blur_value</code> <code>int</code> <p>The size of the kernel used to blur the image. Rainy views are blurry.</p> <code>brightness_coefficient</code> <code>float</code> <p>Coefficient to adjust the brightness of the image. Rainy days are usually shady.</p> <code>rain_drops</code> <code>list[tuple[int, int]]</code> <p>A list of tuples where each tuple represents the (x, y) coordinates of the starting point of a rain drop.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with rain effect added.</p> <p>Reference</p> <p>https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef add_rain(\n    img: np.ndarray,\n    slant: int,\n    drop_length: int,\n    drop_width: int,\n    drop_color: tuple[int, int, int],\n    blur_value: int,\n    brightness_coefficient: float,\n    rain_drops: list[tuple[int, int]],\n) -&gt; np.ndarray:\n    \"\"\"Adds rain drops to the image.\n\n    Args:\n        img (np.ndarray): Input image.\n        slant (int): The angle of the rain drops.\n        drop_length (int): The length of each rain drop.\n        drop_width (int): The width of each rain drop.\n        drop_color (tuple[int, int, int]): The color of the rain drops in RGB format.\n        blur_value (int): The size of the kernel used to blur the image. Rainy views are blurry.\n        brightness_coefficient (float): Coefficient to adjust the brightness of the image. Rainy days are usually shady.\n        rain_drops (list[tuple[int, int]]): A list of tuples where each tuple represents the (x, y)\n            coordinates of the starting point of a rain drop.\n\n    Returns:\n        np.ndarray: Image with rain effect added.\n\n    Reference:\n        https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n    \"\"\"\n    img = img.copy()\n    for rain_drop_x0, rain_drop_y0 in rain_drops:\n        rain_drop_x1 = rain_drop_x0 + slant\n        rain_drop_y1 = rain_drop_y0 + drop_length\n\n        cv2.line(\n            img,\n            (rain_drop_x0, rain_drop_y0),\n            (rain_drop_x1, rain_drop_y1),\n            drop_color,\n            drop_width,\n        )\n\n    img = cv2.blur(img, (blur_value, blur_value))  # rainy view are blurry\n    image_hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV).astype(np.float32)\n    image_hsv[:, :, 2] *= brightness_coefficient\n\n    return cv2.cvtColor(image_hsv.astype(np.uint8), cv2.COLOR_HSV2RGB)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_shadow","title":"<code>def add_shadow    (img, vertices_list, intensities)    </code> [view source on GitHub]","text":"<p>Add shadows to the image by reducing the intensity of the pixel values in specified regions.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Multichannel images are supported.</p> <code>vertices_list</code> <code>list[np.ndarray]</code> <p>List of vertices for shadow polygons.</p> <code>intensities</code> <code>np.ndarray</code> <p>Array of shadow intensities. Range is [0, 1].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with shadows added.</p> <p>Reference</p> <p>https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef add_shadow(\n    img: np.ndarray,\n    vertices_list: list[np.ndarray],\n    intensities: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Add shadows to the image by reducing the intensity of the pixel values in specified regions.\n\n    Args:\n        img (np.ndarray): Input image. Multichannel images are supported.\n        vertices_list (list[np.ndarray]): List of vertices for shadow polygons.\n        intensities (np.ndarray): Array of shadow intensities. Range is [0, 1].\n\n    Returns:\n        np.ndarray: Image with shadows added.\n\n    Reference:\n        https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n    \"\"\"\n    num_channels = get_num_channels(img)\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    img_shadowed = img.copy()\n\n    # Iterate over the vertices and intensity list\n    for vertices, shadow_intensity in zip(vertices_list, intensities):\n        # Create mask for the current shadow polygon\n        mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)\n        cv2.fillPoly(mask, [vertices], (max_value,))\n\n        # Duplicate the mask to have the same number of channels as the image\n        mask = np.repeat(mask, num_channels, axis=2)\n\n        # Apply shadow to the channels directly\n        # It could be tempting to convert to HLS and apply the shadow to the L channel, but it creates artifacts\n        shadowed_indices = mask[:, :, 0] == max_value\n        darkness = 1 - shadow_intensity\n        img_shadowed[shadowed_indices] = clip(\n            img_shadowed[shadowed_indices] * darkness,\n            np.uint8,\n            inplace=True,\n        )\n\n    return img_shadowed\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_snow_bleach","title":"<code>def add_snow_bleach    (img, snow_point, brightness_coeff)    </code> [view source on GitHub]","text":"<p>Adds a simple snow effect to the image by bleaching out pixels.</p> <p>This function simulates a basic snow effect by increasing the brightness of pixels that are above a certain threshold (snow_point). It operates in the HLS color space to modify the lightness channel.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be either RGB uint8 or float32.</p> <code>snow_point</code> <code>float</code> <p>A float in the range [0, 1], scaled and adjusted to determine the threshold for pixel modification. Higher values result in less snow effect.</p> <code>brightness_coeff</code> <code>float</code> <p>Coefficient applied to increase the brightness of pixels below the snow_point threshold. Larger values lead to more pronounced snow effects. Should be greater than 1.0 for a visible effect.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with simulated snow effect. The output has the same dtype as the input.</p> <p>Note</p> <ul> <li>This function converts the image to the HLS color space to modify the lightness channel.</li> <li>The snow effect is created by selectively increasing the brightness of pixels.</li> <li>This method tends to create a 'bleached' look, which may not be as realistic as more   advanced snow simulation techniques.</li> <li>The function automatically handles both uint8 and float32 input images.</li> </ul> <p>The snow effect is created through the following steps: 1. Convert the image from RGB to HLS color space. 2. Adjust the snow_point threshold. 3. Increase the lightness of pixels below the threshold. 4. Convert the image back to RGB.</p> <p>Mathematical Formulation:     Let L be the lightness channel in HLS space.     For each pixel (i, j):     If L[i, j] &lt; snow_point:         L[i, j] = L[i, j] * brightness_coeff</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; snowy_image = A.functional.add_snow_v1(image, snow_point=0.5, brightness_coeff=1.5)\n</code></pre> <p>References</p> <ul> <li>HLS Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> <li>Original implementation: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef add_snow_bleach(\n    img: np.ndarray,\n    snow_point: float,\n    brightness_coeff: float,\n) -&gt; np.ndarray:\n    \"\"\"Adds a simple snow effect to the image by bleaching out pixels.\n\n    This function simulates a basic snow effect by increasing the brightness of pixels\n    that are above a certain threshold (snow_point). It operates in the HLS color space\n    to modify the lightness channel.\n\n    Args:\n        img (np.ndarray): Input image. Can be either RGB uint8 or float32.\n        snow_point (float): A float in the range [0, 1], scaled and adjusted to determine\n            the threshold for pixel modification. Higher values result in less snow effect.\n        brightness_coeff (float): Coefficient applied to increase the brightness of pixels\n            below the snow_point threshold. Larger values lead to more pronounced snow effects.\n            Should be greater than 1.0 for a visible effect.\n\n    Returns:\n        np.ndarray: Image with simulated snow effect. The output has the same dtype as the input.\n\n    Note:\n        - This function converts the image to the HLS color space to modify the lightness channel.\n        - The snow effect is created by selectively increasing the brightness of pixels.\n        - This method tends to create a 'bleached' look, which may not be as realistic as more\n          advanced snow simulation techniques.\n        - The function automatically handles both uint8 and float32 input images.\n\n    The snow effect is created through the following steps:\n    1. Convert the image from RGB to HLS color space.\n    2. Adjust the snow_point threshold.\n    3. Increase the lightness of pixels below the threshold.\n    4. Convert the image back to RGB.\n\n    Mathematical Formulation:\n        Let L be the lightness channel in HLS space.\n        For each pixel (i, j):\n        If L[i, j] &lt; snow_point:\n            L[i, j] = L[i, j] * brightness_coeff\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; snowy_image = A.functional.add_snow_v1(image, snow_point=0.5, brightness_coeff=1.5)\n\n    References:\n        - HLS Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV\n        - Original implementation: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n    \"\"\"\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    snow_point *= max_value / 2\n    snow_point += max_value / 3\n\n    image_hls = cv2.cvtColor(img, cv2.COLOR_RGB2HLS)\n    image_hls = np.array(image_hls, dtype=np.float32)\n\n    image_hls[:, :, 1][image_hls[:, :, 1] &lt; snow_point] *= brightness_coeff\n\n    image_hls[:, :, 1] = clip(image_hls[:, :, 1], np.uint8, inplace=True)\n\n    image_hls = np.array(image_hls, dtype=np.uint8)\n\n    return cv2.cvtColor(image_hls, cv2.COLOR_HLS2RGB)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_snow_texture","title":"<code>def add_snow_texture    (img, snow_point, brightness_coeff, snow_texture, sparkle_mask)    </code> [view source on GitHub]","text":"<p>Add a realistic snow effect to the input image.</p> <p>This function simulates snowfall by applying multiple visual effects to the image, including brightness adjustment, snow texture overlay, depth simulation, and color tinting. The result is a more natural-looking snow effect compared to simple pixel bleaching methods.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image in RGB format.</p> <code>snow_point</code> <code>float</code> <p>Coefficient that controls the amount and intensity of snow. Should be in the range [0, 1], where 0 means no snow and 1 means maximum snow effect.</p> <code>brightness_coeff</code> <code>float</code> <p>Coefficient for brightness adjustment to simulate the reflective nature of snow. Should be in the range [0, 1], where higher values result in a brighter image.</p> <code>snow_texture</code> <code>np.ndarray</code> <p>Snow texture.</p> <code>sparkle_mask</code> <code>np.ndarray</code> <p>Sparkle mask.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with added snow effect. The output has the same dtype as the input.</p> <p>Note</p> <ul> <li>The function first converts the image to HSV color space for better control over   brightness and color adjustments.</li> <li>A snow texture is generated using Gaussian noise and then filtered for a more   natural appearance.</li> <li>A depth effect is simulated, with more snow at the top of the image and less at the bottom.</li> <li>A slight blue tint is added to simulate the cool color of snow.</li> <li>Random sparkle effects are added to simulate light reflecting off snow crystals.</li> </ul> <p>The snow effect is created through the following steps: 1. Brightness adjustment in HSV space 2. Generation of a snow texture using Gaussian noise 3. Application of a depth effect to the snow texture 4. Blending of the snow texture with the original image 5. Addition of a cool blue tint 6. Addition of sparkle effects</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; snowy_image = A.functional.add_snow_v2(image, snow_coeff=0.5, brightness_coeff=0.2)\n</code></pre> <p>Note</p> <p>This function works with both uint8 and float32 image types, automatically handling the conversion between them.</p> <p>References</p> <ul> <li>Perlin Noise: https://en.wikipedia.org/wiki/Perlin_noise</li> <li>HSV Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef add_snow_texture(\n    img: np.ndarray,\n    snow_point: float,\n    brightness_coeff: float,\n    snow_texture: np.ndarray,\n    sparkle_mask: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Add a realistic snow effect to the input image.\n\n    This function simulates snowfall by applying multiple visual effects to the image,\n    including brightness adjustment, snow texture overlay, depth simulation, and color tinting.\n    The result is a more natural-looking snow effect compared to simple pixel bleaching methods.\n\n    Args:\n        img (np.ndarray): Input image in RGB format.\n        snow_point (float): Coefficient that controls the amount and intensity of snow.\n            Should be in the range [0, 1], where 0 means no snow and 1 means maximum snow effect.\n        brightness_coeff (float): Coefficient for brightness adjustment to simulate the\n            reflective nature of snow. Should be in the range [0, 1], where higher values\n            result in a brighter image.\n        snow_texture (np.ndarray): Snow texture.\n        sparkle_mask (np.ndarray): Sparkle mask.\n\n    Returns:\n        np.ndarray: Image with added snow effect. The output has the same dtype as the input.\n\n    Note:\n        - The function first converts the image to HSV color space for better control over\n          brightness and color adjustments.\n        - A snow texture is generated using Gaussian noise and then filtered for a more\n          natural appearance.\n        - A depth effect is simulated, with more snow at the top of the image and less at the bottom.\n        - A slight blue tint is added to simulate the cool color of snow.\n        - Random sparkle effects are added to simulate light reflecting off snow crystals.\n\n    The snow effect is created through the following steps:\n    1. Brightness adjustment in HSV space\n    2. Generation of a snow texture using Gaussian noise\n    3. Application of a depth effect to the snow texture\n    4. Blending of the snow texture with the original image\n    5. Addition of a cool blue tint\n    6. Addition of sparkle effects\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; snowy_image = A.functional.add_snow_v2(image, snow_coeff=0.5, brightness_coeff=0.2)\n\n    Note:\n        This function works with both uint8 and float32 image types, automatically\n        handling the conversion between them.\n\n    References:\n        - Perlin Noise: https://en.wikipedia.org/wiki/Perlin_noise\n        - HSV Color Space: https://en.wikipedia.org/wiki/HSL_and_HSV\n    \"\"\"\n    max_value = MAX_VALUES_BY_DTYPE[np.uint8]\n\n    # Convert to HSV for better color control\n    img_hsv = cv2.cvtColor(img, cv2.COLOR_RGB2HSV).astype(np.float32)\n\n    # Increase brightness\n    img_hsv[:, :, 2] = np.clip(\n        img_hsv[:, :, 2] * (1 + brightness_coeff * snow_point),\n        0,\n        max_value,\n    )\n\n    # Generate snow texture\n    snow_texture = cv2.GaussianBlur(snow_texture, (0, 0), sigmaX=1, sigmaY=1)\n\n    # Create depth effect for snow simulation\n    # More snow accumulates at the top of the image, gradually decreasing towards the bottom\n    # This simulates natural snow distribution on surfaces\n    # The effect is achieved using a linear gradient from 1 (full snow) to 0.2 (less snow)\n    rows = img.shape[0]\n    depth_effect = np.linspace(1, 0.2, rows)[:, np.newaxis]\n    snow_texture *= depth_effect\n\n    # Apply snow texture\n    snow_layer = (np.dstack([snow_texture] * 3) * max_value * snow_point).astype(\n        np.float32,\n    )\n\n    # Blend snow with original image\n    img_with_snow = cv2.add(img_hsv, snow_layer)\n\n    # Add a slight blue tint to simulate cool snow color\n    blue_tint = np.full_like(img_with_snow, (0.6, 0.75, 1))  # Slight blue in HSV\n\n    img_with_snow = cv2.addWeighted(\n        img_with_snow,\n        0.85,\n        blue_tint,\n        0.15 * snow_point,\n        0,\n    )\n\n    # Convert back to RGB\n    img_with_snow = cv2.cvtColor(img_with_snow.astype(np.uint8), cv2.COLOR_HSV2RGB)\n\n    # Add some sparkle effects for snow glitter\n    img_with_snow[sparkle_mask] = [max_value, max_value, max_value]\n\n    return img_with_snow\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_sun_flare_overlay","title":"<code>def add_sun_flare_overlay    (img, flare_center, src_radius, src_color, circles)    </code> [view source on GitHub]","text":"<p>Add a sun flare effect to an image using a simple overlay technique.</p> <p>This function creates a basic sun flare effect by overlaying multiple semi-transparent circles of varying sizes and intensities on the input image. The effect simulates a simple lens flare caused by bright light sources.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The input image.</p> <code>flare_center</code> <code>tuple[float, float]</code> <p>(x, y) coordinates of the flare center in pixel coordinates.</p> <code>src_radius</code> <code>int</code> <p>The radius of the main sun circle in pixels.</p> <code>src_color</code> <code>tuple[int, ...]</code> <p>The color of the sun, represented as a tuple of RGB values.</p> <code>circles</code> <code>list[Any]</code> <p>A list of tuples, each representing a circle that contributes to the flare effect. Each tuple contains: - alpha (float): The transparency of the circle (0.0 to 1.0). - center (tuple[int, int]): (x, y) coordinates of the circle center. - radius (int): The radius of the circle. - color (tuple[int, int, int]): RGB color of the circle.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The output image with the sun flare effect added.</p> <p>Note</p> <ul> <li>This function uses a simple alpha blending technique to overlay flare elements.</li> <li>The main sun is created as a gradient circle, fading from the center outwards.</li> <li>Additional flare circles are added along an imaginary line from the sun's position.</li> <li>This method is computationally efficient but may produce less realistic results   compared to more advanced techniques.</li> </ul> <p>The flare effect is created through the following steps: 1. Create an overlay image and output image as copies of the input. 2. Add smaller flare circles to the overlay. 3. Blend the overlay with the output image using alpha compositing. 4. Add the main sun circle with a radial gradient.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; flare_center = (50, 50)\n&gt;&gt;&gt; src_radius = 20\n&gt;&gt;&gt; src_color = (255, 255, 200)\n&gt;&gt;&gt; circles = [\n...     (0.1, (60, 60), 5, (255, 200, 200)),\n...     (0.2, (70, 70), 3, (200, 255, 200))\n... ]\n&gt;&gt;&gt; flared_image = A.functional.add_sun_flare_overlay(\n...     image, flare_center, src_radius, src_color, circles\n... )\n</code></pre> <p>References</p> <ul> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Lens flare: https://en.wikipedia.org/wiki/Lens_flare</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\n@maybe_process_in_chunks\ndef add_sun_flare_overlay(\n    img: np.ndarray,\n    flare_center: tuple[float, float],\n    src_radius: int,\n    src_color: tuple[int, ...],\n    circles: list[Any],\n) -&gt; np.ndarray:\n    \"\"\"Add a sun flare effect to an image using a simple overlay technique.\n\n    This function creates a basic sun flare effect by overlaying multiple semi-transparent\n    circles of varying sizes and intensities on the input image. The effect simulates\n    a simple lens flare caused by bright light sources.\n\n    Args:\n        img (np.ndarray): The input image.\n        flare_center (tuple[float, float]): (x, y) coordinates of the flare center\n            in pixel coordinates.\n        src_radius (int): The radius of the main sun circle in pixels.\n        src_color (tuple[int, ...]): The color of the sun, represented as a tuple of RGB values.\n        circles (list[Any]): A list of tuples, each representing a circle that contributes\n            to the flare effect. Each tuple contains:\n            - alpha (float): The transparency of the circle (0.0 to 1.0).\n            - center (tuple[int, int]): (x, y) coordinates of the circle center.\n            - radius (int): The radius of the circle.\n            - color (tuple[int, int, int]): RGB color of the circle.\n\n    Returns:\n        np.ndarray: The output image with the sun flare effect added.\n\n    Note:\n        - This function uses a simple alpha blending technique to overlay flare elements.\n        - The main sun is created as a gradient circle, fading from the center outwards.\n        - Additional flare circles are added along an imaginary line from the sun's position.\n        - This method is computationally efficient but may produce less realistic results\n          compared to more advanced techniques.\n\n    The flare effect is created through the following steps:\n    1. Create an overlay image and output image as copies of the input.\n    2. Add smaller flare circles to the overlay.\n    3. Blend the overlay with the output image using alpha compositing.\n    4. Add the main sun circle with a radial gradient.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; flare_center = (50, 50)\n        &gt;&gt;&gt; src_radius = 20\n        &gt;&gt;&gt; src_color = (255, 255, 200)\n        &gt;&gt;&gt; circles = [\n        ...     (0.1, (60, 60), 5, (255, 200, 200)),\n        ...     (0.2, (70, 70), 3, (200, 255, 200))\n        ... ]\n        &gt;&gt;&gt; flared_image = A.functional.add_sun_flare_overlay(\n        ...     image, flare_center, src_radius, src_color, circles\n        ... )\n\n    References:\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Lens flare: https://en.wikipedia.org/wiki/Lens_flare\n    \"\"\"\n    overlay = img.copy()\n    output = img.copy()\n\n    weighted_brightness = 0.0\n    total_radius_length = 0.0\n\n    for alpha, (x, y), rad3, circle_color in circles:\n        weighted_brightness += alpha * rad3\n        total_radius_length += rad3\n        cv2.circle(overlay, (x, y), rad3, circle_color, -1)\n        output = add_weighted(overlay, alpha, output, 1 - alpha)\n\n    point = [int(x) for x in flare_center]\n\n    overlay = output.copy()\n    num_times = src_radius // 10\n\n    # max_alpha is calculated using weighted_brightness and total_radii_length times 5\n    # meaning the higher the alpha with larger area, the brighter the bright spot will be\n    # for list of alphas in range [0.05, 0.2], the max_alpha should below 1\n    max_alpha = weighted_brightness / total_radius_length * 5\n    alpha = np.linspace(0.0, min(max_alpha, 1.0), num=num_times)\n\n    rad = np.linspace(1, src_radius, num=num_times)\n\n    for i in range(num_times):\n        cv2.circle(overlay, point, int(rad[i]), src_color, -1)\n        alp = alpha[num_times - i - 1] * alpha[num_times - i - 1] * alpha[num_times - i - 1]\n        output = add_weighted(overlay, alp, output, 1 - alp)\n\n    return output\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.add_sun_flare_physics_based","title":"<code>def add_sun_flare_physics_based    (img, flare_center, src_radius, src_color, circles)    </code> [view source on GitHub]","text":"<p>Add a more realistic sun flare effect to the image.</p> <p>This function creates a complex sun flare effect by simulating various optical phenomena that occur in real camera lenses when capturing bright light sources. The result is a more realistic and physically plausible lens flare effect.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image.</p> <code>flare_center</code> <code>tuple[int, int]</code> <p>(x, y) coordinates of the sun's center in pixels.</p> <code>src_radius</code> <code>int</code> <p>Radius of the main sun circle in pixels.</p> <code>src_color</code> <code>tuple[int, int, int]</code> <p>Color of the sun in RGB format.</p> <code>circles</code> <code>list[Any]</code> <p>List of tuples, each representing a flare circle with parameters: (alpha, center, size, color) - alpha (float): Transparency of the circle (0.0 to 1.0). - center (tuple[int, int]): (x, y) coordinates of the circle center. - size (float): Size factor for the circle radius. - color (tuple[int, int, int]): RGB color of the circle.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with added sun flare effect.</p> <p>Note</p> <p>This function implements several techniques to create a more realistic flare: 1. Separate flare layer: Allows for complex manipulations of the flare effect. 2. Lens diffraction spikes: Simulates light diffraction in camera aperture. 3. Radial gradient mask: Creates natural fading of the flare from the center. 4. Gaussian blur: Softens the flare for a more natural glow effect. 5. Chromatic aberration: Simulates color fringing often seen in real lens flares. 6. Screen blending: Provides a more realistic blending of the flare with the image.</p> <p>The flare effect is created through the following steps: 1. Create a separate flare layer. 2. Add the main sun circle and diffraction spikes to the flare layer. 3. Add additional flare circles based on the input parameters. 4. Apply Gaussian blur to soften the flare. 5. Create and apply a radial gradient mask for natural fading. 6. Simulate chromatic aberration by applying different blurs to color channels. 7. Blend the flare with the original image using screen blending mode.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n&gt;&gt;&gt; flare_center = (500, 500)\n&gt;&gt;&gt; src_radius = 50\n&gt;&gt;&gt; src_color = (255, 255, 200)\n&gt;&gt;&gt; circles = [\n...     (0.1, (550, 550), 10, (255, 200, 200)),\n...     (0.2, (600, 600), 5, (200, 255, 200))\n... ]\n&gt;&gt;&gt; flared_image = A.functional.add_sun_flare_physics_based(\n...     image, flare_center, src_radius, src_color, circles\n... )\n</code></pre> <p>References</p> <ul> <li>Lens flare: https://en.wikipedia.org/wiki/Lens_flare</li> <li>Diffraction: https://en.wikipedia.org/wiki/Diffraction</li> <li>Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration</li> <li>Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\ndef add_sun_flare_physics_based(\n    img: np.ndarray,\n    flare_center: tuple[int, int],\n    src_radius: int,\n    src_color: tuple[int, int, int],\n    circles: list[Any],\n) -&gt; np.ndarray:\n    \"\"\"Add a more realistic sun flare effect to the image.\n\n    This function creates a complex sun flare effect by simulating various optical phenomena\n    that occur in real camera lenses when capturing bright light sources. The result is a\n    more realistic and physically plausible lens flare effect.\n\n    Args:\n        img (np.ndarray): Input image.\n        flare_center (tuple[int, int]): (x, y) coordinates of the sun's center in pixels.\n        src_radius (int): Radius of the main sun circle in pixels.\n        src_color (tuple[int, int, int]): Color of the sun in RGB format.\n        circles (list[Any]): List of tuples, each representing a flare circle with parameters:\n            (alpha, center, size, color)\n            - alpha (float): Transparency of the circle (0.0 to 1.0).\n            - center (tuple[int, int]): (x, y) coordinates of the circle center.\n            - size (float): Size factor for the circle radius.\n            - color (tuple[int, int, int]): RGB color of the circle.\n\n    Returns:\n        np.ndarray: Image with added sun flare effect.\n\n    Note:\n        This function implements several techniques to create a more realistic flare:\n        1. Separate flare layer: Allows for complex manipulations of the flare effect.\n        2. Lens diffraction spikes: Simulates light diffraction in camera aperture.\n        3. Radial gradient mask: Creates natural fading of the flare from the center.\n        4. Gaussian blur: Softens the flare for a more natural glow effect.\n        5. Chromatic aberration: Simulates color fringing often seen in real lens flares.\n        6. Screen blending: Provides a more realistic blending of the flare with the image.\n\n    The flare effect is created through the following steps:\n    1. Create a separate flare layer.\n    2. Add the main sun circle and diffraction spikes to the flare layer.\n    3. Add additional flare circles based on the input parameters.\n    4. Apply Gaussian blur to soften the flare.\n    5. Create and apply a radial gradient mask for natural fading.\n    6. Simulate chromatic aberration by applying different blurs to color channels.\n    7. Blend the flare with the original image using screen blending mode.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n        &gt;&gt;&gt; flare_center = (500, 500)\n        &gt;&gt;&gt; src_radius = 50\n        &gt;&gt;&gt; src_color = (255, 255, 200)\n        &gt;&gt;&gt; circles = [\n        ...     (0.1, (550, 550), 10, (255, 200, 200)),\n        ...     (0.2, (600, 600), 5, (200, 255, 200))\n        ... ]\n        &gt;&gt;&gt; flared_image = A.functional.add_sun_flare_physics_based(\n        ...     image, flare_center, src_radius, src_color, circles\n        ... )\n\n    References:\n        - Lens flare: https://en.wikipedia.org/wiki/Lens_flare\n        - Diffraction: https://en.wikipedia.org/wiki/Diffraction\n        - Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration\n        - Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen\n    \"\"\"\n    output = img.copy()\n    height, width = img.shape[:2]\n\n    # Create a separate flare layer\n    flare_layer = np.zeros_like(img, dtype=np.float32)\n\n    # Add the main sun\n    cv2.circle(flare_layer, flare_center, src_radius, src_color, -1)\n\n    # Add lens diffraction spikes\n    for angle in [0, 45, 90, 135]:\n        end_point = (\n            int(flare_center[0] + np.cos(np.radians(angle)) * max(width, height)),\n            int(flare_center[1] + np.sin(np.radians(angle)) * max(width, height)),\n        )\n        cv2.line(flare_layer, flare_center, end_point, src_color, 2)\n\n    # Add flare circles\n    for _, center, size, color in circles:\n        cv2.circle(flare_layer, center, int(size**0.33), color, -1)\n\n    # Apply gaussian blur to soften the flare\n    flare_layer = cv2.GaussianBlur(flare_layer, (0, 0), sigmaX=15, sigmaY=15)\n\n    # Create a radial gradient mask\n    y, x = np.ogrid[:height, :width]\n    mask = np.sqrt((x - flare_center[0]) ** 2 + (y - flare_center[1]) ** 2)\n    mask = 1 - np.clip(mask / (max(width, height) * 0.7), 0, 1)\n    mask = np.dstack([mask] * 3)\n\n    # Apply the mask to the flare layer\n    flare_layer *= mask\n\n    # Add chromatic aberration\n    channels = list(cv2.split(flare_layer))\n    channels[0] = cv2.GaussianBlur(\n        channels[0],\n        (0, 0),\n        sigmaX=3,\n        sigmaY=3,\n    )  # Blue channel\n    channels[2] = cv2.GaussianBlur(\n        channels[2],\n        (0, 0),\n        sigmaX=5,\n        sigmaY=5,\n    )  # Red channel\n    flare_layer = cv2.merge(channels)\n\n    # Blend the flare with the original image using screen blending\n    return 255 - ((255 - output) * (255 - flare_layer) / 255)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_corner_illumination","title":"<code>def apply_corner_illumination    (img, intensity, corner)    </code> [view source on GitHub]","text":"<p>Apply corner-based illumination effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_corner_illumination(\n    img: np.ndarray,\n    intensity: float,\n    corner: Literal[0, 1, 2, 3],\n) -&gt; np.ndarray:\n    \"\"\"Apply corner-based illumination effect.\"\"\"\n    result, height, width = prepare_illumination_input(img)\n\n    # Create distance map coordinates\n    y, x = np.ogrid[:height, :width]\n\n    # Adjust coordinates based on corner\n    if corner == 1:  # top-right\n        x = width - 1 - x\n    elif corner == 2:  # bottom-right\n        x = width - 1 - x\n        y = height - 1 - y\n    elif corner == 3:  # bottom-left\n        y = height - 1 - y\n\n    # Calculate normalized distance\n    distance = np.sqrt(x * x + y * y) / np.sqrt(height * height + width * width)\n    pattern = 1 - distance  # Invert so corner is brightest\n\n    return apply_illumination_pattern(result, pattern, intensity)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_gaussian_illumination","title":"<code>def apply_gaussian_illumination    (img, intensity, center, sigma)    </code> [view source on GitHub]","text":"<p>Apply gaussian illumination effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_gaussian_illumination(\n    img: np.ndarray,\n    intensity: float,\n    center: tuple[float, float],\n    sigma: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply gaussian illumination effect.\"\"\"\n    result, height, width = prepare_illumination_input(img)\n\n    # Create coordinate grid\n    y, x = np.ogrid[:height, :width]\n\n    # Calculate gaussian pattern\n    center_x = width * center[0]\n    center_y = height * center[1]\n    sigma_pixels = max(height, width) * sigma\n    gaussian = np.exp(\n        -((x - center_x) ** 2 + (y - center_y) ** 2) / (2 * sigma_pixels**2),\n    )\n\n    return apply_illumination_pattern(result, gaussian, intensity)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_illumination_pattern","title":"<code>def apply_illumination_pattern    (img, pattern, intensity)    </code> [view source on GitHub]","text":"<p>Apply illumination pattern to image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>pattern</code> <code>np.ndarray</code> <p>Illumination pattern of shape (H, W)</p> <code>intensity</code> <code>float</code> <p>Effect strength (-0.2 to 0.2)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with applied illumination</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def apply_illumination_pattern(\n    img: np.ndarray,\n    pattern: np.ndarray,\n    intensity: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply illumination pattern to image.\n\n    Args:\n        img: Input image\n        pattern: Illumination pattern of shape (H, W)\n        intensity: Effect strength (-0.2 to 0.2)\n\n    Returns:\n        Image with applied illumination\n    \"\"\"\n    if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n        pattern = pattern[..., np.newaxis]\n    return img * (1 + intensity * pattern)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_linear_illumination","title":"<code>def apply_linear_illumination    (img, intensity, angle)    </code> [view source on GitHub]","text":"<p>Apply linear gradient illumination effect.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_linear_illumination(\n    img: np.ndarray,\n    intensity: float,\n    angle: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply linear gradient illumination effect.\"\"\"\n    result, height, width = prepare_illumination_input(img)\n\n    # Create gradient coordinates\n    y, x = np.ogrid[:height, :width]\n\n    # Calculate gradient direction\n    angle_rad = np.deg2rad(angle)\n    dx, dy = np.cos(angle_rad), np.sin(angle_rad)\n\n    # Create normalized gradient\n    gradient = (x * dx + y * dy) / np.sqrt(height * height + width * width)\n    gradient = (gradient + 1) / 2  # Normalize to [0, 1]\n\n    return apply_illumination_pattern(result, gradient, intensity)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_plasma_brightness_contrast","title":"<code>def apply_plasma_brightness_contrast    (img, brightness_factor, contrast_factor, plasma_pattern)    </code> [view source on GitHub]","text":"<p>Apply plasma-based brightness and contrast adjustments.</p> <p>The plasma pattern is used to create spatially-varying adjustments: 1. Brightness is modified by adding the pattern * brightness_factor 2. Contrast is modified by interpolating between mean and original    using the pattern * contrast_factor</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_plasma_brightness_contrast(\n    img: np.ndarray,\n    brightness_factor: float,\n    contrast_factor: float,\n    plasma_pattern: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply plasma-based brightness and contrast adjustments.\n\n    The plasma pattern is used to create spatially-varying adjustments:\n    1. Brightness is modified by adding the pattern * brightness_factor\n    2. Contrast is modified by interpolating between mean and original\n       using the pattern * contrast_factor\n    \"\"\"\n    result = img.copy()\n\n    max_value = MAX_VALUES_BY_DTYPE[img.dtype]\n\n    # Expand plasma pattern to match image dimensions\n    plasma_pattern = plasma_pattern[..., np.newaxis] if img.ndim &gt; MONO_CHANNEL_DIMENSIONS else plasma_pattern\n\n    # Apply brightness adjustment\n    if brightness_factor != 0:\n        brightness_adjustment = plasma_pattern * brightness_factor * max_value\n        result = np.clip(result + brightness_adjustment, 0, max_value)\n\n    # Apply contrast adjustment\n    if contrast_factor != 0:\n        mean = result.mean()\n        contrast_weights = plasma_pattern * contrast_factor + 1\n        result = np.clip(mean + (result - mean) * contrast_weights, 0, max_value)\n\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_plasma_shadow","title":"<code>def apply_plasma_shadow    (img, intensity, plasma_pattern)    </code> [view source on GitHub]","text":"<p>Apply plasma-based shadow effect by darkening.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>intensity</code> <code>float</code> <p>Shadow intensity in [0, 1]</p> <code>plasma_pattern</code> <code>np.ndarray</code> <p>Generated plasma pattern of shape (H, W)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with applied shadow effect</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef apply_plasma_shadow(\n    img: np.ndarray,\n    intensity: float,\n    plasma_pattern: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply plasma-based shadow effect by darkening.\n\n    Args:\n        img: Input image\n        intensity: Shadow intensity in [0, 1]\n        plasma_pattern: Generated plasma pattern of shape (H, W)\n\n    Returns:\n        Image with applied shadow effect\n    \"\"\"\n    result = img.copy()\n\n    # Expand dimensions to match image\n    plasma_pattern = plasma_pattern[..., np.newaxis] if img.ndim &gt; MONO_CHANNEL_DIMENSIONS else plasma_pattern\n\n    # Apply shadow by darkening (multiplying by values &lt; 1)\n    shadow_mask = 1 - plasma_pattern * intensity\n\n    return result * shadow_mask\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.apply_salt_and_pepper","title":"<code>def apply_salt_and_pepper    (img, salt_mask, pepper_mask)    </code> [view source on GitHub]","text":"<p>Apply salt and pepper noise to image using pre-computed masks.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>salt_mask</code> <code>np.ndarray</code> <p>Boolean mask for salt (white) noise</p> <code>pepper_mask</code> <code>np.ndarray</code> <p>Boolean mask for pepper (black) noise</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with applied salt and pepper noise</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def apply_salt_and_pepper(\n    img: np.ndarray,\n    salt_mask: np.ndarray,\n    pepper_mask: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply salt and pepper noise to image using pre-computed masks.\n\n    Args:\n        img: Input image\n        salt_mask: Boolean mask for salt (white) noise\n        pepper_mask: Boolean mask for pepper (black) noise\n\n    Returns:\n        Image with applied salt and pepper noise\n    \"\"\"\n    result = img.copy()\n\n    result[salt_mask] = MAX_VALUES_BY_DTYPE[img.dtype]\n    result[pepper_mask] = 0\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.auto_contrast","title":"<code>def auto_contrast    (img)    </code> [view source on GitHub]","text":"<p>Apply auto contrast to the image.</p> <p>Auto contrast enhances image contrast by stretching the intensity range to use the full range while preserving relative intensities.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image in uint8 or float32 format.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Contrast-enhanced image in the same dtype as input.</p> <p>Note</p> <p>The function: 1. Computes histogram for each channel 2. Creates cumulative distribution 3. Normalizes to full intensity range 4. Uses lookup table for scaling</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef auto_contrast(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply auto contrast to the image.\n\n    Auto contrast enhances image contrast by stretching the intensity range\n    to use the full range while preserving relative intensities.\n\n    Args:\n        img: Input image in uint8 or float32 format.\n\n    Returns:\n        Contrast-enhanced image in the same dtype as input.\n\n    Note:\n        The function:\n        1. Computes histogram for each channel\n        2. Creates cumulative distribution\n        3. Normalizes to full intensity range\n        4. Uses lookup table for scaling\n    \"\"\"\n    result = img.copy()\n    num_channels = get_num_channels(img)\n    max_value = MAX_VALUES_BY_DTYPE[img.dtype]\n\n    for i in range(num_channels):\n        channel = img[..., i] if img.ndim &gt; MONO_CHANNEL_DIMENSIONS else img\n\n        # Compute histogram\n        hist = np.histogram(channel.flatten(), bins=256, range=(0, max_value))[0]\n\n        # Calculate cumulative distribution\n        cdf = hist.cumsum()\n\n        # Find the minimum and maximum non-zero values in the CDF\n        if cdf[cdf &gt; 0].size == 0:\n            continue  # Skip if the channel is constant or empty\n\n        cdf_min = cdf[cdf &gt; 0].min()\n        cdf_max = cdf.max()\n\n        if cdf_min == cdf_max:\n            continue\n\n        # Normalize CDF\n        cdf = (cdf - cdf_min) * max_value / (cdf_max - cdf_min)\n\n        # Create lookup table\n        lut = np.clip(np.around(cdf), 0, max_value).astype(np.uint8)\n\n        # Apply lookup table\n        if img.ndim &gt; MONO_CHANNEL_DIMENSIONS:\n            result[..., i] = sz_lut(channel, lut)\n        else:\n            result = sz_lut(channel, lut)\n\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.clahe","title":"<code>def clahe    (img, clip_limit, tile_grid_size)    </code> [view source on GitHub]","text":"<p>Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.</p> <p>This function enhances the contrast of the input image using CLAHE. For color images, it converts the image to the LAB color space, applies CLAHE to the L channel, and then converts the image back to RGB.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be grayscale (2D array) or RGB (3D array).</p> <code>clip_limit</code> <code>float</code> <p>Threshold for contrast limiting. Higher values give more contrast.</p> <code>tile_grid_size</code> <code>tuple[int, int]</code> <p>Size of grid for histogram equalization. Width and height of the grid.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with CLAHE applied. The output has the same dtype as the input.</p> <p>Note</p> <ul> <li>If the input image is float32, it's temporarily converted to uint8 for processing   and then converted back to float32.</li> <li>For color images, CLAHE is applied only to the luminance channel in the LAB color space.</li> </ul> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input image is not 2D or 3D.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; img = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; result = clahe(img, clip_limit=2.0, tile_grid_size=(8, 8))\n&gt;&gt;&gt; assert result.shape == img.shape\n&gt;&gt;&gt; assert result.dtype == img.dtype\n</code></pre> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef clahe(\n    img: np.ndarray,\n    clip_limit: float,\n    tile_grid_size: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.\n\n    This function enhances the contrast of the input image using CLAHE. For color images,\n    it converts the image to the LAB color space, applies CLAHE to the L channel, and then\n    converts the image back to RGB.\n\n    Args:\n        img (np.ndarray): Input image. Can be grayscale (2D array) or RGB (3D array).\n        clip_limit (float): Threshold for contrast limiting. Higher values give more contrast.\n        tile_grid_size (tuple[int, int]): Size of grid for histogram equalization.\n            Width and height of the grid.\n\n    Returns:\n        np.ndarray: Image with CLAHE applied. The output has the same dtype as the input.\n\n    Note:\n        - If the input image is float32, it's temporarily converted to uint8 for processing\n          and then converted back to float32.\n        - For color images, CLAHE is applied only to the luminance channel in the LAB color space.\n\n    Raises:\n        ValueError: If the input image is not 2D or 3D.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; img = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; result = clahe(img, clip_limit=2.0, tile_grid_size=(8, 8))\n        &gt;&gt;&gt; assert result.shape == img.shape\n        &gt;&gt;&gt; assert result.dtype == img.dtype\n    \"\"\"\n    img = img.copy()\n    clahe_mat = cv2.createCLAHE(clipLimit=clip_limit, tileGridSize=tile_grid_size)\n\n    if is_grayscale_image(img):\n        return clahe_mat.apply(img)\n\n    img = cv2.cvtColor(img, cv2.COLOR_RGB2LAB)\n\n    img[:, :, 0] = clahe_mat.apply(img[:, :, 0])\n\n    return cv2.cvtColor(img, cv2.COLOR_LAB2RGB)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.diamond_step","title":"<code>def diamond_step    (pattern, y, x, half, grid_size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Compute edge value during diamond step.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def diamond_step(\n    pattern: np.ndarray,\n    y: int,\n    x: int,\n    half: int,\n    grid_size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; float:\n    \"\"\"Compute edge value during diamond step.\"\"\"\n    points = []\n    if y &gt;= half:\n        points.append(pattern[y - half, x])\n    if y + half &lt;= grid_size:\n        points.append(pattern[y + half, x])\n    if x &gt;= half:\n        points.append(pattern[y, x - half])\n    if x + half &lt;= grid_size:\n        points.append(pattern[y, x + half])\n\n    return sum(points) / len(points) + random_offset(\n        half * 2,\n        grid_size,\n        roughness,\n        random_generator,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.equalize","title":"<code>def equalize    (img, mask=None, mode='cv', by_channels=True)    </code> [view source on GitHub]","text":"<p>Apply histogram equalization to the input image.</p> <p>This function enhances the contrast of the input image by equalizing its histogram. It supports both grayscale and color images, and can operate on individual channels or on the luminance channel of the image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be grayscale (2D array) or RGB (3D array).</p> <code>mask</code> <code>np.ndarray | None</code> <p>Optional mask to apply the equalization selectively. If provided, must have the same shape as the input image. Default: None.</p> <code>mode</code> <code>ImageMode</code> <p>The backend to use for equalization. Can be either \"cv\" for OpenCV or \"pil\" for Pillow-style equalization. Default: \"cv\".</p> <code>by_channels</code> <code>bool</code> <p>If True, applies equalization to each channel independently. If False, converts the image to YCrCb color space and equalizes only the luminance channel. Only applicable to color images. Default: True.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Equalized image. The output has the same dtype as the input.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input image or mask have invalid shapes or types.</p> <p>Note</p> <ul> <li>If the input image is not uint8, it will be temporarily converted to uint8   for processing and then converted back to its original dtype.</li> <li>For color images, when by_channels=False, the image is converted to YCrCb   color space, equalized on the Y channel, and then converted back to RGB.</li> <li>The function preserves the original number of channels in the image.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; equalized = A.equalize(image, mode=\"cv\", by_channels=True)\n&gt;&gt;&gt; assert equalized.shape == image.shape\n&gt;&gt;&gt; assert equalized.dtype == image.dtype\n</code></pre> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef equalize(\n    img: np.ndarray,\n    mask: np.ndarray | None = None,\n    mode: ImageMode = \"cv\",\n    by_channels: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Apply histogram equalization to the input image.\n\n    This function enhances the contrast of the input image by equalizing its histogram.\n    It supports both grayscale and color images, and can operate on individual channels\n    or on the luminance channel of the image.\n\n    Args:\n        img (np.ndarray): Input image. Can be grayscale (2D array) or RGB (3D array).\n        mask (np.ndarray | None): Optional mask to apply the equalization selectively.\n            If provided, must have the same shape as the input image. Default: None.\n        mode (ImageMode): The backend to use for equalization. Can be either \"cv\" for\n            OpenCV or \"pil\" for Pillow-style equalization. Default: \"cv\".\n        by_channels (bool): If True, applies equalization to each channel independently.\n            If False, converts the image to YCrCb color space and equalizes only the\n            luminance channel. Only applicable to color images. Default: True.\n\n    Returns:\n        np.ndarray: Equalized image. The output has the same dtype as the input.\n\n    Raises:\n        ValueError: If the input image or mask have invalid shapes or types.\n\n    Note:\n        - If the input image is not uint8, it will be temporarily converted to uint8\n          for processing and then converted back to its original dtype.\n        - For color images, when by_channels=False, the image is converted to YCrCb\n          color space, equalized on the Y channel, and then converted back to RGB.\n        - The function preserves the original number of channels in the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; equalized = A.equalize(image, mode=\"cv\", by_channels=True)\n        &gt;&gt;&gt; assert equalized.shape == image.shape\n        &gt;&gt;&gt; assert equalized.dtype == image.dtype\n    \"\"\"\n    _check_preconditions(img, mask, by_channels)\n\n    function = _equalize_pil if mode == \"pil\" else _equalize_cv\n\n    if is_grayscale_image(img):\n        return function(img, _handle_mask(mask))\n\n    if not by_channels:\n        result_img = cv2.cvtColor(img, cv2.COLOR_RGB2YCrCb)\n        result_img[..., 0] = function(result_img[..., 0], _handle_mask(mask))\n        return cv2.cvtColor(result_img, cv2.COLOR_YCrCb2RGB)\n\n    result_img = np.empty_like(img)\n    for i in range(NUM_RGB_CHANNELS):\n        _mask = _handle_mask(mask, i)\n        result_img[..., i] = function(img[..., i], _mask)\n\n    return result_img\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.fancy_pca","title":"<code>def fancy_pca    (img, alpha_vector)    </code> [view source on GitHub]","text":"<p>Perform 'Fancy PCA' augmentation on an image with any number of channels.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>alpha_vector</code> <code>np.ndarray</code> <p>Vector of scale factors for each principal component.                        Should have the same length as the number of channels in the image.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Augmented image of the same shape, type, and range as the input.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This function generalizes the Fancy PCA augmentation to work with any number of channels.</li> <li>It preserves the original range of the image ([0, 255] for uint8, [0, 1] for float32).</li> <li>For single-channel images, the augmentation is applied as a simple scaling of pixel intensity variation.</li> <li>For multi-channel images, PCA is performed on the entire image, treating each pixel   as a point in N-dimensional space (where N is the number of channels).</li> <li>The augmentation preserves the correlation between channels while adding controlled noise.</li> <li>Computation time may increase significantly for images with a large number of channels.</li> </ul> <p>Reference</p> <p>Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@float32_io\n@clipped\n@preserve_channel_dim\ndef fancy_pca(img: np.ndarray, alpha_vector: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Perform 'Fancy PCA' augmentation on an image with any number of channels.\n\n    Args:\n        img (np.ndarray): Input image\n        alpha_vector (np.ndarray): Vector of scale factors for each principal component.\n                                   Should have the same length as the number of channels in the image.\n\n    Returns:\n        np.ndarray: Augmented image of the same shape, type, and range as the input.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This function generalizes the Fancy PCA augmentation to work with any number of channels.\n        - It preserves the original range of the image ([0, 255] for uint8, [0, 1] for float32).\n        - For single-channel images, the augmentation is applied as a simple scaling of pixel intensity variation.\n        - For multi-channel images, PCA is performed on the entire image, treating each pixel\n          as a point in N-dimensional space (where N is the number of channels).\n        - The augmentation preserves the correlation between channels while adding controlled noise.\n        - Computation time may increase significantly for images with a large number of channels.\n\n    Reference:\n        Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012).\n        ImageNet classification with deep convolutional neural networks.\n        In Advances in neural information processing systems (pp. 1097-1105).\n    \"\"\"\n    orig_shape = img.shape\n    num_channels = get_num_channels(img)\n\n    # Reshape image to 2D array of pixels\n    img_reshaped = img.reshape(-1, num_channels)\n\n    # Center the pixel values\n    img_mean = np.mean(img_reshaped, axis=0)\n    img_centered = img_reshaped - img_mean\n\n    if num_channels == 1:\n        # For grayscale images, apply a simple scaling\n        std_dev = np.std(img_centered)\n        noise = alpha_vector[0] * std_dev * img_centered\n    else:\n        # Compute covariance matrix\n        img_cov = np.cov(img_centered, rowvar=False)\n\n        # Compute eigenvectors &amp; eigenvalues of the covariance matrix\n        eig_vals, eig_vecs = np.linalg.eigh(img_cov)\n\n        # Sort eigenvectors by eigenvalues in descending order\n        sort_perm = eig_vals[::-1].argsort()\n        eig_vals = eig_vals[sort_perm]\n        eig_vecs = eig_vecs[:, sort_perm]\n\n        # Create noise vector\n        noise = np.dot(\n            np.dot(eig_vecs, np.diag(alpha_vector * eig_vals)),\n            img_centered.T,\n        ).T\n\n    # Add noise to the image\n    img_pca = img_reshaped + noise\n\n    # Reshape back to original shape\n    img_pca = img_pca.reshape(orig_shape)\n\n    # Clip values to [0, 1] range\n    return np.clip(img_pca, 0, 1, out=img_pca)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_constant_noise","title":"<code>def generate_constant_noise    (noise_type, shape, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Generate one value per channel.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_constant_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    shape: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate one value per channel.\"\"\"\n    num_channels = shape[-1] if len(shape) &gt; MONO_CHANNEL_DIMENSIONS else 1\n    return sample_noise(\n        noise_type,\n        (num_channels,),\n        params,\n        max_value,\n        random_generator,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_per_pixel_noise","title":"<code>def generate_per_pixel_noise    (noise_type, shape, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Generate separate noise map for each channel.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_per_pixel_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    shape: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate separate noise map for each channel.\"\"\"\n    return sample_noise(noise_type, shape, params, max_value, random_generator)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_plasma_pattern","title":"<code>def generate_plasma_pattern    (target_shape, size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Generate a plasma fractal pattern using the Diamond-Square algorithm.</p> <p>The Diamond-Square algorithm creates a natural-looking noise pattern by recursively subdividing a grid and adding random displacements at each step. The roughness parameter controls how quickly the random displacements decrease with each iteration.</p> <p>Parameters:</p> Name Type Description <code>target_shape</code> <code>tuple[int, int]</code> <p>Final shape (height, width) of the pattern</p> <code>size</code> <code>int</code> <p>Initial size of the pattern grid. Will be rounded up to nearest power of 2. Larger values create more detailed patterns.</p> <code>roughness</code> <code>float</code> <p>Controls pattern roughness. Higher values create more rough/sharp transitions. Typical values are between 1.0 and 5.0.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Normalized plasma pattern array of shape target_shape with values in [0, 1]</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_plasma_pattern(\n    target_shape: tuple[int, int],\n    size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate a plasma fractal pattern using the Diamond-Square algorithm.\n\n    The Diamond-Square algorithm creates a natural-looking noise pattern by recursively\n    subdividing a grid and adding random displacements at each step. The roughness\n    parameter controls how quickly the random displacements decrease with each iteration.\n\n    Args:\n        target_shape: Final shape (height, width) of the pattern\n        size: Initial size of the pattern grid. Will be rounded up to nearest power of 2.\n            Larger values create more detailed patterns.\n        roughness: Controls pattern roughness. Higher values create more rough/sharp transitions.\n            Typical values are between 1.0 and 5.0.\n        random_generator: NumPy random generator.\n\n    Returns:\n        Normalized plasma pattern array of shape target_shape with values in [0, 1]\n    \"\"\"\n    # Initialize grid\n    grid_size = get_grid_size(size, target_shape)\n    pattern = initialize_grid(grid_size, random_generator)\n\n    # Diamond-Square algorithm\n    step_size = grid_size\n    while step_size &gt; 1:\n        half_step = step_size // 2\n\n        # Square step\n        for y in range(0, grid_size, step_size):\n            for x in range(0, grid_size, step_size):\n                if half_step &gt; 0:\n                    pattern[y + half_step, x + half_step] = square_step(\n                        pattern,\n                        y,\n                        x,\n                        step_size,\n                        half_step,\n                        roughness,\n                        random_generator,\n                    )\n\n        # Diamond step\n        for y in range(0, grid_size + 1, half_step):\n            for x in range((y + half_step) % step_size, grid_size + 1, step_size):\n                pattern[y, x] = diamond_step(\n                    pattern,\n                    y,\n                    x,\n                    half_step,\n                    grid_size,\n                    roughness,\n                    random_generator,\n                )\n\n        step_size = half_step\n\n    min_pattern = pattern.min()\n\n    # Normalize to [0, 1] range\n    pattern = (pattern - min_pattern) / (pattern.max() - min_pattern)\n\n    return (\n        fgeometric.resize(pattern, target_shape, interpolation=cv2.INTER_LINEAR)\n        if pattern.shape != target_shape\n        else pattern\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_shared_noise","title":"<code>def generate_shared_noise    (noise_type, shape, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Generate one noise map and broadcast to all channels.</p> <p>Parameters:</p> Name Type Description <code>noise_type</code> <code>Literal['uniform', 'gaussian', 'laplace', 'beta']</code> <p>Type of noise distribution to use</p> <code>shape</code> <code>tuple[int, ...]</code> <p>Shape of the input image (H, W) or (H, W, C)</p> <code>params</code> <code>dict[str, Any]</code> <p>Parameters for the noise distribution</p> <code>max_value</code> <code>float</code> <p>Maximum value for the noise distribution</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator instance</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Noise array of shape (H, W) or (H, W, C) where the same noise pattern is shared across all channels</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_shared_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    shape: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate one noise map and broadcast to all channels.\n\n    Args:\n        noise_type: Type of noise distribution to use\n        shape: Shape of the input image (H, W) or (H, W, C)\n        params: Parameters for the noise distribution\n        max_value: Maximum value for the noise distribution\n        random_generator: NumPy random generator instance\n\n    Returns:\n        Noise array of shape (H, W) or (H, W, C) where the same noise\n        pattern is shared across all channels\n    \"\"\"\n    # Generate noise for (H, W)\n    height, width = shape[:2]\n    noise_map = sample_noise(\n        noise_type,\n        (height, width),\n        params,\n        max_value,\n        random_generator,\n    )\n\n    # If input is multichannel, broadcast noise to all channels\n    if len(shape) &gt; MONO_CHANNEL_DIMENSIONS:\n        return np.broadcast_to(noise_map[..., None], shape)\n    return noise_map\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.generate_snow_textures","title":"<code>def generate_snow_textures    (img_shape, random_generator)    </code> [view source on GitHub]","text":"<p>Generate snow texture and sparkle mask.</p> <p>Parameters:</p> Name Type Description <code>img_shape</code> <code>tuple[int, int]</code> <p>Image shape.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random generator to use.</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>Tuple of (snow_texture, sparkle_mask) arrays.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def generate_snow_textures(\n    img_shape: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate snow texture and sparkle mask.\n\n    Args:\n        img_shape (tuple[int, int]): Image shape.\n        random_generator (np.random.Generator): Random generator to use.\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: Tuple of (snow_texture, sparkle_mask) arrays.\n    \"\"\"\n    # Generate base snow texture\n    snow_texture = random_generator.normal(size=img_shape[:2], loc=0.5, scale=0.3)\n    snow_texture = cv2.GaussianBlur(snow_texture, (0, 0), sigmaX=1, sigmaY=1)\n\n    # Generate sparkle mask\n    sparkle_mask = random_generator.random(img_shape[:2]) &gt; 0.99\n\n    return snow_texture, sparkle_mask\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.get_fog_particle_radiuses","title":"<code>def get_fog_particle_radiuses    (img_shape, num_particles, fog_intensity, random_generator)    </code> [view source on GitHub]","text":"<p>Generate radiuses for fog particles.</p> <p>Parameters:</p> Name Type Description <code>img_shape</code> <code>tuple[int, int]</code> <p>Image shape.</p> <code>num_particles</code> <code>int</code> <p>Number of fog particles.</p> <code>fog_intensity</code> <code>float</code> <p>Intensity of the fog effect, between 0 and 1.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random generator to use.</p> <p>Returns:</p> Type Description <code>list[int]</code> <p>List of radiuses for each fog particle.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def get_fog_particle_radiuses(\n    img_shape: tuple[int, int],\n    num_particles: int,\n    fog_intensity: float,\n    random_generator: np.random.Generator,\n) -&gt; list[int]:\n    \"\"\"Generate radiuses for fog particles.\n\n    Args:\n        img_shape (tuple[int, int]): Image shape.\n        num_particles (int): Number of fog particles.\n        fog_intensity (float): Intensity of the fog effect, between 0 and 1.\n        random_generator (np.random.Generator): Random generator to use.\n\n    Returns:\n        list[int]: List of radiuses for each fog particle.\n    \"\"\"\n    height, width = img_shape[:2]\n    max_fog_radius = max(2, int(min(height, width) * 0.1 * fog_intensity))\n    min_radius = max(1, max_fog_radius // 2)\n\n    return [random_generator.integers(min_radius, max_fog_radius) for _ in range(num_particles)]\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.get_grid_size","title":"<code>def get_grid_size    (size, target_shape)    </code> [view source on GitHub]","text":"<p>Round up to nearest power of 2.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def get_grid_size(size: int, target_shape: tuple[int, int]) -&gt; int:\n    \"\"\"Round up to nearest power of 2.\"\"\"\n    return 2 ** int(np.ceil(np.log2(max(size, *target_shape))))\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.get_safe_brightness_contrast_params","title":"<code>def get_safe_brightness_contrast_params    (alpha, beta, max_value)    </code> [view source on GitHub]","text":"<p>Calculate safe alpha and beta values to prevent overflow/underflow.</p> <p>For any pixel value x, we want: 0 &lt;= alpha * x + beta &lt;= max_value</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>float</code> <p>Contrast factor (1 means no change)</p> <code>beta</code> <code>float</code> <p>Brightness offset</p> <code>max_value</code> <code>float</code> <p>Maximum allowed value (255 for uint8, 1 for float32)</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>Safe (alpha, beta) values that prevent overflow/underflow</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def get_safe_brightness_contrast_params(\n    alpha: float,\n    beta: float,\n    max_value: float,\n) -&gt; tuple[float, float]:\n    \"\"\"Calculate safe alpha and beta values to prevent overflow/underflow.\n\n    For any pixel value x, we want: 0 &lt;= alpha * x + beta &lt;= max_value\n\n    Args:\n        alpha: Contrast factor (1 means no change)\n        beta: Brightness offset\n        max_value: Maximum allowed value (255 for uint8, 1 for float32)\n\n    Returns:\n        tuple[float, float]: Safe (alpha, beta) values that prevent overflow/underflow\n    \"\"\"\n    if alpha &gt; 0:\n        # For x = max_value: alpha * max_value + beta &lt;= max_value\n        # For x = 0: beta &gt;= 0\n        safe_beta = np.clip(beta, 0, max_value)\n        # From alpha * max_value + safe_beta &lt;= max_value\n        safe_alpha = min(alpha, (max_value - safe_beta) / max_value)\n    else:\n        # For x = 0: beta &lt;= max_value\n        # For x = max_value: alpha * max_value + beta &gt;= 0\n        safe_beta = min(beta, max_value)\n        # From alpha * max_value + safe_beta &gt;= 0\n        safe_alpha = max(alpha, -safe_beta / max_value)\n\n    return safe_alpha, safe_beta\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.grayscale_to_multichannel","title":"<code>def grayscale_to_multichannel    (grayscale_image, num_output_channels=3)    </code> [view source on GitHub]","text":"<p>Convert a grayscale image to a multi-channel image.</p> <p>This function takes a 2D grayscale image or a 3D image with a single channel and converts it to a multi-channel image by repeating the grayscale data across the specified number of channels.</p> <p>Parameters:</p> Name Type Description <code>grayscale_image</code> <code>np.ndarray</code> <p>Input grayscale image. Can be 2D (height, width)                           or 3D (height, width, 1).</p> <code>num_output_channels</code> <code>int</code> <p>Number of channels in the output image. Defaults to 3.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Multi-channel image with shape (height, width, num_channels)</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def grayscale_to_multichannel(\n    grayscale_image: np.ndarray,\n    num_output_channels: int = 3,\n) -&gt; np.ndarray:\n    \"\"\"Convert a grayscale image to a multi-channel image.\n\n    This function takes a 2D grayscale image or a 3D image with a single channel\n    and converts it to a multi-channel image by repeating the grayscale data\n    across the specified number of channels.\n\n    Args:\n        grayscale_image (np.ndarray): Input grayscale image. Can be 2D (height, width)\n                                      or 3D (height, width, 1).\n        num_output_channels (int, optional): Number of channels in the output image. Defaults to 3.\n\n    Returns:\n        np.ndarray: Multi-channel image with shape (height, width, num_channels)\n    \"\"\"\n    # If output should be single channel, just squeeze and return\n    if num_output_channels == 1:\n        return grayscale_image\n\n    # For multi-channel output, squeeze and stack\n    squeezed = np.squeeze(grayscale_image)\n\n    return cv2.merge([squeezed] * num_output_channels)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.image_compression","title":"<code>def image_compression    (img, quality, image_type)    </code> [view source on GitHub]","text":"<p>Apply compression to image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>quality</code> <code>int</code> <p>Compression quality (0-100)</p> <code>image_type</code> <code>Literal['.jpg', '.webp']</code> <p>Type of compression ('.jpg' or '.webp')</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Compressed image with same number of channels as input</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef image_compression(\n    img: np.ndarray,\n    quality: int,\n    image_type: Literal[\".jpg\", \".webp\"],\n) -&gt; np.ndarray:\n    \"\"\"Apply compression to image.\n\n    Args:\n        img: Input image\n        quality: Compression quality (0-100)\n        image_type: Type of compression ('.jpg' or '.webp')\n\n    Returns:\n        Compressed image with same number of channels as input\n    \"\"\"\n    quality_flag = cv2.IMWRITE_JPEG_QUALITY if image_type == \".jpg\" else cv2.IMWRITE_WEBP_QUALITY\n\n    num_channels = get_num_channels(img)\n\n    if num_channels == 1:\n        # For grayscale, ensure we read back as single channel\n        _, encoded_img = cv2.imencode(image_type, img, (int(quality_flag), quality))\n        decoded = cv2.imdecode(encoded_img, cv2.IMREAD_GRAYSCALE)\n        return decoded[..., np.newaxis]  # Add channel dimension back\n\n    if num_channels == NUM_RGB_CHANNELS:\n        # Standard RGB image\n        _, encoded_img = cv2.imencode(image_type, img, (int(quality_flag), quality))\n        return cv2.imdecode(encoded_img, cv2.IMREAD_UNCHANGED)\n\n    # For 2,4 or more channels, we need to handle alpha/extra channels separately\n    if num_channels == 2:\n        # For 2 channels, pad to 3 channels and take only first 2 after compression\n        padded = np.pad(img, ((0, 0), (0, 0), (0, 1)), mode=\"constant\")\n        _, encoded_bgr = cv2.imencode(image_type, padded, (int(quality_flag), quality))\n        decoded_bgr = cv2.imdecode(encoded_bgr, cv2.IMREAD_UNCHANGED)\n        return decoded_bgr[..., :2]\n\n    # Process first 3 channels together\n    bgr = img[..., :NUM_RGB_CHANNELS]\n    _, encoded_bgr = cv2.imencode(image_type, bgr, (int(quality_flag), quality))\n    decoded_bgr = cv2.imdecode(encoded_bgr, cv2.IMREAD_UNCHANGED)\n\n    if num_channels &gt; NUM_RGB_CHANNELS:\n        # Process additional channels one by one\n        extra_channels = []\n        for i in range(NUM_RGB_CHANNELS, num_channels):\n            channel = img[..., i]\n            _, encoded = cv2.imencode(image_type, channel, (int(quality_flag), quality))\n            decoded = cv2.imdecode(encoded, cv2.IMREAD_GRAYSCALE)\n            if len(decoded.shape) == 2:\n                decoded = decoded[..., np.newaxis]\n            extra_channels.append(decoded)\n\n        # Combine BGR with extra channels\n        return np.dstack([decoded_bgr, *extra_channels])\n\n    return decoded_bgr\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.initialize_grid","title":"<code>def initialize_grid    (grid_size, random_generator)    </code> [view source on GitHub]","text":"<p>Initialize grid with random corners.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def initialize_grid(\n    grid_size: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Initialize grid with random corners.\"\"\"\n    pattern = np.zeros((grid_size + 1, grid_size + 1), dtype=np.float32)\n    for corner in [(0, 0), (0, -1), (-1, 0), (-1, -1)]:\n        pattern[corner] = random_generator.random()\n    return pattern\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.iso_noise","title":"<code>def iso_noise    (image, color_shift, intensity, random_generator)    </code> [view source on GitHub]","text":"<p>Apply poisson noise to an image to simulate camera sensor noise.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image. Currently, only RGB images are supported.</p> <code>color_shift</code> <code>float</code> <p>The amount of color shift to apply.</p> <code>intensity</code> <code>float</code> <p>Multiplication factor for noise values. Values of ~0.5 produce a noticeable,                yet acceptable level of noise.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>If specified, this will be random generator used for noise generation.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The noised image.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@float32_io\n@clipped\ndef iso_noise(\n    image: np.ndarray,\n    color_shift: float,\n    intensity: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Apply poisson noise to an image to simulate camera sensor noise.\n\n    Args:\n        image (np.ndarray): Input image. Currently, only RGB images are supported.\n        color_shift (float): The amount of color shift to apply.\n        intensity (float): Multiplication factor for noise values. Values of ~0.5 produce a noticeable,\n                           yet acceptable level of noise.\n        random_generator (np.random.Generator): If specified, this will be random generator used\n            for noise generation.\n\n    Returns:\n        np.ndarray: The noised image.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n    \"\"\"\n    hls = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)\n    _, stddev = cv2.meanStdDev(hls)\n\n    luminance_noise = random_generator.poisson(\n        stddev[1] * intensity,\n        size=hls.shape[:2],\n    )\n    color_noise = random_generator.normal(\n        0,\n        color_shift * intensity,\n        size=hls.shape[:2],\n    )\n\n    hls[..., 0] += color_noise\n    hls[..., 1] = add_array(\n        hls[..., 1],\n        luminance_noise * intensity * (1.0 - hls[..., 1]),\n    )\n\n    noised_hls = cv2.cvtColor(hls, cv2.COLOR_HLS2RGB)\n    return np.clip(noised_hls, 0, 1, out=noised_hls)  # Ensure output is in [0, 1] range\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.move_tone_curve","title":"<code>def move_tone_curve    (img, low_y, high_y)    </code> [view source on GitHub]","text":"<p>Rescales the relationship between bright and dark areas of the image by manipulating its tone curve.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>np.ndarray. Any number of channels</p> <code>low_y</code> <code>float | np.ndarray</code> <p>per-channel or single y-position of a Bezier control point used to adjust the tone curve, must be in range [0, 1]</p> <code>high_y</code> <code>float | np.ndarray</code> <p>per-channel or single y-position of a Bezier control point used to adjust image tone curve, must be in range [0, 1]</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\ndef move_tone_curve(\n    img: np.ndarray,\n    low_y: float | np.ndarray,\n    high_y: float | np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Rescales the relationship between bright and dark areas of the image by manipulating its tone curve.\n\n    Args:\n        img: np.ndarray. Any number of channels\n        low_y: per-channel or single y-position of a Bezier control point used\n            to adjust the tone curve, must be in range [0, 1]\n        high_y: per-channel or single y-position of a Bezier control point used\n            to adjust image tone curve, must be in range [0, 1]\n\n    \"\"\"\n    t = np.linspace(0.0, 1.0, 256)\n\n    def evaluate_bez(\n        t: np.ndarray,\n        low_y: float | np.ndarray,\n        high_y: float | np.ndarray,\n    ) -&gt; np.ndarray:\n        one_minus_t = 1 - t\n        return (3 * one_minus_t**2 * t * low_y + 3 * one_minus_t * t**2 * high_y + t**3) * 255\n\n    num_channels = get_num_channels(img)\n\n    if np.isscalar(low_y) and np.isscalar(high_y):\n        lut = clip(np.rint(evaluate_bez(t, low_y, high_y)), np.uint8, inplace=False)\n        return sz_lut(img, lut, inplace=False)\n    if isinstance(low_y, np.ndarray) and isinstance(high_y, np.ndarray):\n        luts = clip(\n            np.rint(evaluate_bez(t[:, np.newaxis], low_y, high_y).T),\n            np.uint8,\n            inplace=False,\n        )\n        return cv2.merge(\n            [sz_lut(img[:, :, i], np.ascontiguousarray(luts[i]), inplace=False) for i in range(num_channels)],\n        )\n\n    raise TypeError(\n        f\"low_y and high_y must both be of type float or np.ndarray. Got {type(low_y)} and {type(high_y)}\",\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.posterize","title":"<code>def posterize    (img, bits)    </code> [view source on GitHub]","text":"<p>Reduce the number of bits for each color channel by keeping only the highest N bits.</p> <p>This transform performs bit-depth reduction by masking out lower bits, effectively reducing the number of possible values per channel. This creates a posterization effect where similar colors are merged together.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image. Can be single or multi-channel.</p> <code>bits</code> <code>Literal[1, 2, 3, 4, 5, 6, 7] | list[Literal[1, 2, 3, 4, 5, 6, 7]]</code> <p>Number of high bits to keep. Must be in range [1, 7]. Can be either: - A single value to apply the same bit reduction to all channels - A list of values to apply different bit reduction per channel.   Length of list must match number of channels in image.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image with reduced bit depth. Has same shape and dtype as input.</p> <p>Note</p> <ul> <li>The transform keeps the N highest bits and sets all other bits to 0</li> <li>For example, if bits=3:<ul> <li>Original value: 11010110 (214)</li> <li>Keep 3 bits:   11000000 (192)</li> </ul> </li> <li>The number of unique colors per channel will be 2^bits</li> <li>Higher bits values = more colors = more subtle effect</li> <li>Lower bits values = fewer colors = more dramatic posterization</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; # Same posterization for all channels\n&gt;&gt;&gt; result = posterize(image, bits=3)\n&gt;&gt;&gt; # Different posterization per channel\n&gt;&gt;&gt; result = posterize(image, bits=[3, 4, 5])  # RGB channels\n</code></pre> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\ndef posterize(img: np.ndarray, bits: Literal[1, 2, 3, 4, 5, 6, 7] | list[Literal[1, 2, 3, 4, 5, 6, 7]]) -&gt; np.ndarray:\n    \"\"\"Reduce the number of bits for each color channel by keeping only the highest N bits.\n\n    This transform performs bit-depth reduction by masking out lower bits, effectively\n    reducing the number of possible values per channel. This creates a posterization\n    effect where similar colors are merged together.\n\n    Args:\n        img: Input image. Can be single or multi-channel.\n        bits: Number of high bits to keep. Must be in range [1, 7].\n            Can be either:\n            - A single value to apply the same bit reduction to all channels\n            - A list of values to apply different bit reduction per channel.\n              Length of list must match number of channels in image.\n\n    Returns:\n        np.ndarray: Image with reduced bit depth. Has same shape and dtype as input.\n\n    Note:\n        - The transform keeps the N highest bits and sets all other bits to 0\n        - For example, if bits=3:\n            - Original value: 11010110 (214)\n            - Keep 3 bits:   11000000 (192)\n        - The number of unique colors per channel will be 2^bits\n        - Higher bits values = more colors = more subtle effect\n        - Lower bits values = fewer colors = more dramatic posterization\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; # Same posterization for all channels\n        &gt;&gt;&gt; result = posterize(image, bits=3)\n        &gt;&gt;&gt; # Different posterization per channel\n        &gt;&gt;&gt; result = posterize(image, bits=[3, 4, 5])  # RGB channels\n    \"\"\"\n    bits_array = np.uint8(bits)\n\n    if not bits_array.shape or len(bits_array) == 1:\n        lut = np.arange(0, 256, dtype=np.uint8)\n        mask = ~np.uint8(2 ** (8 - bits_array) - 1)\n        lut &amp;= mask\n\n        return sz_lut(img, lut, inplace=False)\n\n    result_img = np.empty_like(img)\n    for i, channel_bits in enumerate(bits_array):\n        lut = np.arange(0, 256, dtype=np.uint8)\n        mask = ~np.uint8(2 ** (8 - channel_bits) - 1)\n        lut &amp;= mask\n\n        result_img[..., i] = sz_lut(img[..., i], lut, inplace=True)\n\n    return result_img\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.prepare_illumination_input","title":"<code>def prepare_illumination_input    (img)    </code> [view source on GitHub]","text":"<p>Prepare image for illumination effect.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>float32 image</li> <li>height</li> <li>width</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def prepare_illumination_input(img: np.ndarray) -&gt; tuple[np.ndarray, int, int]:\n    \"\"\"Prepare image for illumination effect.\n\n    Args:\n        img: Input image\n\n    Returns:\n        tuple of:\n        - float32 image\n        - height\n        - width\n    \"\"\"\n    result = img.astype(np.float32)\n    height, width = img.shape[:2]\n    return result, height, width\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.random_offset","title":"<code>def random_offset    (current_size, total_size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Calculate random offset based on current grid size.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def random_offset(\n    current_size: int,\n    total_size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; float:\n    \"\"\"Calculate random offset based on current grid size.\"\"\"\n    return (random_generator.random() - 0.5) * (current_size / total_size) ** (roughness / 2)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_beta","title":"<code>def sample_beta    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from Beta distribution.</p> <p>The Beta distribution is bounded by [0, 1] and then scaled and shifted to [-scale, scale]. Alpha and beta parameters control the shape of the distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_beta(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from Beta distribution.\n\n    The Beta distribution is bounded by [0, 1] and then scaled and shifted to [-scale, scale].\n    Alpha and beta parameters control the shape of the distribution.\n    \"\"\"\n    alpha = random_generator.uniform(*params[\"alpha_range\"])\n    beta = random_generator.uniform(*params[\"beta_range\"])\n    scale = random_generator.uniform(*params[\"scale_range\"])\n\n    # Sample from Beta[0,1] and transform to [-scale,scale]\n    samples = random_generator.beta(alpha, beta, size=size)\n    return (2 * samples - 1) * scale\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_gaussian","title":"<code>def sample_gaussian    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from Gaussian distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_gaussian(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from Gaussian distribution.\"\"\"\n    mean = random_generator.uniform(*params[\"mean_range\"])\n    std = random_generator.uniform(*params[\"std_range\"])\n    return random_generator.normal(mean, std, size=size)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_laplace","title":"<code>def sample_laplace    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from Laplace distribution.</p> <p>The Laplace distribution is also known as the double exponential distribution. It has heavier tails than the Gaussian distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_laplace(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from Laplace distribution.\n\n    The Laplace distribution is also known as the double exponential distribution.\n    It has heavier tails than the Gaussian distribution.\n    \"\"\"\n    loc = random_generator.uniform(*params[\"mean_range\"])\n    scale = random_generator.uniform(*params[\"scale_range\"])\n    return random_generator.laplace(loc=loc, scale=scale, size=size)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_noise","title":"<code>def sample_noise    (noise_type, size, params, max_value, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from specific noise distribution.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_noise(\n    noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"],\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    max_value: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Sample from specific noise distribution.\"\"\"\n    if noise_type == \"uniform\":\n        return sample_uniform(size, params, random_generator) * max_value\n    if noise_type == \"gaussian\":\n        return sample_gaussian(size, params, random_generator) * max_value\n    if noise_type == \"laplace\":\n        return sample_laplace(size, params, random_generator) * max_value\n    if noise_type == \"beta\":\n        return sample_beta(size, params, random_generator) * max_value\n\n    raise ValueError(f\"Unknown noise type: {noise_type}\")\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sample_uniform","title":"<code>def sample_uniform    (size, params, random_generator)    </code> [view source on GitHub]","text":"<p>Sample from uniform distribution.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, ...]</code> <p>Output shape. If length is 1, generates constant noise per channel.</p> <code>params</code> <code>dict[str, Any]</code> <p>Must contain 'ranges' key with list of (min, max) tuples. If only one range is provided, it will be used for all channels.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator instance</p> <p>Returns:</p> Type Description <code>np.ndarray | float</code> <p>Noise array of specified size. For single-channel constant mode, returns scalar instead of array with shape (1,).</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def sample_uniform(\n    size: tuple[int, ...],\n    params: dict[str, Any],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray | float:\n    \"\"\"Sample from uniform distribution.\n\n    Args:\n        size: Output shape. If length is 1, generates constant noise per channel.\n        params: Must contain 'ranges' key with list of (min, max) tuples.\n            If only one range is provided, it will be used for all channels.\n        random_generator: NumPy random generator instance\n\n    Returns:\n        Noise array of specified size. For single-channel constant mode,\n        returns scalar instead of array with shape (1,).\n    \"\"\"\n    if len(size) == 1:  # constant mode\n        ranges = params[\"ranges\"]\n        num_channels = size[0]\n\n        if len(ranges) == 1:\n            ranges = ranges * num_channels\n        elif len(ranges) &lt; num_channels:\n            raise ValueError(\n                f\"Not enough ranges provided. Expected {num_channels}, got {len(ranges)}\",\n            )\n\n        return np.array(\n            [random_generator.uniform(low, high) for low, high in ranges[:num_channels]],\n        )\n\n    # use first range for spatial noise\n    low, high = params[\"ranges\"][0]\n    return random_generator.uniform(low, high, size=size)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.sharpen_gaussian","title":"<code>def sharpen_gaussian    (img, alpha, kernel_size, sigma)    </code> [view source on GitHub]","text":"<p>Sharpen image using Gaussian blur.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef sharpen_gaussian(\n    img: np.ndarray,\n    alpha: float,\n    kernel_size: int,\n    sigma: float,\n) -&gt; np.ndarray:\n    \"\"\"Sharpen image using Gaussian blur.\"\"\"\n    blurred = cv2.GaussianBlur(\n        img,\n        ksize=(kernel_size, kernel_size),\n        sigmaX=sigma,\n        sigmaY=sigma,\n    )\n    # Unsharp mask formula: original + alpha * (original - blurred)\n    # This is equivalent to: original * (1 + alpha) - alpha * blurred\n    return img + alpha * (img - blurred)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.shot_noise","title":"<code>def shot_noise    (img, scale, random_generator)    </code> [view source on GitHub]","text":"<p>Apply shot noise to the image by simulating photon counting in linear light space.</p> <p>This function simulates photon shot noise, which occurs due to the quantum nature of light. The process: 1. Converts image to linear light space (removes gamma correction) 2. Scales pixel values to represent expected photon counts 3. Samples actual photon counts from Poisson distribution 4. Converts back to display space (reapplies gamma)</p> <p>The simulation is performed in linear light space because photon shot noise is a physical process that occurs before gamma correction is applied by cameras/displays.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image in range [0, 1]. Can be single or multi-channel.</p> <code>scale</code> <code>float</code> <p>Reciprocal of the number of photons (noise intensity). - Larger values = fewer photons = more noise - Smaller values = more photons = less noise For example: - scale = 0.1 simulates ~100 photons per unit intensity - scale = 10.0 simulates ~0.1 photons per unit intensity</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random generator for Poisson sampling</p> <p>Returns:</p> Type Description <code>Image with shot noise applied, same shape and range [0, 1] as input. The noise characteristics will follow Poisson statistics in linear space</code> <ul> <li>Variance equals mean in linear space</li> <li>More noise in brighter regions (but less relative noise)</li> <li>Less noise in darker regions (but more relative noise)</li> </ul> <p>Note</p> <ul> <li>Uses gamma value of 2.2 for linear/display space conversion</li> <li>Adds small constant (1e-6) to avoid issues with zero values</li> <li>Clips final values to [0, 1] range</li> <li>Operates on the image in-place for memory efficiency</li> <li>Preserves float32 precision throughout calculations</li> </ul> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Shot_noise</li> <li>https://en.wikipedia.org/wiki/Gamma_correction</li> </ul> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@preserve_channel_dim\n@float32_io\ndef shot_noise(\n    img: np.ndarray,\n    scale: float,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Apply shot noise to the image by simulating photon counting in linear light space.\n\n    This function simulates photon shot noise, which occurs due to the quantum nature of light.\n    The process:\n    1. Converts image to linear light space (removes gamma correction)\n    2. Scales pixel values to represent expected photon counts\n    3. Samples actual photon counts from Poisson distribution\n    4. Converts back to display space (reapplies gamma)\n\n    The simulation is performed in linear light space because photon shot noise is a physical\n    process that occurs before gamma correction is applied by cameras/displays.\n\n    Args:\n        img: Input image in range [0, 1]. Can be single or multi-channel.\n        scale: Reciprocal of the number of photons (noise intensity).\n            - Larger values = fewer photons = more noise\n            - Smaller values = more photons = less noise\n            For example:\n            - scale = 0.1 simulates ~100 photons per unit intensity\n            - scale = 10.0 simulates ~0.1 photons per unit intensity\n        random_generator: NumPy random generator for Poisson sampling\n\n    Returns:\n        Image with shot noise applied, same shape and range [0, 1] as input.\n        The noise characteristics will follow Poisson statistics in linear space:\n        - Variance equals mean in linear space\n        - More noise in brighter regions (but less relative noise)\n        - Less noise in darker regions (but more relative noise)\n\n    Note:\n        - Uses gamma value of 2.2 for linear/display space conversion\n        - Adds small constant (1e-6) to avoid issues with zero values\n        - Clips final values to [0, 1] range\n        - Operates on the image in-place for memory efficiency\n        - Preserves float32 precision throughout calculations\n\n    References:\n        - https://en.wikipedia.org/wiki/Shot_noise\n        - https://en.wikipedia.org/wiki/Gamma_correction\n    \"\"\"\n    # Apply inverse gamma correction to work in linear space\n    img_linear = cv2.pow(img, 2.2)\n\n    # Scale image values and add small constant to avoid zero values\n    scaled_img = (img_linear + scale * 1e-6) / scale\n\n    # Generate Poisson noise\n    noisy_img = multiply_by_constant(\n        random_generator.poisson(scaled_img).astype(np.float32),\n        scale,\n        inplace=True,\n    )\n\n    # Scale back and apply gamma correction\n    return power(np.clip(noisy_img, 0, 1, out=noisy_img), 1 / 2.2)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.slic","title":"<code>def slic    (image, n_segments, compactness=10.0, max_iterations=10)    </code> [view source on GitHub]","text":"<p>Simple Linear Iterative Clustering (SLIC) superpixel segmentation using OpenCV and NumPy.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image (2D or 3D numpy array).</p> <code>n_segments</code> <code>int</code> <p>Approximate number of superpixels to generate.</p> <code>compactness</code> <code>float</code> <p>Balance between color proximity and space proximity.</p> <code>max_iterations</code> <code>int</code> <p>Maximum number of iterations for k-means.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Segmentation mask where each superpixel has a unique label.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def slic(\n    image: np.ndarray,\n    n_segments: int,\n    compactness: float = 10.0,\n    max_iterations: int = 10,\n) -&gt; np.ndarray:\n    \"\"\"Simple Linear Iterative Clustering (SLIC) superpixel segmentation using OpenCV and NumPy.\n\n    Args:\n        image (np.ndarray): Input image (2D or 3D numpy array).\n        n_segments (int): Approximate number of superpixels to generate.\n        compactness (float): Balance between color proximity and space proximity.\n        max_iterations (int): Maximum number of iterations for k-means.\n\n    Returns:\n        np.ndarray: Segmentation mask where each superpixel has a unique label.\n    \"\"\"\n    if image.ndim == MONO_CHANNEL_DIMENSIONS:\n        image = image[..., np.newaxis]\n\n    height, width = image.shape[:2]\n    num_pixels = height * width\n\n    # Normalize image to [0, 1] range\n    image_normalized = image.astype(np.float32) / np.max(image + 1e-6)\n\n    # Initialize cluster centers\n    grid_step = int((num_pixels / n_segments) ** 0.5)\n    x_range = np.arange(grid_step // 2, width, grid_step)\n    y_range = np.arange(grid_step // 2, height, grid_step)\n    centers = np.array(\n        [(x, y) for y in y_range for x in x_range if x &lt; width and y &lt; height],\n    )\n\n    # Initialize labels and distances\n    labels = -1 * np.ones((height, width), dtype=np.int32)\n    distances = np.full((height, width), np.inf)\n\n    for _ in range(max_iterations):\n        for i, center in enumerate(centers):\n            y, x = int(center[1]), int(center[0])\n\n            # Define the neighborhood\n            y_low, y_high = max(0, y - grid_step), min(height, y + grid_step + 1)\n            x_low, x_high = max(0, x - grid_step), min(width, x + grid_step + 1)\n\n            # Compute distances\n            crop = image_normalized[y_low:y_high, x_low:x_high]\n            color_diff = crop - image_normalized[y, x]\n            color_distance = np.sum(color_diff**2, axis=-1)\n\n            yy, xx = np.ogrid[y_low:y_high, x_low:x_high]\n            spatial_distance = ((yy - y) ** 2 + (xx - x) ** 2) / (grid_step**2)\n\n            distance = color_distance + compactness * spatial_distance\n\n            mask = distance &lt; distances[y_low:y_high, x_low:x_high]\n            distances[y_low:y_high, x_low:x_high][mask] = distance[mask]\n            labels[y_low:y_high, x_low:x_high][mask] = i\n\n        # Update centers\n        for i in range(len(centers)):\n            mask = labels == i\n            if np.any(mask):\n                centers[i] = np.mean(np.argwhere(mask), axis=0)[::-1]\n\n    return labels\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.solarize","title":"<code>def solarize    (img, threshold)    </code> [view source on GitHub]","text":"<p>Invert all pixel values above a threshold.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The image to solarize. Can be uint8 or float32.</p> <code>threshold</code> <code>float</code> <p>Normalized threshold value in range [0, 1]. For uint8 images: pixels above threshold * 255 are inverted For float32 images: pixels above threshold are inverted</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Solarized image.</p> <p>Note</p> <p>The threshold is normalized to [0, 1] range for both uint8 and float32 images. For uint8 images, the threshold is internally scaled by 255.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef solarize(img: np.ndarray, threshold: float) -&gt; np.ndarray:\n    \"\"\"Invert all pixel values above a threshold.\n\n    Args:\n        img: The image to solarize. Can be uint8 or float32.\n        threshold: Normalized threshold value in range [0, 1].\n            For uint8 images: pixels above threshold * 255 are inverted\n            For float32 images: pixels above threshold are inverted\n\n    Returns:\n        Solarized image.\n\n    Note:\n        The threshold is normalized to [0, 1] range for both uint8 and float32 images.\n        For uint8 images, the threshold is internally scaled by 255.\n    \"\"\"\n    dtype = img.dtype\n    max_val = MAX_VALUES_BY_DTYPE[dtype]\n\n    if dtype == np.uint8:\n        lut = [(max_val - i if i &gt;= threshold * max_val else i) for i in range(int(max_val) + 1)]\n\n        prev_shape = img.shape\n        img = sz_lut(img, np.array(lut, dtype=dtype), inplace=False)\n\n        return np.expand_dims(img, -1) if len(prev_shape) != img.ndim else img\n\n    img = img.copy()\n\n    cond = img &gt;= threshold\n    img[cond] = max_val - img[cond]\n    return img\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.square_step","title":"<code>def square_step    (pattern, y, x, step, grid_size, roughness, random_generator)    </code> [view source on GitHub]","text":"<p>Compute center value during square step.</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def square_step(\n    pattern: np.ndarray,\n    y: int,\n    x: int,\n    step: int,\n    grid_size: int,\n    roughness: float,\n    random_generator: np.random.Generator,\n) -&gt; float:\n    \"\"\"Compute center value during square step.\"\"\"\n    corners = [\n        pattern[y, x],  # top-left\n        pattern[y, x + step],  # top-right\n        pattern[y + step, x],  # bottom-left\n        pattern[y + step, x + step],  # bottom-right\n    ]\n    return sum(corners) / 4.0 + random_offset(\n        step,\n        grid_size,\n        roughness,\n        random_generator,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_average","title":"<code>def to_gray_average    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using the average method.</p> <p>This function computes the arithmetic mean across all channels for each pixel, resulting in a grayscale representation of the image.</p> <p>Key aspects of this method: 1. It treats all channels equally, regardless of their perceptual importance. 2. Works with any number of channels, making it versatile for various image types. 3. Simple and fast to compute, but may not accurately represent perceived brightness. 4. For RGB images, the formula is: Gray = (R + G + B) / 3</p> <p>Note: This method may produce different results compared to weighted methods (like RGB weighted average) which account for human perception of color brightness. It may also produce unexpected results for images with alpha channels or non-color data in additional channels.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array. Can be any number of channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array. The output data type             matches the input data type.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def to_gray_average(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using the average method.\n\n    This function computes the arithmetic mean across all channels for each pixel,\n    resulting in a grayscale representation of the image.\n\n    Key aspects of this method:\n    1. It treats all channels equally, regardless of their perceptual importance.\n    2. Works with any number of channels, making it versatile for various image types.\n    3. Simple and fast to compute, but may not accurately represent perceived brightness.\n    4. For RGB images, the formula is: Gray = (R + G + B) / 3\n\n    Note: This method may produce different results compared to weighted methods\n    (like RGB weighted average) which account for human perception of color brightness.\n    It may also produce unexpected results for images with alpha channels or\n    non-color data in additional channels.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array. Can be any number of channels.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array. The output data type\n                    matches the input data type.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    return np.mean(img, axis=-1).astype(img.dtype)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_desaturation","title":"<code>def to_gray_desaturation    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using the desaturation method.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef to_gray_desaturation(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using the desaturation method.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    float_image = img.astype(np.float32)\n    return (np.max(float_image, axis=-1) + np.min(float_image, axis=-1)) / 2\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_from_lab","title":"<code>def to_gray_from_lab    (img)    </code> [view source on GitHub]","text":"<p>Convert an RGB image to grayscale using the L channel from the LAB color space.</p> <p>This function converts the RGB image to the LAB color space and extracts the L channel. The LAB color space is designed to approximate human vision, where L represents lightness.</p> <p>Key aspects of this method: 1. The L channel represents the lightness of each pixel, ranging from 0 (black) to 100 (white). 2. It's more perceptually uniform than RGB, meaning equal changes in L values correspond to    roughly equal changes in perceived lightness. 3. The L channel is independent of the color information (A and B channels), making it    suitable for grayscale conversion.</p> <p>This method can be particularly useful when you want a grayscale image that closely matches human perception of lightness, potentially preserving more perceived contrast than simple RGB-based methods.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input RGB image as a numpy array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array, representing the L (lightness) channel.             Values are scaled to match the input image's data type range.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@uint8_io\n@clipped\ndef to_gray_from_lab(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an RGB image to grayscale using the L channel from the LAB color space.\n\n    This function converts the RGB image to the LAB color space and extracts the L channel.\n    The LAB color space is designed to approximate human vision, where L represents lightness.\n\n    Key aspects of this method:\n    1. The L channel represents the lightness of each pixel, ranging from 0 (black) to 100 (white).\n    2. It's more perceptually uniform than RGB, meaning equal changes in L values correspond to\n       roughly equal changes in perceived lightness.\n    3. The L channel is independent of the color information (A and B channels), making it\n       suitable for grayscale conversion.\n\n    This method can be particularly useful when you want a grayscale image that closely\n    matches human perception of lightness, potentially preserving more perceived contrast\n    than simple RGB-based methods.\n\n    Args:\n        img (np.ndarray): Input RGB image as a numpy array.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array, representing the L (lightness) channel.\n                    Values are scaled to match the input image's data type range.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n    \"\"\"\n    return cv2.cvtColor(img, cv2.COLOR_RGB2LAB)[..., 0]\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_max","title":"<code>def to_gray_max    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using the maximum channel value method.</p> <p>This function takes the maximum value across all channels for each pixel, resulting in a grayscale image that preserves the brightest parts of the original image.</p> <p>Key aspects of this method: 1. Works with any number of channels, making it versatile for various image types. 2. For 3-channel (e.g., RGB) images, this method is equivalent to extracting the V (Value)    channel from the HSV color space. 3. Preserves the brightest parts of the image but may lose some color contrast information. 4. Simple and fast to compute.</p> <p>Note: - This method tends to produce brighter grayscale images compared to other conversion methods,   as it always selects the highest intensity value from the channels. - For RGB images, it may not accurately represent perceived brightness as it doesn't   account for human color perception.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array. Can be any number of channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array. The output data type             matches the input data type.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def to_gray_max(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using the maximum channel value method.\n\n    This function takes the maximum value across all channels for each pixel,\n    resulting in a grayscale image that preserves the brightest parts of the original image.\n\n    Key aspects of this method:\n    1. Works with any number of channels, making it versatile for various image types.\n    2. For 3-channel (e.g., RGB) images, this method is equivalent to extracting the V (Value)\n       channel from the HSV color space.\n    3. Preserves the brightest parts of the image but may lose some color contrast information.\n    4. Simple and fast to compute.\n\n    Note:\n    - This method tends to produce brighter grayscale images compared to other conversion methods,\n      as it always selects the highest intensity value from the channels.\n    - For RGB images, it may not accurately represent perceived brightness as it doesn't\n      account for human color perception.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array. Can be any number of channels.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array. The output data type\n                    matches the input data type.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    return np.max(img, axis=-1)\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_pca","title":"<code>def to_gray_pca    (img)    </code> [view source on GitHub]","text":"<p>Convert an image to grayscale using Principal Component Analysis (PCA).</p> <p>This function applies PCA to reduce a multi-channel image to a single channel, effectively creating a grayscale representation that captures the maximum variance in the color data.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image as a numpy array with shape (height, width, channels).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array with shape (height, width).             If input is uint8, output is uint8 in range [0, 255].             If input is float32, output is float32 in range [0, 1].</p> <p>Note</p> <p>This method can potentially preserve more information from the original image compared to standard weighted average methods, as it accounts for the correlations between color channels.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>@clipped\ndef to_gray_pca(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an image to grayscale using Principal Component Analysis (PCA).\n\n    This function applies PCA to reduce a multi-channel image to a single channel,\n    effectively creating a grayscale representation that captures the maximum variance\n    in the color data.\n\n    Args:\n        img (np.ndarray): Input image as a numpy array with shape (height, width, channels).\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array with shape (height, width).\n                    If input is uint8, output is uint8 in range [0, 255].\n                    If input is float32, output is float32 in range [0, 1].\n\n    Note:\n        This method can potentially preserve more information from the original image\n        compared to standard weighted average methods, as it accounts for the\n        correlations between color channels.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n    \"\"\"\n    dtype = img.dtype\n    # Reshape the image to a 2D array of pixels\n    pixels = img.reshape(-1, img.shape[2])\n\n    # Perform PCA\n    pca = PCA(n_components=1)\n    pca_result = pca.fit_transform(pixels)\n\n    # Reshape back to image dimensions and scale to 0-255\n    grayscale = pca_result.reshape(img.shape[:2])\n    grayscale = normalize_per_image(grayscale, \"min_max\")\n\n    return from_float(grayscale, target_dtype=dtype) if dtype == np.uint8 else grayscale\n</code></pre>"},{"location":"api_reference/augmentations/functional/#albumentations.augmentations.functional.to_gray_weighted_average","title":"<code>def to_gray_weighted_average    (img)    </code> [view source on GitHub]","text":"<p>Convert an RGB image to grayscale using the weighted average method.</p> <p>This function uses OpenCV's cvtColor function with COLOR_RGB2GRAY conversion, which applies the following formula: Y = 0.299R + 0.587G + 0.114*B</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input RGB image as a numpy array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image as a 2D numpy array.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> Source code in <code>albumentations/augmentations/functional.py</code> Python<pre><code>def to_gray_weighted_average(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Convert an RGB image to grayscale using the weighted average method.\n\n    This function uses OpenCV's cvtColor function with COLOR_RGB2GRAY conversion,\n    which applies the following formula:\n    Y = 0.299*R + 0.587*G + 0.114*B\n\n    Args:\n        img (np.ndarray): Input RGB image as a numpy array.\n\n    Returns:\n        np.ndarray: Grayscale image as a 2D numpy array.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n    \"\"\"\n    return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/","title":"Geometric augmentations (augmentations.geometric)","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional","title":"<code>functional</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.adjust_padding_by_position","title":"<code>def adjust_padding_by_position    (h_top, h_bottom, w_left, w_right, position, py_random)    </code> [view source on GitHub]","text":"<p>Adjust padding values based on desired position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def adjust_padding_by_position(\n    h_top: int,\n    h_bottom: int,\n    w_left: int,\n    w_right: int,\n    position: PositionType,\n    py_random: np.random.RandomState,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Adjust padding values based on desired position.\"\"\"\n    if position == \"center\":\n        return h_top, h_bottom, w_left, w_right\n\n    if position == \"top_left\":\n        return 0, h_top + h_bottom, 0, w_left + w_right\n\n    if position == \"top_right\":\n        return 0, h_top + h_bottom, w_left + w_right, 0\n\n    if position == \"bottom_left\":\n        return h_top + h_bottom, 0, 0, w_left + w_right\n\n    if position == \"bottom_right\":\n        return h_top + h_bottom, 0, w_left + w_right, 0\n\n    if position == \"random\":\n        h_pad = h_top + h_bottom\n        w_pad = w_left + w_right\n        h_top = py_random.randint(0, h_pad)\n        h_bottom = h_pad - h_top\n        w_left = py_random.randint(0, w_pad)\n        w_right = w_pad - w_left\n        return h_top, h_bottom, w_left, w_right\n\n    raise ValueError(f\"Unknown position: {position}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.almost_equal_intervals","title":"<code>def almost_equal_intervals    (n, parts)    </code> [view source on GitHub]","text":"<p>Generates an array of nearly equal integer intervals that sum up to <code>n</code>.</p> <p>This function divides the number <code>n</code> into <code>parts</code> nearly equal parts. It ensures that the sum of all parts equals <code>n</code>, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.</p> <p>Parameters:</p> Name Type Description <code>n</code> <code>int</code> <p>The total value to be split.</p> <code>parts</code> <code>int</code> <p>The number of parts to split into.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of integers where each integer represents the size of a part.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; almost_equal_intervals(20, 3)\narray([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n&gt;&gt;&gt; almost_equal_intervals(16, 4)\narray([4, 4, 4, 4])  # Splits 16 into four equal parts\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def almost_equal_intervals(n: int, parts: int) -&gt; np.ndarray:\n    \"\"\"Generates an array of nearly equal integer intervals that sum up to `n`.\n\n    This function divides the number `n` into `parts` nearly equal parts. It ensures that\n    the sum of all parts equals `n`, and the difference between any two parts is at most one.\n    This is useful for distributing a total amount into nearly equal discrete parts.\n\n    Args:\n        n (int): The total value to be split.\n        parts (int): The number of parts to split into.\n\n    Returns:\n        np.ndarray: An array of integers where each integer represents the size of a part.\n\n    Example:\n        &gt;&gt;&gt; almost_equal_intervals(20, 3)\n        array([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n        &gt;&gt;&gt; almost_equal_intervals(16, 4)\n        array([4, 4, 4, 4])  # Splits 16 into four equal parts\n    \"\"\"\n    part_size, remainder = divmod(n, parts)\n    # Create an array with the base part size and adjust the first `remainder` parts by adding 1\n    return np.array(\n        [part_size + 1 if i &lt; remainder else part_size for i in range(parts)],\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.apply_affine_to_points","title":"<code>def apply_affine_to_points    (points, matrix)    </code> [view source on GitHub]","text":"<p>Apply affine transformation to a set of points.</p> <p>This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.</p> <p>Parameters:</p> Name Type Description <code>points</code> <code>np.ndarray</code> <p>Array of points with shape (N, 2).</p> <code>matrix</code> <code>np.ndarray</code> <p>3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (N, 2).</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"points\")\ndef apply_affine_to_points(points: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply affine transformation to a set of points.\n\n    This function handles potential division by zero by replacing zero values\n    in the homogeneous coordinate with a small epsilon value.\n\n    Args:\n        points (np.ndarray): Array of points with shape (N, 2).\n        matrix (np.ndarray): 3x3 affine transformation matrix.\n\n    Returns:\n        np.ndarray: Transformed points with shape (N, 2).\n    \"\"\"\n    homogeneous_points = np.column_stack([points, np.ones(points.shape[0])])\n    transformed_points = homogeneous_points @ matrix.T\n\n    # Handle potential division by zero\n    epsilon = np.finfo(transformed_points.dtype).eps\n    transformed_points[:, 2] = np.where(\n        np.abs(transformed_points[:, 2]) &lt; epsilon,\n        np.sign(transformed_points[:, 2]) * epsilon,\n        transformed_points[:, 2],\n    )\n\n    return transformed_points[:, :2] / transformed_points[:, 2:]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_affine","title":"<code>def bboxes_affine    (bboxes, matrix, rotate_method, image_shape, border_mode, output_shape)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes.</p> <p>For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes</p> <p>For other border modes, it directly applies the affine transformation without padding.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Input bounding boxes</p> <code>matrix</code> <code>np.ndarray</code> <p>Affine transformation matrix</p> <code>rotate_method</code> <code>str</code> <p>Method for rotating bounding boxes ('largest_box' or 'ellipse')</p> <code>image_shape</code> <code>Sequence[int]</code> <p>Shape of the input image</p> <code>border_mode</code> <code>int</code> <p>OpenCV border mode</p> <code>output_shape</code> <code>Sequence[int]</code> <p>Shape of the output image</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed and normalized bounding boxes</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine(\n    bboxes: np.ndarray,\n    matrix: np.ndarray,\n    rotate_method: Literal[\"largest_box\", \"ellipse\"],\n    image_shape: tuple[int, int],\n    border_mode: int,\n    output_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes.\n\n    For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function:\n    1. Calculates necessary padding to avoid information loss\n    2. Applies padding to the bounding boxes\n    3. Adjusts the transformation matrix to account for padding\n    4. Applies the affine transformation\n    5. Validates the transformed bounding boxes\n\n    For other border modes, it directly applies the affine transformation without padding.\n\n    Args:\n        bboxes (np.ndarray): Input bounding boxes\n        matrix (np.ndarray): Affine transformation matrix\n        rotate_method (str): Method for rotating bounding boxes ('largest_box' or 'ellipse')\n        image_shape (Sequence[int]): Shape of the input image\n        border_mode (int): OpenCV border mode\n        output_shape (Sequence[int]): Shape of the output image\n\n    Returns:\n        np.ndarray: Transformed and normalized bounding boxes\n    \"\"\"\n    if is_identity_matrix(matrix):\n        return bboxes\n\n    bboxes = denormalize_bboxes(bboxes, image_shape)\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        bboxes = generate_reflected_bboxes(\n            bboxes,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Apply affine transform\n    if rotate_method == \"largest_box\":\n        transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n    elif rotate_method == \"ellipse\":\n        transformed_bboxes = bboxes_affine_ellipse(bboxes, matrix)\n    else:\n        raise ValueError(f\"Method {rotate_method} is not a valid rotation method.\")\n\n    # Validate and normalize bboxes\n    validated_bboxes = validate_bboxes(transformed_bboxes, output_shape)\n\n    return normalize_bboxes(validated_bboxes, output_shape)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_affine_ellipse","title":"<code>def bboxes_affine_ellipse    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes using an ellipse approximation method.</p> <p>This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The ellipse approximation method can provide a tighter bounding box compared to the   largest box method, especially for rotations.</li> <li>360 points are used to approximate each ellipse, which provides a good balance between   accuracy and computational efficiency.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method may be more suitable for objects that are roughly elliptical in shape.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_ellipse(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes using an ellipse approximation method.\n\n    This function transforms bounding boxes by approximating each box with an ellipse,\n    transforming points along the ellipse's circumference, and then computing the\n    new bounding box that encloses the transformed ellipse.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The ellipse approximation method can provide a tighter bounding box compared to the\n          largest box method, especially for rotations.\n        - 360 points are used to approximate each ellipse, which provides a good balance between\n          accuracy and computational efficiency.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method may be more suitable for objects that are roughly elliptical in shape.\n    \"\"\"\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n    bbox_width = (x_max - x_min) / 2\n    bbox_height = (y_max - y_min) / 2\n    center_x = x_min + bbox_width\n    center_y = y_min + bbox_height\n\n    angles = np.arange(0, 360, dtype=np.float32)\n    cos_angles = np.cos(np.radians(angles))\n    sin_angles = np.sin(np.radians(angles))\n\n    # Generate points for all ellipses at once\n    x = bbox_width[:, np.newaxis] * sin_angles + center_x[:, np.newaxis]\n    y = bbox_height[:, np.newaxis] * cos_angles + center_y[:, np.newaxis]\n    points = np.stack([x, y], axis=-1).reshape(-1, 2)\n\n    # Transform all points at once using the helper function\n    transformed_points = apply_affine_to_points(points, matrix)\n\n    transformed_points = transformed_points.reshape(len(bboxes), -1, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_points[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_points[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_points[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_points[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_affine_largest_box","title":"<code>def bboxes_affine_largest_box    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes and return the largest enclosing boxes.</p> <p>This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The resulting bounding boxes are the smallest axis-aligned boxes that completely   enclose the transformed original boxes. They may be larger than the minimal possible   bounding box if the original box becomes rotated.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method is called \"largest box\" because it returns the largest axis-aligned box   that encloses all corners of the transformed bounding box.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n&gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n&gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n&gt;&gt;&gt; print(transformed_bboxes)\n[[ 25.  25.  45.  45.   1.]\n [ 65.  65.  85.  85.   2.]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_largest_box(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes and return the largest enclosing boxes.\n\n    This function transforms each corner of every bounding box using the given affine transformation\n    matrix, then computes the new bounding boxes that fully enclose the transformed corners.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The resulting bounding boxes are the smallest axis-aligned boxes that completely\n          enclose the transformed original boxes. They may be larger than the minimal possible\n          bounding box if the original box becomes rotated.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method is called \"largest box\" because it returns the largest axis-aligned box\n          that encloses all corners of the transformed bounding box.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n        &gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n        &gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n        &gt;&gt;&gt; print(transformed_bboxes)\n        [[ 25.  25.  45.  45.   1.]\n         [ 65.  65.  85.  85.   2.]]\n    \"\"\"\n    # Extract corners of all bboxes\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    corners = (\n        np.array([[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]]).transpose(2, 0, 1).reshape(-1, 2)\n    )\n\n    # Transform all corners at once\n    transformed_corners = apply_affine_to_points(corners, matrix).reshape(-1, 4, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_corners[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_corners[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_corners[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_corners[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_d4","title":"<code>def bboxes_d4    (bboxes, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a bounding box.</p> <p>The function transforms a bounding box according to the specified group member from the <code>D_4</code> group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions.</p> <ul> <li>bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</li> <li>group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.</li> </ul> <ul> <li>BoxInternalType: The transformed bounding box.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Applying a 90-degree rotation:   <code>bbox_d4((10, 20, 110, 120), 'r90')</code>   This would rotate the bounding box 90 degrees within a 100x100 image.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_d4(\n    bboxes: np.ndarray,\n    group_member: D4Type,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a bounding box.\n\n    The function transforms a bounding box according to the specified group member from the `D_4` group.\n    These transformations include rotations and reflections, specified to work on an image's bounding box given\n    its dimensions.\n\n    Parameters:\n    -  bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n    - group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.\n\n    Returns:\n    - BoxInternalType: The transformed bounding box.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Applying a 90-degree rotation:\n      `bbox_d4((10, 20, 110, 120), 'r90')`\n      This would rotate the bounding box 90 degrees within a 100x100 image.\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: bboxes_rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: bboxes_rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: bboxes_rot90(x, 3),  # Rotate 270 degrees\n        \"v\": lambda x: bboxes_vflip(x),  # Vertical flip\n        \"hvt\": lambda x: bboxes_transpose(\n            bboxes_rot90(x, 2),\n        ),  # Reflect over anti-diagonal\n        \"h\": lambda x: bboxes_hflip(x),  # Horizontal flip\n        \"t\": lambda x: bboxes_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](bboxes)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_grid_shuffle","title":"<code>def bboxes_grid_shuffle    (bboxes, tiles, mapping, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Apply grid shuffle transformation to bounding boxes.</p> <p>This function transforms bounding boxes according to a grid shuffle operation. It handles cases where bounding boxes may be split into multiple components after shuffling and applies filtering based on minimum area and visibility requirements.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) where N is the number of boxes.    Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents    optional additional fields (e.g., class_id, score).</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tile coordinates with shape (M, 4) where M is the number of tiles.    Each tile is in format [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int]</code> <p>List of indices defining how tiles should be rearranged. Each index i in the list     contains the index of the tile that should be moved to position i.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area threshold in pixels. If a component's area after shuffling is      smaller than this value, it will be filtered out. If None, no area filtering      is applied.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio threshold in range [0, 1]. Calculated as            (component_area / original_area). If a component's visibility is lower            than this value, it will be filtered out. If None, no visibility            filtering is applied.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of transformed bounding boxes with shape (K, 4+) where K is the            number of valid components after shuffling and filtering. The format of            each box matches the input format, preserving any additional fields.            If no valid components remain after filtering, returns an empty array            with shape (0, C) where C matches the input column count.</p> <p>Note</p> <ul> <li>The function converts bboxes to masks before applying the transformation to handle   cases where boxes may be split into multiple components.</li> <li>After shuffling, each component is validated against min_area and min_visibility   requirements independently.</li> <li>Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and   copied to all components derived from the same original bbox.</li> <li>Empty input arrays are handled gracefully and return empty arrays of the   appropriate shape.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n&gt;&gt;&gt; tiles = np.array([\n...     [0, 0, 50, 50],    # top-left tile\n...     [0, 50, 50, 100],  # top-right tile\n...     [50, 0, 100, 50],  # bottom-left tile\n...     [50, 50, 100, 100] # bottom-right tile\n... ])\n&gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n&gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n&gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_grid_shuffle(\n    bboxes: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int],\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply grid shuffle transformation to bounding boxes.\n\n    This function transforms bounding boxes according to a grid shuffle operation. It handles cases\n    where bounding boxes may be split into multiple components after shuffling and applies\n    filtering based on minimum area and visibility requirements.\n\n    Args:\n        bboxes: Array of bounding boxes with shape (N, 4+) where N is the number of boxes.\n               Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents\n               optional additional fields (e.g., class_id, score).\n        tiles: Array of tile coordinates with shape (M, 4) where M is the number of tiles.\n               Each tile is in format [start_y, start_x, end_y, end_x].\n        mapping: List of indices defining how tiles should be rearranged. Each index i in the list\n                contains the index of the tile that should be moved to position i.\n        image_shape: Shape of the image as (height, width).\n        min_area: Minimum area threshold in pixels. If a component's area after shuffling is\n                 smaller than this value, it will be filtered out. If None, no area filtering\n                 is applied.\n        min_visibility: Minimum visibility ratio threshold in range [0, 1]. Calculated as\n                       (component_area / original_area). If a component's visibility is lower\n                       than this value, it will be filtered out. If None, no visibility\n                       filtering is applied.\n\n    Returns:\n        np.ndarray: Array of transformed bounding boxes with shape (K, 4+) where K is the\n                   number of valid components after shuffling and filtering. The format of\n                   each box matches the input format, preserving any additional fields.\n                   If no valid components remain after filtering, returns an empty array\n                   with shape (0, C) where C matches the input column count.\n\n    Note:\n        - The function converts bboxes to masks before applying the transformation to handle\n          cases where boxes may be split into multiple components.\n        - After shuffling, each component is validated against min_area and min_visibility\n          requirements independently.\n        - Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and\n          copied to all components derived from the same original bbox.\n        - Empty input arrays are handled gracefully and return empty arrays of the\n          appropriate shape.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n        &gt;&gt;&gt; tiles = np.array([\n        ...     [0, 0, 50, 50],    # top-left tile\n        ...     [0, 50, 50, 100],  # top-right tile\n        ...     [50, 0, 100, 50],  # bottom-left tile\n        ...     [50, 50, 100, 100] # bottom-right tile\n        ... ])\n        &gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n        &gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n        &gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n    \"\"\"\n    # Convert bboxes to masks\n    masks = masks_from_bboxes(bboxes, image_shape)\n\n    # Apply grid shuffle to each mask and handle split components\n    all_component_masks = []\n    extra_bbox_data = []  # Store additional bbox data for each component\n\n    for idx, mask in enumerate(masks):\n        original_area = np.sum(mask)  # Get original mask area\n\n        # Shuffle the mask\n        shuffled_mask = swap_tiles_on_image(mask, tiles, mapping)\n\n        # Find connected components\n        num_components, components = cv2.connectedComponents(\n            shuffled_mask.astype(np.uint8),\n        )\n\n        # For each component, create a separate binary mask\n        for comp_idx in range(1, num_components):  # Skip background (0)\n            component_mask = (components == comp_idx).astype(np.uint8)\n\n            # Calculate area and visibility ratio\n            component_area = np.sum(component_mask)\n            # Check if component meets minimum requirements\n            if is_valid_component(\n                component_area,\n                original_area,\n                min_area,\n                min_visibility,\n            ):\n                all_component_masks.append(component_mask)\n                # Append additional bbox data for this component\n                if bboxes.shape[1] &gt; NUM_BBOXES_COLUMNS_IN_ALBUMENTATIONS:\n                    extra_bbox_data.append(bboxes[idx, 4:])\n\n    # Convert all component masks to bboxes\n    if all_component_masks:\n        all_component_masks = np.array(all_component_masks)\n        shuffled_bboxes = bboxes_from_masks(all_component_masks)\n\n        # Add back additional bbox data if present\n        if extra_bbox_data:\n            extra_bbox_data = np.array(extra_bbox_data)\n            return np.column_stack([shuffled_bboxes, extra_bbox_data])\n    else:\n        # Handle case where no valid components were found\n        return np.zeros((0, bboxes.shape[1]), dtype=bboxes.dtype)\n\n    return shuffled_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_hflip","title":"<code>def bboxes_hflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of horizontally flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_hflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally around the y-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of horizontally flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 0] = 1 - bboxes[:, 2]  # new x_min = 1 - x_max\n    flipped_bboxes[:, 2] = 1 - bboxes[:, 0]  # new x_max = 1 - x_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_rot90","title":"<code>def bboxes_rot90    (bboxes, factor)    </code> [view source on GitHub]","text":"<p>Rotates bounding boxes by 90 degrees CCW (see np.rot90)</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <code>factor</code> <code>Literal[0, 1, 2, 3]</code> <p>Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of rotated bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_rot90(bboxes: np.ndarray, factor: Literal[0, 1, 2, 3]) -&gt; np.ndarray:\n    \"\"\"Rotates bounding boxes by 90 degrees CCW (see np.rot90)\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n        factor: Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.\n\n    Returns:\n        np.ndarray: A numpy array of rotated bounding boxes with the same shape as input.\n    \"\"\"\n    if factor == 0:\n        return bboxes\n\n    rotated_bboxes = bboxes.copy()\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    if factor == 1:\n        rotated_bboxes[:, 0] = y_min\n        rotated_bboxes[:, 1] = 1 - x_max\n        rotated_bboxes[:, 2] = y_max\n        rotated_bboxes[:, 3] = 1 - x_min\n    elif factor == ROT90_180_FACTOR:\n        rotated_bboxes[:, 0] = 1 - x_max\n        rotated_bboxes[:, 1] = 1 - y_max\n        rotated_bboxes[:, 2] = 1 - x_min\n        rotated_bboxes[:, 3] = 1 - y_min\n    elif factor == ROT90_270_FACTOR:\n        rotated_bboxes[:, 0] = 1 - y_max\n        rotated_bboxes[:, 1] = x_min\n        rotated_bboxes[:, 2] = 1 - y_min\n        rotated_bboxes[:, 3] = x_max\n\n    return rotated_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_transpose","title":"<code>def bboxes_transpose    (bboxes)    </code> [view source on GitHub]","text":"<p>Transpose bounding boxes by swapping x and y coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of transposed bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_transpose(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transpose bounding boxes by swapping x and y coordinates.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of transposed bounding boxes with the same shape as input.\n    \"\"\"\n    transposed_bboxes = bboxes.copy()\n    transposed_bboxes[:, [0, 1, 2, 3]] = bboxes[:, [1, 0, 3, 2]]\n\n    return transposed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.bboxes_vflip","title":"<code>def bboxes_vflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of vertically flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_vflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes vertically around the x-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of vertically flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 1] = 1 - bboxes[:, 3]  # new y_min = 1 - y_max\n    flipped_bboxes[:, 3] = 1 - bboxes[:, 1]  # new y_max = 1 - y_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.calculate_affine_transform_padding","title":"<code>def calculate_affine_transform_padding    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the necessary padding for an affine transformation to avoid empty spaces.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def calculate_affine_transform_padding(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate the necessary padding for an affine transformation to avoid empty spaces.\"\"\"\n    height, width = image_shape[:2]\n\n    # Check for identity transform\n    if is_identity_matrix(matrix):\n        return (0, 0, 0, 0)\n\n    # Original corners\n    corners = np.array([[0, 0], [width, 0], [width, height], [0, height]])\n\n    # Transform corners\n    transformed_corners = apply_affine_to_points(corners, matrix)\n\n    # Ensure transformed_corners is 2D\n    transformed_corners = transformed_corners.reshape(-1, 2)\n\n    # Find box that includes both original and transformed corners\n    all_corners = np.vstack((corners, transformed_corners))\n    min_x, min_y = all_corners.min(axis=0)\n    max_x, max_y = all_corners.max(axis=0)\n\n    # Compute the inverse transform\n    inverse_matrix = np.linalg.inv(matrix)\n\n    # Apply inverse transform to all corners of the bounding box\n    bbox_corners = np.array(\n        [[min_x, min_y], [max_x, min_y], [max_x, max_y], [min_x, max_y]],\n    )\n    inverse_corners = apply_affine_to_points(bbox_corners, inverse_matrix).reshape(\n        -1,\n        2,\n    )\n\n    min_x, min_y = inverse_corners.min(axis=0)\n    max_x, max_y = inverse_corners.max(axis=0)\n\n    pad_left = max(0, math.ceil(0 - min_x))\n    pad_right = max(0, math.ceil(max_x - width))\n    pad_top = max(0, math.ceil(0 - min_y))\n    pad_bottom = max(0, math.ceil(max_y - height))\n\n    return pad_left, pad_right, pad_top, pad_bottom\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.center","title":"<code>def center    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates if image. Used by images, masks and keypoints.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates if image. Used by images, masks and keypoints.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2 - 0.5, height / 2 - 0.5\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.center_bbox","title":"<code>def center_bbox    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates for of image for bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center_bbox(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates for of image for bounding boxes.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2, height / 2\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.compute_tps_weights","title":"<code>def compute_tps_weights    (src_points, dst_points)    </code> [view source on GitHub]","text":"<p>Compute Thin Plate Spline weights.</p> <p>Parameters:</p> Name Type Description <code>src_points</code> <code>np.ndarray</code> <p>Source control points with shape (num_points, 2)</p> <code>dst_points</code> <code>np.ndarray</code> <p>Destination control points with shape (num_points, 2)</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)</li> <li>affine_weights: Weights for affine transformation (3, 2)     [constant term, x scale/shear, y scale/shear]</li> </ul> <p>Note</p> <p>The TPS interpolation is decomposed into: 1. Nonlinear part (controlled by kernel weights) 2. Affine part (global scaling, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_tps_weights(\n    src_points: np.ndarray,\n    dst_points: np.ndarray,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute Thin Plate Spline weights.\n\n    Args:\n        src_points: Source control points with shape (num_points, 2)\n        dst_points: Destination control points with shape (num_points, 2)\n\n    Returns:\n        tuple of:\n        - nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)\n        - affine_weights: Weights for affine transformation (3, 2)\n            [constant term, x scale/shear, y scale/shear]\n\n    Note:\n        The TPS interpolation is decomposed into:\n        1. Nonlinear part (controlled by kernel weights)\n        2. Affine part (global scaling, rotation, translation)\n    \"\"\"\n    num_points = src_points.shape[0]\n\n    # Compute pairwise distances\n    distances = np.linalg.norm(src_points[:, None] - src_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    # Add small epsilon to avoid log(0)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Construct affine terms matrix [1, x, y]\n    affine_terms = np.ones((num_points, 3))\n    affine_terms[:, 1:] = src_points\n\n    # Build system matrix\n    system_matrix = np.zeros((num_points + 3, num_points + 3))\n    system_matrix[:num_points, :num_points] = kernel_matrix\n    system_matrix[:num_points, num_points:] = affine_terms\n    system_matrix[num_points:, :num_points] = affine_terms.T\n\n    # Right-hand side of the system\n    target_coords = np.zeros((num_points + 3, 2))\n    target_coords[:num_points] = dst_points\n\n    # Solve the system for both x and y coordinates\n    all_weights = np.linalg.solve(system_matrix, target_coords)\n\n    # Split weights into nonlinear and affine components\n    nonlinear_weights = all_weights[:num_points]\n    affine_weights = all_weights[num_points:]\n\n    return nonlinear_weights, affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.compute_transformed_image_bounds","title":"<code>def compute_transformed_image_bounds    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Compute the bounds of an image after applying an affine transformation.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing:     - min_coords: An array with the minimum x and y coordinates.     - max_coords: An array with the maximum x and y coordinates.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_transformed_image_bounds(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute the bounds of an image after applying an affine transformation.\n\n    Args:\n        matrix (np.ndarray): The 3x3 affine transformation matrix.\n        image_shape (Tuple[int, int]): The shape of the image as (height, width).\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing:\n            - min_coords: An array with the minimum x and y coordinates.\n            - max_coords: An array with the maximum x and y coordinates.\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Define the corners of the image\n    corners = np.array([[0, 0, 1], [width, 0, 1], [width, height, 1], [0, height, 1]])\n\n    # Transform the corners\n    transformed_corners = corners @ matrix.T\n    transformed_corners = transformed_corners[:, :2] / transformed_corners[:, 2:]\n\n    # Calculate the bounding box of the transformed corners\n    min_coords = np.floor(transformed_corners.min(axis=0)).astype(int)\n    max_coords = np.ceil(transformed_corners.max(axis=0)).astype(int)\n\n    return min_coords, max_coords\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.create_affine_transformation_matrix","title":"<code>def create_affine_transformation_matrix    (translate, shear, scale, rotate, shift)    </code> [view source on GitHub]","text":"<p>Create an affine transformation matrix combining translation, shear, scale, and rotation.</p> <p>Parameters:</p> Name Type Description <code>translate</code> <code>dict[str, float]</code> <p>Translation in x and y directions.</p> <code>shear</code> <code>dict[str, float]</code> <p>Shear in x and y directions (in degrees).</p> <code>scale</code> <code>dict[str, float]</code> <p>Scale factors for x and y directions.</p> <code>rotate</code> <code>float</code> <p>Rotation angle in degrees.</p> <code>shift</code> <code>tuple[float, float]</code> <p>Shift to apply before and after transformations.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The resulting 3x3 affine transformation matrix.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_affine_transformation_matrix(\n    translate: XYInt,\n    shear: XYFloat,\n    scale: XYFloat,\n    rotate: float,\n    shift: tuple[float, float],\n) -&gt; np.ndarray:\n    \"\"\"Create an affine transformation matrix combining translation, shear, scale, and rotation.\n\n    Args:\n        translate (dict[str, float]): Translation in x and y directions.\n        shear (dict[str, float]): Shear in x and y directions (in degrees).\n        scale (dict[str, float]): Scale factors for x and y directions.\n        rotate (float): Rotation angle in degrees.\n        shift (tuple[float, float]): Shift to apply before and after transformations.\n\n    Returns:\n        np.ndarray: The resulting 3x3 affine transformation matrix.\n    \"\"\"\n    # Convert angles to radians\n    rotate_rad = np.deg2rad(rotate % 360)\n\n    shear_x_rad = np.deg2rad(shear[\"x\"])\n    shear_y_rad = np.deg2rad(shear[\"y\"])\n\n    # Create individual transformation matrices\n    # 1. Shift to top-left\n    m_shift_topleft = np.array([[1, 0, -shift[0]], [0, 1, -shift[1]], [0, 0, 1]])\n\n    # 2. Scale\n    m_scale = np.array([[scale[\"x\"], 0, 0], [0, scale[\"y\"], 0], [0, 0, 1]])\n\n    # 3. Rotation\n    m_rotate = np.array(\n        [\n            [np.cos(rotate_rad), np.sin(rotate_rad), 0],\n            [-np.sin(rotate_rad), np.cos(rotate_rad), 0],\n            [0, 0, 1],\n        ],\n    )\n\n    # 4. Shear\n    m_shear = np.array(\n        [[1, np.tan(shear_x_rad), 0], [np.tan(shear_y_rad), 1, 0], [0, 0, 1]],\n    )\n\n    # 5. Translation\n    m_translate = np.array([[1, 0, translate[\"x\"]], [0, 1, translate[\"y\"]], [0, 0, 1]])\n\n    # 6. Shift back to center\n    m_shift_center = np.array([[1, 0, shift[0]], [0, 1, shift[1]], [0, 0, 1]])\n\n    # Combine all transformations\n    # The order is important: transformations are applied from right to left\n    m = m_shift_center @ m_translate @ m_shear @ m_rotate @ m_scale @ m_shift_topleft\n\n    # Ensure the last row is exactly [0, 0, 1]\n    m[2] = [0, 0, 1]\n\n    return m\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.create_piecewise_affine_maps","title":"<code>def create_piecewise_affine_maps    (image_shape, grid, scale, absolute_scale, random_generator)    </code> [view source on GitHub]","text":"<p>Create maps for piecewise affine transformation using OpenCV's remap function.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_piecewise_affine_maps(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    scale: float,\n    absolute_scale: bool,\n    random_generator: np.random.Generator,\n) -&gt; tuple[np.ndarray | None, np.ndarray | None]:\n    \"\"\"Create maps for piecewise affine transformation using OpenCV's remap function.\"\"\"\n    height, width = image_shape[:2]\n    nb_rows, nb_cols = grid\n\n    # Input validation\n    if height &lt;= 0 or width &lt;= 0 or nb_rows &lt;= 0 or nb_cols &lt;= 0:\n        raise ValueError(\"Dimensions must be positive\")\n    if scale &lt;= 0:\n        return None, None\n\n    # Create source points grid\n    y = np.linspace(0, height - 1, nb_rows, dtype=np.float32)\n    x = np.linspace(0, width - 1, nb_cols, dtype=np.float32)\n    xx_src, yy_src = np.meshgrid(x, y)\n\n    # Initialize destination maps at full resolution\n    map_x = np.zeros((height, width), dtype=np.float32)\n    map_y = np.zeros((height, width), dtype=np.float32)\n\n    # Generate jitter for control points\n    jitter_scale = scale / 3 if absolute_scale else scale * min(width, height) / 3\n\n    jitter = random_generator.normal(0, jitter_scale, (nb_rows, nb_cols, 2)).astype(\n        np.float32,\n    )\n\n    # Create control points with jitter\n    control_points = np.zeros((nb_rows * nb_cols, 4), dtype=np.float32)\n    for i in range(nb_rows):\n        for j in range(nb_cols):\n            idx = i * nb_cols + j\n            # Source points\n            control_points[idx, 0] = xx_src[i, j]\n            control_points[idx, 1] = yy_src[i, j]\n            # Destination points with jitter\n            control_points[idx, 2] = np.clip(\n                xx_src[i, j] + jitter[i, j, 1],\n                0,\n                width - 1,\n            )\n            control_points[idx, 3] = np.clip(\n                yy_src[i, j] + jitter[i, j, 0],\n                0,\n                height - 1,\n            )\n\n    # Create full resolution maps\n    for i in range(height):\n        for j in range(width):\n            # Find nearest control points and interpolate\n            dx = j - control_points[:, 0]\n            dy = i - control_points[:, 1]\n            dist = dx * dx + dy * dy\n            weights = 1 / (dist + 1e-8)\n            weights = weights / np.sum(weights)\n\n            map_x[i, j] = np.sum(weights * control_points[:, 2])\n            map_y[i, j] = np.sum(weights * control_points[:, 3])\n\n    # Ensure output is within bounds\n    map_x = np.clip(map_x, 0, width - 1, out=map_x)\n    map_y = np.clip(map_y, 0, height - 1, out=map_y)\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.create_shape_groups","title":"<code>def create_shape_groups    (tiles)    </code> [view source on GitHub]","text":"<p>Groups tiles by their shape and stores the indices for each shape.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_shape_groups(tiles: np.ndarray) -&gt; dict[tuple[int, int], list[int]]:\n    \"\"\"Groups tiles by their shape and stores the indices for each shape.\"\"\"\n    shape_groups = defaultdict(list)\n    for index, (start_y, start_x, end_y, end_x) in enumerate(tiles):\n        shape = (end_y - start_y, end_x - start_x)\n        shape_groups[shape].append(index)\n    return shape_groups\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.d4","title":"<code>def d4    (img, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to an image array.</p> <p>This function manipulates an image using transformations such as rotations and flips, corresponding to the <code>D_4</code> dihedral group symmetry operations. Each transformation is identified by a unique group member code.</p> <ul> <li>img (np.ndarray): The input image array to transform.</li> <li>group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:</li> <li>'e': Identity (no transformation).</li> <li>'r90': Rotate 90 degrees counterclockwise.</li> <li>'r180': Rotate 180 degrees.</li> <li>'r270': Rotate 270 degrees counterclockwise.</li> <li>'v': Vertical flip.</li> <li>'hvt': Transpose over second diagonal</li> <li>'h': Horizontal flip.</li> <li>'t': Transpose (reflect over the main diagonal).</li> </ul> <ul> <li>np.ndarray: The transformed image array.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Rotating an image by 90 degrees:   <code>transformed_image = d4(original_image, 'r90')</code></li> <li>Applying a horizontal flip to an image:   <code>transformed_image = d4(original_image, 'h')</code></li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def d4(img: np.ndarray, group_member: D4Type) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to an image array.\n\n    This function manipulates an image using transformations such as rotations and flips,\n    corresponding to the `D_4` dihedral group symmetry operations.\n    Each transformation is identified by a unique group member code.\n\n    Parameters:\n    - img (np.ndarray): The input image array to transform.\n    - group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:\n      - 'e': Identity (no transformation).\n      - 'r90': Rotate 90 degrees counterclockwise.\n      - 'r180': Rotate 180 degrees.\n      - 'r270': Rotate 270 degrees counterclockwise.\n      - 'v': Vertical flip.\n      - 'hvt': Transpose over second diagonal\n      - 'h': Horizontal flip.\n      - 't': Transpose (reflect over the main diagonal).\n\n    Returns:\n    - np.ndarray: The transformed image array.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Rotating an image by 90 degrees:\n      `transformed_image = d4(original_image, 'r90')`\n    - Applying a horizontal flip to an image:\n      `transformed_image = d4(original_image, 'h')`\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: rot90(x, 3),  # Rotate 270 degrees\n        \"v\": vflip,  # Vertical flip\n        \"hvt\": lambda x: transpose(rot90(x, 2)),  # Reflect over anti-diagonal\n        \"h\": hflip,  # Horizontal flip\n        \"t\": transpose,  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](img)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.distort_image","title":"<code>def distort_image    (image, generated_mesh, interpolation)    </code> [view source on GitHub]","text":"<p>Apply perspective distortion to an image based on a generated mesh.</p> <p>This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>The input image to be distorted. Can be a 2D grayscale image or a                 3D color image.</p> <code>generated_mesh</code> <code>np.ndarray</code> <p>A 2D array where each row represents a quadrilateral cell                         as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].                         The first four values define the source rectangle, and the last eight values                         define the destination quadrilateral.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used in the perspective transformation.                  Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The distorted image with the same shape and dtype as the input image.</p> <p>Note</p> <ul> <li>The function preserves the channel dimension of the input image.</li> <li>Each cell of the generated mesh is transformed independently and then blended into the output image.</li> <li>The distortion is applied using perspective transformation, which allows for more complex   distortions compared to affine transformations.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n&gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n&gt;&gt;&gt; distorted.shape\n(100, 100, 3)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@preserve_channel_dim\ndef distort_image(\n    image: np.ndarray,\n    generated_mesh: np.ndarray,\n    interpolation: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply perspective distortion to an image based on a generated mesh.\n\n    This function applies a perspective transformation to each cell of the image defined by the\n    generated mesh. The distortion is applied using OpenCV's perspective transformation and\n    blending techniques.\n\n    Args:\n        image (np.ndarray): The input image to be distorted. Can be a 2D grayscale image or a\n                            3D color image.\n        generated_mesh (np.ndarray): A 2D array where each row represents a quadrilateral cell\n                                    as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].\n                                    The first four values define the source rectangle, and the last eight values\n                                    define the destination quadrilateral.\n        interpolation (int): Interpolation method to be used in the perspective transformation.\n                             Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).\n\n    Returns:\n        np.ndarray: The distorted image with the same shape and dtype as the input image.\n\n    Note:\n        - The function preserves the channel dimension of the input image.\n        - Each cell of the generated mesh is transformed independently and then blended into the output image.\n        - The distortion is applied using perspective transformation, which allows for more complex\n          distortions compared to affine transformations.\n\n    Example:\n        &gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n        &gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n        &gt;&gt;&gt; distorted.shape\n        (100, 100, 3)\n    \"\"\"\n    distorted_image = np.zeros_like(image)\n\n    for mesh in generated_mesh:\n        # Extract source rectangle and destination quadrilateral\n        x1, y1, x2, y2 = mesh[:4]  # Source rectangle\n        dst_quad = mesh[4:].reshape(4, 2)  # Destination quadrilateral\n\n        # Convert source rectangle to quadrilateral\n        src_quad = np.array(\n            [\n                [x1, y1],  # Top-left\n                [x2, y1],  # Top-right\n                [x2, y2],  # Bottom-right\n                [x1, y2],  # Bottom-left\n            ],\n            dtype=np.float32,\n        )\n\n        # Calculate Perspective transformation matrix\n        perspective_mat = cv2.getPerspectiveTransform(src_quad, dst_quad)\n\n        # Apply Perspective transformation\n        warped = cv2.warpPerspective(\n            image,\n            perspective_mat,\n            (image.shape[1], image.shape[0]),\n            flags=interpolation,\n        )\n\n        # Create mask for the transformed region\n        mask = np.zeros(image.shape[:2], dtype=np.uint8)\n        cv2.fillConvexPoly(mask, np.int32(dst_quad), 255)\n\n        # Copy only the warped quadrilateral area to the output image\n        distorted_image = cv2.copyTo(warped, mask, distorted_image)\n\n    return distorted_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.find_keypoint","title":"<code>def find_keypoint    (position, distance_map, threshold, inverted)    </code> [view source on GitHub]","text":"<p>Determine if a valid keypoint can be found at the given position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def find_keypoint(\n    position: tuple[int, int],\n    distance_map: np.ndarray,\n    threshold: float | None,\n    inverted: bool,\n) -&gt; tuple[float, float] | None:\n    \"\"\"Determine if a valid keypoint can be found at the given position.\"\"\"\n    y, x = position\n    value = distance_map[y, x]\n    if not inverted and threshold is not None and value &gt;= threshold:\n        return None\n    if inverted and threshold is not None and value &lt;= threshold:\n        return None\n    return float(x), float(y)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.flip_bboxes","title":"<code>def flip_bboxes    (bboxes, flip_horizontal=False, flip_vertical=False, image_shape=(0, 0))    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally and/or vertically.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...].</p> <code>flip_horizontal</code> <code>bool</code> <p>Whether to flip horizontally.</p> <code>flip_vertical</code> <code>bool</code> <p>Whether to flip vertically.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Flipped bounding boxes.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef flip_bboxes(\n    bboxes: np.ndarray,\n    flip_horizontal: bool = False,\n    flip_vertical: bool = False,\n    image_shape: tuple[int, int] = (0, 0),\n) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally and/or vertically.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where each row is\n            [x_min, y_min, x_max, y_max, ...].\n        flip_horizontal (bool): Whether to flip horizontally.\n        flip_vertical (bool): Whether to flip vertically.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Flipped bounding boxes.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    flipped_bboxes = bboxes.copy()\n    if flip_horizontal:\n        flipped_bboxes[:, [0, 2]] = cols - flipped_bboxes[:, [2, 0]]\n    if flip_vertical:\n        flipped_bboxes[:, [1, 3]] = rows - flipped_bboxes[:, [3, 1]]\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.from_distance_maps","title":"<code>def from_distance_maps    (distance_maps, inverted, if_not_found_coords=None, threshold=None)    </code> [view source on GitHub]","text":"<p>Convert distance maps back to keypoints coordinates.</p> <p>This function is the inverse of <code>to_distance_maps</code>. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.</p> <p>Parameters:</p> Name Type Description <code>distance_maps</code> <code>np.ndarray</code> <p>A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint.</p> <code>inverted</code> <code>bool</code> <p>If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity).</p> <code>if_not_found_coords</code> <code>Sequence[int] | dict[str, Any] | None</code> <p>Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None.</p> <code>threshold</code> <code>float | None</code> <p>A threshold value to determine valid keypoints. For inverted maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates of the reconstructed keypoints. If <code>drop_if_not_found</code> is True (derived from if_not_found_coords), the output may have fewer rows than input keypoints.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input <code>distance_maps</code> is not a 3D array.</p> <p>Notes</p> <ul> <li>The function uses vectorized operations for improved performance, especially with large numbers of keypoints.</li> <li>When <code>threshold</code> is None, all keypoints are considered valid, and <code>if_not_found_coords</code> is not used.</li> <li>The function assumes that the input distance maps are properly normalized and scaled according to the   original image dimensions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n&gt;&gt;&gt; inverted = True\n&gt;&gt;&gt; if_not_found_coords = [0, 0]\n&gt;&gt;&gt; threshold = 0.5\n&gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n&gt;&gt;&gt; print(keypoints.shape)\n(3, 2)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def from_distance_maps(\n    distance_maps: np.ndarray,\n    inverted: bool,\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None = None,\n    threshold: float | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Convert distance maps back to keypoints coordinates.\n\n    This function is the inverse of `to_distance_maps`. It takes distance maps generated for a set of keypoints\n    and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps,\n    and can handle cases where keypoints are not found or fall outside a specified threshold.\n\n    Args:\n        distance_maps (np.ndarray): A 3D numpy array of shape (height, width, nb_keypoints) containing\n            distance maps for each keypoint. Each channel represents the distance map for one keypoint.\n        inverted (bool): If True, treats the distance maps as inverted (where higher values indicate\n            closer proximity to keypoints). If False, treats them as regular distance maps (where lower\n            values indicate closer proximity).\n        if_not_found_coords (Sequence[int] | dict[str, Any] | None, optional): Coordinates to use for\n            keypoints that are not found or fall outside the threshold. Can be:\n            - None: Drop keypoints that are not found.\n            - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints.\n            - Dict with 'x' and 'y' keys: Use these values for not found keypoints.\n            Defaults to None.\n        threshold (float | None, optional): A threshold value to determine valid keypoints. For inverted\n            maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are\n            considered valid. If None, all keypoints are considered valid. Defaults to None.\n\n    Returns:\n        np.ndarray: A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates\n        of the reconstructed keypoints. If `drop_if_not_found` is True (derived from if_not_found_coords),\n        the output may have fewer rows than input keypoints.\n\n    Raises:\n        ValueError: If the input `distance_maps` is not a 3D array.\n\n    Notes:\n        - The function uses vectorized operations for improved performance, especially with large numbers of keypoints.\n        - When `threshold` is None, all keypoints are considered valid, and `if_not_found_coords` is not used.\n        - The function assumes that the input distance maps are properly normalized and scaled according to the\n          original image dimensions.\n\n    Example:\n        &gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n        &gt;&gt;&gt; inverted = True\n        &gt;&gt;&gt; if_not_found_coords = [0, 0]\n        &gt;&gt;&gt; threshold = 0.5\n        &gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n        &gt;&gt;&gt; print(keypoints.shape)\n        (3, 2)\n    \"\"\"\n    if distance_maps.ndim != NUM_MULTI_CHANNEL_DIMENSIONS:\n        msg = f\"Expected three-dimensional input, got {distance_maps.ndim} dimensions and shape {distance_maps.shape}.\"\n        raise ValueError(msg)\n    height, width, nb_keypoints = distance_maps.shape\n\n    drop_if_not_found, if_not_found_x, if_not_found_y = validate_if_not_found_coords(\n        if_not_found_coords,\n    )\n\n    # Find the indices of max/min values for all keypoints at once\n    if inverted:\n        hitidx_flat = np.argmax(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n    else:\n        hitidx_flat = np.argmin(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n\n    # Convert flat indices to 2D coordinates\n    hitidx_y, hitidx_x = np.unravel_index(hitidx_flat, (height, width))\n\n    # Create keypoints array\n    keypoints = np.column_stack((hitidx_x, hitidx_y)).astype(float)\n\n    if threshold is not None:\n        # Check threshold condition\n        if inverted:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &gt;= threshold\n        else:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &lt;= threshold\n\n        if not drop_if_not_found:\n            # Replace invalid keypoints with if_not_found_coords\n            keypoints[~valid_mask] = [if_not_found_x, if_not_found_y]\n        else:\n            # Keep only valid keypoints\n            return keypoints[valid_mask]\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_displacement_fields","title":"<code>def generate_displacement_fields    (image_shape, alpha, sigma, same_dxdy, kernel_size, random_generator, noise_distribution)    </code> [view source on GitHub]","text":"<p>Generate displacement fields for elastic transform.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width)</p> <code>alpha</code> <code>float</code> <p>Scaling factor for displacement</p> <code>sigma</code> <code>float</code> <p>Standard deviation for Gaussian blur</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use same displacement field for both directions</p> <code>kernel_size</code> <code>tuple[int, int]</code> <p>Size of Gaussian blur kernel</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random number generator</p> <code>noise_distribution</code> <code>Literal['gaussian', 'uniform']</code> <p>Type of noise distribution to use (\"gaussian\" or \"uniform\")</p> <p>Returns:</p> Type Description <code>tuple</code> <p>(dx, dy) displacement fields</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_displacement_fields(\n    image_shape: tuple[int, int],\n    alpha: float,\n    sigma: float,\n    same_dxdy: bool,\n    kernel_size: tuple[int, int],\n    random_generator: np.random.Generator,\n    noise_distribution: Literal[\"gaussian\", \"uniform\"],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate displacement fields for elastic transform.\n\n    Args:\n        image_shape: Shape of the image (height, width)\n        alpha: Scaling factor for displacement\n        sigma: Standard deviation for Gaussian blur\n        same_dxdy: Whether to use same displacement field for both directions\n        kernel_size: Size of Gaussian blur kernel\n        random_generator: NumPy random number generator\n        noise_distribution: Type of noise distribution to use (\"gaussian\" or \"uniform\")\n\n    Returns:\n        tuple: (dx, dy) displacement fields\n    \"\"\"\n\n    def generate_noise_field() -&gt; np.ndarray:\n        # Generate noise based on distribution type\n        if noise_distribution == \"gaussian\":\n            field = random_generator.standard_normal(size=image_shape[:2])\n        else:  # uniform\n            field = random_generator.uniform(low=-1, high=1, size=image_shape[:2])\n\n        # Common operations for both distributions\n        field = field.astype(np.float32)\n        cv2.GaussianBlur(field, kernel_size, sigma, dst=field)\n        return field * alpha\n\n    # Generate first displacement field\n    dx = generate_noise_field()\n\n    # Generate or copy second displacement field\n    dy = dx if same_dxdy else generate_noise_field()\n\n    return dx, dy\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons","title":"<code>def generate_distorted_grid_polygons    (dimensions, magnitude, random_generator)    </code> [view source on GitHub]","text":"<p>Generate distorted grid polygons based on input dimensions and magnitude.</p> <p>This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.</p> <p>Parameters:</p> Name Type Description <code>dimensions</code> <code>np.ndarray</code> <p>A 3D array of shape (grid_height, grid_width, 4) where each element                      is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. The actual displacement              will be randomly chosen in the range [-magnitude, magnitude].</p> <code>random_generator</code> <code>np.random.Generator</code> <p>A random number generator.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D array of shape (total_cells, 8) where each row represents a distorted polygon             as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.</p> <p>Note</p> <ul> <li>Only internal grid points are distorted; boundary points remain fixed.</li> <li>The function ensures consistent distortion across shared vertices of adjacent cells.</li> <li>The distortion is applied to the following points of each internal cell:<ul> <li>Bottom-right of the cell above and to the left</li> <li>Bottom-left of the cell above</li> <li>Top-right of the cell to the left</li> <li>Top-left of the current cell</li> </ul> </li> <li>Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.     +--+--+--+--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--+--+--+--+</li> <li>For each X, the coordinates of the left, right, top, and bottom edges   in the four adjacent cells are displaced.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n&gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n&gt;&gt;&gt; distorted.shape\n(4, 8)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_distorted_grid_polygons(\n    dimensions: np.ndarray,\n    magnitude: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate distorted grid polygons based on input dimensions and magnitude.\n\n    This function creates a grid of polygons and applies random distortions to the internal vertices,\n    while keeping the boundary vertices fixed. The distortion is applied consistently across shared\n    vertices to avoid gaps or overlaps in the resulting grid.\n\n    Args:\n        dimensions (np.ndarray): A 3D array of shape (grid_height, grid_width, 4) where each element\n                                 is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.\n        magnitude (int): Maximum pixel-wise displacement for distortion. The actual displacement\n                         will be randomly chosen in the range [-magnitude, magnitude].\n        random_generator (np.random.Generator): A random number generator.\n\n    Returns:\n        np.ndarray: A 2D array of shape (total_cells, 8) where each row represents a distorted polygon\n                    as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.\n\n    Note:\n        - Only internal grid points are distorted; boundary points remain fixed.\n        - The function ensures consistent distortion across shared vertices of adjacent cells.\n        - The distortion is applied to the following points of each internal cell:\n            * Bottom-right of the cell above and to the left\n            * Bottom-left of the cell above\n            * Top-right of the cell to the left\n            * Top-left of the current cell\n        - Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.\n            +--+--+--+--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--+--+--+--+\n        - For each X, the coordinates of the left, right, top, and bottom edges\n          in the four adjacent cells are displaced.\n\n    Example:\n        &gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n        ...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n        &gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n        &gt;&gt;&gt; distorted.shape\n        (4, 8)\n    \"\"\"\n    grid_height, grid_width = dimensions.shape[:2]\n    total_cells = grid_height * grid_width\n\n    # Initialize polygons\n    polygons = np.zeros((total_cells, 8), dtype=np.float32)\n    polygons[:, 0:2] = dimensions.reshape(-1, 4)[:, [0, 1]]  # x1, y1\n    polygons[:, 2:4] = dimensions.reshape(-1, 4)[:, [2, 1]]  # x2, y1\n    polygons[:, 4:6] = dimensions.reshape(-1, 4)[:, [2, 3]]  # x2, y2\n    polygons[:, 6:8] = dimensions.reshape(-1, 4)[:, [0, 3]]  # x1, y2\n\n    # Generate displacements for internal grid points only\n    internal_points_height, internal_points_width = grid_height - 1, grid_width - 1\n    displacements = random_generator.integers(\n        -magnitude,\n        magnitude + 1,\n        size=(internal_points_height, internal_points_width, 2),\n    ).astype(np.float32)\n\n    # Apply displacements to internal polygon vertices\n    for i in range(1, grid_height):\n        for j in range(1, grid_width):\n            dx, dy = displacements[i - 1, j - 1]\n\n            # Bottom-right of cell (i-1, j-1)\n            polygons[(i - 1) * grid_width + (j - 1), 4:6] += [dx, dy]\n\n            # Bottom-left of cell (i-1, j)\n            polygons[(i - 1) * grid_width + j, 6:8] += [dx, dy]\n\n            # Top-right of cell (i, j-1)\n            polygons[i * grid_width + (j - 1), 2:4] += [dx, dy]\n\n            # Top-left of cell (i, j)\n            polygons[i * grid_width + j, 0:2] += [dx, dy]\n\n    return polygons\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_grid","title":"<code>def generate_grid    (image_shape, steps_x, steps_y, num_steps)    </code> [view source on GitHub]","text":"<p>Generate a distorted grid for image transformation based on given step sizes.</p> <p>This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>steps_x</code> <code>list[float]</code> <p>List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction.</p> <code>steps_y</code> <code>list[float]</code> <p>List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction.</p> <code>num_steps</code> <code>int</code> <p>The number of steps to divide each axis into. This determines the granularity of the distortion grid.</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing two 2D numpy arrays:     - map_x: A 2D array of float32 values representing the x-coordinates       of the distorted grid.     - map_y: A 2D array of float32 values representing the y-coordinates       of the distorted grid.</p> <p>Note</p> <ul> <li>The function generates a grid where each cell can be distorted independently.</li> <li>The distortion is controlled by the steps_x and steps_y parameters, which   determine how much each grid line is shifted.</li> <li>The resulting map_x and map_y can be used directly with cv2.remap() to   apply the distortion to an image.</li> <li>The distortion is applied smoothly across each grid cell using linear   interpolation.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n&gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n&gt;&gt;&gt; num_steps = 5\n&gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n&gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_grid(\n    image_shape: tuple[int, int],\n    steps_x: list[float],\n    steps_y: list[float],\n    num_steps: int,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate a distorted grid for image transformation based on given step sizes.\n\n    This function creates two 2D arrays (map_x and map_y) that represent a distorted version\n    of the original image grid. These arrays can be used with OpenCV's remap function to\n    apply grid distortion to an image.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        steps_x (list[float]): List of step sizes for the x-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the x direction.\n        steps_y (list[float]): List of step sizes for the y-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the y direction.\n        num_steps (int): The number of steps to divide each axis into. This determines\n            the granularity of the distortion grid.\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing two 2D numpy arrays:\n            - map_x: A 2D array of float32 values representing the x-coordinates\n              of the distorted grid.\n            - map_y: A 2D array of float32 values representing the y-coordinates\n              of the distorted grid.\n\n    Note:\n        - The function generates a grid where each cell can be distorted independently.\n        - The distortion is controlled by the steps_x and steps_y parameters, which\n          determine how much each grid line is shifted.\n        - The resulting map_x and map_y can be used directly with cv2.remap() to\n          apply the distortion to an image.\n        - The distortion is applied smoothly across each grid cell using linear\n          interpolation.\n\n    Example:\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n        &gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n        &gt;&gt;&gt; num_steps = 5\n        &gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n        &gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n    \"\"\"\n    height, width = image_shape[:2]\n    x_step = width // num_steps\n    xx = np.zeros(width, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_x):\n        x = idx * x_step\n        start = int(x)\n        end = min(int(x) + x_step, width)\n        cur = prev + x_step * step\n        xx[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    y_step = height // num_steps\n    yy = np.zeros(height, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_y):\n        y = idx * y_step\n        start = int(y)\n        end = min(int(y) + y_step, height)\n        cur = prev + y_step * step\n        yy[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    return np.meshgrid(xx, yy)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_reflected_bboxes","title":"<code>def generate_reflected_bboxes    (bboxes, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected bounding boxes for the entire reflection grid.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Original bounding boxes.</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>Grid dimensions and original position.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted bounding boxes for the entire grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_bboxes(\n    bboxes: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected bounding boxes for the entire reflection grid.\n\n    Args:\n        bboxes (np.ndarray): Original bounding boxes.\n        grid_dims (dict[str, tuple[int, int]]): Grid dimensions and original position.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted bounding boxes for the entire grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of bboxes\n    bboxes_hflipped = flip_bboxes(bboxes, flip_horizontal=True, image_shape=image_shape)\n    bboxes_vflipped = flip_bboxes(bboxes, flip_vertical=True, image_shape=image_shape)\n    bboxes_hvflipped = flip_bboxes(\n        bboxes,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [\n            original_col * cols,\n            original_row * rows,\n            original_col * cols,\n            original_row * rows,\n        ],\n    )\n    bboxes = shift_bboxes(bboxes, shift_vector)\n    bboxes_hflipped = shift_bboxes(bboxes_hflipped, shift_vector)\n    bboxes_vflipped = shift_bboxes(bboxes_vflipped, shift_vector)\n    bboxes_hvflipped = shift_bboxes(bboxes_hvflipped, shift_vector)\n\n    new_bboxes = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of bboxes to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes\n            elif (grid_row - original_row) % 2 == 0:\n                current_bboxes = bboxes_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes_vflipped\n            else:\n                current_bboxes = bboxes_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                ],\n            )\n            shifted_bboxes = shift_bboxes(current_bboxes, cell_shift)\n\n            new_bboxes.append(shifted_bboxes)\n\n    result = np.vstack(new_bboxes)\n\n    return shift_bboxes(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_reflected_keypoints","title":"<code>def generate_reflected_keypoints    (keypoints, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected keypoints for the entire reflection grid.</p> <p>This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the <code>center_in_origin</code> parameter.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Original keypoints array of shape (N, 4+), where N is the number of keypoints,                     and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing grid dimensions and original position. It should have the following keys: - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols) - \"original_position\": tuple[int, int] representing (original_row, original_col)</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted keypoints for the entire grid. The shape is             (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.</p> <p>Note</p> <ul> <li>The function handles keypoint flipping and shifting to create a grid of reflected keypoints.</li> <li>It preserves the angle and scale information of the keypoints during transformations.</li> <li>The resulting grid can be either centered at the origin or positioned based on the original grid.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_keypoints(\n    keypoints: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected keypoints for the entire reflection grid.\n\n    This function creates a grid of keypoints by reflecting and shifting the original keypoints.\n    It handles both centered and non-centered grids based on the `center_in_origin` parameter.\n\n    Args:\n        keypoints (np.ndarray): Original keypoints array of shape (N, 4+), where N is the number of keypoints,\n                                and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).\n        grid_dims (dict[str, tuple[int, int]]): A dictionary containing grid dimensions and original position.\n            It should have the following keys:\n            - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols)\n            - \"original_position\": tuple[int, int] representing (original_row, original_col)\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool, optional): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted keypoints for the entire grid. The shape is\n                    (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.\n\n    Note:\n        - The function handles keypoint flipping and shifting to create a grid of reflected keypoints.\n        - It preserves the angle and scale information of the keypoints during transformations.\n        - The resulting grid can be either centered at the origin or positioned based on the original grid.\n    \"\"\"\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of keypoints\n    keypoints_hflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        image_shape=image_shape,\n    )\n    keypoints_vflipped = flip_keypoints(\n        keypoints,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n    keypoints_hvflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    rows, cols = image_shape[:2]\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [original_col * cols, original_row * rows, 0, 0, 0],\n    )  # Only shift x and y\n    keypoints = shift_keypoints(keypoints, shift_vector)\n    keypoints_hflipped = shift_keypoints(keypoints_hflipped, shift_vector)\n    keypoints_vflipped = shift_keypoints(keypoints_vflipped, shift_vector)\n    keypoints_hvflipped = shift_keypoints(keypoints_hvflipped, shift_vector)\n\n    new_keypoints = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of keypoints to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints\n            elif (grid_row - original_row) % 2 == 0:\n                current_keypoints = keypoints_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints_vflipped\n            else:\n                current_keypoints = keypoints_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    0,\n                    0,\n                    0,\n                ],\n            )\n            shifted_keypoints = shift_keypoints(current_keypoints, cell_shift)\n\n            new_keypoints.append(shifted_keypoints)\n\n    result = np.vstack(new_keypoints)\n\n    return shift_keypoints(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.generate_shuffled_splits","title":"<code>def generate_shuffled_splits    (size, divisions, random_generator)    </code> [view source on GitHub]","text":"<p>Generate shuffled splits for a given dimension size and number of divisions.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>int</code> <p>Total size of the dimension (height or width).</p> <code>divisions</code> <code>int</code> <p>Number of divisions (rows or columns).</p> <code>random_generator</code> <code>np.random.Generator | None</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Cumulative edges of the shuffled intervals.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_shuffled_splits(\n    size: int,\n    divisions: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate shuffled splits for a given dimension size and number of divisions.\n\n    Args:\n        size (int): Total size of the dimension (height or width).\n        divisions (int): Number of divisions (rows or columns).\n        random_generator (np.random.Generator | None): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: Cumulative edges of the shuffled intervals.\n    \"\"\"\n    intervals = almost_equal_intervals(size, divisions)\n    random_generator.shuffle(intervals)\n    return np.insert(np.cumsum(intervals), 0, 0)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps","title":"<code>def get_camera_matrix_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using camera matrix model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_camera_matrix_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using camera matrix model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n    camera_matrix = np.array(\n        [[width, 0, center_xy[0]], [0, height, center_xy[1]], [0, 0, 1]],\n        dtype=np.float32,\n    )\n    distortion = np.array([k, k, 0, 0, 0], dtype=np.float32)\n    return cv2.initUndistortRectifyMap(\n        camera_matrix,\n        distortion,\n        None,\n        None,\n        (width, height),\n        cv2.CV_32FC1,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_dimension_padding","title":"<code>def get_dimension_padding    (current_size, min_size, divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding for a single dimension.</p> <p>Parameters:</p> Name Type Description <code>current_size</code> <code>int</code> <p>Current size of the dimension</p> <code>min_size</code> <code>int | None</code> <p>Minimum size requirement, if any</p> <code>divisor</code> <code>int | None</code> <p>Divisor for padding to make size divisible, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int]</code> <p>(pad_before, pad_after)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_dimension_padding(\n    current_size: int,\n    min_size: int | None,\n    divisor: int | None,\n) -&gt; tuple[int, int]:\n    \"\"\"Calculate padding for a single dimension.\n\n    Args:\n        current_size: Current size of the dimension\n        min_size: Minimum size requirement, if any\n        divisor: Divisor for padding to make size divisible, if any\n\n    Returns:\n        tuple[int, int]: (pad_before, pad_after)\n    \"\"\"\n    if min_size is not None:\n        if current_size &lt; min_size:\n            pad_before = int((min_size - current_size) / 2.0)\n            pad_after = min_size - current_size - pad_before\n            return pad_before, pad_after\n    elif divisor is not None:\n        remainder = current_size % divisor\n        if remainder &gt; 0:\n            total_pad = divisor - remainder\n            pad_before = total_pad // 2\n            pad_after = total_pad - pad_before\n            return pad_before, pad_after\n\n    return 0, 0\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps","title":"<code>def get_fisheye_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using fisheye model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_fisheye_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using fisheye model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n\n    center_x, center_y = center_xy\n\n    # Create coordinate grid\n    y, x = np.mgrid[:height, :width].astype(np.float32)\n\n    x = x - center_x\n    y = y - center_y\n\n    # Calculate polar coordinates\n    r = np.sqrt(x * x + y * y)\n    theta = np.arctan2(y, x)\n\n    # Normalize radius by the maximum possible radius to keep distortion in check\n    max_radius = math.sqrt(max(center_x, width - center_x) ** 2 + max(center_y, height - center_y) ** 2)\n    r_norm = r / max_radius\n\n    # Apply fisheye distortion to normalized radius\n    r_dist = r * (1 + k * r_norm * r_norm)\n\n    # Convert back to cartesian coordinates\n    map_x = r_dist * np.cos(theta) + center_x\n    map_y = r_dist * np.sin(theta) + center_y\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_pad_grid_dimensions","title":"<code>def get_pad_grid_dimensions    (pad_top, pad_bottom, pad_left, pad_right, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the dimensions of the grid needed for reflection padding and the position of the original image.</p> <p>Parameters:</p> Name Type Description <code>pad_top</code> <code>int</code> <p>Number of pixels to pad above the image.</p> <code>pad_bottom</code> <code>int</code> <p>Number of pixels to pad below the image.</p> <code>pad_left</code> <code>int</code> <p>Number of pixels to pad to the left of the image.</p> <code>pad_right</code> <code>int</code> <p>Number of pixels to pad to the right of the image.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <p>Returns:</p> Type Description <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing:     - 'grid_shape': A tuple (grid_rows, grid_cols) where:         - grid_rows (int): Number of times the image needs to be repeated vertically.         - grid_cols (int): Number of times the image needs to be repeated horizontally.     - 'original_position': A tuple (original_row, original_col) where:         - original_row (int): Row index of the original image in the grid.         - original_col (int): Column index of the original image in the grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_pad_grid_dimensions(\n    pad_top: int,\n    pad_bottom: int,\n    pad_left: int,\n    pad_right: int,\n    image_shape: tuple[int, int],\n) -&gt; dict[str, tuple[int, int]]:\n    \"\"\"Calculate the dimensions of the grid needed for reflection padding and the position of the original image.\n\n    Args:\n        pad_top (int): Number of pixels to pad above the image.\n        pad_bottom (int): Number of pixels to pad below the image.\n        pad_left (int): Number of pixels to pad to the left of the image.\n        pad_right (int): Number of pixels to pad to the right of the image.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n\n    Returns:\n        dict[str, tuple[int, int]]: A dictionary containing:\n            - 'grid_shape': A tuple (grid_rows, grid_cols) where:\n                - grid_rows (int): Number of times the image needs to be repeated vertically.\n                - grid_cols (int): Number of times the image needs to be repeated horizontally.\n            - 'original_position': A tuple (original_row, original_col) where:\n                - original_row (int): Row index of the original image in the grid.\n                - original_col (int): Column index of the original image in the grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    grid_rows = 1 + math.ceil(pad_top / rows) + math.ceil(pad_bottom / rows)\n    grid_cols = 1 + math.ceil(pad_left / cols) + math.ceil(pad_right / cols)\n    original_row = math.ceil(pad_top / rows)\n    original_col = math.ceil(pad_left / cols)\n\n    return {\n        \"grid_shape\": (grid_rows, grid_cols),\n        \"original_position\": (original_row, original_col),\n    }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.get_padding_params","title":"<code>def get_padding_params    (image_shape, min_height, min_width, pad_height_divisor, pad_width_divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding parameters based on target dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>(height, width) of the image</p> <code>min_height</code> <code>int | None</code> <p>Minimum height requirement, if any</p> <code>min_width</code> <code>int | None</code> <p>Minimum width requirement, if any</p> <code>pad_height_divisor</code> <code>int | None</code> <p>Divisor for height padding, if any</p> <code>pad_width_divisor</code> <code>int | None</code> <p>Divisor for width padding, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int, int, int]</code> <p>(pad_top, pad_bottom, pad_left, pad_right)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_padding_params(\n    image_shape: tuple[int, int],\n    min_height: int | None,\n    min_width: int | None,\n    pad_height_divisor: int | None,\n    pad_width_divisor: int | None,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate padding parameters based on target dimensions.\n\n    Args:\n        image_shape: (height, width) of the image\n        min_height: Minimum height requirement, if any\n        min_width: Minimum width requirement, if any\n        pad_height_divisor: Divisor for height padding, if any\n        pad_width_divisor: Divisor for width padding, if any\n\n    Returns:\n        tuple[int, int, int, int]: (pad_top, pad_bottom, pad_left, pad_right)\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    h_pad_top, h_pad_bottom = get_dimension_padding(\n        rows,\n        min_height,\n        pad_height_divisor,\n    )\n    w_pad_left, w_pad_right = get_dimension_padding(cols, min_width, pad_width_divisor)\n\n    return h_pad_top, h_pad_bottom, w_pad_left, w_pad_right\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.is_identity_matrix","title":"<code>def is_identity_matrix    (matrix)    </code> [view source on GitHub]","text":"<p>Check if the given matrix is an identity matrix.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>A 3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>bool</code> <p>True if the matrix is an identity matrix, False otherwise.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_identity_matrix(matrix: np.ndarray) -&gt; bool:\n    \"\"\"Check if the given matrix is an identity matrix.\n\n    Args:\n        matrix (np.ndarray): A 3x3 affine transformation matrix.\n\n    Returns:\n        bool: True if the matrix is an identity matrix, False otherwise.\n    \"\"\"\n    return np.allclose(matrix, np.eye(3, dtype=matrix.dtype))\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.is_valid_component","title":"<code>def is_valid_component    (component_area, original_area, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Validate if a component meets the minimum requirements.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_valid_component(\n    component_area: float,\n    original_area: float,\n    min_area: float | None,\n    min_visibility: float | None,\n) -&gt; bool:\n    \"\"\"Validate if a component meets the minimum requirements.\"\"\"\n    visibility = component_area / original_area\n    return (min_area is None or component_area &gt;= min_area) and (min_visibility is None or visibility &gt;= min_visibility)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_affine","title":"<code>def keypoints_affine    (keypoints, matrix, image_shape, scale, border_mode)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to keypoints.</p> <p>This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 4+) where N is the number of keypoints.                     Each keypoint is represented as [x, y, angle, scale, ...].</p> <code>matrix</code> <code>np.ndarray</code> <p>The 2x3 or 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width).</p> <code>scale</code> <code>dict[str, float]</code> <p>Dictionary containing scale factors for x and y directions.                       Expected keys are 'x' and 'y'.</p> <code>border_mode</code> <code>int</code> <p>Border mode for handling keypoints near image edges.                 Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed keypoints array with the same shape as input.</p> <p>Notes</p> <ul> <li>The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.</li> <li>Coordinates (x, y) are transformed using the affine matrix.</li> <li>Angles are adjusted based on the rotation component of the affine transformation.</li> <li>Scales are multiplied by the maximum of x and y scale factors.</li> <li>The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n&gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n&gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n&gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_affine(\n    keypoints: np.ndarray,\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n    scale: XYFloat,\n    border_mode: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to keypoints.\n\n    This function transforms keypoints using the given affine transformation matrix.\n    It handles reflection padding if necessary, updates coordinates, angles, and scales.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 4+) where N is the number of keypoints.\n                                Each keypoint is represented as [x, y, angle, scale, ...].\n        matrix (np.ndarray): The 2x3 or 3x3 affine transformation matrix.\n        image_shape (tuple[int, int]): Shape of the image (height, width).\n        scale (dict[str, float]): Dictionary containing scale factors for x and y directions.\n                                  Expected keys are 'x' and 'y'.\n        border_mode (int): Border mode for handling keypoints near image edges.\n                            Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.\n\n    Returns:\n        np.ndarray: Transformed keypoints array with the same shape as input.\n\n    Notes:\n        - The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.\n        - Coordinates (x, y) are transformed using the affine matrix.\n        - Angles are adjusted based on the rotation component of the affine transformation.\n        - Scales are multiplied by the maximum of x and y scale factors.\n        - The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.\n\n    Example:\n        &gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n        &gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n        &gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n    \"\"\"\n    keypoints = keypoints.copy().astype(np.float32)\n\n    if is_identity_matrix(matrix):\n        return keypoints\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        keypoints = generate_reflected_keypoints(\n            keypoints,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Extract x, y coordinates (z is preserved)\n    xy = keypoints[:, :2]\n\n    # Ensure matrix is 2x3\n    if matrix.shape == (3, 3):\n        matrix = matrix[:2]\n\n    # Transform x, y coordinates\n    xy_transformed = cv2.transform(xy.reshape(-1, 1, 2), matrix).squeeze()\n\n    # Calculate angle adjustment\n    angle_adjustment = rotation2d_matrix_to_euler_angles(matrix[:2, :2], y_up=False)\n\n    # Update angles (now at index 3)\n    keypoints[:, 3] = keypoints[:, 3] + angle_adjustment\n\n    # Update scales (now at index 4)\n    max_scale = max(scale[\"x\"], scale[\"y\"])\n    keypoints[:, 4] *= max_scale\n\n    # Update x, y coordinates and preserve z\n    keypoints[:, :2] = xy_transformed\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_d4","title":"<code>def keypoints_d4    (keypoints, group_member, image_shape, ** params)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a keypoint.</p> <p>This function adjusts a keypoint's coordinates according to the specified <code>D_4</code> group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.</p> <ul> <li>keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). -group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.</li> <li>image_shape (tuple[int, int]): The shape of the image.</li> <li>params (Any): Not used</li> </ul> <ul> <li>KeypointInternalType: The transformed keypoint.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.</li> </ul> <p>Examples:</p> <ul> <li>Rotating a keypoint by 90 degrees in a 100x100 image:   <code>keypoint_d4((50, 30), 'r90', 100, 100)</code>   This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_d4(\n    keypoints: np.ndarray,\n    group_member: D4Type,\n    image_shape: tuple[int, int],\n    **params: Any,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a keypoint.\n\n    This function adjusts a keypoint's coordinates according to the specified `D_4` group transformation,\n    which includes rotations and reflections suitable for image processing tasks. These transformations account\n    for the dimensions of the image to ensure the keypoint remains within its boundaries.\n\n    Parameters:\n    - keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n    -group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.\n    - image_shape (tuple[int, int]): The shape of the image.\n    - params (Any): Not used\n\n    Returns:\n    - KeypointInternalType: The transformed keypoint.\n\n    Raises:\n    - ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.\n\n    Examples:\n    - Rotating a keypoint by 90 degrees in a 100x100 image:\n      `keypoint_d4((50, 30), 'r90', 100, 100)`\n      This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: keypoints_rot90(x, 1, image_shape),  # Rotate 90 degrees\n        \"r180\": lambda x: keypoints_rot90(x, 2, image_shape),  # Rotate 180 degrees\n        \"r270\": lambda x: keypoints_rot90(x, 3, image_shape),  # Rotate 270 degrees\n        \"v\": lambda x: keypoints_vflip(x, rows),  # Vertical flip\n        \"hvt\": lambda x: keypoints_transpose(\n            keypoints_rot90(x, 2, image_shape),\n        ),  # Reflect over anti diagonal\n        \"h\": lambda x: keypoints_hflip(x, cols),  # Horizontal flip\n        \"t\": lambda x: keypoints_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](keypoints)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_hflip","title":"<code>def keypoints_hflip    (keypoints, cols)    </code> [view source on GitHub]","text":"<p>Flip keypoints horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>cols</code> <code>int</code> <p>Image width.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_hflip(keypoints: np.ndarray, cols: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints horizontally around the y-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        cols: Image width.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip x-coordinates\n    flipped_keypoints[:, 0] = (cols - 1) - keypoints[:, 0]\n\n    # Adjust angles\n    flipped_keypoints[:, 3] = np.pi - keypoints[:, 3]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_rot90","title":"<code>def keypoints_rot90    (keypoints, factor, image_shape)    </code> [view source on GitHub]","text":"<p>Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p> <code>factor</code> <code>int</code> <p>The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The rotated keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_rot90(\n    keypoints: np.ndarray,\n    factor: Literal[0, 1, 2, 3],\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.\n\n    Args:\n        keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n        factor (int): The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n\n    Returns:\n        np.ndarray: The rotated keypoints with the same shape as the input.\n    \"\"\"\n    if factor == 0:\n        return keypoints\n\n    height, width = image_shape[:2]\n    rotated_keypoints = keypoints.copy().astype(np.float32)\n\n    x, y, angle = keypoints[:, 0], keypoints[:, 1], keypoints[:, 3]\n\n    if factor == 1:\n        rotated_keypoints[:, 0] = y\n        rotated_keypoints[:, 1] = width - 1 - x\n        rotated_keypoints[:, 3] = angle - np.pi / 2\n    elif factor == ROT90_180_FACTOR:\n        rotated_keypoints[:, 0] = width - 1 - x\n        rotated_keypoints[:, 1] = height - 1 - y\n        rotated_keypoints[:, 3] = angle - np.pi\n    elif factor == ROT90_270_FACTOR:\n        rotated_keypoints[:, 0] = height - 1 - y\n        rotated_keypoints[:, 1] = x\n        rotated_keypoints[:, 3] = angle + np.pi / 2\n\n    return rotated_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_scale","title":"<code>def keypoints_scale    (keypoints, scale_x, scale_y)    </code> [view source on GitHub]","text":"<p>Scales keypoints by scale_x and scale_y.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of keypoints with shape (N, 5+) in the format       (x, y, z, angle, scale, ...).</p> <code>scale_x</code> <code>float</code> <p>Scale coefficient x-axis.</p> <code>scale_y</code> <code>float</code> <p>Scale coefficient y-axis.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of scaled keypoints with the same shape as input. X and Y coordinates are scaled by their respective scale factors, Z coordinate remains unchanged, and the keypoint scale is multiplied by max(scale_x, scale_y).</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_scale(\n    keypoints: np.ndarray,\n    scale_x: float,\n    scale_y: float,\n) -&gt; np.ndarray:\n    \"\"\"Scales keypoints by scale_x and scale_y.\n\n    Args:\n        keypoints: A numpy array of keypoints with shape (N, 5+) in the format\n                  (x, y, z, angle, scale, ...).\n        scale_x: Scale coefficient x-axis.\n        scale_y: Scale coefficient y-axis.\n\n    Returns:\n        A numpy array of scaled keypoints with the same shape as input.\n        X and Y coordinates are scaled by their respective scale factors,\n        Z coordinate remains unchanged, and the keypoint scale is multiplied\n        by max(scale_x, scale_y).\n    \"\"\"\n    # Extract x, y, z, angle, and scale\n    x, y, z, angle, scale = (\n        keypoints[:, 0],\n        keypoints[:, 1],\n        keypoints[:, 2],\n        keypoints[:, 3],\n        keypoints[:, 4],\n    )\n\n    # Scale x and y\n    x_scaled = x * scale_x\n    y_scaled = y * scale_y\n\n    # Scale the keypoint scale by the maximum of scale_x and scale_y\n    scale_scaled = scale * max(scale_x, scale_y)\n\n    # Create the output array\n    scaled_keypoints = np.column_stack([x_scaled, y_scaled, z, angle, scale_scaled])\n\n    # If there are additional columns, preserve them\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack(\n            [scaled_keypoints, keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:]],\n        )\n\n    return scaled_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_transpose","title":"<code>def keypoints_transpose    (keypoints)    </code> [view source on GitHub]","text":"<p>Transposes keypoints along the main diagonal.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transposed keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_transpose(keypoints: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes keypoints along the main diagonal.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n\n    Returns:\n        np.ndarray: An array of transposed keypoints with the same shape as the input.\n    \"\"\"\n    transposed_keypoints = keypoints.copy()\n\n    # Swap x and y coordinates\n    transposed_keypoints[:, [0, 1]] = keypoints[:, [1, 0]]\n\n    # Adjust angles to reflect the coordinate swap\n    angles = keypoints[:, 3]\n    transposed_keypoints[:, 3] = np.where(\n        angles &lt;= np.pi,\n        np.pi / 2 - angles,\n        3 * np.pi / 2 - angles,\n    )\n\n    return transposed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.keypoints_vflip","title":"<code>def keypoints_vflip    (keypoints, rows)    </code> [view source on GitHub]","text":"<p>Flip keypoints vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>rows</code> <code>int</code> <p>Image height.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_vflip(keypoints: np.ndarray, rows: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints vertically around the x-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        rows: Image height.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip y-coordinates\n    flipped_keypoints[:, 1] = (rows - 1) - keypoints[:, 1]\n\n    # Negate angles\n    flipped_keypoints[:, 3] = -keypoints[:, 3]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.perspective_bboxes","title":"<code>def perspective_bboxes    (bboxes, image_shape, matrix, max_width, max_height, keep_size)    </code> [view source on GitHub]","text":"<p>Applies perspective transformation to bounding boxes.</p> <p>This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (num_bboxes, 4+).                  Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).                  Additional columns beyond the first 4 are preserved unchanged.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <code>matrix</code> <code>np.ndarray</code> <p>The perspective transformation matrix.</p> <code>max_width</code> <code>int</code> <p>The maximum width of the output image.</p> <code>max_height</code> <code>int</code> <p>The maximum height of the output image.</p> <code>keep_size</code> <code>bool</code> <p>If True, maintains the original image size after transformation.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as input.             The first 4 columns contain the transformed coordinates, and any             additional columns are preserved from the input.</p> <p>Note</p> <ul> <li>This function modifies only the coordinate columns (first 4) of the input bounding boxes.</li> <li>Any additional attributes (columns beyond the first 4) are kept unchanged.</li> <li>The function handles denormalization and renormalization of coordinates internally.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n&gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef perspective_bboxes(\n    bboxes: np.ndarray,\n    image_shape: tuple[int, int],\n    matrix: np.ndarray,\n    max_width: int,\n    max_height: int,\n    keep_size: bool,\n) -&gt; np.ndarray:\n    \"\"\"Applies perspective transformation to bounding boxes.\n\n    This function transforms bounding boxes using the given perspective transformation matrix.\n    It handles bounding boxes with additional attributes beyond the standard coordinates.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (num_bboxes, 4+).\n                             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n                             Additional columns beyond the first 4 are preserved unchanged.\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n        matrix (np.ndarray): The perspective transformation matrix.\n        max_width (int): The maximum width of the output image.\n        max_height (int): The maximum height of the output image.\n        keep_size (bool): If True, maintains the original image size after transformation.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as input.\n                    The first 4 columns contain the transformed coordinates, and any\n                    additional columns are preserved from the input.\n\n    Note:\n        - This function modifies only the coordinate columns (first 4) of the input bounding boxes.\n        - Any additional attributes (columns beyond the first 4) are kept unchanged.\n        - The function handles denormalization and renormalization of coordinates internally.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n        &gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n    \"\"\"\n    height, width = image_shape[:2]\n    transformed_bboxes = bboxes.copy()\n    denormalized_coords = denormalize_bboxes(bboxes[:, :4], image_shape)\n\n    x_min, y_min, x_max, y_max = denormalized_coords.T\n    points = np.array(\n        [[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]],\n    ).transpose(2, 0, 1)\n    points_reshaped = points.reshape(-1, 1, 2)\n\n    transformed_points = cv2.perspectiveTransform(\n        points_reshaped.astype(np.float32),\n        matrix,\n    )\n    transformed_points = transformed_points.reshape(-1, 4, 2)\n\n    new_coords = np.array(\n        [[np.min(box[:, 0]), np.min(box[:, 1]), np.max(box[:, 0]), np.max(box[:, 1])] for box in transformed_points],\n    )\n\n    if keep_size:\n        scale_x, scale_y = width / max_width, height / max_height\n        new_coords[:, [0, 2]] *= scale_x\n        new_coords[:, [1, 3]] *= scale_y\n        output_shape = image_shape\n    else:\n        output_shape = (max_height, max_width)\n\n    normalized_coords = normalize_bboxes(new_coords, output_shape)\n    transformed_bboxes[:, :4] = normalized_coords\n\n    return transformed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.perspective_keypoints","title":"<code>def perspective_keypoints    (keypoints, image_shape, matrix, max_width, max_height, keep_size)    </code> [view source on GitHub]","text":"<p>Apply perspective transformation to keypoints.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of shape (N, 5+) in format [x, y, z, angle, scale, ...].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Original image shape (height, width).</p> <code>matrix</code> <code>np.ndarray</code> <p>3x3 perspective transformation matrix.</p> <code>max_width</code> <code>int</code> <p>Maximum width after transformation.</p> <code>max_height</code> <code>int</code> <p>Maximum height after transformation.</p> <code>keep_size</code> <code>bool</code> <p>Whether to keep original size.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed keypoints array with same shape as input. Z coordinate remains unchanged through the transformation.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef perspective_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    matrix: np.ndarray,\n    max_width: int,\n    max_height: int,\n    keep_size: bool,\n) -&gt; np.ndarray:\n    \"\"\"Apply perspective transformation to keypoints.\n\n    Args:\n        keypoints: Array of shape (N, 5+) in format [x, y, z, angle, scale, ...].\n        image_shape: Original image shape (height, width).\n        matrix: 3x3 perspective transformation matrix.\n        max_width: Maximum width after transformation.\n        max_height: Maximum height after transformation.\n        keep_size: Whether to keep original size.\n\n    Returns:\n        Transformed keypoints array with same shape as input.\n        Z coordinate remains unchanged through the transformation.\n    \"\"\"\n    keypoints = keypoints.copy().astype(np.float32)\n\n    height, width = image_shape[:2]\n\n    x, y, z, angle, scale = (\n        keypoints[:, 0],\n        keypoints[:, 1],\n        keypoints[:, 2],\n        keypoints[:, 3],\n        keypoints[:, 4],\n    )\n\n    # Reshape keypoints for perspective transform\n    keypoint_vector = np.column_stack((x, y)).astype(np.float32).reshape(-1, 1, 2)\n\n    # Apply perspective transform\n    transformed_points = cv2.perspectiveTransform(keypoint_vector, matrix).squeeze()\n\n    # Unsqueeze if we have a single keypoint\n    if transformed_points.ndim == 1:\n        transformed_points = transformed_points[np.newaxis, :]\n\n    x, y = transformed_points[:, 0], transformed_points[:, 1]\n\n    # Update angles\n    angle += rotation2d_matrix_to_euler_angles(matrix[:2, :2], y_up=True)\n\n    # Calculate scale factors\n    scale_x = np.sign(matrix[0, 0]) * np.sqrt(matrix[0, 0] ** 2 + matrix[0, 1] ** 2)\n    scale_y = np.sign(matrix[1, 1]) * np.sqrt(matrix[1, 0] ** 2 + matrix[1, 1] ** 2)\n    scale *= max(scale_x, scale_y)\n\n    if keep_size:\n        scale_x = width / max_width\n        scale_y = height / max_height\n        x *= scale_x\n        y *= scale_y\n        scale *= max(scale_x, scale_y)\n\n    # Create the output array with unchanged z coordinate\n    transformed_keypoints = np.column_stack([x, y, z, angle, scale])\n\n    # If there are additional columns, preserve them\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack(\n            [\n                transformed_keypoints,\n                keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:],\n            ],\n        )\n\n    return transformed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles","title":"<code>def rotation2d_matrix_to_euler_angles    (matrix, y_up)    </code> [view source on GitHub]","text":"<p>matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def rotation2d_matrix_to_euler_angles(matrix: np.ndarray, y_up: bool) -&gt; float:\n    \"\"\"Args:\n    matrix (np.ndarray): Rotation matrix\n    y_up (bool): is Y axis looks up or down\n\n    \"\"\"\n    if y_up:\n        return np.arctan2(matrix[1, 0], matrix[0, 0])\n    return np.arctan2(-matrix[1, 0], matrix[0, 0])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.shift_bboxes","title":"<code>def shift_bboxes    (bboxes, shift_vector)    </code> [view source on GitHub]","text":"<p>Shift bounding boxes by a given vector.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where n is the number of bboxes                  and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>shift_vector</code> <code>np.ndarray</code> <p>Vector to shift the bounding boxes by, with shape (4,) for                        [shift_x, shift_y, shift_x, shift_y].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Shifted bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shift_bboxes(bboxes: np.ndarray, shift_vector: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Shift bounding boxes by a given vector.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where n is the number of bboxes\n                             and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].\n        shift_vector (np.ndarray): Vector to shift the bounding boxes by, with shape (4,) for\n                                   [shift_x, shift_y, shift_x, shift_y].\n\n    Returns:\n        np.ndarray: Shifted bounding boxes with the same shape as input.\n    \"\"\"\n    # Create a copy of the input array to avoid modifying it in-place\n    shifted_bboxes = bboxes.copy()\n\n    # Add the shift vector to the first 4 columns\n    shifted_bboxes[:, :4] += shift_vector\n\n    return shifted_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups","title":"<code>def shuffle_tiles_within_shape_groups    (shape_groups, random_generator)    </code> [view source on GitHub]","text":"<p>Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.</p> <p>Parameters:</p> Name Type Description <code>shape_groups</code> <code>dict[tuple[int, int], list[int]]</code> <p>Groups of tile indices categorized by shape.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the indices. If None, a new random generator will be used.</p> <p>Returns:</p> Type Description <code>list[int]</code> <p>A list where each index is mapped to the new index of the tile after shuffling.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shuffle_tiles_within_shape_groups(\n    shape_groups: dict[tuple[int, int], list[int]],\n    random_generator: np.random.Generator,\n) -&gt; list[int]:\n    \"\"\"Shuffles indices within each group of similar shapes and creates a list where each\n    index points to the index of the tile it should be mapped to.\n\n    Args:\n        shape_groups (dict[tuple[int, int], list[int]]): Groups of tile indices categorized by shape.\n        random_generator (np.random.Generator): The random generator to use for shuffling the indices.\n            If None, a new random generator will be used.\n\n    Returns:\n        list[int]: A list where each index is mapped to the new index of the tile after shuffling.\n    \"\"\"\n    # Initialize the output list with the same size as the total number of tiles, filled with -1\n    num_tiles = sum(len(indices) for indices in shape_groups.values())\n    mapping = [-1] * num_tiles\n\n    # Prepare the random number generator\n\n    for indices in shape_groups.values():\n        shuffled_indices = indices.copy()\n        random_generator.shuffle(shuffled_indices)\n\n        for old, new in zip(indices, shuffled_indices):\n            mapping[old] = new\n\n    return mapping\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.split_uniform_grid","title":"<code>def split_uniform_grid    (image_shape, grid, random_generator)    </code> [view source on GitHub]","text":"<p>Splits an image shape into a uniform grid specified by the grid dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>grid</code> <code>tuple[int, int]</code> <p>The grid size as (rows, columns).</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).</p> <p>Note</p> <p>The function uses <code>generate_shuffled_splits</code> to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def split_uniform_grid(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Splits an image shape into a uniform grid specified by the grid dimensions.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        grid (tuple[int, int]): The grid size as (rows, columns).\n        random_generator (np.random.Generator): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).\n\n    Note:\n        The function uses `generate_shuffled_splits` to generate the splits for the height and width of the image.\n        The splits are then used to calculate the coordinates of the tiles.\n    \"\"\"\n    n_rows, n_cols = grid\n\n    height_splits = generate_shuffled_splits(\n        image_shape[0],\n        grid[0],\n        random_generator=random_generator,\n    )\n    width_splits = generate_shuffled_splits(\n        image_shape[1],\n        grid[1],\n        random_generator=random_generator,\n    )\n\n    # Calculate tiles coordinates\n    tiles = [\n        (height_splits[i], width_splits[j], height_splits[i + 1], width_splits[j + 1])\n        for i in range(n_rows)\n        for j in range(n_cols)\n    ]\n\n    return np.array(tiles, dtype=np.int16)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.swap_tiles_on_image","title":"<code>def swap_tiles_on_image    (image, tiles, mapping=None)    </code> [view source on GitHub]","text":"<p>Swap tiles on the image according to the new format.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image.</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tiles with each tile as [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int] | None</code> <p>list of new tile indices.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Output image with tiles swapped according to the random shuffle.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_image(\n    image: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int] | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Swap tiles on the image according to the new format.\n\n    Args:\n        image: Input image.\n        tiles: Array of tiles with each tile as [start_y, start_x, end_y, end_x].\n        mapping: list of new tile indices.\n\n    Returns:\n        np.ndarray: Output image with tiles swapped according to the random shuffle.\n    \"\"\"\n    # If no tiles are provided, return a copy of the original image\n    if tiles.size == 0 or mapping is None:\n        return image.copy()\n\n    # Create a copy of the image to retain original for reference\n    new_image = np.empty_like(image)\n    for num, new_index in enumerate(mapping):\n        start_y, start_x, end_y, end_x = tiles[new_index]\n        start_y_orig, start_x_orig, end_y_orig, end_x_orig = tiles[num]\n        # Assign the corresponding tile from the original image to the new image\n        new_image[start_y:end_y, start_x:end_x] = image[\n            start_y_orig:end_y_orig,\n            start_x_orig:end_x_orig,\n        ]\n\n    return new_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints","title":"<code>def swap_tiles_on_keypoints    (keypoints, tiles, mapping)    </code> [view source on GitHub]","text":"<p>Swap the positions of keypoints based on a tile mapping.</p> <p>This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (N, 2) where N is the number of keypoints.                     Each row represents a keypoint's (x, y) coordinates.</p> <code>tiles</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (M, 4) where M is the number of tiles.                 Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.</p> <code>mapping</code> <code>np.ndarray</code> <p>A 1D numpy array of shape (M,) where M is the number of tiles.                   Each element i contains the index of the tile that tile i should be swapped with.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of the same shape as the input keypoints, containing the new positions             of the keypoints after the tile swap.</p> <p>Exceptions:</p> Type Description <code>RuntimeWarning</code> <p>If any keypoint is not found within any tile.</p> <p>Notes</p> <ul> <li>Keypoints that do not fall within any tile will remain unchanged.</li> <li>The function assumes that the tiles do not overlap and cover the entire image space.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_keypoints(\n    keypoints: np.ndarray,\n    tiles: np.ndarray,\n    mapping: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Swap the positions of keypoints based on a tile mapping.\n\n    This function takes a set of keypoints and repositions them according to a mapping of tile swaps.\n    Keypoints are moved from their original tiles to new positions in the swapped tiles.\n\n    Args:\n        keypoints (np.ndarray): A 2D numpy array of shape (N, 2) where N is the number of keypoints.\n                                Each row represents a keypoint's (x, y) coordinates.\n        tiles (np.ndarray): A 2D numpy array of shape (M, 4) where M is the number of tiles.\n                            Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.\n        mapping (np.ndarray): A 1D numpy array of shape (M,) where M is the number of tiles.\n                              Each element i contains the index of the tile that tile i should be swapped with.\n\n    Returns:\n        np.ndarray: A 2D numpy array of the same shape as the input keypoints, containing the new positions\n                    of the keypoints after the tile swap.\n\n    Raises:\n        RuntimeWarning: If any keypoint is not found within any tile.\n\n    Notes:\n        - Keypoints that do not fall within any tile will remain unchanged.\n        - The function assumes that the tiles do not overlap and cover the entire image space.\n    \"\"\"\n    if not keypoints.size:\n        return keypoints\n\n    # Broadcast keypoints and tiles for vectorized comparison\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    start_y, start_x, end_y, end_x = tiles.T  # Each shape: (num_tiles,)\n\n    # Check if each keypoint is inside each tile\n    in_tile = (kp_y &gt;= start_y) &amp; (kp_y &lt; end_y) &amp; (kp_x &gt;= start_x) &amp; (kp_x &lt; end_x)\n\n    # Find which tile each keypoint belongs to\n    tile_indices = np.argmax(in_tile, axis=1)\n\n    # Check if any keypoint is not in any tile\n    not_in_any_tile = ~np.any(in_tile, axis=1)\n    if np.any(not_in_any_tile):\n        warn(\n            \"Some keypoints are not in any tile. They will be returned unchanged. This is unexpected and should be \"\n            \"investigated.\",\n            RuntimeWarning,\n            stacklevel=2,\n        )\n\n    # Get the new tile indices\n    new_tile_indices = np.array(mapping)[tile_indices]\n\n    # Calculate the offsets\n    old_start_x = tiles[tile_indices, 1]\n    old_start_y = tiles[tile_indices, 0]\n    new_start_x = tiles[new_tile_indices, 1]\n    new_start_y = tiles[new_tile_indices, 0]\n\n    # Apply the transformation\n    new_keypoints = keypoints.copy()\n    new_keypoints[:, 0] = (keypoints[:, 0] - old_start_x) + new_start_x\n    new_keypoints[:, 1] = (keypoints[:, 1] - old_start_y) + new_start_y\n\n    # Keep original coordinates for keypoints not in any tile\n    new_keypoints[not_in_any_tile] = keypoints[not_in_any_tile]\n\n    return new_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.to_distance_maps","title":"<code>def to_distance_maps    (keypoints, image_shape, inverted=False)    </code> [view source on GitHub]","text":"<p>Generate a <code>(H,W,N)</code> array of distance maps for <code>N</code> keypoints.</p> <p>The <code>n</code>-th distance map contains at every location <code>(y, x)</code> the euclidean distance to the <code>n</code>-th keypoint.</p> <p>This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 2+) where N is the number of keypoints.        Each row represents a keypoint's (x, y) coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>tuple[int, int] shape of the image (height, width)</p> <code>inverted</code> <code>bool</code> <p>If <code>True</code>, inverted distance maps are returned where each distance value d is replaced by <code>d/(d+1)</code>, i.e. the distance maps have values in the range <code>(0.0, 1.0]</code> with <code>1.0</code> denoting exactly the position of the respective keypoint.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A <code>float32</code> array of shape (H, W, N) containing <code>N</code> distance maps for <code>N</code>     keypoints. Each location <code>(y, x, n)</code> in the array denotes the     euclidean distance at <code>(y, x)</code> to the <code>n</code>-th keypoint.     If <code>inverted</code> is <code>True</code>, the distance <code>d</code> is replaced     by <code>d/(d+1)</code>. The height and width of the array match the     height and width in <code>image_shape</code>.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def to_distance_maps(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    inverted: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate a ``(H,W,N)`` array of distance maps for ``N`` keypoints.\n\n    The ``n``-th distance map contains at every location ``(y, x)`` the\n    euclidean distance to the ``n``-th keypoint.\n\n    This function can be used as a helper when augmenting keypoints with a\n    method that only supports the augmentation of images.\n\n    Args:\n        keypoints: A numpy array of shape (N, 2+) where N is the number of keypoints.\n                   Each row represents a keypoint's (x, y) coordinates.\n        image_shape: tuple[int, int] shape of the image (height, width)\n        inverted (bool): If ``True``, inverted distance maps are returned where each\n            distance value d is replaced by ``d/(d+1)``, i.e. the distance\n            maps have values in the range ``(0.0, 1.0]`` with ``1.0`` denoting\n            exactly the position of the respective keypoint.\n\n    Returns:\n        np.ndarray: A ``float32`` array of shape (H, W, N) containing ``N`` distance maps for ``N``\n            keypoints. Each location ``(y, x, n)`` in the array denotes the\n            euclidean distance at ``(y, x)`` to the ``n``-th keypoint.\n            If `inverted` is ``True``, the distance ``d`` is replaced\n            by ``d/(d+1)``. The height and width of the array match the\n            height and width in ``image_shape``.\n    \"\"\"\n    height, width = image_shape[:2]\n    if len(keypoints) == 0:\n        return np.zeros((height, width, 0), dtype=np.float32)\n\n    # Create coordinate grids\n    yy, xx = np.mgrid[:height, :width]\n\n    # Convert keypoints to numpy array\n    keypoints_array = np.array(keypoints)\n\n    # Compute distances for all keypoints at once\n    distances = np.sqrt(\n        (xx[..., np.newaxis] - keypoints_array[:, 0]) ** 2 + (yy[..., np.newaxis] - keypoints_array[:, 1]) ** 2,\n    )\n\n    if inverted:\n        return (1 / (distances + 1)).astype(np.float32)\n    return distances.astype(np.float32)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.tps_transform","title":"<code>def tps_transform    (target_points, control_points, nonlinear_weights, affine_weights)    </code> [view source on GitHub]","text":"<p>Apply Thin Plate Spline transformation to points.</p> <p>Parameters:</p> Name Type Description <code>target_points</code> <code>np.ndarray</code> <p>Points to transform with shape (num_targets, 2)</p> <code>control_points</code> <code>np.ndarray</code> <p>Original control points with shape (num_controls, 2)</p> <code>nonlinear_weights</code> <code>np.ndarray</code> <p>TPS kernel weights with shape (num_controls, 2)</p> <code>affine_weights</code> <code>np.ndarray</code> <p>Affine transformation weights with shape (3, 2)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (num_targets, 2)</p> <p>Note</p> <p>The transformation combines: 1. Nonlinear warping based on distances to control points 2. Global affine transformation (scale, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def tps_transform(\n    target_points: np.ndarray,\n    control_points: np.ndarray,\n    nonlinear_weights: np.ndarray,\n    affine_weights: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply Thin Plate Spline transformation to points.\n\n    Args:\n        target_points: Points to transform with shape (num_targets, 2)\n        control_points: Original control points with shape (num_controls, 2)\n        nonlinear_weights: TPS kernel weights with shape (num_controls, 2)\n        affine_weights: Affine transformation weights with shape (3, 2)\n\n    Returns:\n        Transformed points with shape (num_targets, 2)\n\n    Note:\n        The transformation combines:\n        1. Nonlinear warping based on distances to control points\n        2. Global affine transformation (scale, rotation, translation)\n    \"\"\"\n    # Compute all pairwise distances at once: (num_targets, num_controls)\n    distances = np.linalg.norm(target_points[:, None] - control_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Prepare affine terms [1, x, y] for each point\n    affine_terms = np.c_[np.ones(len(target_points)), target_points]\n\n    # Combine nonlinear and affine transformations\n    return kernel_matrix @ nonlinear_weights + affine_terms @ affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.transpose","title":"<code>def transpose    (img)    </code> [view source on GitHub]","text":"<p>Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transposed array.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def transpose(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes the first two dimensions of an array of any dimensionality.\n    Retains the order of any additional dimensions.\n\n    Args:\n        img (np.ndarray): Input array.\n\n    Returns:\n        np.ndarray: Transposed array.\n    \"\"\"\n    # Generate the new axes order\n    new_axes = list(range(img.ndim))\n    new_axes[0], new_axes[1] = 1, 0  # Swap the first two dimensions\n\n    # Transpose the array using the new axes order\n    return img.transpose(new_axes)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.validate_bboxes","title":"<code>def validate_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Validate bounding boxes and remove invalid ones.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid bounding boxes, potentially with fewer boxes than the input.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n&gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n&gt;&gt;&gt; print(valid_bboxes)\n[[10 20 30 40]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_bboxes(bboxes: np.ndarray, image_shape: Sequence[int]) -&gt; np.ndarray:\n    \"\"\"Validate bounding boxes and remove invalid ones.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid bounding boxes, potentially with fewer boxes than the input.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n        &gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n        &gt;&gt;&gt; print(valid_bboxes)\n        [[10 20 30 40]]\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    valid_indices = (x_max &gt; 0) &amp; (y_max &gt; 0) &amp; (x_min &lt; cols) &amp; (y_min &lt; rows)\n\n    return bboxes[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.validate_if_not_found_coords","title":"<code>def validate_if_not_found_coords    (if_not_found_coords)    </code> [view source on GitHub]","text":"<p>Validate and process <code>if_not_found_coords</code> parameter.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_if_not_found_coords(\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None,\n) -&gt; tuple[bool, float, float]:\n    \"\"\"Validate and process `if_not_found_coords` parameter.\"\"\"\n    if if_not_found_coords is None:\n        return True, -1, -1\n    if isinstance(if_not_found_coords, (tuple, list)):\n        if len(if_not_found_coords) != PAIR:\n            msg = \"Expected tuple/list 'if_not_found_coords' to contain exactly two entries.\"\n            raise ValueError(msg)\n        return False, if_not_found_coords[0], if_not_found_coords[1]\n    if isinstance(if_not_found_coords, dict):\n        return False, if_not_found_coords[\"x\"], if_not_found_coords[\"y\"]\n\n    msg = \"Expected if_not_found_coords to be None, tuple, list, or dict.\"\n    raise ValueError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.functional.validate_keypoints","title":"<code>def validate_keypoints    (keypoints, image_shape)    </code> [view source on GitHub]","text":"<p>Validate keypoints and remove those that fall outside the image boundaries.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, M) where N is the number of keypoints                     and M &gt;= 2. The first two columns represent x and y coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid keypoints that fall within the image boundaries.</p> <p>Note</p> <p>This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Validate keypoints and remove those that fall outside the image boundaries.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, M) where N is the number of keypoints\n                                and M &gt;= 2. The first two columns represent x and y coordinates.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid keypoints that fall within the image boundaries.\n\n    Note:\n        This function only checks the x and y coordinates (first two columns) of the keypoints.\n        Any additional columns (e.g., angle, scale) are preserved for valid keypoints.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x, y = keypoints[:, 0], keypoints[:, 1]\n\n    valid_indices = (x &gt;= 0) &amp; (x &lt; cols) &amp; (y &gt;= 0) &amp; (y &lt; rows)\n\n    return keypoints[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize","title":"<code>resize</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.LongestMaxSize","title":"<code>class  LongestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,     keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, Sequence[int]</code> <p>Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>interpolation method. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the longest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / max(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = min(H_target/H, W_target/W)\n       This ensures both dimensions fit within the specified bounds.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n&gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Common use case with padding\n&gt;&gt;&gt; transform4 = A.Compose([\n...     A.LongestMaxSize(max_size=1024),\n...     A.PadIfNeeded(min_height=1024, min_width=1024),\n... ])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class LongestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,\n        keeping the aspect ratio.\n\n    Args:\n        max_size (int, Sequence[int], optional): Maximum size of the longest side after the transformation.\n            When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must fit within these bounds\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): interpolation method. Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the longest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / max(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = min(H_target/H, W_target/W)\n               This ensures both dimensions fit within the specified bounds.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n        &gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Common use case with padding\n        &gt;&gt;&gt; transform4 = A.Compose([\n        ...     A.LongestMaxSize(max_size=1024),\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024),\n        ... ])\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / max(img_h, img_w)\n        elif self.max_size_hw is not None:\n            # We know max_size_hw is not None here due to model validator\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on longest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = min(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.MaxSizeTransform","title":"<code>class  MaxSizeTransform</code> <code>       (max_size=1024, max_size_hw=None, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                                     </code>  [view source on GitHub]","text":"<p>Base class for transforms that resize based on maximum size constraints.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class MaxSizeTransform(DualTransform):\n    \"\"\"Base class for transforms that resize based on maximum size constraints.\"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_size: int | list[int] | None\n        max_size_hw: tuple[int | None, int | None] | None\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def validate_size_parameters(self) -&gt; Self:\n            if self.max_size is None and self.max_size_hw is None:\n                raise ValueError(\"Either max_size or max_size_hw must be specified\")\n            if self.max_size is not None and self.max_size_hw is not None:\n                raise ValueError(\"Only one of max_size or max_size_hw should be specified\")\n            return self\n\n    def __init__(\n        self,\n        max_size: int | Sequence[int] | None = 1024,\n        max_size_hw: tuple[int | None, int | None] | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_size = max_size\n        self.max_size_hw = max_size_hw\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = img.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(img, (new_height, new_width), interpolation=self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = mask.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(mask, (new_height, new_width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_images(self, images: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(images, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volume, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volumes, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(masks3d, *args, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_size\", \"max_size_hw\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.RandomScale","title":"<code>class  RandomScale</code> <code>       (scale_limit=(-0.1, 0.1), interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Randomly resize the input. Output image size is different from the input image size.</p> <p>Parameters:</p> Name Type Description <code>scale_limit</code> <code>float or tuple[float, float]</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The output image size is different from the input image size.</li> <li>Scale factor is sampled independently per image side (width and height).</li> <li>Bounding box coordinates are scaled accordingly.</li> <li>Keypoint coordinates are scaled accordingly.</li> </ul> <p>Mathematical formulation:     Let (W, H) be the original image dimensions and (W', H') be the output dimensions.     The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].     Then, W' = W * s and H' = H * s.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; scaled_image = result['image']\n# scaled_image will have dimensions in the range [90, 110] x [90, 110]\n# (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class RandomScale(DualTransform):\n    \"\"\"Randomly resize the input. Output image size is different from the input image size.\n\n    Args:\n        scale_limit (float or tuple[float, float]): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The output image size is different from the input image size.\n        - Scale factor is sampled independently per image side (width and height).\n        - Bounding box coordinates are scaled accordingly.\n        - Keypoint coordinates are scaled accordingly.\n\n    Mathematical formulation:\n        Let (W, H) be the original image dimensions and (W', H') be the output dimensions.\n        The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].\n        Then, W' = W * s and H' = H * s.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; scaled_image = result['image']\n        # scaled_image will have dimensions in the range [90, 110] x [90, 110]\n        # (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_limit: ScaleFloatType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(cls, v: ScaleFloatType) -&gt; tuple[float, float]:\n            return to_tuple(v, bias=1.0)\n\n    def __init__(\n        self,\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"scale\": self.py_random.uniform(*self.scale_limit)}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(img, scale, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(mask, scale, self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"interpolation\": self.interpolation,\n            \"mask_interpolation\": self.mask_interpolation,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.Resize","title":"<code>class  Resize</code> <code>       (height, width, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Resize the input to the given height and width.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>desired height of the output.</p> <code>width</code> <code>int</code> <p>desired width of the output.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class Resize(DualTransform):\n    \"\"\"Resize the input to the given height and width.\n\n    Args:\n        height (int): desired height of the output.\n        width (int): desired width of the output.\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        height: int = Field(ge=1)\n        width: int = Field(ge=1)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.height = height\n        self.width = width\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(img, (self.height, self.width), interpolation=self.interpolation)\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(mask, (self.height, self.width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        height, width = params[\"shape\"][:2]\n        scale_x = self.width / width\n        scale_y = self.height / height\n        return fgeometric.keypoints_scale(keypoints, scale_x, scale_y)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.resize.SmallestMaxSize","title":"<code>class  SmallestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, list of int</code> <p>Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the smallest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be larger than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / min(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = max(H_target/H, W_target/W)\n       This ensures both dimensions are at least as large as specified.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n&gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n&gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class SmallestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints,\n    keeping the aspect ratio.\n\n    Args:\n        max_size (int, list of int, optional): Maximum size of smallest side of the image after the transformation.\n            When using a list, max size will be randomly selected from the values in the list. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must be at least these values\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the smallest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be larger than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / min(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = max(H_target/H, W_target/W)\n               This ensures both dimensions are at least as large as specified.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n        &gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n        &gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / min(img_h, img_w)\n        elif self.max_size_hw is not None:\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on smallest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = max(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate","title":"<code>rotate</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.RandomRotate90","title":"<code>class  RandomRotate90</code> <code> </code>  [view source on GitHub]","text":"<p>Randomly rotate the input by 90 degrees zero or more times.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RandomRotate90(DualTransform):\n    \"\"\"Randomly rotate the input by 90 degrees zero or more times.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, factor: Literal[0, 1, 2, 3], **params: Any) -&gt; np.ndarray:\n        return fgeometric.rot90(img, factor)\n\n    def get_params(self) -&gt; dict[str, int]:\n        # Random int in the range [0, 3]\n        return {\"factor\": self.py_random.randint(0, 3)}\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        factor: Literal[0, 1, 2, 3],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_rot90(bboxes, factor)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        factor: Literal[0, 1, 2, 3],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_rot90(keypoints, factor, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.Rotate","title":"<code>class  Rotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Rotate the input by an angle selected randomly from the uniform distribution.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>str</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>crop_border</code> <code>bool</code> <p>Whether to crop border after rotation. If True, the output image size might differ from the input. Default: False</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation angle is randomly selected for each execution within the range specified by 'limit'.</li> <li>When 'crop_border' is False, the output image will have the same size as the input, potentially   introducing black triangles in the corners.</li> <li>When 'crop_border' is True, the output image is cropped to remove black triangles, which may result   in a smaller image.</li> <li>Bounding boxes are rotated and may change size or shape.</li> <li>Keypoints are rotated around the center of the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. Each point (x, y) in the image is transformed to (x', y') by:        [x']   cos(\u03b8)  -sin(\u03b8)   [cx]        [y'] = sin(\u03b8)   cos(\u03b8) + [cy]        where (cx, cy) is the center of the image.     5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class Rotate(DualTransform):\n    \"\"\"Rotate the input by an angle selected randomly from the uniform distribution.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        rotate_method (str): Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'.\n            Default: 'largest_box'\n        crop_border (bool): Whether to crop border after rotation. If True, the output image size might differ\n            from the input. Default: False\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation angle is randomly selected for each execution within the range specified by 'limit'.\n        - When 'crop_border' is False, the output image will have the same size as the input, potentially\n          introducing black triangles in the corners.\n        - When 'crop_border' is True, the output image is cropped to remove black triangles, which may result\n          in a smaller image.\n        - Bounding boxes are rotated and may change size or shape.\n        - Keypoints are rotated around the center of the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [cos(\u03b8)  -sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [sin(\u03b8)   cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n        5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        crop_border: bool\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_value(self) -&gt; Self:\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n            return self\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        crop_border: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.crop_border = crop_border\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            img,\n            matrix,\n            self.interpolation,\n            self.fill,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            mask,\n            matrix,\n            self.mask_interpolation,\n            self.fill_mask,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_out = fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            image_shape,\n            self.border_mode,\n            image_shape,\n        )\n        if self.crop_border:\n            return fcrops.crop_bboxes_by_coords(\n                bboxes_out,\n                (x_min, y_min, x_max, y_max),\n                image_shape,\n            )\n        return bboxes_out\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        keypoints_out = fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"][:2],\n            scale={\"x\": 1, \"y\": 1},\n            border_mode=self.border_mode,\n        )\n        if self.crop_border:\n            return fcrops.crop_keypoints_by_coords(\n                keypoints_out,\n                (x_min, y_min, x_max, y_max),\n            )\n        return keypoints_out\n\n    @staticmethod\n    def _rotated_rect_with_max_area(\n        height: int,\n        width: int,\n        angle: float,\n    ) -&gt; dict[str, int]:\n        \"\"\"Given a rectangle of size wxh that has been rotated by 'angle' (in\n        degrees), computes the width and height of the largest possible\n        axis-aligned rectangle (maximal area) within the rotated rectangle.\n\n        Reference:\n            https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders\n        \"\"\"\n        angle = math.radians(angle)\n        width_is_longer = width &gt;= height\n        side_long, side_short = (width, height) if width_is_longer else (height, width)\n\n        # since the solutions for angle, -angle and 180-angle are all the same,\n        # it is sufficient to look at the first quadrant and the absolute values of sin,cos:\n        sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))\n        if side_short &lt;= 2.0 * sin_a * cos_a * side_long or abs(sin_a - cos_a) &lt; SMALL_NUMBER:\n            # half constrained case: two crop corners touch the longer side,\n            # the other two corners are on the mid-line parallel to the longer line\n            x = 0.5 * side_short\n            wr, hr = (x / sin_a, x / cos_a) if width_is_longer else (x / cos_a, x / sin_a)\n        else:\n            # fully constrained case: crop touches all 4 sides\n            cos_2a = cos_a * cos_a - sin_a * sin_a\n            wr, hr = (\n                (width * cos_a - height * sin_a) / cos_2a,\n                (height * cos_a - width * sin_a) / cos_2a,\n            )\n\n        return {\n            \"x_min\": max(0, int(width / 2 - wr / 2)),\n            \"x_max\": min(width, int(width / 2 + wr / 2)),\n            \"y_min\": max(0, int(height / 2 - hr / 2)),\n            \"y_max\": min(height, int(height / 2 + hr / 2)),\n        }\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        angle = self.py_random.uniform(*self.limit)\n\n        if self.crop_border:\n            height, width = params[\"shape\"][:2]\n            out_params = self._rotated_rect_with_max_area(height, width, angle)\n        else:\n            out_params = {\"x_min\": -1, \"x_max\": -1, \"y_min\": -1, \"y_max\": -1}\n\n        center = fgeometric.center(params[\"shape\"][:2])\n        bbox_center = fgeometric.center_bbox(params[\"shape\"][:2])\n\n        translate: fgeometric.XYInt = {\"x\": 0, \"y\": 0}\n        shear: fgeometric.XYFloat = {\"x\": 0, \"y\": 0}\n        scale: fgeometric.XYFloat = {\"x\": 1, \"y\": 1}\n        rotate = angle\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            center,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_center,\n        )\n        out_params[\"matrix\"] = matrix\n        out_params[\"bbox_matrix\"] = bbox_matrix\n\n        return out_params\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"crop_border\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.RotateInitSchema","title":"<code>class  RotateInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RotateInitSchema(BaseTransformInitSchema):\n    limit: SymmetricRangeType\n\n    interpolation: InterpolationType\n    mask_interpolation: InterpolationType\n\n    border_mode: BorderModeType\n\n    fill: ColorType | None\n    fill_mask: ColorType | None\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.rotate.SafeRotate","title":"<code>class  SafeRotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.</p> <p>This transformation ensures that the entire rotated image fits within the original frame by scaling it down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the rotation and scaling process.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation is performed around the center of the image.</li> <li>After rotation, the image is scaled to fit within the original frame, which may cause some distortion.</li> <li>The output image will always have the same dimensions as the input image.</li> <li>Bounding boxes and keypoints are transformed along with the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:        s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),                height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))     5. The combined transformation matrix T is:        T = [scos(\u03b8)  -ssin(\u03b8)  tx]            [ssin(\u03b8)   scos(\u03b8)  ty]        where tx and ty are translation factors to keep the image centered.     6. Each point (x, y) in the image is transformed to (x', y') by:        [x']   scos(\u03b8)   ssin(\u03b8)   [cx]        [y'] = -ssin(\u03b8)  scos(\u03b8) + [cy]        where (cx, cy) is the center of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n# scaled to fit within the original 100x100 frame\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class SafeRotate(Affine):\n    \"\"\"Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.\n\n    This transformation ensures that the entire rotated image fits within the original frame by scaling it\n    down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the\n    rotation and scaling process.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied\n            for masks.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): Method to rotate bounding boxes.\n            Should be 'largest_box' or 'ellipse'. Default: 'largest_box'\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation is performed around the center of the image.\n        - After rotation, the image is scaled to fit within the original frame, which may cause some distortion.\n        - The output image will always have the same dimensions as the input image.\n        - Bounding boxes and keypoints are transformed along with the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:\n           s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),\n                   height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))\n        5. The combined transformation matrix T is:\n           T = [s*cos(\u03b8)  -s*sin(\u03b8)  tx]\n               [s*sin(\u03b8)   s*cos(\u03b8)  ty]\n           where tx and ty are translation factors to keep the image centered.\n        6. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [s*cos(\u03b8)   s*sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [-s*sin(\u03b8)  s*cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n        # scaled to fit within the original 100x100 frame\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            rotate=limit,\n            interpolation=interpolation,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            rotate_method=rotate_method,\n            fit_output=True,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.mask_interpolation = mask_interpolation\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"mask_interpolation\",\n        )\n\n    def _create_safe_rotate_matrix(\n        self,\n        angle: float,\n        center: tuple[float, float],\n        image_shape: tuple[int, int],\n    ) -&gt; tuple[np.ndarray, dict[str, float]]:\n        height, width = image_shape[:2]\n        rotation_mat = cv2.getRotationMatrix2D(center, angle, 1.0)\n\n        # Calculate new image size\n        abs_cos = abs(rotation_mat[0, 0])\n        abs_sin = abs(rotation_mat[0, 1])\n        new_w = int(height * abs_sin + width * abs_cos)\n        new_h = int(height * abs_cos + width * abs_sin)\n\n        # Adjust the rotation matrix to take into account the new size\n        rotation_mat[0, 2] += new_w / 2 - center[0]\n        rotation_mat[1, 2] += new_h / 2 - center[1]\n\n        # Calculate scaling factors\n        scale_x = width / new_w\n        scale_y = height / new_h\n\n        # Create scaling matrix\n        scale_mat = np.array([[scale_x, 0, 0], [0, scale_y, 0], [0, 0, 1]])\n\n        # Combine rotation and scaling\n        matrix = scale_mat @ np.vstack([rotation_mat, [0, 0, 1]])\n\n        return matrix, {\"x\": scale_x, \"y\": scale_y}\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        angle = self.py_random.uniform(*self.limit)\n\n        # Calculate centers for image and bbox\n        image_center = fgeometric.center(image_shape)\n        bbox_center = fgeometric.center_bbox(image_shape)\n\n        # Create matrices for image and bbox\n        matrix, scale = self._create_safe_rotate_matrix(\n            angle,\n            image_center,\n            image_shape,\n        )\n        bbox_matrix, _ = self._create_safe_rotate_matrix(\n            angle,\n            bbox_center,\n            image_shape,\n        )\n\n        return {\n            \"rotate\": angle,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": image_shape,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms","title":"<code>transforms</code>","text":""},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Affine","title":"<code>class  Affine</code> <code>       (scale=1, translate_percent=None, translate_px=None, rotate=0, shear=0, interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, fit_output=False, keep_ratio=False, rotate_method='largest_box', balanced_scale=False, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Augmentation to apply affine transformations to images.</p> <p>Affine transformations involve:</p> <pre><code>- Translation (\"move\" image on the x-/y-axis)\n- Rotation\n- Scaling (\"zoom\" in/out)\n- Shear (move one side of the image, turning a square into a trapezoid)\n</code></pre> <p>All such transformations can create \"new\" pixels in the image without a defined content, e.g. if the image is translated to the left, pixels are created on the right. A method has to be defined to deal with these pixel values. The parameters <code>fill</code> and <code>fill_mask</code> of this class deal with this.</p> <p>Some transformations involve interpolations between several pixels of the input image to generate output pixel values. The parameters <code>interpolation</code> and <code>mask_interpolation</code> deals with the method of interpolation used for this.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>number, tuple of number or dict</code> <p>Scaling factor to use, where <code>1.0</code> denotes \"no change\" and <code>0.5</code> is zoomed out to <code>50</code> percent of the original size.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That the same range will be used for both x- and y-axis. To keep the aspect ratio, set       <code>keep_ratio=True</code>, then the same value will be used for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes. Note that when       the <code>keep_ratio=True</code>, the x- and y-axis ranges should be the same.</p> <code>translate_percent</code> <code>None, number, tuple of number or dict</code> <p>Translation as a fraction of the image height/width (x-translation, y-translation), where <code>0</code> denotes \"no change\" and <code>0.5</code> denotes \"half of the axis size\".     * If <code>None</code> then equivalent to <code>0.0</code> unless <code>translate_px</code> has a value other than <code>None</code>.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That sampled fraction value will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>translate_px</code> <code>None, int, tuple of int or dict</code> <p>Translation in pixels.     * If <code>None</code> then equivalent to <code>0</code> unless <code>translate_percent</code> has a value other than <code>None</code>.     * If a single int, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from       the discrete interval <code>[a..b]</code>. That number will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>rotate</code> <code>number or tuple of number</code> <p>Rotation in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>. Rotation happens around the center of the image, not the top left corner as in some other frameworks.     * If a number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>       and used as the rotation value.</p> <code>shear</code> <code>number, tuple of number or dict</code> <p>Shear in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>, with reasonable values being in the range of <code>[-45, 45]</code>.     * If a number, then that value will be used for all images as       the shear on the x-axis (no shear on the y-axis will be done).     * If a tuple <code>(a, b)</code>, then two value will be uniformly sampled per image       from the interval <code>[a, b]</code> and be used as the x- and y-shear value.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>mask_interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>fill</code> <code>ColorType</code> <p>The constant value to use when filling in newly created pixels. (E.g. translating by 1px to the right will create a new 1px-wide column of pixels on the left of the image). The value is only used when <code>mode=constant</code>. The expected value range is <code>[0, 255]</code> for <code>uint8</code> images.</p> <code>fill_mask</code> <code>ColorType</code> <p>Same as fill but only for masks.</p> <code>border_mode</code> <code>int</code> <p>OpenCV border flag.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to tightly capture the whole image after affine transformation (<code>translate_percent</code> and <code>translate_px</code> are ignored). Otherwise (<code>False</code>),  parts of the transformed image may end up outside the image plane. Fitting the output shape can be useful to avoid corners of the image being outside the image plane after applying rotations. Default: False</p> <code>keep_ratio</code> <code>bool</code> <p>When True, the original aspect ratio will be kept when the random scale is applied. Default: False.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"</p> <code>balanced_scale</code> <code>bool</code> <p>When True, scaling factors are chosen to be either entirely below or above 1, ensuring balanced scaling. Default: False.</p> <p>This is important because without it, scaling tends to lean towards upscaling. For example, if we want the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly from [0.5, 2]. With <code>balanced_scale</code>, the  function ensures that half the time, the scaling factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in). This makes the zooming in and out process more balanced.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>[1] https://arxiv.org/abs/2109.13488</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Affine(DualTransform):\n    \"\"\"Augmentation to apply affine transformations to images.\n\n    Affine transformations involve:\n\n        - Translation (\"move\" image on the x-/y-axis)\n        - Rotation\n        - Scaling (\"zoom\" in/out)\n        - Shear (move one side of the image, turning a square into a trapezoid)\n\n    All such transformations can create \"new\" pixels in the image without a defined content, e.g.\n    if the image is translated to the left, pixels are created on the right.\n    A method has to be defined to deal with these pixel values.\n    The parameters `fill` and `fill_mask` of this class deal with this.\n\n    Some transformations involve interpolations between several pixels\n    of the input image to generate output pixel values. The parameters `interpolation` and\n    `mask_interpolation` deals with the method of interpolation used for this.\n\n    Args:\n        scale (number, tuple of number or dict): Scaling factor to use, where ``1.0`` denotes \"no change\" and\n            ``0.5`` is zoomed out to ``50`` percent of the original size.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That the same range will be used for both x- and y-axis. To keep the aspect ratio, set\n                  ``keep_ratio=True``, then the same value will be used for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes. Note that when\n                  the ``keep_ratio=True``, the x- and y-axis ranges should be the same.\n        translate_percent (None, number, tuple of number or dict): Translation as a fraction of the image height/width\n            (x-translation, y-translation), where ``0`` denotes \"no change\"\n            and ``0.5`` denotes \"half of the axis size\".\n                * If ``None`` then equivalent to ``0.0`` unless `translate_px` has a value other than ``None``.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That sampled fraction value will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        translate_px (None, int, tuple of int or dict): Translation in pixels.\n                * If ``None`` then equivalent to ``0`` unless `translate_percent` has a value other than ``None``.\n                * If a single int, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from\n                  the discrete interval ``[a..b]``. That number will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        rotate (number or tuple of number): Rotation in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``. Rotation happens around the *center* of the image,\n            not the top left corner as in some other frameworks.\n                * If a number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``\n                  and used as the rotation value.\n        shear (number, tuple of number or dict): Shear in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``, with reasonable values being in the range of ``[-45, 45]``.\n                * If a number, then that value will be used for all images as\n                  the shear on the x-axis (no shear on the y-axis will be done).\n                * If a tuple ``(a, b)``, then two value will be uniformly sampled per image\n                  from the interval ``[a, b]`` and be used as the x- and y-shear value.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        interpolation (int): OpenCV interpolation flag.\n        mask_interpolation (int): OpenCV interpolation flag.\n        fill (ColorType): The constant value to use when filling in newly created pixels.\n            (E.g. translating by 1px to the right will create a new 1px-wide column of pixels\n            on the left of the image).\n            The value is only used when `mode=constant`. The expected value range is ``[0, 255]`` for ``uint8`` images.\n        fill_mask (ColorType): Same as fill but only for masks.\n        border_mode (int): OpenCV border flag.\n        fit_output (bool): If True, the image plane size and position will be adjusted to tightly capture\n            the whole image after affine transformation (`translate_percent` and `translate_px` are ignored).\n            Otherwise (``False``),  parts of the transformed image may end up outside the image plane.\n            Fitting the output shape can be useful to avoid corners of the image being outside the image plane\n            after applying rotations. Default: False\n        keep_ratio (bool): When True, the original aspect ratio will be kept when the random scale is applied.\n            Default: False.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): rotation method used for the bounding boxes.\n            Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"\n        balanced_scale (bool): When True, scaling factors are chosen to be either entirely below or above 1,\n            ensuring balanced scaling. Default: False.\n\n            This is important because without it, scaling tends to lean towards upscaling. For example, if we want\n            the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is\n            three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly\n            from [0.5, 2]. With `balanced_scale`, the  function ensures that half the time, the scaling\n            factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in).\n            This makes the zooming in and out process more balanced.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        [1] https://arxiv.org/abs/2109.13488\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: ScaleFloatType | fgeometric.XYFloatScale\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None\n        rotate: ScaleFloatType\n        shear: ScaleFloatType | fgeometric.XYFloatScale\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        cval: ColorType | None\n        cval_mask: ColorType | None\n        mode: BorderModeType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        fit_output: bool\n        keep_ratio: bool\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        balanced_scale: bool\n\n        @field_validator(\"shear\", \"scale\")\n        @classmethod\n        def process_shear(\n            cls,\n            value: ScaleFloatType | fgeometric.XYFloatScale,\n            info: ValidationInfo,\n        ) -&gt; fgeometric.XYFloatDict:\n            return cast(\n                fgeometric.XYFloatDict,\n                cls._handle_dict_arg(value, info.field_name),\n            )\n\n        @field_validator(\"rotate\")\n        @classmethod\n        def process_rotate(\n            cls,\n            value: ScaleFloatType,\n        ) -&gt; tuple[float, float]:\n            return to_tuple(value, value)\n\n        @model_validator(mode=\"after\")\n        def handle_translate(self) -&gt; Self:\n            if self.translate_percent is None and self.translate_px is None:\n                self.translate_px = 0\n\n            if self.translate_percent is not None and self.translate_px is not None:\n                msg = \"Expected either translate_percent or translate_px to be provided, but both were provided.\"\n                raise ValueError(msg)\n\n            if self.translate_percent is not None:\n                self.translate_percent = self._handle_dict_arg(\n                    self.translate_percent,\n                    \"translate_percent\",\n                    default=0.0,\n                )  # type: ignore[assignment]\n\n            if self.translate_px is not None:\n                self.translate_px = self._handle_dict_arg(\n                    self.translate_px,\n                    \"translate_px\",\n                    default=0,\n                )  # type: ignore[assignment]\n\n            return self\n\n        @staticmethod\n        def _handle_dict_arg(\n            val: ScaleType | fgeometric.XYFloatScale | fgeometric.XYIntScale,\n            name: str | None,\n            default: float = 1.0,\n        ) -&gt; dict[str, Any]:\n            if isinstance(val, dict):\n                if \"x\" not in val and \"y\" not in val:\n                    raise ValueError(\n                        f'Expected {name} dictionary to contain at least key \"x\" or key \"y\". Found neither of them.',\n                    )\n                x = val.get(\"x\", default)\n                y = val.get(\"y\", default)\n                return {\"x\": to_tuple(x, x), \"y\": to_tuple(y, y)}  # type: ignore[arg-type]\n            return {\"x\": to_tuple(val, val), \"y\": to_tuple(val, val)}\n\n        @model_validator(mode=\"after\")\n        def validate_fill_types(self) -&gt; Self:\n            if self.cval is not None:\n                self.fill = self.cval\n                warn(\"cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n            if self.cval_mask is not None:\n                self.fill_mask = self.cval_mask\n                warn(\"cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n            if self.mode is not None:\n                self.border_mode = self.mode\n                warn(\"mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType | fgeometric.XYFloatScale = 1,\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None = None,\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None = None,\n        rotate: ScaleFloatType = 0,\n        shear: ScaleFloatType | fgeometric.XYFloatScale = 0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: ColorType | None = None,\n        cval_mask: ColorType | None = None,\n        mode: int | None = None,\n        fit_output: bool = False,\n        keep_ratio: bool = False,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        balanced_scale: bool = False,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n        self.scale = cast(fgeometric.XYFloatDict, scale)\n        self.translate_percent = cast(fgeometric.XYFloatDict, translate_percent)\n        self.translate_px = cast(fgeometric.XYIntDict, translate_px)\n        self.rotate = cast(tuple[float, float], rotate)\n        self.fit_output = fit_output\n        self.shear = cast(fgeometric.XYFloatDict, shear)\n        self.keep_ratio = keep_ratio\n        self.rotate_method = rotate_method\n        self.balanced_scale = balanced_scale\n\n        if self.keep_ratio and self.scale[\"x\"] != self.scale[\"y\"]:\n            raise ValueError(\n                f\"When keep_ratio is True, the x and y scale range should be identical. got {self.scale}\",\n            )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"fill\",\n            \"border_mode\",\n            \"scale\",\n            \"translate_percent\",\n            \"translate_px\",\n            \"rotate\",\n            \"fit_output\",\n            \"shear\",\n            \"fill_mask\",\n            \"keep_ratio\",\n            \"rotate_method\",\n            \"balanced_scale\",\n        )\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            img,\n            matrix,\n            interpolation=self.interpolation,\n            fill=self.fill,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            mask,\n            matrix,\n            interpolation=self.mask_interpolation,\n            fill=self.fill_mask,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            params[\"shape\"][:2],\n            self.border_mode,\n            output_shape,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        scale: fgeometric.XYFloat,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"],\n            scale,\n            self.border_mode,\n        )\n\n    @staticmethod\n    def get_scale(\n        scale: fgeometric.XYFloatDict,\n        keep_ratio: bool,\n        balanced_scale: bool,\n        random_state: random.Random,\n    ) -&gt; fgeometric.XYFloat:\n        result_scale = {}\n        for key, value in scale.items():\n            if isinstance(value, (int, float)):\n                result_scale[key] = float(value)\n            elif isinstance(value, tuple):\n                if balanced_scale:\n                    lower_interval = (value[0], 1.0) if value[0] &lt; 1 else None\n                    upper_interval = (1.0, value[1]) if value[1] &gt; 1 else None\n\n                    if lower_interval is not None and upper_interval is not None:\n                        selected_interval = random_state.choice(\n                            [lower_interval, upper_interval],\n                        )\n                    elif lower_interval is not None:\n                        selected_interval = lower_interval\n                    elif upper_interval is not None:\n                        selected_interval = upper_interval\n                    else:\n                        result_scale[key] = 1.0\n                        continue\n\n                    result_scale[key] = random_state.uniform(*selected_interval)\n                else:\n                    result_scale[key] = random_state.uniform(*value)\n            else:\n                raise TypeError(\n                    f\"Invalid scale value for key {key}: {value}. Expected a float or a tuple of two floats.\",\n                )\n\n        if keep_ratio:\n            result_scale[\"y\"] = result_scale[\"x\"]\n\n        return cast(fgeometric.XYFloat, result_scale)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        translate = self._get_translate_params(image_shape)\n        shear = self._get_shear_params()\n        scale = self.get_scale(\n            self.scale,\n            self.keep_ratio,\n            self.balanced_scale,\n            self.py_random,\n        )\n        rotate = self.py_random.uniform(*self.rotate)\n\n        image_shift = fgeometric.center(image_shape)\n        bbox_shift = fgeometric.center_bbox(image_shape)\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            image_shift,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_shift,\n        )\n\n        if self.fit_output:\n            matrix, output_shape = fgeometric.compute_affine_warp_output_shape(\n                matrix,\n                image_shape,\n            )\n            bbox_matrix, _ = fgeometric.compute_affine_warp_output_shape(\n                bbox_matrix,\n                image_shape,\n            )\n        else:\n            output_shape = image_shape\n\n        return {\n            \"rotate\": rotate,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": output_shape,\n        }\n\n    def _get_translate_params(self, image_shape: tuple[int, int]) -&gt; fgeometric.XYInt:\n        height, width = image_shape[:2]\n        if self.translate_px is not None:\n            return {\n                \"x\": self.py_random.randint(*self.translate_px[\"x\"]),\n                \"y\": self.py_random.randint(*self.translate_px[\"y\"]),\n            }\n        if self.translate_percent is not None:\n            translate = {key: self.py_random.uniform(*value) for key, value in self.translate_percent.items()}\n            return cast(\n                fgeometric.XYInt,\n                {\"x\": int(translate[\"x\"] * width), \"y\": int(translate[\"y\"] * height)},\n            )\n        return cast(fgeometric.XYInt, {\"x\": 0, \"y\": 0})\n\n    def _get_shear_params(self) -&gt; fgeometric.XYFloat:\n        return {\n            \"x\": -self.py_random.uniform(*self.shear[\"x\"]),\n            \"y\": -self.py_random.uniform(*self.shear[\"y\"]),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.BaseDistortion","title":"<code>class  BaseDistortion</code> <code>       (interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Base class for distortion-based transformations.</p> <p>This class provides a foundation for implementing various types of image distortions, such as optical distortions, grid distortions, and elastic transformations. It handles the common operations of applying distortions to images, masks, bounding boxes, and keypoints.</p> <p>Parameters:</p> Name Type Description <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR, cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This is an abstract base class and should not be used directly.</li> <li>Subclasses should implement the <code>get_params_dependent_on_data</code> method to generate   the distortion maps (map_x and map_y).</li> <li>The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)   to maintain coherence in the augmented data.</li> </ul> <p>Example of a subclass:     class CustomDistortion(BaseDistortion):         def init(self, args, **kwargs):             super().init(args, **kwargs)             # Add custom parameters here</p> <pre><code>    def get_params_dependent_on_data(self, params, data):\n        # Generate and return map_x and map_y based on the distortion logic\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self):\n        return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class BaseDistortion(DualTransform):\n    \"\"\"Base class for distortion-based transformations.\n\n    This class provides a foundation for implementing various types of image distortions,\n    such as optical distortions, grid distortions, and elastic transformations. It handles\n    the common operations of applying distortions to images, masks, bounding boxes, and keypoints.\n\n    Args:\n        interpolation (int): Interpolation method to be used for image transformation.\n            Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR,\n            cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This is an abstract base class and should not be used directly.\n        - Subclasses should implement the `get_params_dependent_on_data` method to generate\n          the distortion maps (map_x and map_y).\n        - The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)\n          to maintain coherence in the augmented data.\n\n    Example of a subclass:\n        class CustomDistortion(BaseDistortion):\n            def __init__(self, *args, **kwargs):\n                super().__init__(*args, **kwargs)\n                # Add custom parameters here\n\n            def get_params_dependent_on_data(self, params, data):\n                # Generate and return map_x and map_y based on the distortion logic\n                return {\"map_x\": map_x, \"map_y\": map_y}\n\n            def get_transform_init_args_names(self):\n                return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            img,\n            map_x,\n            map_y,\n            self.interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            mask,\n            map_x,\n            map_y,\n            self.mask_interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        bboxes_returned = fgeometric.remap_bboxes(\n            bboxes_denorm,\n            map_x,\n            map_y,\n            image_shape,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap_keypoints(keypoints, map_x, map_y, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.D4","title":"<code>class  D4</code> <code>       (p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections.</p> <p>The D4 group transformations include: - 'e' (identity): No transformation is applied. - 'r90' (rotation by 90 degrees counterclockwise) - 'r180' (rotation by 180 degrees) - 'r270' (rotation by 270 degrees counterclockwise) - 'v' (reflection across the vertical midline) - 'hvt' (reflection across the anti-diagonal) - 'h' (reflection across the horizontal midline) - 't' (reflection across the main diagonal)</p> <p>Even if the probability (<code>p</code>) of applying the transform is set to 1, the identity transformation 'e' may still occur, which means the input will remain unchanged in one out of eight cases.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform is particularly useful for augmenting data that does not have a clear orientation,   such as top-view satellite or drone imagery, or certain types of medical images.</li> <li>The input image should be square-shaped for optimal results. Non-square inputs may lead to   unexpected behavior or distortions.</li> <li>When applied to bounding boxes or keypoints, their coordinates will be adjusted according   to the selected transformation.</li> <li>This transform preserves the aspect ratio and size of the input.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.D4(p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed['image']\n# The resulting image will be one of the 8 possible D4 transformations of the input\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class D4(DualTransform):\n    \"\"\"Applies one of the eight possible D4 dihedral group transformations to a square-shaped input,\n    maintaining the square shape. These transformations correspond to the symmetries of a square,\n    including rotations and reflections.\n\n    The D4 group transformations include:\n    - 'e' (identity): No transformation is applied.\n    - 'r90' (rotation by 90 degrees counterclockwise)\n    - 'r180' (rotation by 180 degrees)\n    - 'r270' (rotation by 270 degrees counterclockwise)\n    - 'v' (reflection across the vertical midline)\n    - 'hvt' (reflection across the anti-diagonal)\n    - 'h' (reflection across the horizontal midline)\n    - 't' (reflection across the main diagonal)\n\n    Even if the probability (`p`) of applying the transform is set to 1, the identity transformation\n    'e' may still occur, which means the input will remain unchanged in one out of eight cases.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform is particularly useful for augmenting data that does not have a clear orientation,\n          such as top-view satellite or drone imagery, or certain types of medical images.\n        - The input image should be square-shaped for optimal results. Non-square inputs may lead to\n          unexpected behavior or distortions.\n        - When applied to bounding boxes or keypoints, their coordinates will be adjusted according\n          to the selected transformation.\n        - This transform preserves the aspect ratio and size of the input.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.D4(p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        # The resulting image will be one of the 8 possible D4 transformations of the input\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(\n        self,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.d4(img, group_element)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_d4(bboxes, group_element)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_d4(keypoints, group_element, params[\"shape\"])\n\n    def get_params(self) -&gt; dict[str, D4Type]:\n        return {\n            \"group_element\": self.random_generator.choice(d4_group_elements),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.ElasticTransform","title":"<code>class  ElasticTransform</code> <code>       (alpha=1, sigma=50, interpolation=1, border_mode=4, value=None, mask_value=None, approximate=False, same_dxdy=False, mask_interpolation=0, noise_distribution='gaussian', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply elastic deformation to images, masks, bounding boxes, and keypoints.</p> <p>This transformation introduces random elastic distortions to the input data. It's particularly useful for data augmentation in training deep learning models, especially for tasks like image segmentation or object detection where you want to maintain the relative positions of features while introducing realistic deformations.</p> <p>The transform works by generating random displacement fields and applying them to the input. These fields are smoothed using a Gaussian filter to create more natural-looking distortions.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>float</code> <p>Scaling factor for the random displacement fields. Higher values result in more pronounced distortions. Default: 1.0</p> <code>sigma</code> <code>float</code> <p>Standard deviation of the Gaussian filter used to smooth the displacement fields. Higher values result in smoother, more global distortions. Default: 50.0</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>approximate</code> <code>bool</code> <p>Whether to use an approximate version of the elastic transform. If True, uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially less accurate for large sigma values. Default: False</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use the same random displacement field for both x and y directions. Can speed up the transform at the cost of less diverse distortions. Default: False</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>noise_distribution</code> <code>Literal[\"gaussian\", \"uniform\"]</code> <p>Distribution used to generate the displacement fields. \"gaussian\" generates fields using normal distribution (more natural deformations). \"uniform\" generates fields using uniform distribution (more mechanical deformations). Default: \"gaussian\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)   by using the same displacement fields for all.</li> <li>The 'approximate' parameter determines whether to use a precise or approximate method for   generating displacement fields. The approximate method can be faster but may be less   accurate for large sigma values.</li> <li>Bounding boxes that end up outside the image after transformation will be removed.</li> <li>Keypoints that end up outside the image after transformation will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ElasticTransform(BaseDistortion):\n    \"\"\"Apply elastic deformation to images, masks, bounding boxes, and keypoints.\n\n    This transformation introduces random elastic distortions to the input data. It's particularly\n    useful for data augmentation in training deep learning models, especially for tasks like\n    image segmentation or object detection where you want to maintain the relative positions of\n    features while introducing realistic deformations.\n\n    The transform works by generating random displacement fields and applying them to the input.\n    These fields are smoothed using a Gaussian filter to create more natural-looking distortions.\n\n    Args:\n        alpha (float): Scaling factor for the random displacement fields. Higher values result in\n            more pronounced distortions. Default: 1.0\n        sigma (float): Standard deviation of the Gaussian filter used to smooth the displacement\n            fields. Higher values result in smoother, more global distortions. Default: 50.0\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        approximate (bool): Whether to use an approximate version of the elastic transform. If True,\n            uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially\n            less accurate for large sigma values. Default: False\n        same_dxdy (bool): Whether to use the same random displacement field for both x and y\n            directions. Can speed up the transform at the cost of less diverse distortions. Default: False\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        noise_distribution (Literal[\"gaussian\", \"uniform\"]): Distribution used to generate the displacement fields.\n            \"gaussian\" generates fields using normal distribution (more natural deformations).\n            \"uniform\" generates fields using uniform distribution (more mechanical deformations).\n            Default: \"gaussian\".\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)\n          by using the same displacement fields for all.\n        - The 'approximate' parameter determines whether to use a precise or approximate method for\n          generating displacement fields. The approximate method can be faster but may be less\n          accurate for large sigma values.\n        - Bounding boxes that end up outside the image after transformation will be removed.\n        - Keypoints that end up outside the image after transformation will be removed.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        alpha: Annotated[float, Field(ge=0)]\n        sigma: Annotated[float, Field(ge=1)]\n        approximate: bool\n        same_dxdy: bool\n        noise_distribution: Literal[\"gaussian\", \"uniform\"]\n        border_mode: BorderModeType = Field(deprecated=\"Deprecated\")\n        value: ColorType | None = Field(deprecated=\"Deprecated\")\n        mask_value: ColorType | None = Field(deprecated=\"Deprecated\")\n\n    def __init__(\n        self,\n        alpha: float = 1,\n        sigma: float = 50,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        approximate: bool = False,\n        same_dxdy: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        noise_distribution: Literal[\"gaussian\", \"uniform\"] = \"gaussian\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.alpha = alpha\n        self.sigma = sigma\n        self.approximate = approximate\n        self.same_dxdy = same_dxdy\n        self.noise_distribution = noise_distribution\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n        kernel_size = (0, 0) if self.approximate else (17, 17)\n\n        # Generate displacement fields\n        dx, dy = fgeometric.generate_displacement_fields(\n            (height, width),\n            self.alpha,\n            self.sigma,\n            same_dxdy=self.same_dxdy,\n            kernel_size=kernel_size,\n            random_generator=self.random_generator,\n            noise_distribution=self.noise_distribution,\n        )\n\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        map_x = np.float32(x + dx)\n        map_y = np.float32(y + dy)\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"alpha\",\n            \"sigma\",\n            \"approximate\",\n            \"same_dxdy\",\n            \"noise_distribution\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.GridDistortion","title":"<code>class  GridDistortion</code> <code>       (num_steps=5, distort_limit=(-0.3, 0.3), interpolation=1, border_mode=4, value=None, mask_value=None, normalized=True, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply grid distortion to images, masks, bounding boxes, and keypoints.</p> <p>This transformation divides the image into a grid and randomly distorts each cell, creating localized warping effects. It's particularly useful for data augmentation in tasks like medical image analysis, OCR, and other domains where local geometric variations are meaningful.</p> <p>Parameters:</p> Name Type Description <code>num_steps</code> <code>int</code> <p>Number of grid cells on each side of the image. Higher values create more granular distortions. Must be at least 1. Default: 5.</p> <code>distort_limit</code> <code>float or tuple[float, float]</code> <p>Range of distortion. If a single float is provided, the range will be (-distort_limit, distort_limit). Higher values create stronger distortions. Should be in the range of -1 to 1. Default: (-0.3, 0.3).</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation method used for image transformation. Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.</p> <code>normalized</code> <code>bool</code> <p>If True, ensures that the distortion does not move pixels outside the image boundaries. This can result in less extreme distortions but guarantees that no information is lost. Default: True.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The same distortion is applied to all targets (image, mask, bboxes, keypoints)   to maintain consistency.</li> <li>When normalized=True, the distortion is adjusted to ensure all pixels remain   within the image boundaries.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridDistortion(BaseDistortion):\n    \"\"\"Apply grid distortion to images, masks, bounding boxes, and keypoints.\n\n    This transformation divides the image into a grid and randomly distorts each cell,\n    creating localized warping effects. It's particularly useful for data augmentation\n    in tasks like medical image analysis, OCR, and other domains where local geometric\n    variations are meaningful.\n\n    Args:\n        num_steps (int): Number of grid cells on each side of the image. Higher values\n            create more granular distortions. Must be at least 1. Default: 5.\n        distort_limit (float or tuple[float, float]): Range of distortion. If a single float\n            is provided, the range will be (-distort_limit, distort_limit). Higher values\n            create stronger distortions. Should be in the range of -1 to 1.\n            Default: (-0.3, 0.3).\n        interpolation (int): OpenCV interpolation method used for image transformation.\n            Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.\n        normalized (bool): If True, ensures that the distortion does not move pixels\n            outside the image boundaries. This can result in less extreme distortions\n            but guarantees that no information is lost. Default: True.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The same distortion is applied to all targets (image, mask, bboxes, keypoints)\n          to maintain consistency.\n        - When normalized=True, the distortion is adjusted to ensure all pixels remain\n          within the image boundaries.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        num_steps: Annotated[int, Field(ge=1)]\n        distort_limit: SymmetricRangeType\n        normalized: bool\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int = Field(deprecated=\"Deprecated. Does not have any effect.\")\n\n        @field_validator(\"distort_limit\")\n        @classmethod\n        def check_limits(\n            cls,\n            v: tuple[float, float],\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = -1, 1\n            result = to_tuple(v)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        num_steps: int = 5,\n        distort_limit: ScaleFloatType = (-0.3, 0.3),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        normalized: bool = True,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.num_steps = num_steps\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.normalized = normalized\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        steps_x = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n        steps_y = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n\n        if self.normalized:\n            normalized_params = fgeometric.normalize_grid_distortion_steps(\n                image_shape,\n                self.num_steps,\n                steps_x,\n                steps_y,\n            )\n            steps_x, steps_y = (\n                normalized_params[\"steps_x\"],\n                normalized_params[\"steps_y\"],\n            )\n\n        map_x, map_y = fgeometric.generate_grid(\n            image_shape,\n            steps_x,\n            steps_y,\n            self.num_steps,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"num_steps\",\n            \"distort_limit\",\n            \"normalized\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.GridElasticDeform","title":"<code>class  GridElasticDeform</code> <code>       (num_grid_xy, magnitude, interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.</p> <p>This transformation overlays a grid on the input and applies random displacements to the grid points, resulting in local elastic distortions. The granularity and intensity of the distortions can be controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.</p> <p>Parameters:</p> Name Type Description <code>num_grid_xy</code> <code>tuple[int, int]</code> <p>Number of grid cells along the width and height. Specified as (grid_width, grid_height). Each value must be greater than 1.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. Must be greater than 0.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for the image transformation. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method to be used for mask transformation. Default: cv2.INTER_NEAREST</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n&gt;&gt;&gt; result = transform(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n</code></pre> <p>Note</p> <p>This transformation is particularly useful for data augmentation in medical imaging and other domains where elastic deformations can simulate realistic variations.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridElasticDeform(DualTransform):\n    \"\"\"Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.\n\n    This transformation overlays a grid on the input and applies random displacements to the grid points,\n    resulting in local elastic distortions. The granularity and intensity of the distortions can be\n    controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.\n\n\n    Args:\n        num_grid_xy (tuple[int, int]): Number of grid cells along the width and height.\n            Specified as (grid_width, grid_height). Each value must be greater than 1.\n        magnitude (int): Maximum pixel-wise displacement for distortion. Must be greater than 0.\n        interpolation (int): Interpolation method to be used for the image transformation.\n            Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Interpolation method to be used for mask transformation.\n            Default: cv2.INTER_NEAREST\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n\n    Note:\n        This transformation is particularly useful for data augmentation in medical imaging\n        and other domains where elastic deformations can simulate realistic variations.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        num_grid_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n        magnitude: int = Field(gt=0)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        num_grid_xy: tuple[int, int],\n        magnitude: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_grid_xy = num_grid_xy\n        self.magnitude = magnitude\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    @staticmethod\n    def generate_mesh(polygons: np.ndarray, dimensions: np.ndarray) -&gt; np.ndarray:\n        return np.hstack((dimensions.reshape(-1, 4), polygons))\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        # Replace calculate_grid_dimensions with split_uniform_grid\n        tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.num_grid_xy,\n            self.random_generator,\n        )\n\n        # Convert tiles to the format expected by generate_distorted_grid_polygons\n        dimensions = np.array(\n            [\n                [\n                    tile[1],\n                    tile[0],\n                    tile[3],\n                    tile[2],\n                ]  # Reorder to [x_min, y_min, x_max, y_max]\n                for tile in tiles\n            ],\n        ).reshape(\n            self.num_grid_xy[::-1] + (4,),\n        )  # Reshape to (grid_height, grid_width, 4)\n\n        polygons = fgeometric.generate_distorted_grid_polygons(\n            dimensions,\n            self.magnitude,\n            self.random_generator,\n        )\n\n        generated_mesh = self.generate_mesh(polygons, dimensions)\n\n        return {\"generated_mesh\": generated_mesh}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(img, generated_mesh, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(mask, generated_mesh, self.mask_interpolation)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        bboxes_denorm = denormalize_bboxes(bboxes, params[\"shape\"][:2])\n        return normalize_bboxes(\n            fgeometric.bbox_distort_image(\n                bboxes_denorm,\n                generated_mesh,\n                params[\"shape\"][:2],\n            ),\n            params[\"shape\"][:2],\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image_keypoints(\n            keypoints,\n            generated_mesh,\n            params[\"shape\"][:2],\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"num_grid_xy\", \"magnitude\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.HorizontalFlip","title":"<code>class  HorizontalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class HorizontalFlip(DualTransform):\n    \"\"\"Flip the input horizontally around the y-axis.\n\n    Args:\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return hflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_hflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_hflip(keypoints, params[\"shape\"][1])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.OpticalDistortion","title":"<code>class  OpticalDistortion</code> <code>       (distort_limit=(-0.05, 0.05), shift_limit=None, interpolation=1, border_mode=None, value=None, mask_value=None, mask_interpolation=0, mode='camera', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply optical distortion to images, masks, bounding boxes, and keypoints.</p> <p>Supports two distortion models: 1. Camera matrix model (original):    Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients</p> <ol> <li>Fisheye model:    Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)</li> </ol> <p>Parameters:</p> Name Type Description <code>distort_limit</code> <code>float | tuple[float, float]</code> <p>Range of distortion coefficient. For camera model: recommended range (-0.05, 0.05) For fisheye model: recommended range (-0.3, 0.3) Default: (-0.05, 0.05)</p> <code>mode</code> <code>Literal['camera', 'fisheye']</code> <p>Distortion model to use: - 'camera': Original camera matrix model - 'fisheye': Fisheye lens model Default: 'camera'</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Interpolation method used for image transformation. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.</li> <li>The distortion coefficient (k) is randomly sampled from the distort_limit range.</li> <li>The image center is shifted by dx and dy, randomly sampled from the shift_limit range.</li> <li>Bounding boxes and keypoints are transformed along with the image to maintain consistency.</li> <li>Fisheye model directly applies radial distortion</li> <li>Both models use shift_limit to control distortion center</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class OpticalDistortion(BaseDistortion):\n    \"\"\"Apply optical distortion to images, masks, bounding boxes, and keypoints.\n\n    Supports two distortion models:\n    1. Camera matrix model (original):\n       Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients\n\n    2. Fisheye model:\n       Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)\n\n    Args:\n        distort_limit (float | tuple[float, float]): Range of distortion coefficient.\n            For camera model: recommended range (-0.05, 0.05)\n            For fisheye model: recommended range (-0.3, 0.3)\n            Default: (-0.05, 0.05)\n\n        mode (Literal['camera', 'fisheye']): Distortion model to use:\n            - 'camera': Original camera matrix model\n            - 'fisheye': Fisheye lens model\n            Default: 'camera'\n\n        interpolation (OpenCV flag): Interpolation method used for image transformation.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC,\n            cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.\n\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.\n        - The distortion coefficient (k) is randomly sampled from the distort_limit range.\n        - The image center is shifted by dx and dy, randomly sampled from the shift_limit range.\n        - Bounding boxes and keypoints are transformed along with the image to maintain consistency.\n        - Fisheye model directly applies radial distortion\n        - Both models use shift_limit to control distortion center\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        distort_limit: SymmetricRangeType\n        mode: Literal[\"camera\", \"fisheye\"]\n        shift_limit: SymmetricRangeType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n    def __init__(\n        self,\n        distort_limit: ScaleFloatType = (-0.05, 0.05),\n        shift_limit: ScaleFloatType | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int | None = None,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        mode: Literal[\"camera\", \"fisheye\"] = \"camera\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.mode = mode\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        height, width = image_shape\n\n        # Get distortion coefficient\n        k = self.py_random.uniform(*self.distort_limit)\n\n        # Calculate center shift\n        center_xy = fgeometric.center(image_shape)\n\n        # Get distortion maps based on mode\n        if self.mode == \"camera\":\n            map_x, map_y = fgeometric.get_camera_matrix_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n        else:  # fisheye\n            map_x, map_y = fgeometric.get_fisheye_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"distort_limit\",\n            \"mode\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Pad","title":"<code>class  Pad</code> <code>       (padding=0, fill=0, fill_mask=0, border_mode=0, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Pad the sides of an image by specified number of pixels.</p> <p>Parameters:</p> Name Type Description <code>padding</code> <code>int, tuple[int, int] or tuple[int, int, int, int]</code> <p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>References</p> <ul> <li>https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Pad(DualTransform):\n    \"\"\"Pad the sides of an image by specified number of pixels.\n\n    Args:\n        padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:\n            * int - pad all sides by this value\n            * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y\n            * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT\n        fill_mask (ColorType): Padding value for mask if border_mode is cv2.BORDER_CONSTANT\n        border_mode (OpenCV flag): OpenCV border mode\n        p (float): probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    References:\n        - https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        padding: int | tuple[int, int] | tuple[int, int, int, int]\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n    def __init__(\n        self,\n        padding: int | tuple[int, int] | tuple[int, int, int, int] = 0,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        border_mode: BorderModeType = cv2.BORDER_CONSTANT,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.padding = padding\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n\n    def apply(\n        self,\n        img: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            img,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            mask,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill_mask,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_np = denormalize_bboxes(bboxes, params[\"shape\"])\n\n        result = fgeometric.pad_bboxes(\n            bboxes_np,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=image_shape,\n        )\n\n        rows, cols = params[\"shape\"][:2]\n        return normalize_bboxes(\n            result,\n            (rows + pad_top + pad_bottom, cols + pad_left + pad_right),\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_keypoints(\n            keypoints,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=params[\"shape\"][:2],\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        if isinstance(self.padding, Real):\n            pad_top = pad_bottom = pad_left = pad_right = self.padding\n        elif isinstance(self.padding, (tuple, list)):\n            if len(self.padding) == NUM_PADS_XY:\n                pad_left = pad_right = self.padding[0]\n                pad_top = pad_bottom = self.padding[1]\n            elif len(self.padding) == NUM_PADS_ALL_SIDES:\n                pad_left, pad_top, pad_right, pad_bottom = self.padding  # type: ignore[misc]\n            else:\n                raise TypeError(\n                    \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n                )\n        else:\n            raise TypeError(\n                \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n            )\n\n        return {\n            \"pad_top\": pad_top,\n            \"pad_bottom\": pad_bottom,\n            \"pad_left\": pad_left,\n            \"pad_right\": pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"padding\",\n            \"fill\",\n            \"fill_mask\",\n            \"border_mode\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.PadIfNeeded","title":"<code>class  PadIfNeeded</code> <code>       (min_height=1024, min_width=1024, pad_height_divisor=None, pad_width_divisor=None, position='center', border_mode=4, value=None, mask_value=None, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pads the sides of an image if the image dimensions are less than the specified minimum dimensions. If the <code>pad_height_divisor</code> or <code>pad_width_divisor</code> is specified, the function additionally ensures that the image dimensions are divisible by these values.</p> <p>Parameters:</p> Name Type Description <code>min_height</code> <code>int | None</code> <p>Minimum desired height of the image. Ensures image height is at least this value. If not specified, pad_height_divisor must be provided.</p> <code>min_width</code> <code>int | None</code> <p>Minimum desired width of the image. Ensures image width is at least this value. If not specified, pad_width_divisor must be provided.</p> <code>pad_height_divisor</code> <code>int | None</code> <p>If set, pads the image height to make it divisible by this value. If not specified, min_height must be provided.</p> <code>pad_width_divisor</code> <code>int | None</code> <p>If set, pads the image width to make it divisible by this value. If not specified, min_width must be provided.</p> <code>position</code> <code>Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]</code> <p>Position where the image is to be placed after padding. Default is 'center'.</p> <code>border_mode</code> <code>int</code> <p>Specifies the border mode to use if padding is required. The default is <code>cv2.BORDER_REFLECT_101</code>.</p> <code>fill</code> <code>ColorType | None</code> <p>Value to fill the border pixels if the border mode is <code>cv2.BORDER_CONSTANT</code>. Default is None.</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Similar to <code>fill</code> but used for padding masks. Default is None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default is 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Either <code>min_height</code> or <code>pad_height_divisor</code> must be set, but not both.</li> <li>Either <code>min_width</code> or <code>pad_width_divisor</code> must be set, but not both.</li> <li>If <code>border_mode</code> is set to <code>cv2.BORDER_CONSTANT</code>, <code>value</code> must be provided.</li> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).</li> <li>For bounding boxes, the coordinates will be adjusted to account for the padding.</li> <li>For keypoints, their positions will be shifted according to the padding.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; padded_image = transformed['image']\n&gt;&gt;&gt; padded_mask = transformed['mask']\n&gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n&gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PadIfNeeded(Pad):\n    \"\"\"Pads the sides of an image if the image dimensions are less than the specified minimum dimensions.\n    If the `pad_height_divisor` or `pad_width_divisor` is specified, the function additionally ensures\n    that the image dimensions are divisible by these values.\n\n    Args:\n        min_height (int | None): Minimum desired height of the image. Ensures image height is at least this value.\n            If not specified, pad_height_divisor must be provided.\n        min_width (int | None): Minimum desired width of the image. Ensures image width is at least this value.\n            If not specified, pad_width_divisor must be provided.\n        pad_height_divisor (int | None): If set, pads the image height to make it divisible by this value.\n            If not specified, min_height must be provided.\n        pad_width_divisor (int | None): If set, pads the image width to make it divisible by this value.\n            If not specified, min_width must be provided.\n        position (Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]):\n            Position where the image is to be placed after padding. Default is 'center'.\n        border_mode (int): Specifies the border mode to use if padding is required.\n            The default is `cv2.BORDER_REFLECT_101`.\n        fill (ColorType | None): Value to fill the border pixels if the border mode is `cv2.BORDER_CONSTANT`.\n            Default is None.\n        fill_mask (ColorType | None): Similar to `fill` but used for padding masks. Default is None.\n        p (float): Probability of applying the transform. Default is 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Either `min_height` or `pad_height_divisor` must be set, but not both.\n        - Either `min_width` or `pad_width_divisor` must be set, but not both.\n        - If `border_mode` is set to `cv2.BORDER_CONSTANT`, `value` must be provided.\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).\n        - For bounding boxes, the coordinates will be adjusted to account for the padding.\n        - For keypoints, their positions will be shifted according to the padding.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; padded_image = transformed['image']\n        &gt;&gt;&gt; padded_mask = transformed['mask']\n        &gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        min_height: int | None = Field(ge=1)\n        min_width: int | None = Field(ge=1)\n        pad_height_divisor: int | None = Field(ge=1)\n        pad_width_divisor: int | None = Field(ge=1)\n        position: PositionType\n        border_mode: BorderModeType\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        @model_validator(mode=\"after\")\n        def validate_divisibility(self) -&gt; Self:\n            if (self.min_height is None) == (self.pad_height_divisor is None):\n                msg = \"Only one of 'min_height' and 'pad_height_divisor' parameters must be set\"\n                raise ValueError(msg)\n            if (self.min_width is None) == (self.pad_width_divisor is None):\n                msg = \"Only one of 'min_width' and 'pad_width_divisor' parameters must be set\"\n                raise ValueError(msg)\n\n            if self.border_mode == cv2.BORDER_CONSTANT and self.fill is None:\n                msg = \"If 'border_mode' is set to 'BORDER_CONSTANT', 'fill' must be provided.\"\n                raise ValueError(msg)\n\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n\n            return self\n\n    def __init__(\n        self,\n        min_height: int | None = 1024,\n        min_width: int | None = 1024,\n        pad_height_divisor: int | None = None,\n        pad_width_divisor: int | None = None,\n        position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        # Initialize with dummy padding that will be calculated later\n        super().__init__(\n            padding=0,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            p=p,\n        )\n        self.min_height = min_height\n        self.min_width = min_width\n        self.pad_height_divisor = pad_height_divisor\n        self.pad_width_divisor = pad_width_divisor\n        self.position = position\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.get_padding_params(\n            image_shape=params[\"shape\"][:2],\n            min_height=self.min_height,\n            min_width=self.min_width,\n            pad_height_divisor=self.pad_height_divisor,\n            pad_width_divisor=self.pad_width_divisor,\n        )\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.adjust_padding_by_position(\n            h_top=h_pad_top,\n            h_bottom=h_pad_bottom,\n            w_left=w_pad_left,\n            w_right=w_pad_right,\n            position=self.position,\n            py_random=self.py_random,\n        )\n\n        return {\n            \"pad_top\": h_pad_top,\n            \"pad_bottom\": h_pad_bottom,\n            \"pad_left\": w_pad_left,\n            \"pad_right\": w_pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"min_height\",\n            \"min_width\",\n            \"pad_height_divisor\",\n            \"pad_width_divisor\",\n            \"position\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Perspective","title":"<code>class  Perspective</code> <code>       (scale=(0.05, 0.1), keep_size=True, pad_mode=None, pad_val=None, mask_pad_val=None, fit_output=False, interpolation=1, mask_interpolation=0, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Apply random four point perspective transformation to the input.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>float or tuple of float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Default: (0.05, 0.1).</p> <code>keep_size</code> <code>bool</code> <p>Whether to resize image back to its original size after applying the perspective transform. If set to False, the resulting images may end up having different shapes. Default: True.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to still capture the whole image after perspective transformation. This is followed by image resizing if keep_size is set to True. If False, parts of the transformed image may be outside of the image plane. This setting should not be set to True when using large scale values as it could lead to very large images. Default: False.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>This transformation creates a perspective effect by randomly moving the four corners of the image. The amount of movement is controlled by the 'scale' parameter.</p> <p>When 'keep_size' is True, the output image will have the same size as the input image, which may cause some parts of the transformed image to be cut off or padded.</p> <p>When 'fit_output' is True, the transformation ensures that the entire transformed image is visible, which may result in a larger output image if keep_size is False.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n... ])\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Perspective(DualTransform):\n    \"\"\"Apply random four point perspective transformation to the input.\n\n    Args:\n        scale (float or tuple of float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Default: (0.05, 0.1).\n        keep_size (bool): Whether to resize image back to its original size after applying the perspective transform.\n            If set to False, the resulting images may end up having different shapes.\n            Default: True.\n        border_mode (OpenCV flag): OpenCV border mode used for padding.\n            Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n            Default: 0.\n        fill_mask (ColorType): Padding value for mask if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fit_output (bool): If True, the image plane size and position will be adjusted to still capture\n            the whole image after perspective transformation. This is followed by image resizing if keep_size is set\n            to True. If False, parts of the transformed image may be outside of the image plane.\n            This setting should not be set to True when using large scale values as it could lead to very large images.\n            Default: False.\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        This transformation creates a perspective effect by randomly moving the four corners of the image.\n        The amount of movement is controlled by the 'scale' parameter.\n\n        When 'keep_size' is True, the output image will have the same size as the input image,\n        which may cause some parts of the transformed image to be cut off or padded.\n\n        When 'fit_output' is True, the transformation ensures that the entire transformed image is visible,\n        which may result in a larger output image if keep_size is False.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n        ... ])\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        keep_size: bool\n        pad_mode: BorderModeType | None\n        pad_val: ColorType | None\n        mask_pad_val: ColorType | None\n        fit_output: bool\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        @model_validator(mode=\"after\")\n        def validate_deprecated_fields(self) -&gt; Self:\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_val is not None:\n                warn(\"pad_val is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_val\n            if self.mask_pad_val is not None:\n                warn(\"mask_pad_val is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_pad_val\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.05, 0.1),\n        keep_size: bool = True,\n        pad_mode: int | None = None,\n        pad_val: ColorType | None = None,\n        mask_pad_val: ColorType | None = None,\n        fit_output: bool = False,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p, always_apply=always_apply)\n        self.scale = cast(tuple[float, float], scale)\n        self.keep_size = keep_size\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.fit_output = fit_output\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            img,\n            matrix,\n            max_width,\n            max_height,\n            self.fill,\n            self.border_mode,\n            self.keep_size,\n            self.interpolation,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            mask,\n            matrix,\n            max_width,\n            max_height,\n            self.fill_mask,\n            self.border_mode,\n            self.keep_size,\n            self.mask_interpolation,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        matrix_bbox: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_bboxes(\n            bboxes,\n            params[\"shape\"],\n            matrix_bbox,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_keypoints(\n            keypoints,\n            params[\"shape\"],\n            matrix,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        scale = self.py_random.uniform(*self.scale)\n\n        points = fgeometric.generate_perspective_points(\n            image_shape,\n            scale,\n            self.random_generator,\n        )\n        points = fgeometric.order_points(points)\n\n        matrix, max_width, max_height = fgeometric.compute_perspective_params(\n            points,\n            image_shape,\n        )\n\n        if self.fit_output:\n            matrix, max_width, max_height = fgeometric.expand_transform(\n                matrix,\n                image_shape,\n            )\n\n        return {\n            \"matrix\": matrix,\n            \"max_height\": max_height,\n            \"max_width\": max_width,\n            \"matrix_bbox\": matrix,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"keep_size\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"fit_output\",\n            \"interpolation\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.PiecewiseAffine","title":"<code>class  PiecewiseAffine</code> <code>       (scale=(0.03, 0.05), nb_rows=(4, 4), nb_cols=(4, 4), interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, absolute_scale=False, p=0.5, always_apply=None, keypoints_threshold=0.01)                     </code>  [view source on GitHub]","text":"<p>Apply piecewise affine transformations to the input image.</p> <p>This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points around via affine transformations. This leads to local distortions in the image.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>tuple[float, float] | float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Recommended values are in the range (0.01, 0.05) for small distortions, and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).</p> <code>nb_rows</code> <code>tuple[int, int] | int</code> <p>Number of rows of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of rows. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>nb_cols</code> <code>tuple[int, int] | int</code> <p>Number of columns of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of columns. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>absolute_scale</code> <code>bool</code> <p>If set to True, the value of the scale parameter will be treated as an absolute pixel value. If set to False, it will be treated as a fraction of the image height and width. Default: False.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This augmentation is very slow. Consider using <code>ElasticTransform</code> instead, which is at least 10x faster.</li> <li>The augmentation may not always produce visible effects, especially with small scale values.</li> <li>For keypoints and bounding boxes, the transformation might move them outside the image boundaries.   In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PiecewiseAffine(BaseDistortion):\n    \"\"\"Apply piecewise affine transformations to the input image.\n\n    This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points\n    around via affine transformations. This leads to local distortions in the image.\n\n    Args:\n        scale (tuple[float, float] | float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Recommended values are in the range (0.01, 0.05) for small distortions,\n            and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).\n        nb_rows (tuple[int, int] | int): Number of rows of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of rows.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        nb_cols (tuple[int, int] | int): Number of columns of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of columns.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        absolute_scale (bool): If set to True, the value of the scale parameter will be treated as an absolute\n            pixel value. If set to False, it will be treated as a fraction of the image height and width.\n            Default: False.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This augmentation is very slow. Consider using `ElasticTransform` instead, which is at least 10x faster.\n        - The augmentation may not always produce visible effects, especially with small scale values.\n        - For keypoints and bounding boxes, the transformation might move them outside the image boundaries.\n          In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed[\"image\"]\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        nb_rows: ScaleIntType\n        nb_cols: ScaleIntType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        cval: int | None = Field(deprecated=\"Deprecated. Does not have any effect.\")\n        cval_mask: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = Field(\n            deprecated=\"Deprecated. Does not have any effects.\",\n        )\n\n        absolute_scale: bool\n        keypoints_threshold: float = Field(\n            deprecated=\"This parameter is not used anymore\",\n        )\n\n        @field_validator(\"nb_rows\", \"nb_cols\")\n        @classmethod\n        def process_range(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = 2, BIG_INTEGER\n            result = to_tuple(value, value)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.03, 0.05),\n        nb_rows: ScaleIntType = (4, 4),\n        nb_cols: ScaleIntType = (4, 4),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: int | None = None,\n        cval_mask: int | None = None,\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = None,\n        absolute_scale: bool = False,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n        keypoints_threshold: float = 0.01,\n    ):\n        super().__init__(\n            p=p,\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n        )\n\n        warn(\n            \"This augmenter is very slow. Try to use ``ElasticTransform`` instead, which is at least 10x faster.\",\n            stacklevel=2,\n        )\n\n        self.scale = cast(tuple[float, float], scale)\n        self.nb_rows = cast(tuple[int, int], nb_rows)\n        self.nb_cols = cast(tuple[int, int], nb_cols)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.absolute_scale = absolute_scale\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"nb_rows\",\n            \"nb_cols\",\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"absolute_scale\",\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        nb_rows = np.clip(self.py_random.randint(*self.nb_rows), 2, None)\n        nb_cols = np.clip(self.py_random.randint(*self.nb_cols), 2, None)\n        scale = self.py_random.uniform(*self.scale)\n\n        map_x, map_y = fgeometric.create_piecewise_affine_maps(\n            image_shape=image_shape,\n            grid=(nb_rows, nb_cols),\n            scale=scale,\n            absolute_scale=self.absolute_scale,\n            random_generator=self.random_generator,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.RandomGridShuffle","title":"<code>class  RandomGridShuffle</code> <code>       (grid=(3, 3), p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Randomly shuffles the grid's cells on an image, mask, or keypoints, effectively rearranging patches within the image. This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.</p> <p>Parameters:</p> Name Type Description <code>grid</code> <code>tuple[int, int]</code> <p>Size of the grid for splitting the image into cells. Each cell is shuffled randomly. For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled. Default: (3, 3)</p> <code>p</code> <code>float</code> <p>Probability that the transform will be applied. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform maintains consistency across all targets. If applied to an image and its corresponding   mask or keypoints, the same shuffling will be applied to all.</li> <li>The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))   for the transform to have any effect.</li> <li>Keypoints are moved along with their corresponding grid cell.</li> <li>This transform could be useful when only micro features are important for the model, and memorizing   the global structure could be harmful. For example:</li> <li>Identifying the type of cell phone used to take a picture based on micro artifacts generated by     phone post-processing algorithms, rather than the semantic features of the photo.     See more at https://ieeexplore.ieee.org/abstract/document/8622031</li> <li>Identifying stress, glucose, hydration levels based on skin images.</li> </ul> <p>Mathematical Formulation:     1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.     2. A random permutation P of integers from 0 to (mn - 1) is generated.     3. Each cell in the grid is assigned a number from 0 to (mn - 1) in row-major order.     4. The cells are then rearranged according to the permutation P.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4]\n... ])\n&gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# The resulting image might look like this (one possible outcome):\n# [[4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1]]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class RandomGridShuffle(DualTransform):\n    \"\"\"Randomly shuffles the grid's cells on an image, mask, or keypoints,\n    effectively rearranging patches within the image.\n    This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.\n\n    Args:\n        grid (tuple[int, int]): Size of the grid for splitting the image into cells. Each cell is shuffled randomly.\n            For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled.\n            Default: (3, 3)\n        p (float): Probability that the transform will be applied. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform maintains consistency across all targets. If applied to an image and its corresponding\n          mask or keypoints, the same shuffling will be applied to all.\n        - The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))\n          for the transform to have any effect.\n        - Keypoints are moved along with their corresponding grid cell.\n        - This transform could be useful when only micro features are important for the model, and memorizing\n          the global structure could be harmful. For example:\n          - Identifying the type of cell phone used to take a picture based on micro artifacts generated by\n            phone post-processing algorithms, rather than the semantic features of the photo.\n            See more at https://ieeexplore.ieee.org/abstract/document/8622031\n          - Identifying stress, glucose, hydration levels based on skin images.\n\n    Mathematical Formulation:\n        1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.\n        2. A random permutation P of integers from 0 to (m*n - 1) is generated.\n        3. Each cell in the grid is assigned a number from 0 to (m*n - 1) in row-major order.\n        4. The cells are then rearranged according to the permutation P.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4]\n        ... ])\n        &gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # The resulting image might look like this (one possible outcome):\n        # [[4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1]]\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        grid: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n\n    _targets = ALL_TARGETS\n\n    def __init__(\n        self,\n        grid: tuple[int, int] = (3, 3),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.grid = grid\n\n    def apply(\n        self,\n        img: np.ndarray,\n        tiles: np.ndarray,\n        mapping: list[int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_image(img, tiles, mapping)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n        bboxes_returned = fgeometric.bboxes_grid_shuffle(\n            bboxes_denorm,\n            tiles,\n            mapping,\n            image_shape,\n            min_area=processor.params.min_area,\n            min_visibility=processor.params.min_visibility,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_keypoints(keypoints, tiles, mapping)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        image_shape = params[\"shape\"][:2]\n\n        original_tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.grid,\n            self.random_generator,\n        )\n        shape_groups = fgeometric.create_shape_groups(original_tiles)\n        mapping = fgeometric.shuffle_tiles_within_shape_groups(\n            shape_groups,\n            self.random_generator,\n        )\n\n        return {\"tiles\": original_tiles, \"mapping\": mapping}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"grid\",)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.ShiftScaleRotate","title":"<code>class  ShiftScaleRotate</code> <code>       (shift_limit=(-0.0625, 0.0625), scale_limit=(-0.1, 0.1), rotate_limit=(-45, 45), interpolation=1, border_mode=4, value=None, mask_value=None, shift_limit_x=None, shift_limit_y=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Randomly apply affine transforms: translate, scale and rotate the input.</p> <p>Parameters:</p> Name Type Description <code>shift_limit</code> <code>float, float) or float</code> <p>shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).</p> <code>scale_limit</code> <code>float, float) or float</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>rotate_limit</code> <code>int, int) or int</code> <p>rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>shift_limit_x</code> <code>float, float) or float</code> <p>shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [-1, 1]. Default: None.</p> <code>shift_limit_y</code> <code>float, float) or float</code> <p>shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [-, 1]. Default: None.</p> <code>rotate_method</code> <code>str</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\". Default: \"largest_box\"</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ShiftScaleRotate(Affine):\n    \"\"\"Randomly apply affine transforms: translate, scale and rotate the input.\n\n    Args:\n        shift_limit ((float, float) or float): shift factor range for both height and width. If shift_limit\n            is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and\n            upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).\n        scale_limit ((float, float) or float): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        rotate_limit ((int, int) or int): rotation range. If rotate_limit is a single int value, the\n            range will be (-rotate_limit, rotate_limit). Default: (-45, 45).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        shift_limit_x ((float, float) or float): shift factor range for width. If it is set then this value\n            instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value,\n            the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in\n            the range [-1, 1]. Default: None.\n        shift_limit_y ((float, float) or float): shift factor range for height. If it is set then this value\n            instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value,\n            the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie\n            in the range [-, 1]. Default: None.\n        rotate_method (str): rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\".\n            Default: \"largest_box\"\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        shift_limit: SymmetricRangeType\n        scale_limit: SymmetricRangeType\n        rotate_limit: SymmetricRangeType\n        interpolation: InterpolationType\n        border_mode: BorderModeType\n\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        fill: ColorType = 0\n        fill_mask: ColorType = 0\n\n        shift_limit_x: ScaleFloatType | None\n        shift_limit_y: ScaleFloatType | None\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def check_shift_limit(self) -&gt; Self:\n            bounds = -1, 1\n            self.shift_limit_x = to_tuple(\n                self.shift_limit_x if self.shift_limit_x is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_x, *bounds, \"shift_limit_x\")\n            self.shift_limit_y = to_tuple(\n                self.shift_limit_y if self.shift_limit_y is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_y, *bounds, \"shift_limit_y\")\n\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n            return self\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; ScaleFloatType:\n            bounds = 0, float(\"inf\")\n            result = to_tuple(value, bias=1.0)\n            check_range(result, *bounds, str(info.field_name))\n            return result\n\n    def __init__(\n        self,\n        shift_limit: ScaleFloatType = (-0.0625, 0.0625),\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        rotate_limit: ScaleFloatType = (-45, 45),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        shift_limit_x: ScaleFloatType | None = None,\n        shift_limit_y: ScaleFloatType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: InterpolationType = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        shift_limit_x = cast(tuple[float, float], shift_limit_x)\n        shift_limit_y = cast(tuple[float, float], shift_limit_y)\n        super().__init__(\n            scale=scale_limit,\n            translate_percent={\"x\": shift_limit_x, \"y\": shift_limit_y},\n            rotate=rotate_limit,\n            shear=(0, 0),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            fit_output=False,\n            keep_ratio=False,\n            rotate_method=rotate_method,\n            always_apply=always_apply,\n            p=p,\n        )\n        warn(\n            \"ShiftScaleRotate is deprecated. Please use Affine transform instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        self.shift_limit_x = shift_limit_x\n        self.shift_limit_y = shift_limit_y\n\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.rotate_limit = cast(tuple[int, int], rotate_limit)\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"shift_limit_x\": self.shift_limit_x,\n            \"shift_limit_y\": self.shift_limit_y,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n            \"rotate_limit\": self.rotate_limit,\n            \"interpolation\": self.interpolation,\n            \"border_mode\": self.border_mode,\n            \"fill\": self.fill,\n            \"fill_mask\": self.fill_mask,\n            \"rotate_method\": self.rotate_method,\n            \"mask_interpolation\": self.mask_interpolation,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.ThinPlateSpline","title":"<code>class  ThinPlateSpline</code> <code>       (scale_range=(0.2, 0.4), num_control_points=4, interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.</p> <p>Imagine the image printed on a thin metal plate that can be bent and warped smoothly: - Control points act like pins pushing or pulling the plate - The plate resists sharp bending, creating smooth deformations - The transformation maintains continuity (no tears or folds) - Areas between control points are interpolated naturally</p> <p>The transform works by: 1. Creating a regular grid of control points (like pins in the plate) 2. Randomly displacing these points (like pushing/pulling the pins) 3. Computing a smooth interpolation (like the plate bending) 4. Applying the resulting deformation to the image</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for random displacement of control points. Values should be in [0.0, 1.0]: - 0.0: No displacement (identity transform) - 0.1: Subtle warping - 0.2-0.4: Moderate deformation (recommended range) - 0.5+: Strong warping Default: (0.2, 0.4)</p> <code>num_control_points</code> <code>int</code> <p>Number of control points per side. Creates a grid of num_control_points x num_control_points points. - 2: Minimal deformation (affine-like) - 3-4: Moderate flexibility (recommended) - 5+: More local deformation control Must be &gt;= 2. Default: 4</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag. Used for image sampling. See also: cv2.INTER_* Default: cv2.INTER_LINEAR</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transformation preserves smoothness and continuity</li> <li>Stronger scale values may create more extreme deformations</li> <li>Higher number of control points allows more local deformations</li> <li>The same deformation is applied consistently to all targets</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Basic usage\n&gt;&gt;&gt; transform = A.ThinPlateSpline()\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Subtle deformation\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.1, 0.2),\n...     num_control_points=3\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Strong warping with fine control\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.3, 0.5),\n...     num_control_points=5,\n... )\n</code></pre> <p>References</p> <ul> <li> <p>\"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"   by F.L. Bookstein   https://doi.org/10.1109/34.24792</p> </li> <li> <p>Thin Plate Splines in Computer Vision:   https://en.wikipedia.org/wiki/Thin_plate_spline</p> </li> <li> <p>Similar implementation in Kornia:   https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline</p> </li> </ul> <p>See Also:     - ElasticTransform: For different type of non-rigid deformation     - GridDistortion: For grid-based warping     - OpticalDistortion: For lens-like distortions</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ThinPlateSpline(BaseDistortion):\n    r\"\"\"Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.\n\n    Imagine the image printed on a thin metal plate that can be bent and warped smoothly:\n    - Control points act like pins pushing or pulling the plate\n    - The plate resists sharp bending, creating smooth deformations\n    - The transformation maintains continuity (no tears or folds)\n    - Areas between control points are interpolated naturally\n\n    The transform works by:\n    1. Creating a regular grid of control points (like pins in the plate)\n    2. Randomly displacing these points (like pushing/pulling the pins)\n    3. Computing a smooth interpolation (like the plate bending)\n    4. Applying the resulting deformation to the image\n\n\n    Args:\n        scale_range (tuple[float, float]): Range for random displacement of control points.\n            Values should be in [0.0, 1.0]:\n            - 0.0: No displacement (identity transform)\n            - 0.1: Subtle warping\n            - 0.2-0.4: Moderate deformation (recommended range)\n            - 0.5+: Strong warping\n            Default: (0.2, 0.4)\n\n        num_control_points (int): Number of control points per side.\n            Creates a grid of num_control_points x num_control_points points.\n            - 2: Minimal deformation (affine-like)\n            - 3-4: Moderate flexibility (recommended)\n            - 5+: More local deformation control\n            Must be &gt;= 2. Default: 4\n\n        interpolation (int): OpenCV interpolation flag. Used for image sampling.\n            See also: cv2.INTER_*\n            Default: cv2.INTER_LINEAR\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transformation preserves smoothness and continuity\n        - Stronger scale values may create more extreme deformations\n        - Higher number of control points allows more local deformations\n        - The same deformation is applied consistently to all targets\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Basic usage\n        &gt;&gt;&gt; transform = A.ThinPlateSpline()\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Subtle deformation\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.1, 0.2),\n        ...     num_control_points=3\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Strong warping with fine control\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.3, 0.5),\n        ...     num_control_points=5,\n        ... )\n\n    References:\n        - \"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"\n          by F.L. Bookstein\n          https://doi.org/10.1109/34.24792\n\n        - Thin Plate Splines in Computer Vision:\n          https://en.wikipedia.org/wiki/Thin_plate_spline\n\n        - Similar implementation in Kornia:\n          https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline\n\n    See Also:\n        - ElasticTransform: For different type of non-rigid deformation\n        - GridDistortion: For grid-based warping\n        - OpticalDistortion: For lens-like distortions\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        scale_range: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        num_control_points: int = Field(ge=2)\n\n    def __init__(\n        self,\n        scale_range: tuple[float, float] = (0.2, 0.4),\n        num_control_points: int = 4,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.scale_range = scale_range\n        self.num_control_points = num_control_points\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        # Create regular grid of control points\n        grid_size = self.num_control_points\n        x = np.linspace(0, 1, grid_size)\n        y = np.linspace(0, 1, grid_size)\n        src_points = np.stack(np.meshgrid(x, y), axis=-1).reshape(-1, 2)\n\n        # Add random displacement to destination points\n        scale = self.py_random.uniform(*self.scale_range) / 10\n        dst_points = src_points + self.random_generator.normal(\n            0,\n            scale,\n            src_points.shape,\n        )\n\n        # Compute TPS weights\n        weights, affine = fgeometric.compute_tps_weights(src_points, dst_points)\n\n        # Create grid of points\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        points = np.stack([x.flatten(), y.flatten()], axis=1).astype(np.float32)\n\n        # Transform points\n        transformed = fgeometric.tps_transform(\n            points / [width, height],\n            src_points,\n            weights,\n            affine,\n        )\n        transformed *= [width, height]\n\n        return {\n            \"map_x\": transformed[:, 0].reshape(height, width).astype(np.float32),\n            \"map_y\": transformed[:, 1].reshape(height, width).astype(np.float32),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale_range\",\n            \"num_control_points\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.Transpose","title":"<code>class  Transpose</code> <code> </code>  [view source on GitHub]","text":"<p>Transpose the input by swapping its rows and columns.</p> <p>This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The dimensions of the output will be swapped compared to the input. For example,   an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).</li> <li>This transform is its own inverse. Applying it twice will return the original input.</li> <li>For multi-channel images (like RGB), the channels are preserved in their original order.</li> <li>Bounding boxes will have their coordinates adjusted to match the new image dimensions.</li> <li>Keypoints will have their x and y coordinates swapped.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (y_min, x_min, y_max, x_max)     3. For keypoints with coordinates (x, y):        new_keypoint = (y, x)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.Transpose(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transposed_image = result['image']\n&gt;&gt;&gt; print(transposed_image)\n[[[ 1  2  3]\n  [ 7  8  9]]\n [[ 4  5  6]\n  [10 11 12]]]\n# The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Transpose(DualTransform):\n    \"\"\"Transpose the input by swapping its rows and columns.\n\n    This transform flips the image over its main diagonal, effectively switching its width and height.\n    It's equivalent to a 90-degree rotation followed by a horizontal flip.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The dimensions of the output will be swapped compared to the input. For example,\n          an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).\n        - This transform is its own inverse. Applying it twice will return the original input.\n        - For multi-channel images (like RGB), the channels are preserved in their original order.\n        - Bounding boxes will have their coordinates adjusted to match the new image dimensions.\n        - Keypoints will have their x and y coordinates swapped.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (y_min, x_min, y_max, x_max)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (y, x)\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.Transpose(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transposed_image = result['image']\n        &gt;&gt;&gt; print(transposed_image)\n        [[[ 1  2  3]\n          [ 7  8  9]]\n         [[ 4  5  6]\n          [10 11 12]]]\n        # The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.transpose(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_transpose(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_transpose(keypoints)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/#albumentations.augmentations.geometric.transforms.VerticalFlip","title":"<code>class  VerticalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform flips the image upside down. The top of the image becomes the bottom and vice versa.</li> <li>The dimensions of the image remain unchanged.</li> <li>For multi-channel images (like RGB), each channel is flipped independently.</li> <li>Bounding boxes are adjusted to match their new positions in the flipped image.</li> <li>Keypoints are moved to their new positions in the flipped image.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (x_min, H-y_max, x_max, H-y_min)     3. For keypoints with coordinates (x, y):        new_keypoint = (x, H-y)     where H is the height of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; flipped_image = result['image']\n&gt;&gt;&gt; print(flipped_image)\n[[[ 7  8  9]\n  [10 11 12]]\n [[ 1  2  3]\n  [ 4  5  6]]]\n# The original image is flipped vertically, with rows reversed\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class VerticalFlip(DualTransform):\n    \"\"\"Flip the input vertically around the x-axis.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform flips the image upside down. The top of the image becomes the bottom and vice versa.\n        - The dimensions of the image remain unchanged.\n        - For multi-channel images (like RGB), each channel is flipped independently.\n        - Bounding boxes are adjusted to match their new positions in the flipped image.\n        - Keypoints are moved to their new positions in the flipped image.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (x_min, H-y_max, x_max, H-y_min)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (x, H-y)\n        where H is the height of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; flipped_image = result['image']\n        &gt;&gt;&gt; print(flipped_image)\n        [[[ 7  8  9]\n          [10 11 12]]\n         [[ 1  2  3]\n          [ 4  5  6]]]\n        # The original image is flipped vertically, with rows reversed\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return vflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_vflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_vflip(keypoints, params[\"shape\"][0])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/","title":"Transforms (augmentations.transforms)","text":""},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.AdditiveNoise","title":"<code>class  AdditiveNoise</code> <code>       (noise_type='uniform', spatial_mode='constant', noise_params=None, approximation=1.0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply random noise to image channels using various noise distributions.</p> <p>This transform generates noise using different probability distributions and applies it to image channels. The noise can be generated in three spatial modes and supports multiple noise distributions, each with configurable parameters.</p> <p>Parameters:</p> Name Type Description <code>noise_type</code> <code>Literal['uniform', 'gaussian', 'laplace', 'beta']</code> <p>Type of noise distribution to use. Options: - \"uniform\": Uniform distribution, good for simple random perturbations - \"gaussian\": Normal distribution, models natural random processes - \"laplace\": Similar to Gaussian but with heavier tails, good for outliers - \"beta\": Flexible bounded distribution, can be symmetric or skewed</p> <code>spatial_mode</code> <code>Literal['constant', 'per_pixel', 'shared']</code> <p>How to generate and apply the noise. Options: - \"constant\": One noise value per channel, fastest - \"per_pixel\": Independent noise value for each pixel and channel, slowest - \"shared\": One noise map shared across all channels, medium speed</p> <code>approximation</code> <code>float</code> <p>float in [0, 1], default=1.0 Controls noise generation speed vs quality tradeoff. - 1.0: Generate full resolution noise (slowest, highest quality) - 0.5: Generate noise at half resolution and upsample - 0.25: Generate noise at quarter resolution and upsample Only affects 'per_pixel' and 'shared' spatial modes.</p> <code>noise_params</code> <code>dict[str, Any] | None</code> <p>Parameters for the chosen noise distribution. Must match the noise_type:</p> <p>uniform:     ranges: list[tuple[float, float]]         List of (min, max) ranges for each channel.         Each range must be in [-1, 1].         If only one range is provided, it will be used for all channels.</p> <pre><code>    [(-0.2, 0.2)]  # Same range for all channels\n    [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]  # Different ranges for RGB\n</code></pre> <p>gaussian:     mean_range: tuple[float, float], default (0.0, 0.0)         Range for sampling mean value, in [-1, 1]     std_range: tuple[float, float], default (0.1, 0.1)         Range for sampling standard deviation, in [0, 1]</p> <p>laplace:     mean_range: tuple[float, float], default (0.0, 0.0)         Range for sampling location parameter, in [-1, 1]     scale_range: tuple[float, float], default (0.1, 0.1)         Range for sampling scale parameter, in [0, 1]</p> <p>beta:     alpha_range: tuple[float, float], default (0.5, 1.5)         Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped         Range for sampling first shape parameter, in (0, inf)     beta_range: tuple[float, float], default (0.5, 1.5)         Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped         Range for sampling second shape parameter, in (0, inf)     scale_range: tuple[float, float], default (0.1, 0.3)         Smaller scale for subtler noise         Range for sampling output scale, in [0, 1]</p> <p>Note</p> <p>Performance considerations:     - \"constant\" mode is fastest as it generates only C values (C = number of channels)     - \"shared\" mode generates HxW values and reuses them for all channels     - \"per_pixel\" mode generates HxWxC values, slowest but most flexible</p> <p>Distribution characteristics:     - uniform: Equal probability within range, good for simple perturbations     - gaussian: Bell-shaped, symmetric, good for natural noise     - laplace: Like gaussian but with heavier tails, good for outliers     - beta: Very flexible shape, can be uniform, bell-shaped, or U-shaped</p> <p>Implementation details:     - All noise is generated in normalized range and scaled by image max value     - For uint8 images, final noise range is [-255, 255]     - For float images, final noise range is [-1, 1]</p> <p>Examples:</p> <p>Constant RGB shift with different ranges per channel:</p> Python<pre><code>&gt;&gt;&gt; transform = AdditiveNoise(\n...     noise_type=\"uniform\",\n...     spatial_mode=\"constant\",\n...     noise_params={\"ranges\": [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]}\n... )\n</code></pre> <p>Gaussian noise shared across channels:</p> Python<pre><code>&gt;&gt;&gt; transform = AdditiveNoise(\n...     noise_type=\"gaussian\",\n...     spatial_mode=\"shared\",\n...     noise_params={\"mean_range\": (0.0, 0.0), \"std_range\": (0.05, 0.15)}\n... )\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class AdditiveNoise(ImageOnlyTransform):\n    \"\"\"Apply random noise to image channels using various noise distributions.\n\n    This transform generates noise using different probability distributions and applies it\n    to image channels. The noise can be generated in three spatial modes and supports\n    multiple noise distributions, each with configurable parameters.\n\n    Args:\n        noise_type: Type of noise distribution to use. Options:\n            - \"uniform\": Uniform distribution, good for simple random perturbations\n            - \"gaussian\": Normal distribution, models natural random processes\n            - \"laplace\": Similar to Gaussian but with heavier tails, good for outliers\n            - \"beta\": Flexible bounded distribution, can be symmetric or skewed\n\n        spatial_mode: How to generate and apply the noise. Options:\n            - \"constant\": One noise value per channel, fastest\n            - \"per_pixel\": Independent noise value for each pixel and channel, slowest\n            - \"shared\": One noise map shared across all channels, medium speed\n\n        approximation: float in [0, 1], default=1.0\n            Controls noise generation speed vs quality tradeoff.\n            - 1.0: Generate full resolution noise (slowest, highest quality)\n            - 0.5: Generate noise at half resolution and upsample\n            - 0.25: Generate noise at quarter resolution and upsample\n            Only affects 'per_pixel' and 'shared' spatial modes.\n\n        noise_params: Parameters for the chosen noise distribution.\n            Must match the noise_type:\n\n            uniform:\n                ranges: list[tuple[float, float]]\n                    List of (min, max) ranges for each channel.\n                    Each range must be in [-1, 1].\n                    If only one range is provided, it will be used for all channels.\n\n                    [(-0.2, 0.2)]  # Same range for all channels\n                    [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]  # Different ranges for RGB\n\n            gaussian:\n                mean_range: tuple[float, float], default (0.0, 0.0)\n                    Range for sampling mean value, in [-1, 1]\n                std_range: tuple[float, float], default (0.1, 0.1)\n                    Range for sampling standard deviation, in [0, 1]\n\n            laplace:\n                mean_range: tuple[float, float], default (0.0, 0.0)\n                    Range for sampling location parameter, in [-1, 1]\n                scale_range: tuple[float, float], default (0.1, 0.1)\n                    Range for sampling scale parameter, in [0, 1]\n\n            beta:\n                alpha_range: tuple[float, float], default (0.5, 1.5)\n                    Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped\n                    Range for sampling first shape parameter, in (0, inf)\n                beta_range: tuple[float, float], default (0.5, 1.5)\n                    Value &lt; 1 = U-shaped, Value &gt; 1 = Bell-shaped\n                    Range for sampling second shape parameter, in (0, inf)\n                scale_range: tuple[float, float], default (0.1, 0.3)\n                    Smaller scale for subtler noise\n                    Range for sampling output scale, in [0, 1]\n\n    Note:\n        Performance considerations:\n            - \"constant\" mode is fastest as it generates only C values (C = number of channels)\n            - \"shared\" mode generates HxW values and reuses them for all channels\n            - \"per_pixel\" mode generates HxWxC values, slowest but most flexible\n\n        Distribution characteristics:\n            - uniform: Equal probability within range, good for simple perturbations\n            - gaussian: Bell-shaped, symmetric, good for natural noise\n            - laplace: Like gaussian but with heavier tails, good for outliers\n            - beta: Very flexible shape, can be uniform, bell-shaped, or U-shaped\n\n        Implementation details:\n            - All noise is generated in normalized range and scaled by image max value\n            - For uint8 images, final noise range is [-255, 255]\n            - For float images, final noise range is [-1, 1]\n\n    Examples:\n        Constant RGB shift with different ranges per channel:\n        &gt;&gt;&gt; transform = AdditiveNoise(\n        ...     noise_type=\"uniform\",\n        ...     spatial_mode=\"constant\",\n        ...     noise_params={\"ranges\": [(-0.2, 0.2), (-0.1, 0.1), (-0.1, 0.1)]}\n        ... )\n\n        Gaussian noise shared across channels:\n        &gt;&gt;&gt; transform = AdditiveNoise(\n        ...     noise_type=\"gaussian\",\n        ...     spatial_mode=\"shared\",\n        ...     noise_params={\"mean_range\": (0.0, 0.0), \"std_range\": (0.05, 0.15)}\n        ... )\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"]\n        spatial_mode: Literal[\"constant\", \"per_pixel\", \"shared\"]\n        noise_params: dict[str, Any] | None\n        approximation: float = Field(ge=0.0, le=1.0)\n\n        @model_validator(mode=\"after\")\n        def validate_noise_params(self) -&gt; Self:\n            # Default parameters for each noise type\n            default_params = {\n                \"uniform\": {\n                    \"ranges\": [(-0.1, 0.1)],  # Single channel by default\n                },\n                \"gaussian\": {\"mean_range\": (0.0, 0.0), \"std_range\": (0.05, 0.15)},\n                \"laplace\": {\"mean_range\": (0.0, 0.0), \"scale_range\": (0.05, 0.15)},\n                \"beta\": {\n                    \"alpha_range\": (0.5, 1.5),\n                    \"beta_range\": (0.5, 1.5),\n                    \"scale_range\": (0.1, 0.3),\n                },\n            }\n\n            # Use default params if none provided\n            params_dict = self.noise_params if self.noise_params is not None else default_params[self.noise_type]\n\n            # Convert dict to appropriate NoiseParams object\n            params_class = {\n                \"uniform\": UniformParams,\n                \"gaussian\": GaussianParams,\n                \"laplace\": LaplaceParams,\n                \"beta\": BetaParams,\n            }[self.noise_type]\n\n            # Add noise_type to params if not present\n            params_dict = {**params_dict, \"noise_type\": self.noise_type}  # type: ignore[dict-item]\n            self.noise_params = params_class(**params_dict)\n\n            return self\n\n    def __init__(\n        self,\n        noise_type: Literal[\"uniform\", \"gaussian\", \"laplace\", \"beta\"] = \"uniform\",\n        spatial_mode: Literal[\"constant\", \"per_pixel\", \"shared\"] = \"constant\",\n        noise_params: dict[str, Any] | None = None,\n        approximation: float = 1.0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.noise_type = noise_type\n        self.spatial_mode = spatial_mode\n        self.noise_params = noise_params\n        self.approximation = approximation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        noise_map: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_noise(img, noise_map)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        max_value = MAX_VALUES_BY_DTYPE[image.dtype]\n\n        noise_map = fmain.generate_noise(\n            noise_type=self.noise_type,\n            spatial_mode=self.spatial_mode,\n            shape=image.shape,\n            params=self.noise_params,\n            max_value=max_value,\n            approximation=self.approximation,\n            random_generator=self.random_generator,\n        )\n        return {\"noise_map\": noise_map}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"noise_type\", \"spatial_mode\", \"noise_params\", \"approximation\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.AutoContrast","title":"<code>class  AutoContrast</code> <code>       (p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply random auto contrast to images.</p> <p>Auto contrast enhances image contrast by stretching the intensity range to use the full range while preserving relative intensities. For each color channel: 1. Compute histogram 2. Find cumulative percentiles 3. Clip and scale intensities to full range</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class AutoContrast(ImageOnlyTransform):\n    \"\"\"Apply random auto contrast to images.\n\n    Auto contrast enhances image contrast by stretching the intensity range\n    to use the full range while preserving relative intensities. For each\n    color channel:\n    1. Compute histogram\n    2. Find cumulative percentiles\n    3. Clip and scale intensities to full range\n\n    Args:\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(\n        self,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.auto_contrast(img)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.BetaParams","title":"<code>class  BetaParams</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class BetaParams(NoiseParamsBase):\n    noise_type: Literal[\"beta\"] = \"beta\"\n    alpha_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0)),\n    ]\n    beta_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0)),\n    ]\n    scale_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0, max_val=1)),\n    ]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.CLAHE","title":"<code>class  CLAHE</code> <code>       (clip_limit=4.0, tile_grid_size=(8, 8), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.</p> <p>CLAHE is an advanced method of improving the contrast in an image. Unlike regular histogram equalization, which operates on the entire image, CLAHE operates on small regions (tiles) in the image. This results in a more balanced equalization, preventing over-amplification of contrast in areas with initially low contrast.</p> <p>Parameters:</p> Name Type Description <code>clip_limit</code> <code>tuple[float, float] | float</code> <p>Controls the contrast enhancement limit. - If a single float is provided, the range will be (1, clip_limit). - If a tuple of two floats is provided, it defines the range for random selection. Higher values allow for more contrast enhancement, but may also increase noise. Default: (1, 4)</p> <code>tile_grid_size</code> <code>tuple[int, int]</code> <p>Defines the number of tiles in the row and column directions. Format is (rows, columns). Smaller tile sizes can lead to more localized enhancements, while larger sizes give results closer to global histogram equalization. Default: (8, 8)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Notes</p> <ul> <li>Supports only RGB or grayscale images.</li> <li>For color images, CLAHE is applied to the L channel in the LAB color space.</li> <li>The clip limit determines the maximum slope of the cumulative histogram. A lower   clip limit will result in more contrast limiting.</li> <li>Tile grid size affects the adaptiveness of the method. More tiles increase local   adaptiveness but can lead to an unnatural look if set too high.</li> </ul> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1, 3</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.CLAHE(clip_limit=(1, 4), tile_grid_size=(8, 8), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; clahe_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>https://docs.opencv.org/master/d5/daf/tutorial_py_histogram_equalization.html</li> <li>Zuiderveld, Karel. \"Contrast Limited Adaptive Histogram Equalization.\"   Graphic Gems IV. Academic Press Professional, Inc., 1994.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class CLAHE(ImageOnlyTransform):\n    \"\"\"Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to the input image.\n\n    CLAHE is an advanced method of improving the contrast in an image. Unlike regular histogram\n    equalization, which operates on the entire image, CLAHE operates on small regions (tiles)\n    in the image. This results in a more balanced equalization, preventing over-amplification\n    of contrast in areas with initially low contrast.\n\n    Args:\n        clip_limit (tuple[float, float] | float): Controls the contrast enhancement limit.\n            - If a single float is provided, the range will be (1, clip_limit).\n            - If a tuple of two floats is provided, it defines the range for random selection.\n            Higher values allow for more contrast enhancement, but may also increase noise.\n            Default: (1, 4)\n\n        tile_grid_size (tuple[int, int]): Defines the number of tiles in the row and column directions.\n            Format is (rows, columns). Smaller tile sizes can lead to more localized enhancements,\n            while larger sizes give results closer to global histogram equalization.\n            Default: (8, 8)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Notes:\n        - Supports only RGB or grayscale images.\n        - For color images, CLAHE is applied to the L channel in the LAB color space.\n        - The clip limit determines the maximum slope of the cumulative histogram. A lower\n          clip limit will result in more contrast limiting.\n        - Tile grid size affects the adaptiveness of the method. More tiles increase local\n          adaptiveness but can lead to an unnatural look if set too high.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1, 3\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.CLAHE(clip_limit=(1, 4), tile_grid_size=(8, 8), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; clahe_image = result[\"image\"]\n\n    References:\n        - https://docs.opencv.org/master/d5/daf/tutorial_py_histogram_equalization.html\n        - Zuiderveld, Karel. \"Contrast Limited Adaptive Histogram Equalization.\"\n          Graphic Gems IV. Academic Press Professional, Inc., 1994.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        clip_limit: OnePlusFloatRangeType\n        tile_grid_size: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n\n    def __init__(\n        self,\n        clip_limit: ScaleFloatType = 4.0,\n        tile_grid_size: tuple[int, int] = (8, 8),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.clip_limit = cast(tuple[float, float], clip_limit)\n        self.tile_grid_size = tile_grid_size\n\n    def apply(self, img: np.ndarray, clip_limit: float, **params: Any) -&gt; np.ndarray:\n        if not is_rgb_image(img) and not is_grayscale_image(img):\n            msg = \"CLAHE transformation expects 1-channel or 3-channel images.\"\n            raise TypeError(msg)\n\n        return fmain.clahe(img, clip_limit, self.tile_grid_size)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"clip_limit\": self.py_random.uniform(*self.clip_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"clip_limit\", \"tile_grid_size\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ChannelShuffle","title":"<code>class  ChannelShuffle</code> <code> </code>  [view source on GitHub]","text":"<p>Randomly rearrange channels of the image.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ChannelShuffle(ImageOnlyTransform):\n    \"\"\"Randomly rearrange channels of the image.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    def apply(\n        self,\n        img: np.ndarray,\n        channels_shuffled: tuple[int, ...],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.channel_shuffle(img, channels_shuffled)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        ch_arr = list(range(params[\"shape\"][2]))\n        self.random_generator.shuffle(ch_arr)\n        return {\"channels_shuffled\": ch_arr}\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ChromaticAberration","title":"<code>class  ChromaticAberration</code> <code>       (primary_distortion_limit=(-0.02, 0.02), secondary_distortion_limit=(-0.05, 0.05), mode='green_purple', interpolation=1, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Add lateral chromatic aberration by distorting the red and blue channels of the input image.</p> <p>Chromatic aberration is an optical effect that occurs when a lens fails to focus all colors to the same point. This transform simulates this effect by applying different radial distortions to the red and blue channels of the image, while leaving the green channel unchanged.</p> <p>Parameters:</p> Name Type Description <code>primary_distortion_limit</code> <code>tuple[float, float] | float</code> <p>Range of the primary radial distortion coefficient. If a single float value is provided, the range will be (-primary_distortion_limit, primary_distortion_limit). This parameter controls the distortion in the center of the image: - Positive values result in pincushion distortion (edges bend inward) - Negative values result in barrel distortion (edges bend outward) Default: (-0.02, 0.02).</p> <code>secondary_distortion_limit</code> <code>tuple[float, float] | float</code> <p>Range of the secondary radial distortion coefficient. If a single float value is provided, the range will be (-secondary_distortion_limit, secondary_distortion_limit). This parameter controls the distortion in the corners of the image: - Positive values enhance pincushion distortion - Negative values enhance barrel distortion Default: (-0.05, 0.05).</p> <code>mode</code> <code>Literal[\"green_purple\", \"red_blue\", \"random\"]</code> <p>Type of color fringing to apply. Options are: - 'green_purple': Distorts red and blue channels in opposite directions, creating green-purple fringing. - 'red_blue': Distorts red and blue channels in the same direction, creating red-blue fringing. - 'random': Randomly chooses between 'green_purple' and 'red_blue' modes for each application. Default: 'green_purple'.</p> <code>interpolation</code> <code>InterpolationType</code> <p>Flag specifying the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>This transform only affects RGB images. Grayscale images will raise an error.</li> <li>The strength of the effect depends on both primary and secondary distortion limits.</li> <li>Higher absolute values for distortion limits will result in more pronounced chromatic aberration.</li> <li>The 'green_purple' mode tends to produce more noticeable effects than 'red_blue'.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; transform = A.ChromaticAberration(\n...     primary_distortion_limit=0.05,\n...     secondary_distortion_limit=0.1,\n...     mode='green_purple',\n...     interpolation=cv2.INTER_LINEAR,\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; aberrated_image = transformed['image']\n</code></pre> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Chromatic_aberration</li> <li>https://www.researchgate.net/publication/320691320_Chromatic_Aberration_in_Digital_Images</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ChromaticAberration(ImageOnlyTransform):\n    \"\"\"Add lateral chromatic aberration by distorting the red and blue channels of the input image.\n\n    Chromatic aberration is an optical effect that occurs when a lens fails to focus all colors to the same point.\n    This transform simulates this effect by applying different radial distortions to the red and blue channels\n    of the image, while leaving the green channel unchanged.\n\n    Args:\n        primary_distortion_limit (tuple[float, float] | float): Range of the primary radial distortion coefficient.\n            If a single float value is provided, the range\n            will be (-primary_distortion_limit, primary_distortion_limit).\n            This parameter controls the distortion in the center of the image:\n            - Positive values result in pincushion distortion (edges bend inward)\n            - Negative values result in barrel distortion (edges bend outward)\n            Default: (-0.02, 0.02).\n\n        secondary_distortion_limit (tuple[float, float] | float): Range of the secondary radial distortion coefficient.\n            If a single float value is provided, the range\n            will be (-secondary_distortion_limit, secondary_distortion_limit).\n            This parameter controls the distortion in the corners of the image:\n            - Positive values enhance pincushion distortion\n            - Negative values enhance barrel distortion\n            Default: (-0.05, 0.05).\n\n        mode (Literal[\"green_purple\", \"red_blue\", \"random\"]): Type of color fringing to apply. Options are:\n            - 'green_purple': Distorts red and blue channels in opposite directions, creating green-purple fringing.\n            - 'red_blue': Distorts red and blue channels in the same direction, creating red-blue fringing.\n            - 'random': Randomly chooses between 'green_purple' and 'red_blue' modes for each application.\n            Default: 'green_purple'.\n\n        interpolation (InterpolationType): Flag specifying the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - This transform only affects RGB images. Grayscale images will raise an error.\n        - The strength of the effect depends on both primary and secondary distortion limits.\n        - Higher absolute values for distortion limits will result in more pronounced chromatic aberration.\n        - The 'green_purple' mode tends to produce more noticeable effects than 'red_blue'.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; transform = A.ChromaticAberration(\n        ...     primary_distortion_limit=0.05,\n        ...     secondary_distortion_limit=0.1,\n        ...     mode='green_purple',\n        ...     interpolation=cv2.INTER_LINEAR,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; aberrated_image = transformed['image']\n\n    References:\n        - https://en.wikipedia.org/wiki/Chromatic_aberration\n        - https://www.researchgate.net/publication/320691320_Chromatic_Aberration_in_Digital_Images\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        primary_distortion_limit: SymmetricRangeType\n        secondary_distortion_limit: SymmetricRangeType\n        mode: ChromaticAberrationMode\n        interpolation: InterpolationType\n\n    def __init__(\n        self,\n        primary_distortion_limit: ScaleFloatType = (-0.02, 0.02),\n        secondary_distortion_limit: ScaleFloatType = (-0.05, 0.05),\n        mode: ChromaticAberrationMode = \"green_purple\",\n        interpolation: InterpolationType = cv2.INTER_LINEAR,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.primary_distortion_limit = cast(\n            tuple[float, float],\n            primary_distortion_limit,\n        )\n        self.secondary_distortion_limit = cast(\n            tuple[float, float],\n            secondary_distortion_limit,\n        )\n        self.mode = mode\n        self.interpolation = interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        primary_distortion_red: float,\n        secondary_distortion_red: float,\n        primary_distortion_blue: float,\n        secondary_distortion_blue: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.chromatic_aberration(\n            img,\n            primary_distortion_red,\n            secondary_distortion_red,\n            primary_distortion_blue,\n            secondary_distortion_blue,\n            self.interpolation,\n        )\n\n    def get_params(self) -&gt; dict[str, float]:\n        primary_distortion_red = self.py_random.uniform(*self.primary_distortion_limit)\n        secondary_distortion_red = self.py_random.uniform(\n            *self.secondary_distortion_limit,\n        )\n        primary_distortion_blue = self.py_random.uniform(*self.primary_distortion_limit)\n        secondary_distortion_blue = self.py_random.uniform(\n            *self.secondary_distortion_limit,\n        )\n\n        secondary_distortion_red = self._match_sign(\n            primary_distortion_red,\n            secondary_distortion_red,\n        )\n        secondary_distortion_blue = self._match_sign(\n            primary_distortion_blue,\n            secondary_distortion_blue,\n        )\n\n        if self.mode == \"green_purple\":\n            # distortion coefficients of the red and blue channels have the same sign\n            primary_distortion_blue = self._match_sign(\n                primary_distortion_red,\n                primary_distortion_blue,\n            )\n            secondary_distortion_blue = self._match_sign(\n                secondary_distortion_red,\n                secondary_distortion_blue,\n            )\n        if self.mode == \"red_blue\":\n            # distortion coefficients of the red and blue channels have the opposite sign\n            primary_distortion_blue = self._unmatch_sign(\n                primary_distortion_red,\n                primary_distortion_blue,\n            )\n            secondary_distortion_blue = self._unmatch_sign(\n                secondary_distortion_red,\n                secondary_distortion_blue,\n            )\n\n        return {\n            \"primary_distortion_red\": primary_distortion_red,\n            \"secondary_distortion_red\": secondary_distortion_red,\n            \"primary_distortion_blue\": primary_distortion_blue,\n            \"secondary_distortion_blue\": secondary_distortion_blue,\n        }\n\n    @staticmethod\n    def _match_sign(a: float, b: float) -&gt; float:\n        # Match the sign of b to a\n        if (a &lt; 0 &lt; b) or (a &gt; 0 &gt; b):\n            return -b\n        return b\n\n    @staticmethod\n    def _unmatch_sign(a: float, b: float) -&gt; float:\n        # Unmatch the sign of b to a\n        if (a &lt; 0 and b &lt; 0) or (a &gt; 0 and b &gt; 0):\n            return -b\n        return b\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return (\n            \"primary_distortion_limit\",\n            \"secondary_distortion_limit\",\n            \"mode\",\n            \"interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ColorJitter","title":"<code>class  ColorJitter</code> <code>       (brightness=(0.8, 1.2), contrast=(0.8, 1.2), saturation=(0.8, 1.2), hue=(-0.5, 0.5), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Randomly changes the brightness, contrast, saturation, and hue of an image.</p> <p>This transform is similar to torchvision's ColorJitter but with some differences due to the use of OpenCV instead of Pillow. The main differences are: 1. OpenCV and Pillow use different formulas to convert images to HSV format. 2. This implementation uses value saturation instead of uint8 overflow as in Pillow.</p> <p>These differences may result in slightly different output compared to torchvision's ColorJitter.</p> <p>Parameters:</p> Name Type Description <code>brightness</code> <code>tuple[float, float] | float</code> <p>How much to jitter brightness. If float:     The brightness factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness]. If tuple:     The brightness factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)</p> <code>contrast</code> <code>tuple[float, float] | float</code> <p>How much to jitter contrast. If float:     The contrast factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast]. If tuple:     The contrast factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)</p> <code>saturation</code> <code>tuple[float, float] | float</code> <p>How much to jitter saturation. If float:     The saturation factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation]. If tuple:     The saturation factor is sampled from the range specified. Should be non-negative numbers. Default: (0.8, 1.2)</p> <code>hue</code> <code>float or tuple of float (min, max</code> <p>How much to jitter hue. If float:     The hue factor is chosen uniformly from [-hue, hue]. Should have 0 &lt;= hue &lt;= 0.5. If tuple:     The hue factor is sampled from the range specified. Values should be in range [-0.5, 0.5]. Default: (-0.5, 0.5)</p> <p>p (float): Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1, 3</p> <p>Note</p> <ul> <li>The order of application for these color transformations is random for each image.</li> <li>The ranges for brightness, contrast, and saturation are applied as multiplicative factors.</li> <li>The range for hue is applied as an additive factor.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; jittered_image = result['image']\n</code></pre> <p>References</p> <ul> <li>https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ColorJitter</li> <li>https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ColorJitter(ImageOnlyTransform):\n    \"\"\"Randomly changes the brightness, contrast, saturation, and hue of an image.\n\n    This transform is similar to torchvision's ColorJitter but with some differences due to the use of OpenCV\n    instead of Pillow. The main differences are:\n    1. OpenCV and Pillow use different formulas to convert images to HSV format.\n    2. This implementation uses value saturation instead of uint8 overflow as in Pillow.\n\n    These differences may result in slightly different output compared to torchvision's ColorJitter.\n\n    Args:\n        brightness (tuple[float, float] | float): How much to jitter brightness.\n            If float:\n                The brightness factor is chosen uniformly from [max(0, 1 - brightness), 1 + brightness].\n            If tuple:\n                The brightness factor is sampled from the range specified.\n            Should be non-negative numbers.\n            Default: (0.8, 1.2)\n\n        contrast (tuple[float, float] | float): How much to jitter contrast.\n            If float:\n                The contrast factor is chosen uniformly from [max(0, 1 - contrast), 1 + contrast].\n            If tuple:\n                The contrast factor is sampled from the range specified.\n            Should be non-negative numbers.\n            Default: (0.8, 1.2)\n\n        saturation (tuple[float, float] | float): How much to jitter saturation.\n            If float:\n                The saturation factor is chosen uniformly from [max(0, 1 - saturation), 1 + saturation].\n            If tuple:\n                The saturation factor is sampled from the range specified.\n            Should be non-negative numbers.\n            Default: (0.8, 1.2)\n\n        hue (float or tuple of float (min, max)): How much to jitter hue.\n            If float:\n                The hue factor is chosen uniformly from [-hue, hue]. Should have 0 &lt;= hue &lt;= 0.5.\n            If tuple:\n                The hue factor is sampled from the range specified. Values should be in range [-0.5, 0.5].\n            Default: (-0.5, 0.5)\n\n         p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1, 3\n\n    Note:\n        - The order of application for these color transformations is random for each image.\n        - The ranges for brightness, contrast, and saturation are applied as multiplicative factors.\n        - The range for hue is applied as an additive factor.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; jittered_image = result['image']\n\n    References:\n        - https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.ColorJitter\n        - https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        brightness: ScaleFloatType\n        contrast: ScaleFloatType\n        saturation: ScaleFloatType\n        hue: ScaleFloatType\n\n        @field_validator(\"brightness\", \"contrast\", \"saturation\", \"hue\")\n        @classmethod\n        def check_ranges(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            if info.field_name == \"hue\":\n                bounds = -0.5, 0.5\n                bias = 0\n                clip = False\n            elif info.field_name in [\"brightness\", \"contrast\", \"saturation\"]:\n                bounds = 0, float(\"inf\")\n                bias = 1\n                clip = True\n\n            if isinstance(value, numbers.Number):\n                if value &lt; 0:\n                    raise ValueError(\n                        f\"If {info.field_name} is a single number, it must be non negative.\",\n                    )\n                left = bias - value\n                if clip:\n                    left = max(left, 0)\n                value = (left, bias + value)\n            elif isinstance(value, tuple) and len(value) == PAIR:\n                check_range(value, *bounds, info.field_name)\n\n            return cast(tuple[float, float], value)\n\n    def __init__(\n        self,\n        brightness: ScaleFloatType = (0.8, 1.2),\n        contrast: ScaleFloatType = (0.8, 1.2),\n        saturation: ScaleFloatType = (0.8, 1.2),\n        hue: ScaleFloatType = (-0.5, 0.5),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.brightness = cast(tuple[float, float], brightness)\n        self.contrast = cast(tuple[float, float], contrast)\n        self.saturation = cast(tuple[float, float], saturation)\n        self.hue = cast(tuple[float, float], hue)\n\n        self.transforms = [\n            fmain.adjust_brightness_torchvision,\n            fmain.adjust_contrast_torchvision,\n            fmain.adjust_saturation_torchvision,\n            fmain.adjust_hue_torchvision,\n        ]\n\n    def get_params(self) -&gt; dict[str, Any]:\n        brightness = self.py_random.uniform(*self.brightness)\n        contrast = self.py_random.uniform(*self.contrast)\n        saturation = self.py_random.uniform(*self.saturation)\n        hue = self.py_random.uniform(*self.hue)\n\n        order = [0, 1, 2, 3]\n        self.random_generator.shuffle(order)\n\n        return {\n            \"brightness\": brightness,\n            \"contrast\": contrast,\n            \"saturation\": saturation,\n            \"hue\": hue,\n            \"order\": order,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        brightness: float,\n        contrast: float,\n        saturation: float,\n        hue: float,\n        order: list[int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if not is_rgb_image(img) and not is_grayscale_image(img):\n            msg = \"ColorJitter transformation expects 1-channel or 3-channel images.\"\n            raise TypeError(msg)\n        color_transforms = [brightness, contrast, saturation, hue]\n        for i in order:\n            img = self.transforms[i](img, color_transforms[i])\n        return img\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"brightness\", \"contrast\", \"saturation\", \"hue\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Downscale","title":"<code>class  Downscale</code> <code>       (scale_min=None, scale_max=None, interpolation=None, scale_range=(0.25, 0.25), interpolation_pair={'upscale': 0, 'downscale': 0}, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Decrease image quality by downscaling and upscaling back.</p> <p>This transform simulates the effect of a low-resolution image by first downscaling the image to a lower resolution and then upscaling it back to its original size. This process introduces loss of detail and can be used to simulate low-quality images or to test the robustness of models to different image resolutions.</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for the downscaling factor. Should be two float values between 0 and 1, where the first value is less than or equal to the second. The actual downscaling factor will be randomly chosen from this range for each image. Lower values result in more aggressive downscaling. Default: (0.25, 0.25)</p> <code>interpolation_pair</code> <code>InterpolationDict</code> <p>A dictionary specifying the interpolation methods to use for downscaling and upscaling. Should contain two keys: - 'downscale': Interpolation method for downscaling - 'upscale': Interpolation method for upscaling Values should be OpenCV interpolation flags (e.g., cv2.INTER_NEAREST, cv2.INTER_LINEAR, etc.) Default: {'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_NEAREST}</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual downscaling factor is randomly chosen for each image from the range   specified in scale_range.</li> <li>Using different interpolation methods for downscaling and upscaling can produce   various effects. For example, using INTER_NEAREST for both can create a pixelated look,   while using INTER_LINEAR or INTER_CUBIC can produce smoother results.</li> <li>This transform can be useful for data augmentation, especially when training models   that need to be robust to variations in image quality or resolution.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; transform = A.Downscale(\n...     scale_range=(0.5, 0.75),\n...     interpolation_pair={'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_LINEAR},\n...     p=0.5\n... )\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; downscaled_image = transformed['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Downscale(ImageOnlyTransform):\n    \"\"\"Decrease image quality by downscaling and upscaling back.\n\n    This transform simulates the effect of a low-resolution image by first downscaling\n    the image to a lower resolution and then upscaling it back to its original size.\n    This process introduces loss of detail and can be used to simulate low-quality\n    images or to test the robustness of models to different image resolutions.\n\n    Args:\n        scale_range (tuple[float, float]): Range for the downscaling factor.\n            Should be two float values between 0 and 1, where the first value is less than or equal to the second.\n            The actual downscaling factor will be randomly chosen from this range for each image.\n            Lower values result in more aggressive downscaling.\n            Default: (0.25, 0.25)\n\n        interpolation_pair (InterpolationDict): A dictionary specifying the interpolation methods to use for\n            downscaling and upscaling. Should contain two keys:\n            - 'downscale': Interpolation method for downscaling\n            - 'upscale': Interpolation method for upscaling\n            Values should be OpenCV interpolation flags (e.g., cv2.INTER_NEAREST, cv2.INTER_LINEAR, etc.)\n            Default: {'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_NEAREST}\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual downscaling factor is randomly chosen for each image from the range\n          specified in scale_range.\n        - Using different interpolation methods for downscaling and upscaling can produce\n          various effects. For example, using INTER_NEAREST for both can create a pixelated look,\n          while using INTER_LINEAR or INTER_CUBIC can produce smoother results.\n        - This transform can be useful for data augmentation, especially when training models\n          that need to be robust to variations in image quality or resolution.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; transform = A.Downscale(\n        ...     scale_range=(0.5, 0.75),\n        ...     interpolation_pair={'downscale': cv2.INTER_NEAREST, 'upscale': cv2.INTER_LINEAR},\n        ...     p=0.5\n        ... )\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; downscaled_image = transformed['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_min: float | None\n        scale_max: float | None\n\n        interpolation: int | Interpolation | InterpolationDict | None = Field(\n            default_factory=lambda: Interpolation(\n                downscale=cv2.INTER_NEAREST,\n                upscale=cv2.INTER_NEAREST,\n            ),\n        )\n        interpolation_pair: InterpolationPydantic\n\n        scale_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        @model_validator(mode=\"after\")\n        def validate_params(self) -&gt; Self:\n            if self.scale_min is not None and self.scale_max is not None:\n                warn(\n                    \"scale_min and scale_max are deprecated. Use scale_range instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n                self.scale_range = (self.scale_min, self.scale_max)\n                self.scale_min = None\n                self.scale_max = None\n\n            if self.interpolation is not None:\n                warn(\n                    \"Downscale.interpolation is deprecated. Use Downscale.interpolation_pair instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n                if isinstance(self.interpolation, dict):\n                    self.interpolation_pair = InterpolationPydantic(\n                        **self.interpolation,\n                    )\n                elif isinstance(self.interpolation, int):\n                    self.interpolation_pair = InterpolationPydantic(\n                        upscale=self.interpolation,\n                        downscale=self.interpolation,\n                    )\n                elif isinstance(self.interpolation, Interpolation):\n                    self.interpolation_pair = InterpolationPydantic(\n                        upscale=self.interpolation.upscale,\n                        downscale=self.interpolation.downscale,\n                    )\n                self.interpolation = None\n\n            return self\n\n    def __init__(\n        self,\n        scale_min: float | None = None,\n        scale_max: float | None = None,\n        interpolation: int | Interpolation | InterpolationDict | None = None,\n        scale_range: tuple[float, float] = (0.25, 0.25),\n        interpolation_pair: InterpolationDict = InterpolationDict(\n            {\"upscale\": cv2.INTER_NEAREST, \"downscale\": cv2.INTER_NEAREST},\n        ),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_range = scale_range\n        self.interpolation_pair = interpolation_pair\n\n    def apply(self, img: np.ndarray, scale: float, **params: Any) -&gt; np.ndarray:\n        return fmain.downscale(\n            img,\n            scale=scale,\n            down_interpolation=self.interpolation_pair[\"downscale\"],\n            up_interpolation=self.interpolation_pair[\"upscale\"],\n        )\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\"scale\": self.py_random.uniform(*self.scale_range)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"scale_range\", \"interpolation_pair\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Emboss","title":"<code>class  Emboss</code> <code>       (alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply embossing effect to the input image.</p> <p>This transform creates an emboss effect by highlighting edges and creating a 3D-like texture in the image. It works by applying a specific convolution kernel to the image that emphasizes differences in adjacent pixel values.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>tuple[float, float]</code> <p>Range to choose the visibility of the embossed image. At 0, only the original image is visible, at 1.0 only its embossed version is visible. Values should be in the range [0, 1]. Alpha will be randomly selected from this range for each image. Default: (0.2, 0.5)</p> <code>strength</code> <code>tuple[float, float]</code> <p>Range to choose the strength of the embossing effect. Higher values create a more pronounced 3D effect. Values should be non-negative. Strength will be randomly selected from this range for each image. Default: (0.2, 0.7)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The emboss effect is created using a 3x3 convolution kernel.</li> <li>The 'alpha' parameter controls the blend between the original image and the embossed version.   A higher alpha value will result in a more pronounced emboss effect.</li> <li>The 'strength' parameter affects the intensity of the embossing. Higher strength values   will create more contrast in the embossed areas, resulting in a stronger 3D-like effect.</li> <li>This transform can be useful for creating artistic effects or for data augmentation   in tasks where edge information is important.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; embossed_image = result['image']\n</code></pre> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Image_embossing</li> <li>https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Emboss(ImageOnlyTransform):\n    \"\"\"Apply embossing effect to the input image.\n\n    This transform creates an emboss effect by highlighting edges and creating a 3D-like texture\n    in the image. It works by applying a specific convolution kernel to the image that emphasizes\n    differences in adjacent pixel values.\n\n    Args:\n        alpha (tuple[float, float]): Range to choose the visibility of the embossed image.\n            At 0, only the original image is visible, at 1.0 only its embossed version is visible.\n            Values should be in the range [0, 1].\n            Alpha will be randomly selected from this range for each image.\n            Default: (0.2, 0.5)\n\n        strength (tuple[float, float]): Range to choose the strength of the embossing effect.\n            Higher values create a more pronounced 3D effect.\n            Values should be non-negative.\n            Strength will be randomly selected from this range for each image.\n            Default: (0.2, 0.7)\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The emboss effect is created using a 3x3 convolution kernel.\n        - The 'alpha' parameter controls the blend between the original image and the embossed version.\n          A higher alpha value will result in a more pronounced emboss effect.\n        - The 'strength' parameter affects the intensity of the embossing. Higher strength values\n          will create more contrast in the embossed areas, resulting in a stronger 3D-like effect.\n        - This transform can be useful for creating artistic effects or for data augmentation\n          in tasks where edge information is important.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Emboss(alpha=(0.2, 0.5), strength=(0.2, 0.7), p=0.5)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; embossed_image = result['image']\n\n    References:\n        - https://en.wikipedia.org/wiki/Image_embossing\n        - https://www.researchgate.net/publication/303412455_Application_of_Emboss_Filtering_in_Image_Processing\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        alpha: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        strength: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, None))]\n\n    def __init__(\n        self,\n        alpha: tuple[float, float] = (0.2, 0.5),\n        strength: tuple[float, float] = (0.2, 0.7),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.alpha = alpha\n        self.strength = strength\n\n    @staticmethod\n    def __generate_emboss_matrix(\n        alpha_sample: np.ndarray,\n        strength_sample: np.ndarray,\n    ) -&gt; np.ndarray:\n        matrix_nochange = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 0]], dtype=np.float32)\n        matrix_effect = np.array(\n            [\n                [-1 - strength_sample, 0 - strength_sample, 0],\n                [0 - strength_sample, 1, 0 + strength_sample],\n                [0, 0 + strength_sample, 1 + strength_sample],\n            ],\n            dtype=np.float32,\n        )\n        return (1 - alpha_sample) * matrix_nochange + alpha_sample * matrix_effect\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        alpha = self.py_random.uniform(*self.alpha)\n        strength = self.py_random.uniform(*self.strength)\n        emboss_matrix = self.__generate_emboss_matrix(\n            alpha_sample=alpha,\n            strength_sample=strength,\n        )\n        return {\"emboss_matrix\": emboss_matrix}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        emboss_matrix: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.convolve(img, emboss_matrix)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"alpha\", \"strength\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Equalize","title":"<code>class  Equalize</code> <code>       (mode='cv', by_channels=True, mask=None, mask_params=(), always_apply=None, p=0.5)                         </code>  [view source on GitHub]","text":"<p>Equalize the image histogram.</p> <p>This transform applies histogram equalization to the input image. Histogram equalization is a method in image processing of contrast adjustment using the image's histogram.</p> <p>Parameters:</p> Name Type Description <code>mode</code> <code>Literal['cv', 'pil']</code> <p>Use OpenCV or Pillow equalization method. Default: 'cv'</p> <code>by_channels</code> <code>bool</code> <p>If True, use equalization by channels separately, else convert image to YCbCr representation and use equalization by <code>Y</code> channel. Default: True</p> <code>mask</code> <code>np.ndarray, callable</code> <p>If given, only the pixels selected by the mask are included in the analysis. Can be: - A 1-channel or 3-channel numpy array of the same size as the input image. - A callable (function) that generates a mask. The function should accept 'image'   as its first argument, and can accept additional arguments specified in mask_params. Default: None</p> <code>mask_params</code> <code>list[str]</code> <p>Additional parameters to pass to the mask function. These parameters will be taken from the data dict passed to call. Default: ()</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>When mode='cv', OpenCV's equalizeHist() function is used.</li> <li>When mode='pil', Pillow's equalize() function is used.</li> <li>The 'by_channels' parameter determines whether equalization is applied to each color channel   independently (True) or to the luminance channel only (False).</li> <li>If a mask is provided as a numpy array, it should have the same height and width as the input image.</li> <li>If a mask is provided as a function, it allows for dynamic mask generation based on the input image   and additional parameters. This is useful for scenarios where the mask depends on the image content   or external data (e.g., bounding boxes, segmentation masks).</li> </ul> <p>Mask Function:     When mask is a callable, it should have the following signature:     mask_func(image, *args) -&gt; np.ndarray</p> <pre><code>- image: The input image (numpy array)\n- *args: Additional arguments as specified in mask_params\n\nThe function should return a numpy array of the same height and width as the input image,\nwhere non-zero pixels indicate areas to be equalized.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using a static mask\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Equalize(mask=mask, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using a dynamic mask function\n&gt;&gt;&gt; def mask_func(image, bboxes):\n...     mask = np.ones_like(image[:, :, 0], dtype=np.uint8)\n...     for bbox in bboxes:\n...         x1, y1, x2, y2 = map(int, bbox)\n...         mask[y1:y2, x1:x2] = 0  # Exclude areas inside bounding boxes\n...     return mask\n&gt;&gt;&gt;\n&gt;&gt;&gt; transform = A.Equalize(mask=mask_func, mask_params=['bboxes'], p=1.0)\n&gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (60, 60, 90, 90)]  # Example bounding boxes\n&gt;&gt;&gt; result = transform(image=image, bboxes=bboxes)\n</code></pre> <p>References</p> <ul> <li>OpenCV equalizeHist: https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e</li> <li>Pillow ImageOps.equalize: https://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.equalize</li> <li>Histogram Equalization: https://en.wikipedia.org/wiki/Histogram_equalization</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Equalize(ImageOnlyTransform):\n    \"\"\"Equalize the image histogram.\n\n    This transform applies histogram equalization to the input image. Histogram equalization\n    is a method in image processing of contrast adjustment using the image's histogram.\n\n    Args:\n        mode (Literal['cv', 'pil']): Use OpenCV or Pillow equalization method.\n            Default: 'cv'\n        by_channels (bool): If True, use equalization by channels separately,\n            else convert image to YCbCr representation and use equalization by `Y` channel.\n            Default: True\n        mask (np.ndarray, callable): If given, only the pixels selected by\n            the mask are included in the analysis. Can be:\n            - A 1-channel or 3-channel numpy array of the same size as the input image.\n            - A callable (function) that generates a mask. The function should accept 'image'\n              as its first argument, and can accept additional arguments specified in mask_params.\n            Default: None\n        mask_params (list[str]): Additional parameters to pass to the mask function.\n            These parameters will be taken from the data dict passed to __call__.\n            Default: ()\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - When mode='cv', OpenCV's equalizeHist() function is used.\n        - When mode='pil', Pillow's equalize() function is used.\n        - The 'by_channels' parameter determines whether equalization is applied to each color channel\n          independently (True) or to the luminance channel only (False).\n        - If a mask is provided as a numpy array, it should have the same height and width as the input image.\n        - If a mask is provided as a function, it allows for dynamic mask generation based on the input image\n          and additional parameters. This is useful for scenarios where the mask depends on the image content\n          or external data (e.g., bounding boxes, segmentation masks).\n\n    Mask Function:\n        When mask is a callable, it should have the following signature:\n        mask_func(image, *args) -&gt; np.ndarray\n\n        - image: The input image (numpy array)\n        - *args: Additional arguments as specified in mask_params\n\n        The function should return a numpy array of the same height and width as the input image,\n        where non-zero pixels indicate areas to be equalized.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using a static mask\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Equalize(mask=mask, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using a dynamic mask function\n        &gt;&gt;&gt; def mask_func(image, bboxes):\n        ...     mask = np.ones_like(image[:, :, 0], dtype=np.uint8)\n        ...     for bbox in bboxes:\n        ...         x1, y1, x2, y2 = map(int, bbox)\n        ...         mask[y1:y2, x1:x2] = 0  # Exclude areas inside bounding boxes\n        ...     return mask\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; transform = A.Equalize(mask=mask_func, mask_params=['bboxes'], p=1.0)\n        &gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (60, 60, 90, 90)]  # Example bounding boxes\n        &gt;&gt;&gt; result = transform(image=image, bboxes=bboxes)\n\n    References:\n        - OpenCV equalizeHist: https://docs.opencv.org/3.4/d6/dc7/group__imgproc__hist.html#ga7e54091f0c937d49bf84152a16f76d6e\n        - Pillow ImageOps.equalize: https://pillow.readthedocs.io/en/stable/reference/ImageOps.html#PIL.ImageOps.equalize\n        - Histogram Equalization: https://en.wikipedia.org/wiki/Histogram_equalization\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mode: ImageMode\n        by_channels: bool\n        mask: np.ndarray | Callable[..., Any] | None\n        mask_params: Sequence[str]\n\n    def __init__(\n        self,\n        mode: ImageMode = \"cv\",\n        by_channels: bool = True,\n        mask: np.ndarray | Callable[..., Any] | None = None,\n        mask_params: Sequence[str] = (),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.mode = mode\n        self.by_channels = by_channels\n        self.mask = mask\n        self.mask_params = mask_params\n\n    def apply(self, img: np.ndarray, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.equalize(\n            img,\n            mode=self.mode,\n            by_channels=self.by_channels,\n            mask=mask,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        if not callable(self.mask):\n            return {\"mask\": self.mask}\n\n        mask_params = {\"image\": data[\"image\"]}\n        for key in self.mask_params:\n            if key not in data:\n                raise KeyError(\n                    f\"Required parameter '{key}' for mask function is missing in data.\",\n                )\n            mask_params[key] = data[key]\n\n        return {\"mask\": self.mask(**mask_params)}\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [*list(self.mask_params)]\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"mode\", \"by_channels\", \"mask\", \"mask_params\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.FancyPCA","title":"<code>class  FancyPCA</code> <code>       (alpha=0.1, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply Fancy PCA augmentation to the input image.</p> <p>This augmentation technique applies PCA (Principal Component Analysis) to the image's color channels, then adds multiples of the principal components to the image, with magnitudes proportional to the corresponding eigenvalues times a random variable drawn from a Gaussian with mean 0 and standard deviation 'alpha'.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>tuple[float, float] | float</code> <p>Standard deviation of the Gaussian distribution used to generate random noise for each principal component. If a single float is provided, it will be used for all channels. If a tuple of two floats (min, max) is provided, the standard deviation will be uniformly sampled from this range for each run. Default: 0.1.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     any</p> <p>Note</p> <ul> <li>This augmentation is particularly effective for RGB images but can work with any number of channels.</li> <li>For grayscale images, it applies a simplified version of the augmentation.</li> <li>The transform preserves the mean of the image while adjusting the color/intensity variation.</li> <li>This implementation is based on the paper by Krizhevsky et al. and is similar to the one used   in the original AlexNet paper.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.FancyPCA(alpha=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; augmented_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012). ImageNet classification with deep   convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).</li> <li>https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class FancyPCA(ImageOnlyTransform):\n    \"\"\"Apply Fancy PCA augmentation to the input image.\n\n    This augmentation technique applies PCA (Principal Component Analysis) to the image's color channels,\n    then adds multiples of the principal components to the image, with magnitudes proportional to the\n    corresponding eigenvalues times a random variable drawn from a Gaussian with mean 0 and standard\n    deviation 'alpha'.\n\n    Args:\n        alpha (tuple[float, float] | float): Standard deviation of the Gaussian distribution used to generate\n            random noise for each principal component. If a single float is provided, it will be used for\n            all channels. If a tuple of two floats (min, max) is provided, the standard deviation will be\n            uniformly sampled from this range for each run. Default: 0.1.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        any\n\n    Note:\n        - This augmentation is particularly effective for RGB images but can work with any number of channels.\n        - For grayscale images, it applies a simplified version of the augmentation.\n        - The transform preserves the mean of the image while adjusting the color/intensity variation.\n        - This implementation is based on the paper by Krizhevsky et al. and is similar to the one used\n          in the original AlexNet paper.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.FancyPCA(alpha=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; augmented_image = result[\"image\"]\n\n    References:\n        - Krizhevsky, A., Sutskever, I., &amp; Hinton, G. E. (2012). ImageNet classification with deep\n          convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).\n        - https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        alpha: float = Field(ge=0)\n\n    def __init__(\n        self,\n        alpha: float = 0.1,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.alpha = alpha\n\n    def apply(\n        self,\n        img: np.ndarray,\n        alpha_vector: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.fancy_pca(img, alpha_vector)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        shape = params[\"shape\"]\n        num_channels = shape[-1] if len(shape) == NUM_MULTI_CHANNEL_DIMENSIONS else 1\n        alpha_vector = self.random_generator.normal(0, self.alpha, num_channels).astype(\n            np.float32,\n        )\n        return {\"alpha_vector\": alpha_vector}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str]:\n        return (\"alpha\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.FromFloat","title":"<code>class  FromFloat</code> <code>       (dtype='uint8', max_value=None, always_apply=None, p=1.0)                     </code>  [view source on GitHub]","text":"<p>Convert an image from floating point representation to the specified data type.</p> <p>This transform is designed to convert images from a normalized floating-point representation (typically with values in the range [0, 1]) to other data types, scaling the values appropriately.</p> <p>Parameters:</p> Name Type Description <code>dtype</code> <code>str</code> <p>The desired output data type. Supported types include 'uint8', 'uint16',          'uint32'. Default: 'uint8'.</p> <code>max_value</code> <code>float | None</code> <p>The maximum value for the output dtype. If None, the transform                       will attempt to infer the maximum value based on the dtype.                       Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     float32, float64</p> <p>Note</p> <ul> <li>This is the inverse transform for ToFloat.</li> <li>Input images are expected to be in floating point format with values in the range [0, 1].</li> <li>For integer output types (uint8, uint16, uint32), the function will scale the values   to the appropriate range (e.g., 0-255 for uint8).</li> <li>For float output types (float32, float64), the values will remain in the [0, 1] range.</li> <li>The transform uses the <code>from_float</code> function internally, which ensures output values   are within the valid range for the specified dtype.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.FromFloat(dtype='uint8', max_value=None, p=1.0)\n&gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)  # Float image in [0, 1] range\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; uint8_image = result['image']\n&gt;&gt;&gt; assert uint8_image.dtype == np.uint8\n&gt;&gt;&gt; assert uint8_image.min() &gt;= 0 and uint8_image.max() &lt;= 255\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class FromFloat(ImageOnlyTransform):\n    \"\"\"Convert an image from floating point representation to the specified data type.\n\n    This transform is designed to convert images from a normalized floating-point representation\n    (typically with values in the range [0, 1]) to other data types, scaling the values appropriately.\n\n    Args:\n        dtype (str): The desired output data type. Supported types include 'uint8', 'uint16',\n                     'uint32'. Default: 'uint8'.\n        max_value (float | None): The maximum value for the output dtype. If None, the transform\n                                  will attempt to infer the maximum value based on the dtype.\n                                  Default: None.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, volume\n\n    Image types:\n        float32, float64\n\n    Note:\n        - This is the inverse transform for ToFloat.\n        - Input images are expected to be in floating point format with values in the range [0, 1].\n        - For integer output types (uint8, uint16, uint32), the function will scale the values\n          to the appropriate range (e.g., 0-255 for uint8).\n        - For float output types (float32, float64), the values will remain in the [0, 1] range.\n        - The transform uses the `from_float` function internally, which ensures output values\n          are within the valid range for the specified dtype.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.FromFloat(dtype='uint8', max_value=None, p=1.0)\n        &gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)  # Float image in [0, 1] range\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; uint8_image = result['image']\n        &gt;&gt;&gt; assert uint8_image.dtype == np.uint8\n        &gt;&gt;&gt; assert uint8_image.min() &gt;= 0 and uint8_image.max() &lt;= 255\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        dtype: Literal[\"uint8\", \"uint16\", \"float32\", \"float64\"]\n        max_value: float | None\n\n    def __init__(\n        self,\n        dtype: Literal[\"uint8\", \"uint16\", \"float32\", \"float64\"] = \"uint8\",\n        max_value: float | None = None,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.dtype = np.dtype(dtype)\n        self.max_value = max_value\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return from_float(img, self.dtype, self.max_value)\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\"dtype\": self.dtype.name, \"max_value\": self.max_value}\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.GaussNoise","title":"<code>class  GaussNoise</code> <code>       (var_limit=None, mean=None, std_range=(0.2, 0.44), mean_range=(0.0, 0.0), per_channel=True, noise_scale_factor=1, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply Gaussian noise to the input image.</p> <p>Parameters:</p> Name Type Description <code>std_range</code> <code>tuple[float, float]</code> <p>Range for noise standard deviation as a fraction of the maximum value (255 for uint8 images or 1.0 for float images). Values should be in range [0, 1]. Default: (0.2, 0.44).</p> <code>mean_range</code> <code>tuple[float, float]</code> <p>Range for noise mean as a fraction of the maximum value (255 for uint8 images or 1.0 for float images). Values should be in range [-1, 1]. Default: (0.0, 0.0).</p> <code>var_limit</code> <code>tuple[float, float] | float</code> <p>[Deprecated] Variance range for noise. If var_limit is a single float value, the range will be (0, var_limit). Default: (10.0, 50.0).</p> <code>mean</code> <code>float</code> <p>[Deprecated] Mean of the noise. Default: 0.</p> <code>per_channel</code> <code>bool</code> <p>If True, noise will be sampled for each channel independently. Otherwise, the noise will be sampled once for all channels. Default: True.</p> <code>noise_scale_factor</code> <code>float</code> <p>Scaling factor for noise generation. Value should be in the range (0, 1]. When set to 1, noise is sampled for each pixel independently. If less, noise is sampled for a smaller size and resized to fit the shape of the image. Smaller values make the transform faster. Default: 1.0.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The noise parameters (std_range and mean_range) are normalized to [0, 1] range:</li> <li>For uint8 images, they are multiplied by 255</li> <li>For float32 images, they are used directly</li> <li>The behavior differs between old and new parameters:</li> <li>When using var_limit (deprecated): samples variance uniformly and takes sqrt to get std dev</li> <li>When using std_range: samples standard deviation directly (aligned with torchvision/kornia)</li> <li>Setting per_channel=False is faster but applies the same noise to all channels</li> <li>The noise_scale_factor parameter allows for a trade-off between transform speed and noise granularity</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (224, 224, 3), dtype=np.uint8)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Apply Gaussian noise with normalized std_range\n&gt;&gt;&gt; transform = A.GaussNoise(std_range=(0.1, 0.2), p=1.0)  # 10-20% of max value\n&gt;&gt;&gt; noisy_image = transform(image=image)['image']\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using deprecated var_limit (will be converted to std_range)\n&gt;&gt;&gt; transform = A.GaussNoise(var_limit=(50.0, 100.0), mean=10, p=1.0)\n&gt;&gt;&gt; noisy_image = transform(image=image)['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class GaussNoise(ImageOnlyTransform):\n    \"\"\"Apply Gaussian noise to the input image.\n\n    Args:\n        std_range (tuple[float, float]): Range for noise standard deviation as a fraction\n            of the maximum value (255 for uint8 images or 1.0 for float images).\n            Values should be in range [0, 1]. Default: (0.2, 0.44).\n        mean_range (tuple[float, float]): Range for noise mean as a fraction\n            of the maximum value (255 for uint8 images or 1.0 for float images).\n            Values should be in range [-1, 1]. Default: (0.0, 0.0).\n        var_limit (tuple[float, float] | float): [Deprecated] Variance range for noise.\n            If var_limit is a single float value, the range will be (0, var_limit).\n            Default: (10.0, 50.0).\n        mean (float): [Deprecated] Mean of the noise. Default: 0.\n        per_channel (bool): If True, noise will be sampled for each channel independently.\n            Otherwise, the noise will be sampled once for all channels. Default: True.\n        noise_scale_factor (float): Scaling factor for noise generation. Value should be in the range (0, 1].\n            When set to 1, noise is sampled for each pixel independently. If less, noise is sampled for a smaller size\n            and resized to fit the shape of the image. Smaller values make the transform faster. Default: 1.0.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The noise parameters (std_range and mean_range) are normalized to [0, 1] range:\n          * For uint8 images, they are multiplied by 255\n          * For float32 images, they are used directly\n        - The behavior differs between old and new parameters:\n          * When using var_limit (deprecated): samples variance uniformly and takes sqrt to get std dev\n          * When using std_range: samples standard deviation directly (aligned with torchvision/kornia)\n        - Setting per_channel=False is faster but applies the same noise to all channels\n        - The noise_scale_factor parameter allows for a trade-off between transform speed and noise granularity\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (224, 224, 3), dtype=np.uint8)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Apply Gaussian noise with normalized std_range\n        &gt;&gt;&gt; transform = A.GaussNoise(std_range=(0.1, 0.2), p=1.0)  # 10-20% of max value\n        &gt;&gt;&gt; noisy_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using deprecated var_limit (will be converted to std_range)\n        &gt;&gt;&gt; transform = A.GaussNoise(var_limit=(50.0, 100.0), mean=10, p=1.0)\n        &gt;&gt;&gt; noisy_image = transform(image=image)['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        var_limit: ScaleFloatType | None\n        mean: float | None\n        std_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        mean_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(-1, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        per_channel: bool\n        noise_scale_factor: float = Field(gt=0, le=1)\n\n        @model_validator(mode=\"after\")\n        def check_range(self) -&gt; Self:\n            if self.var_limit is not None:\n                warnings.warn(\"`var_limit` deprecated. Use `std_range` instead.\", DeprecationWarning, stacklevel=2)\n                self.var_limit = to_tuple(self.var_limit, 0)\n                if self.var_limit[1] &gt; 1:\n                    # Convert legacy uint8 variance to normalized std dev\n                    self.std_range = (math.sqrt(10 / 255), math.sqrt(50 / 255))\n                else:\n                    # Already normalized variance, convert to std dev\n                    self.std_range = (\n                        math.sqrt(self.var_limit[0]),\n                        math.sqrt(self.var_limit[1]),\n                    )\n\n            if self.mean is not None:\n                warn(\"`mean` deprecated. Use `mean_range` instead.\", DeprecationWarning, stacklevel=2)\n                if self.mean &gt;= 1:\n                    # Convert legacy uint8 mean to normalized range\n                    self.mean_range = (self.mean / 255, self.mean / 255)\n                else:\n                    # Already normalized mean\n                    self.mean_range = (self.mean, self.mean)\n\n            return self\n\n    def __init__(\n        self,\n        var_limit: ScaleFloatType | None = None,\n        mean: float | None = None,\n        std_range: tuple[float, float] = (0.2, 0.44),  # sqrt(10 / 255), sqrt(50 / 255)\n        mean_range: tuple[float, float] = (0.0, 0.0),\n        per_channel: bool = True,\n        noise_scale_factor: float = 1,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.std_range = std_range\n        self.mean_range = mean_range\n        self.per_channel = per_channel\n        self.noise_scale_factor = noise_scale_factor\n\n        self.var_limit = var_limit\n\n    def apply(\n        self,\n        img: np.ndarray,\n        noise_map: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_noise(img, noise_map)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, float]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n        max_value = MAX_VALUES_BY_DTYPE[image.dtype]\n\n        if self.var_limit is not None:\n            # Legacy behavior: sample variance uniformly then take sqrt\n            var = self.py_random.uniform(self.std_range[0] ** 2, self.std_range[1] ** 2)\n            sigma = math.sqrt(var)\n        else:\n            # New behavior: sample std dev directly (aligned with torchvision/kornia)\n            sigma = self.py_random.uniform(*self.std_range)\n\n        mean = self.py_random.uniform(*self.mean_range)\n\n        noise_map = fmain.generate_noise(\n            noise_type=\"gaussian\",\n            spatial_mode=\"per_pixel\" if self.per_channel else \"shared\",\n            shape=image.shape,\n            params={\"mean_range\": (mean, mean), \"std_range\": (sigma, sigma)},\n            max_value=max_value,\n            approximation=self.noise_scale_factor,\n            random_generator=self.random_generator,\n        )\n\n        return {\"noise_map\": noise_map}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"std_range\", \"mean_range\", \"per_channel\", \"noise_scale_factor\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.GaussianParams","title":"<code>class  GaussianParams</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class GaussianParams(NoiseParamsBase):\n    noise_type: Literal[\"gaussian\"] = \"gaussian\"\n    mean_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=-1, max_val=1)),\n    ]\n    std_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0, max_val=1)),\n    ]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.HueSaturationValue","title":"<code>class  HueSaturationValue</code> <code>       (hue_shift_limit=(-20, 20), sat_shift_limit=(-30, 30), val_shift_limit=(-20, 20), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Randomly change hue, saturation and value of the input image.</p> <p>This transform adjusts the HSV (Hue, Saturation, Value) channels of an input RGB image. It allows for independent control over each channel, providing a wide range of color and brightness modifications.</p> <p>Parameters:</p> Name Type Description <code>hue_shift_limit</code> <code>float | tuple[float, float]</code> <p>Range for changing hue. If a single float value is provided, the range will be (-hue_shift_limit, hue_shift_limit). Values should be in the range [-180, 180]. Default: (-20, 20).</p> <code>sat_shift_limit</code> <code>float | tuple[float, float]</code> <p>Range for changing saturation. If a single float value is provided, the range will be (-sat_shift_limit, sat_shift_limit). Values should be in the range [-255, 255]. Default: (-30, 30).</p> <code>val_shift_limit</code> <code>float | tuple[float, float]</code> <p>Range for changing value (brightness). If a single float value is provided, the range will be (-val_shift_limit, val_shift_limit). Values should be in the range [-255, 255]. Default: (-20, 20).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The transform first converts the input RGB image to the HSV color space.</li> <li>Each channel (Hue, Saturation, Value) is adjusted independently.</li> <li>Hue is circular, so it wraps around at 180 degrees.</li> <li>For float32 images, the shift values are applied as percentages of the full range.</li> <li>This transform is particularly useful for color augmentation and simulating   different lighting conditions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.HueSaturationValue(\n...     hue_shift_limit=20,\n...     sat_shift_limit=30,\n...     val_shift_limit=20,\n...     p=0.7\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; augmented_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>HSV color space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class HueSaturationValue(ImageOnlyTransform):\n    \"\"\"Randomly change hue, saturation and value of the input image.\n\n    This transform adjusts the HSV (Hue, Saturation, Value) channels of an input RGB image.\n    It allows for independent control over each channel, providing a wide range of color\n    and brightness modifications.\n\n    Args:\n        hue_shift_limit (float | tuple[float, float]): Range for changing hue.\n            If a single float value is provided, the range will be (-hue_shift_limit, hue_shift_limit).\n            Values should be in the range [-180, 180]. Default: (-20, 20).\n\n        sat_shift_limit (float | tuple[float, float]): Range for changing saturation.\n            If a single float value is provided, the range will be (-sat_shift_limit, sat_shift_limit).\n            Values should be in the range [-255, 255]. Default: (-30, 30).\n\n        val_shift_limit (float | tuple[float, float]): Range for changing value (brightness).\n            If a single float value is provided, the range will be (-val_shift_limit, val_shift_limit).\n            Values should be in the range [-255, 255]. Default: (-20, 20).\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The transform first converts the input RGB image to the HSV color space.\n        - Each channel (Hue, Saturation, Value) is adjusted independently.\n        - Hue is circular, so it wraps around at 180 degrees.\n        - For float32 images, the shift values are applied as percentages of the full range.\n        - This transform is particularly useful for color augmentation and simulating\n          different lighting conditions.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.HueSaturationValue(\n        ...     hue_shift_limit=20,\n        ...     sat_shift_limit=30,\n        ...     val_shift_limit=20,\n        ...     p=0.7\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; augmented_image = result[\"image\"]\n\n    References:\n        - HSV color space: https://en.wikipedia.org/wiki/HSL_and_HSV\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        hue_shift_limit: SymmetricRangeType\n        sat_shift_limit: SymmetricRangeType\n        val_shift_limit: SymmetricRangeType\n\n    def __init__(\n        self,\n        hue_shift_limit: ScaleFloatType = (-20, 20),\n        sat_shift_limit: ScaleFloatType = (-30, 30),\n        val_shift_limit: ScaleFloatType = (-20, 20),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.hue_shift_limit = cast(tuple[float, float], hue_shift_limit)\n        self.sat_shift_limit = cast(tuple[float, float], sat_shift_limit)\n        self.val_shift_limit = cast(tuple[float, float], val_shift_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        hue_shift: int,\n        sat_shift: int,\n        val_shift: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if not is_rgb_image(img) and not is_grayscale_image(img):\n            msg = \"HueSaturationValue transformation expects 1-channel or 3-channel images.\"\n            raise TypeError(msg)\n        return fmain.shift_hsv(img, hue_shift, sat_shift, val_shift)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"hue_shift\": self.py_random.uniform(*self.hue_shift_limit),\n            \"sat_shift\": self.py_random.uniform(*self.sat_shift_limit),\n            \"val_shift\": self.py_random.uniform(*self.val_shift_limit),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"hue_shift_limit\", \"sat_shift_limit\", \"val_shift_limit\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ISONoise","title":"<code>class  ISONoise</code> <code>       (color_shift=(0.01, 0.05), intensity=(0.1, 0.5), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies camera sensor noise to the input image, simulating high ISO settings.</p> <p>This transform adds random noise to an image, mimicking the effect of using high ISO settings in digital photography. It simulates two main components of ISO noise: 1. Color noise: random shifts in color hue 2. Luminance noise: random variations in pixel intensity</p> <p>Parameters:</p> Name Type Description <code>color_shift</code> <code>tuple[float, float]</code> <p>Range for changing color hue. Values should be in the range [0, 1], where 1 represents a full 360\u00b0 hue rotation. Default: (0.01, 0.05)</p> <code>intensity</code> <code>tuple[float, float]</code> <p>Range for the noise intensity. Higher values increase the strength of both color and luminance noise. Default: (0.1, 0.5)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>This transform only works with RGB images. It will raise a TypeError if applied to   non-RGB images.</li> <li>The color shift is applied in the HSV color space, affecting the hue channel.</li> <li>Luminance noise is added to all channels independently.</li> <li>This transform can be useful for data augmentation in low-light scenarios or when   training models to be robust against noisy inputs.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), p=0.5)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; noisy_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>ISO noise in digital photography:   https://en.wikipedia.org/wiki/Image_noise#In_digital_cameras</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ISONoise(ImageOnlyTransform):\n    \"\"\"Applies camera sensor noise to the input image, simulating high ISO settings.\n\n    This transform adds random noise to an image, mimicking the effect of using high ISO settings\n    in digital photography. It simulates two main components of ISO noise:\n    1. Color noise: random shifts in color hue\n    2. Luminance noise: random variations in pixel intensity\n\n    Args:\n        color_shift (tuple[float, float]): Range for changing color hue.\n            Values should be in the range [0, 1], where 1 represents a full 360\u00b0 hue rotation.\n            Default: (0.01, 0.05)\n\n        intensity (tuple[float, float]): Range for the noise intensity.\n            Higher values increase the strength of both color and luminance noise.\n            Default: (0.1, 0.5)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - This transform only works with RGB images. It will raise a TypeError if applied to\n          non-RGB images.\n        - The color shift is applied in the HSV color space, affecting the hue channel.\n        - Luminance noise is added to all channels independently.\n        - This transform can be useful for data augmentation in low-light scenarios or when\n          training models to be robust against noisy inputs.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ISONoise(color_shift=(0.01, 0.05), intensity=(0.1, 0.5), p=0.5)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; noisy_image = result[\"image\"]\n\n    References:\n        - ISO noise in digital photography:\n          https://en.wikipedia.org/wiki/Image_noise#In_digital_cameras\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        color_shift: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        intensity: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n\n    def __init__(\n        self,\n        color_shift: tuple[float, float] = (0.01, 0.05),\n        intensity: tuple[float, float] = (0.1, 0.5),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.intensity = intensity\n        self.color_shift = color_shift\n\n    def apply(\n        self,\n        img: np.ndarray,\n        color_shift: float,\n        intensity: float,\n        random_seed: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.iso_noise(\n            img,\n            color_shift,\n            intensity,\n            np.random.default_rng(random_seed),\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        random_seed = self.random_generator.integers(0, 2**32 - 1)\n        return {\n            \"color_shift\": self.py_random.uniform(*self.color_shift),\n            \"intensity\": self.py_random.uniform(*self.intensity),\n            \"random_seed\": random_seed,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"intensity\", \"color_shift\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Illumination","title":"<code>class  Illumination</code> <code>       (mode='linear', intensity_range=(0.01, 0.2), effect_type='both', angle_range=(0, 360), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply various illumination effects to the image.</p> <p>This transform simulates different lighting conditions by applying controlled illumination patterns. It can create effects like: - Directional lighting (linear mode) - Corner shadows/highlights (corner mode) - Spotlights or local lighting (gaussian mode)</p> <p>These effects can be used to: - Simulate natural lighting variations - Add dramatic lighting effects - Create synthetic shadows or highlights - Augment training data with different lighting conditions</p> <p>Parameters:</p> Name Type Description <code>mode</code> <code>Literal[\"linear\", \"corner\", \"gaussian\"]</code> <p>Type of illumination pattern: - 'linear': Creates a smooth gradient across the image,            simulating directional lighting like sunlight            through a window - 'corner': Applies gradient from any corner,            simulating light source from a corner - 'gaussian': Creates a circular spotlight effect,              simulating local light sources Default: 'linear'</p> <code>intensity_range</code> <code>tuple[float, float]</code> <p>Range for effect strength. Values between 0.01 and 0.2: - 0.01-0.05: Subtle lighting changes - 0.05-0.1: Moderate lighting effects - 0.1-0.2: Strong lighting effects Default: (0.01, 0.2)</p> <code>effect_type</code> <code>str</code> <p>Type of lighting change: - 'brighten': Only adds light (like a spotlight) - 'darken': Only removes light (like a shadow) - 'both': Randomly chooses between brightening and darkening Default: 'both'</p> <code>angle_range</code> <code>tuple[float, float]</code> <p>Range for gradient angle in degrees. Controls direction of linear gradient: - 0\u00b0: Left to right - 90\u00b0: Top to bottom - 180\u00b0: Right to left - 270\u00b0: Bottom to top Only used for 'linear' mode. Default: (0, 360)</p> <code>center_range</code> <code>tuple[float, float]</code> <p>Range for spotlight position. Values between 0 and 1 representing relative position: - (0, 0): Top-left corner - (1, 1): Bottom-right corner - (0.5, 0.5): Center of image Only used for 'gaussian' mode. Default: (0.1, 0.9)</p> <code>sigma_range</code> <code>tuple[float, float]</code> <p>Range for spotlight size. Values between 0.2 and 1.0: - 0.2: Small, focused spotlight - 0.5: Medium-sized light area - 1.0: Broad, soft lighting Only used for 'gaussian' mode. Default: (0.2, 1.0)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Simulate sunlight through window\n&gt;&gt;&gt; transform = A.Illumination(\n...     mode='linear',\n...     intensity_range=(0.05, 0.1),\n...     effect_type='brighten',\n...     angle_range=(30, 60)\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Create dramatic corner shadow\n&gt;&gt;&gt; transform = A.Illumination(\n...     mode='corner',\n...     intensity_range=(0.1, 0.2),\n...     effect_type='darken'\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Add multiple spotlights\n&gt;&gt;&gt; transform1 = A.Illumination(\n...     mode='gaussian',\n...     intensity_range=(0.05, 0.15),\n...     effect_type='brighten',\n...     center_range=(0.2, 0.4),\n...     sigma_range=(0.2, 0.3)\n... )\n&gt;&gt;&gt; transform2 = A.Illumination(\n...     mode='gaussian',\n...     intensity_range=(0.05, 0.15),\n...     effect_type='darken',\n...     center_range=(0.6, 0.8),\n...     sigma_range=(0.3, 0.5)\n... )\n&gt;&gt;&gt; transforms = A.Compose([transform1, transform2])\n</code></pre> <p>References</p> <ul> <li> <p>Lighting in Computer Vision:   https://en.wikipedia.org/wiki/Lighting_in_computer_vision</p> </li> <li> <p>Image-based lighting:   https://en.wikipedia.org/wiki/Image-based_lighting</p> </li> <li> <p>Similar implementation in Kornia:   https://kornia.readthedocs.io/en/latest/augmentation.html#randomlinearillumination</p> </li> <li> <p>Research on lighting augmentation:   \"Learning Deep Representations of Fine-grained Visual Descriptions\"   https://arxiv.org/abs/1605.05395</p> </li> <li> <p>Photography lighting patterns:   https://en.wikipedia.org/wiki/Lighting_pattern</p> </li> </ul> <p>Note</p> <ul> <li>The transform preserves image range and dtype</li> <li>Effects are applied multiplicatively to preserve texture</li> <li>Can be combined with other transforms for complex lighting scenarios</li> <li>Useful for training models to be robust to lighting variations</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Illumination(ImageOnlyTransform):\n    \"\"\"Apply various illumination effects to the image.\n\n    This transform simulates different lighting conditions by applying controlled\n    illumination patterns. It can create effects like:\n    - Directional lighting (linear mode)\n    - Corner shadows/highlights (corner mode)\n    - Spotlights or local lighting (gaussian mode)\n\n    These effects can be used to:\n    - Simulate natural lighting variations\n    - Add dramatic lighting effects\n    - Create synthetic shadows or highlights\n    - Augment training data with different lighting conditions\n\n    Args:\n        mode (Literal[\"linear\", \"corner\", \"gaussian\"]): Type of illumination pattern:\n            - 'linear': Creates a smooth gradient across the image,\n                       simulating directional lighting like sunlight\n                       through a window\n            - 'corner': Applies gradient from any corner,\n                       simulating light source from a corner\n            - 'gaussian': Creates a circular spotlight effect,\n                         simulating local light sources\n            Default: 'linear'\n\n        intensity_range (tuple[float, float]): Range for effect strength.\n            Values between 0.01 and 0.2:\n            - 0.01-0.05: Subtle lighting changes\n            - 0.05-0.1: Moderate lighting effects\n            - 0.1-0.2: Strong lighting effects\n            Default: (0.01, 0.2)\n\n        effect_type (str): Type of lighting change:\n            - 'brighten': Only adds light (like a spotlight)\n            - 'darken': Only removes light (like a shadow)\n            - 'both': Randomly chooses between brightening and darkening\n            Default: 'both'\n\n        angle_range (tuple[float, float]): Range for gradient angle in degrees.\n            Controls direction of linear gradient:\n            - 0\u00b0: Left to right\n            - 90\u00b0: Top to bottom\n            - 180\u00b0: Right to left\n            - 270\u00b0: Bottom to top\n            Only used for 'linear' mode.\n            Default: (0, 360)\n\n        center_range (tuple[float, float]): Range for spotlight position.\n            Values between 0 and 1 representing relative position:\n            - (0, 0): Top-left corner\n            - (1, 1): Bottom-right corner\n            - (0.5, 0.5): Center of image\n            Only used for 'gaussian' mode.\n            Default: (0.1, 0.9)\n\n        sigma_range (tuple[float, float]): Range for spotlight size.\n            Values between 0.2 and 1.0:\n            - 0.2: Small, focused spotlight\n            - 0.5: Medium-sized light area\n            - 1.0: Broad, soft lighting\n            Only used for 'gaussian' mode.\n            Default: (0.2, 1.0)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Simulate sunlight through window\n        &gt;&gt;&gt; transform = A.Illumination(\n        ...     mode='linear',\n        ...     intensity_range=(0.05, 0.1),\n        ...     effect_type='brighten',\n        ...     angle_range=(30, 60)\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Create dramatic corner shadow\n        &gt;&gt;&gt; transform = A.Illumination(\n        ...     mode='corner',\n        ...     intensity_range=(0.1, 0.2),\n        ...     effect_type='darken'\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Add multiple spotlights\n        &gt;&gt;&gt; transform1 = A.Illumination(\n        ...     mode='gaussian',\n        ...     intensity_range=(0.05, 0.15),\n        ...     effect_type='brighten',\n        ...     center_range=(0.2, 0.4),\n        ...     sigma_range=(0.2, 0.3)\n        ... )\n        &gt;&gt;&gt; transform2 = A.Illumination(\n        ...     mode='gaussian',\n        ...     intensity_range=(0.05, 0.15),\n        ...     effect_type='darken',\n        ...     center_range=(0.6, 0.8),\n        ...     sigma_range=(0.3, 0.5)\n        ... )\n        &gt;&gt;&gt; transforms = A.Compose([transform1, transform2])\n\n    References:\n        - Lighting in Computer Vision:\n          https://en.wikipedia.org/wiki/Lighting_in_computer_vision\n\n        - Image-based lighting:\n          https://en.wikipedia.org/wiki/Image-based_lighting\n\n        - Similar implementation in Kornia:\n          https://kornia.readthedocs.io/en/latest/augmentation.html#randomlinearillumination\n\n        - Research on lighting augmentation:\n          \"Learning Deep Representations of Fine-grained Visual Descriptions\"\n          https://arxiv.org/abs/1605.05395\n\n        - Photography lighting patterns:\n          https://en.wikipedia.org/wiki/Lighting_pattern\n\n    Note:\n        - The transform preserves image range and dtype\n        - Effects are applied multiplicatively to preserve texture\n        - Can be combined with other transforms for complex lighting scenarios\n        - Useful for training models to be robust to lighting variations\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mode: Literal[\"linear\", \"corner\", \"gaussian\"]\n        intensity_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0.01, 0.2)),\n        ]\n        effect_type: Literal[\"brighten\", \"darken\", \"both\"]\n        angle_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 360)),\n        ]\n        center_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        sigma_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0.2, 1.0)),\n        ]\n\n    def __init__(\n        self,\n        mode: Literal[\"linear\", \"corner\", \"gaussian\"] = \"linear\",\n        intensity_range: tuple[float, float] = (0.01, 0.2),\n        effect_type: Literal[\"brighten\", \"darken\", \"both\"] = \"both\",\n        angle_range: tuple[float, float] = (0, 360),\n        center_range: tuple[float, float] = (0.1, 0.9),\n        sigma_range: tuple[float, float] = (0.2, 1.0),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(always_apply=always_apply, p=p)\n        self.mode = mode\n        self.intensity_range = intensity_range\n        self.effect_type = effect_type\n        self.angle_range = angle_range\n        self.center_range = center_range\n        self.sigma_range = sigma_range\n\n    def get_params(self) -&gt; dict[str, Any]:\n        intensity = self.py_random.uniform(*self.intensity_range)\n\n        # Determine if brightening or darkening\n        sign = 1  # brighten\n        if self.effect_type == \"both\":\n            sign = 1 if self.py_random.random() &gt; 0.5 else -1\n        elif self.effect_type == \"darken\":\n            sign = -1\n\n        intensity *= sign\n\n        if self.mode == \"linear\":\n            angle = self.py_random.uniform(*self.angle_range)\n            return {\n                \"intensity\": intensity,\n                \"angle\": angle,\n            }\n        if self.mode == \"corner\":\n            corner = self.py_random.randint(0, 3)  # Choose random corner\n            return {\n                \"intensity\": intensity,\n                \"corner\": corner,\n            }\n\n        x = self.py_random.uniform(*self.center_range)\n        y = self.py_random.uniform(*self.center_range)\n        sigma = self.py_random.uniform(*self.sigma_range)\n        return {\n            \"intensity\": intensity,\n            \"center\": (x, y),\n            \"sigma\": sigma,\n        }\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if self.mode == \"linear\":\n            return fmain.apply_linear_illumination(\n                img,\n                intensity=params[\"intensity\"],\n                angle=params[\"angle\"],\n            )\n        if self.mode == \"corner\":\n            return fmain.apply_corner_illumination(\n                img,\n                intensity=params[\"intensity\"],\n                corner=params[\"corner\"],\n            )\n\n        return fmain.apply_gaussian_illumination(\n            img,\n            intensity=params[\"intensity\"],\n            center=params[\"center\"],\n            sigma=params[\"sigma\"],\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"mode\",\n            \"intensity_range\",\n            \"effect_type\",\n            \"angle_range\",\n            \"center_range\",\n            \"sigma_range\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ImageCompression","title":"<code>class  ImageCompression</code> <code>       (quality_lower=None, quality_upper=None, compression_type='jpeg', quality_range=(99, 100), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Decrease image quality by applying JPEG or WebP compression.</p> <p>This transform simulates the effect of saving an image with lower quality settings, which can introduce compression artifacts. It's useful for data augmentation and for testing model robustness against varying image qualities.</p> <p>Parameters:</p> Name Type Description <code>quality_range</code> <code>tuple[int, int]</code> <p>Range for the compression quality. The values should be in [1, 100] range, where: - 1 is the lowest quality (maximum compression) - 100 is the highest quality (minimum compression) Default: (99, 100)</p> <code>compression_type</code> <code>Literal[\"jpeg\", \"webp\"]</code> <p>Type of compression to apply. - \"jpeg\": JPEG compression - \"webp\": WebP compression Default: \"jpeg\"</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform expects images with 1, 3, or 4 channels.</li> <li>For JPEG compression, alpha channels (4th channel) will be ignored.</li> <li>WebP compression supports transparency (4 channels).</li> <li>The actual file is not saved to disk; the compression is simulated in memory.</li> <li>Lower quality values result in smaller file sizes but may introduce visible artifacts.</li> <li>This transform can be useful for:</li> <li>Data augmentation to improve model robustness</li> <li>Testing how models perform on images of varying quality</li> <li>Simulating images transmitted over low-bandwidth connections</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ImageCompression(quality_range=(50, 90), compression_type=0, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; compressed_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>JPEG compression: https://en.wikipedia.org/wiki/JPEG</li> <li>WebP compression: https://developers.google.com/speed/webp</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ImageCompression(ImageOnlyTransform):\n    \"\"\"Decrease image quality by applying JPEG or WebP compression.\n\n    This transform simulates the effect of saving an image with lower quality settings,\n    which can introduce compression artifacts. It's useful for data augmentation and\n    for testing model robustness against varying image qualities.\n\n    Args:\n        quality_range (tuple[int, int]): Range for the compression quality.\n            The values should be in [1, 100] range, where:\n            - 1 is the lowest quality (maximum compression)\n            - 100 is the highest quality (minimum compression)\n            Default: (99, 100)\n\n        compression_type (Literal[\"jpeg\", \"webp\"]): Type of compression to apply.\n            - \"jpeg\": JPEG compression\n            - \"webp\": WebP compression\n            Default: \"jpeg\"\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform expects images with 1, 3, or 4 channels.\n        - For JPEG compression, alpha channels (4th channel) will be ignored.\n        - WebP compression supports transparency (4 channels).\n        - The actual file is not saved to disk; the compression is simulated in memory.\n        - Lower quality values result in smaller file sizes but may introduce visible artifacts.\n        - This transform can be useful for:\n          * Data augmentation to improve model robustness\n          * Testing how models perform on images of varying quality\n          * Simulating images transmitted over low-bandwidth connections\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ImageCompression(quality_range=(50, 90), compression_type=0, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; compressed_image = result[\"image\"]\n\n    References:\n        - JPEG compression: https://en.wikipedia.org/wiki/JPEG\n        - WebP compression: https://developers.google.com/speed/webp\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        quality_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, 100)),\n            AfterValidator(nondecreasing),\n        ]\n\n        quality_lower: int | None = Field(\n            ge=1,\n            le=100,\n        )\n        quality_upper: int | None = Field(\n            ge=1,\n            le=100,\n        )\n        compression_type: Literal[\"jpeg\", \"webp\"]\n\n        @model_validator(mode=\"after\")\n        def validate_ranges(self) -&gt; Self:\n            # Update the quality_range based on the non-None values of quality_lower and quality_upper\n            if self.quality_lower is not None or self.quality_upper is not None:\n                if self.quality_lower is not None:\n                    warn(\n                        \"`quality_lower` is deprecated. Use `quality_range` as tuple\"\n                        \" (quality_lower, quality_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.quality_upper is not None:\n                    warn(\n                        \"`quality_upper` is deprecated. Use `quality_range` as tuple\"\n                        \" (quality_lower, quality_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.quality_lower if self.quality_lower is not None else self.quality_range[0]\n                upper = self.quality_upper if self.quality_upper is not None else self.quality_range[1]\n                self.quality_range = (lower, upper)\n                # Clear the deprecated individual quality settings\n                self.quality_lower = None\n                self.quality_upper = None\n\n            # Validate the quality_range\n            if not (1 &lt;= self.quality_range[0] &lt;= MAX_JPEG_QUALITY and 1 &lt;= self.quality_range[1] &lt;= MAX_JPEG_QUALITY):\n                raise ValueError(\n                    f\"Quality range values should be within [1, {MAX_JPEG_QUALITY}] range.\",\n                )\n\n            return self\n\n    def __init__(\n        self,\n        quality_lower: int | None = None,\n        quality_upper: int | None = None,\n        compression_type: Literal[\"jpeg\", \"webp\"] = \"jpeg\",\n        quality_range: tuple[int, int] = (99, 100),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.quality_range = quality_range\n        self.compression_type = compression_type\n\n    def apply(\n        self,\n        img: np.ndarray,\n        quality: int,\n        image_type: Literal[\".jpg\", \".webp\"],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.image_compression(img, quality, image_type)\n\n    def get_params(self) -&gt; dict[str, int | str]:\n        if self.compression_type == \"jpeg\":\n            image_type = \".jpg\"\n        elif self.compression_type == \"webp\":\n            image_type = \".webp\"\n        else:\n            raise ValueError(f\"Unknown image compression type: {self.compression_type}\")\n\n        return {\n            \"quality\": self.py_random.randint(*self.quality_range),\n            \"image_type\": image_type,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"quality_range\", \"compression_type\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.InterpolationPydantic","title":"<code>class  InterpolationPydantic</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class InterpolationPydantic(BaseModel):\n    upscale: InterpolationType\n    downscale: InterpolationType\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.InvertImg","title":"<code>class  InvertImg</code> <code> </code>  [view source on GitHub]","text":"<p>Invert the input image by subtracting pixel values from max values of the image types, i.e., 255 for uint8 and 1.0 for float32.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class InvertImg(ImageOnlyTransform):\n    \"\"\"Invert the input image by subtracting pixel values from max values of the image types,\n    i.e., 255 for uint8 and 1.0 for float32.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    \"\"\"\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.invert(img)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Lambda","title":"<code>class  Lambda</code> <code>       (image=None, mask=None, keypoints=None, bboxes=None, name=None, always_apply=None, p=1.0)                               </code>  [view source on GitHub]","text":"<p>A flexible transformation class for using user-defined transformation functions per targets. Function signature must include **kwargs to accept optional arguments like interpolation method, image size, etc:</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>Callable[..., Any] | None</code> <p>Image transformation function.</p> <code>mask</code> <code>Callable[..., Any] | None</code> <p>Mask transformation function.</p> <code>keypoints</code> <code>Callable[..., Any] | None</code> <p>Keypoints transformation function.</p> <code>bboxes</code> <code>Callable[..., Any] | None</code> <p>BBoxes transformation function.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Lambda(NoOp):\n    \"\"\"A flexible transformation class for using user-defined transformation functions per targets.\n    Function signature must include **kwargs to accept optional arguments like interpolation method, image size, etc:\n\n    Args:\n        image: Image transformation function.\n        mask: Mask transformation function.\n        keypoints: Keypoints transformation function.\n        bboxes: BBoxes transformation function.\n        p: probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    \"\"\"\n\n    def __init__(\n        self,\n        image: Callable[..., Any] | None = None,\n        mask: Callable[..., Any] | None = None,\n        keypoints: Callable[..., Any] | None = None,\n        bboxes: Callable[..., Any] | None = None,\n        name: str | None = None,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.name = name\n        self.custom_apply_fns = {\n            target_name: fmain.noop for target_name in (\"image\", \"mask\", \"keypoints\", \"bboxes\", \"global_label\")\n        }\n        for target_name, custom_apply_fn in {\n            \"image\": image,\n            \"mask\": mask,\n            \"keypoints\": keypoints,\n            \"bboxes\": bboxes,\n        }.items():\n            if custom_apply_fn is not None:\n                if isinstance(custom_apply_fn, LambdaType) and custom_apply_fn.__name__ == \"&lt;lambda&gt;\":\n                    warnings.warn(\n                        \"Using lambda is incompatible with multiprocessing. \"\n                        \"Consider using regular functions or partial().\",\n                        stacklevel=2,\n                    )\n\n                self.custom_apply_fns[target_name] = custom_apply_fn\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        fn = self.custom_apply_fns[\"image\"]\n        return fn(img, **params)\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        fn = self.custom_apply_fns[\"mask\"]\n        return fn(mask, **params)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        is_ndarray = True\n\n        if not isinstance(bboxes, np.ndarray):\n            is_ndarray = False\n            bboxes = np.array(bboxes, dtype=np.float32)\n\n        fn = self.custom_apply_fns[\"bboxes\"]\n        result = fn(bboxes, **params)\n\n        if not is_ndarray:\n            return result.tolist()\n\n        return result\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        is_ndarray = True\n        if not isinstance(keypoints, np.ndarray):\n            is_ndarray = False\n            keypoints = np.array(keypoints, dtype=np.float32)\n\n        fn = self.custom_apply_fns[\"keypoints\"]\n        result = fn(keypoints, **params)\n\n        if not is_ndarray:\n            return result.tolist()\n\n        return result\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        if self.name is None:\n            msg = (\n                \"To make a Lambda transform serializable you should provide the `name` argument, \"\n                \"e.g. `Lambda(name='my_transform', image=&lt;some func&gt;, ...)`.\"\n            )\n            raise ValueError(msg)\n        return {\"__class_fullname__\": self.get_class_fullname(), \"__name__\": self.name}\n\n    def __repr__(self) -&gt; str:\n        state = {\"name\": self.name}\n        state.update(self.custom_apply_fns.items())  # type: ignore[arg-type]\n        state.update(self.get_base_init_args())\n        return f\"{self.__class__.__name__}({format_args(state)})\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.LaplaceParams","title":"<code>class  LaplaceParams</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class LaplaceParams(NoiseParamsBase):\n    noise_type: Literal[\"laplace\"] = \"laplace\"\n    mean_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=-1, max_val=1)),\n    ]\n    scale_range: Annotated[\n        Sequence[float],\n        AfterValidator(check_range_bounds(min_val=0, max_val=1)),\n    ]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Morphological","title":"<code>class  Morphological</code> <code>       (scale=(2, 3), operation='dilation', p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Apply a morphological operation (dilation or erosion) to an image, with particular value for enhancing document scans.</p> <p>Morphological operations modify the structure of the image. Dilation expands the white (foreground) regions in a binary or grayscale image, while erosion shrinks them. These operations are beneficial in document processing, for example: - Dilation helps in closing up gaps within text or making thin lines thicker,     enhancing legibility for OCR (Optical Character Recognition). - Erosion can remove small white noise and detach connected objects,     making the structure of larger objects more pronounced.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>int or tuple/list of int</code> <p>Specifies the size of the structuring element (kernel) used for the operation. - If an integer is provided, a square kernel of that size will be used. - If a tuple or list is provided, it should contain two integers representing the minimum     and maximum sizes for the dilation kernel.</p> <code>operation</code> <code>Literal[\"erosion\", \"dilation\"]</code> <p>The morphological operation to apply. Default is 'dilation'.</p> <code>p</code> <code>float</code> <p>The probability of applying this transformation. Default is 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>https://github.com/facebookresearch/nougat</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n&gt;&gt;&gt;     A.Morphological(scale=(2, 3), operation='dilation', p=0.5)\n&gt;&gt;&gt; ])\n&gt;&gt;&gt; image = transform(image=image)[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Morphological(DualTransform):\n    \"\"\"Apply a morphological operation (dilation or erosion) to an image,\n    with particular value for enhancing document scans.\n\n    Morphological operations modify the structure of the image.\n    Dilation expands the white (foreground) regions in a binary or grayscale image, while erosion shrinks them.\n    These operations are beneficial in document processing, for example:\n    - Dilation helps in closing up gaps within text or making thin lines thicker,\n        enhancing legibility for OCR (Optical Character Recognition).\n    - Erosion can remove small white noise and detach connected objects,\n        making the structure of larger objects more pronounced.\n\n    Args:\n        scale (int or tuple/list of int): Specifies the size of the structuring element (kernel) used for the operation.\n            - If an integer is provided, a square kernel of that size will be used.\n            - If a tuple or list is provided, it should contain two integers representing the minimum\n                and maximum sizes for the dilation kernel.\n        operation (Literal[\"erosion\", \"dilation\"]): The morphological operation to apply.\n            Default is 'dilation'.\n        p (float, optional): The probability of applying this transformation. Default is 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        https://github.com/facebookresearch/nougat\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        &gt;&gt;&gt;     A.Morphological(scale=(2, 3), operation='dilation', p=0.5)\n        &gt;&gt;&gt; ])\n        &gt;&gt;&gt; image = transform(image=image)[\"image\"]\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: OnePlusIntRangeType\n        operation: MorphologyMode\n\n    def __init__(\n        self,\n        scale: ScaleIntType = (2, 3),\n        operation: MorphologyMode = \"dilation\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale = cast(tuple[int, int], scale)\n        self.operation = operation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        kernel: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.morphology(img, kernel, self.operation)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        kernel: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"]\n\n        denormalized_boxes = denormalize_bboxes(bboxes, image_shape)\n\n        result = fmain.bboxes_morphology(\n            denormalized_boxes,\n            kernel,\n            self.operation,\n            image_shape,\n        )\n\n        return normalize_bboxes(result, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        kernel: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return keypoints\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"kernel\": cv2.getStructuringElement(cv2.MORPH_ELLIPSE, self.scale),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"scale\", \"operation\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.MultiplicativeNoise","title":"<code>class  MultiplicativeNoise</code> <code>       (multiplier=(0.9, 1.1), per_channel=False, elementwise=False, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply multiplicative noise to the input image.</p> <p>This transform multiplies each pixel in the image by a random value or array of values, effectively creating a noise pattern that scales with the image intensity.</p> <p>Parameters:</p> Name Type Description <code>multiplier</code> <code>tuple[float, float]</code> <p>The range for the random multiplier. Defines the range from which the multiplier is sampled. Default: (0.9, 1.1)</p> <code>per_channel</code> <code>bool</code> <p>If True, use a different random multiplier for each channel. If False, use the same multiplier for all channels. Setting this to False is slightly faster. Default: False</p> <code>elementwise</code> <code>bool</code> <p>If True, generates a unique multiplier for each pixel. If False, generates a single multiplier (or one per channel if per_channel=True). Default: False</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>When elementwise=False and per_channel=False, a single multiplier is applied to the entire image.</li> <li>When elementwise=False and per_channel=True, each channel gets a different multiplier.</li> <li>When elementwise=True and per_channel=False, each pixel gets the same multiplier across all channels.</li> <li>When elementwise=True and per_channel=True, each pixel in each channel gets a unique multiplier.</li> <li>Setting per_channel=False is slightly faster, especially for larger images.</li> <li>This transform can be used to simulate various lighting conditions or to create noise that   scales with image intensity.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; noisy_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Multiplicative noise: https://en.wikipedia.org/wiki/Multiplicative_noise</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class MultiplicativeNoise(ImageOnlyTransform):\n    \"\"\"Apply multiplicative noise to the input image.\n\n    This transform multiplies each pixel in the image by a random value or array of values,\n    effectively creating a noise pattern that scales with the image intensity.\n\n    Args:\n        multiplier (tuple[float, float]): The range for the random multiplier.\n            Defines the range from which the multiplier is sampled.\n            Default: (0.9, 1.1)\n\n        per_channel (bool): If True, use a different random multiplier for each channel.\n            If False, use the same multiplier for all channels.\n            Setting this to False is slightly faster.\n            Default: False\n\n        elementwise (bool): If True, generates a unique multiplier for each pixel.\n            If False, generates a single multiplier (or one per channel if per_channel=True).\n            Default: False\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - When elementwise=False and per_channel=False, a single multiplier is applied to the entire image.\n        - When elementwise=False and per_channel=True, each channel gets a different multiplier.\n        - When elementwise=True and per_channel=False, each pixel gets the same multiplier across all channels.\n        - When elementwise=True and per_channel=True, each pixel in each channel gets a unique multiplier.\n        - Setting per_channel=False is slightly faster, especially for larger images.\n        - This transform can be used to simulate various lighting conditions or to create noise that\n          scales with image intensity.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.MultiplicativeNoise(multiplier=(0.9, 1.1), per_channel=True, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; noisy_image = result[\"image\"]\n\n    References:\n        - Multiplicative noise: https://en.wikipedia.org/wiki/Multiplicative_noise\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        multiplier: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n        per_channel: bool\n        elementwise: bool\n\n    def __init__(\n        self,\n        multiplier: ScaleFloatType = (0.9, 1.1),\n        per_channel: bool = False,\n        elementwise: bool = False,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.multiplier = cast(tuple[float, float], multiplier)\n        self.elementwise = elementwise\n        self.per_channel = per_channel\n\n    def apply(\n        self,\n        img: np.ndarray,\n        multiplier: float | np.ndarray,\n        **kwargs: Any,\n    ) -&gt; np.ndarray:\n        return multiply(img, multiplier)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        num_channels = get_num_channels(image)\n\n        if self.elementwise:\n            shape = image.shape if self.per_channel else (*image.shape[:2], 1)\n        else:\n            shape = (num_channels,) if self.per_channel else (1,)\n\n        multiplier = self.random_generator.uniform(\n            self.multiplier[0],\n            self.multiplier[1],\n            shape,\n        ).astype(np.float32)\n\n        if not self.per_channel and num_channels &gt; 1:\n            # Replicate the multiplier for all channels if not per_channel\n            multiplier = np.repeat(multiplier, num_channels, axis=-1)\n\n        if not self.elementwise and self.per_channel:\n            # Reshape to broadcast correctly when not elementwise but per_channel\n            multiplier = multiplier.reshape(1, 1, -1)\n\n        if multiplier.shape != image.shape:\n            multiplier = multiplier.squeeze()\n\n        return {\"multiplier\": multiplier}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str]:\n        return \"multiplier\", \"elementwise\", \"per_channel\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.NoiseParamsBase","title":"<code>class  NoiseParamsBase</code> <code> </code>","text":"<p>Base class for all noise parameter models.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class NoiseParamsBase(BaseModel):\n    \"\"\"Base class for all noise parameter models.\"\"\"\n\n    model_config = ConfigDict(extra=\"forbid\")\n    noise_type: str\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Normalize","title":"<code>class  Normalize</code> <code>       (mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, normalization='standard', always_apply=None, p=1.0)                           </code>  [view source on GitHub]","text":"<p>Applies various normalization techniques to an image. The specific normalization technique can be selected     with the <code>normalization</code> parameter.</p> <p>Standard normalization is applied using the formula:     <code>img = (img - mean * max_pixel_value) / (std * max_pixel_value)</code>.     Other normalization techniques adjust the image based on global or per-channel statistics,     or scale pixel values to a specified range.</p> <p>Parameters:</p> Name Type Description <code>mean</code> <code>ColorType | None</code> <p>Mean values for standard normalization. For \"standard\" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).</p> <code>std</code> <code>ColorType | None</code> <p>Standard deviation values for standard normalization. For \"standard\" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).</p> <code>max_pixel_value</code> <code>float | None</code> <p>Maximum possible pixel value, used for scaling in standard normalization. Defaults to 255.0.</p> <code>normalization</code> <code>Literal[\"standard\", \"image\", \"image_per_channel\", \"min_max\", \"min_max_per_channel\"]) Specifies the normalization technique to apply. Defaults to \"standard\". - \"standard\"</code> <p>Applies the formula <code>(img - mean * max_pixel_value) / (std * max_pixel_value)</code>.     The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5)     for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO. - \"image\": Normalizes the whole image based on its global mean and standard deviation. - \"image_per_channel\": Normalizes the image per channel based on each channel's mean and standard deviation. - \"min_max\": Scales the image pixel values to a [0, 1] range based on the global     minimum and maximum pixel values. - \"min_max_per_channel\": Scales each channel of the image pixel values to a [0, 1]     range based on the per-channel minimum and maximum pixel values.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Defaults to 1.0.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>For \"standard\" normalization, <code>mean</code>, <code>std</code>, and <code>max_pixel_value</code> must be provided.</li> <li>For other normalization types, these parameters are ignored.</li> <li>For inception normalization, use mean values of (0.5, 0.5, 0.5).</li> <li>For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1).</li> <li>This transform is often used as a final step in image preprocessing pipelines to   prepare images for neural network input.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; # Standard ImageNet normalization\n&gt;&gt;&gt; transform = A.Normalize(\n...     mean=(0.485, 0.456, 0.406),\n...     std=(0.229, 0.224, 0.225),\n...     max_pixel_value=255.0,\n...     p=1.0\n... )\n&gt;&gt;&gt; normalized_image = transform(image=image)[\"image\"]\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Min-max normalization\n&gt;&gt;&gt; transform_minmax = A.Normalize(normalization=\"min_max\", p=1.0)\n&gt;&gt;&gt; normalized_image_minmax = transform_minmax(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>ImageNet mean and std: https://pytorch.org/vision/stable/models.html</li> <li>Inception preprocessing: https://keras.io/api/applications/inceptionv3/</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Normalize(ImageOnlyTransform):\n    \"\"\"Applies various normalization techniques to an image. The specific normalization technique can be selected\n        with the `normalization` parameter.\n\n    Standard normalization is applied using the formula:\n        `img = (img - mean * max_pixel_value) / (std * max_pixel_value)`.\n        Other normalization techniques adjust the image based on global or per-channel statistics,\n        or scale pixel values to a specified range.\n\n    Args:\n        mean (ColorType | None): Mean values for standard normalization.\n            For \"standard\" normalization, the default values are ImageNet mean values: (0.485, 0.456, 0.406).\n        std (ColorType | None): Standard deviation values for standard normalization.\n            For \"standard\" normalization, the default values are ImageNet standard deviation :(0.229, 0.224, 0.225).\n        max_pixel_value (float | None): Maximum possible pixel value, used for scaling in standard normalization.\n            Defaults to 255.0.\n        normalization (Literal[\"standard\", \"image\", \"image_per_channel\", \"min_max\", \"min_max_per_channel\"])\n            Specifies the normalization technique to apply. Defaults to \"standard\".\n            - \"standard\": Applies the formula `(img - mean * max_pixel_value) / (std * max_pixel_value)`.\n                The default mean and std are based on ImageNet. You can use mean and std values of (0.5, 0.5, 0.5)\n                for inception normalization. And mean values of (0, 0, 0) and std values of (1, 1, 1) for YOLO.\n            - \"image\": Normalizes the whole image based on its global mean and standard deviation.\n            - \"image_per_channel\": Normalizes the image per channel based on each channel's mean and standard deviation.\n            - \"min_max\": Scales the image pixel values to a [0, 1] range based on the global\n                minimum and maximum pixel values.\n            - \"min_max_per_channel\": Scales each channel of the image pixel values to a [0, 1]\n                range based on the per-channel minimum and maximum pixel values.\n\n        p (float): Probability of applying the transform. Defaults to 1.0.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - For \"standard\" normalization, `mean`, `std`, and `max_pixel_value` must be provided.\n        - For other normalization types, these parameters are ignored.\n        - For inception normalization, use mean values of (0.5, 0.5, 0.5).\n        - For YOLO normalization, use mean values of (0, 0, 0) and std values of (1, 1, 1).\n        - This transform is often used as a final step in image preprocessing pipelines to\n          prepare images for neural network input.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; # Standard ImageNet normalization\n        &gt;&gt;&gt; transform = A.Normalize(\n        ...     mean=(0.485, 0.456, 0.406),\n        ...     std=(0.229, 0.224, 0.225),\n        ...     max_pixel_value=255.0,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; normalized_image = transform(image=image)[\"image\"]\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Min-max normalization\n        &gt;&gt;&gt; transform_minmax = A.Normalize(normalization=\"min_max\", p=1.0)\n        &gt;&gt;&gt; normalized_image_minmax = transform_minmax(image=image)[\"image\"]\n\n    References:\n        - ImageNet mean and std: https://pytorch.org/vision/stable/models.html\n        - Inception preprocessing: https://keras.io/api/applications/inceptionv3/\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mean: ColorType | None\n        std: ColorType | None\n        max_pixel_value: float | None\n        normalization: Literal[\n            \"standard\",\n            \"image\",\n            \"image_per_channel\",\n            \"min_max\",\n            \"min_max_per_channel\",\n        ]\n\n        @model_validator(mode=\"after\")\n        def validate_normalization(self) -&gt; Self:\n            if (\n                self.mean is None\n                or self.std is None\n                or (self.max_pixel_value is None and self.normalization == \"standard\")\n            ):\n                raise ValueError(\n                    \"mean, std, and max_pixel_value must be provided for standard normalization.\",\n                )\n            return self\n\n    def __init__(\n        self,\n        mean: ColorType | None = (0.485, 0.456, 0.406),\n        std: ColorType | None = (0.229, 0.224, 0.225),\n        max_pixel_value: float | None = 255.0,\n        normalization: Literal[\n            \"standard\",\n            \"image\",\n            \"image_per_channel\",\n            \"min_max\",\n            \"min_max_per_channel\",\n        ] = \"standard\",\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.mean = mean\n        self.mean_np = np.array(mean, dtype=np.float32) * max_pixel_value\n        self.std = std\n        self.denominator = np.reciprocal(\n            np.array(std, dtype=np.float32) * max_pixel_value,\n        )\n        self.max_pixel_value = max_pixel_value\n        self.normalization = normalization\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if self.normalization == \"standard\":\n            return normalize(\n                img,\n                self.mean_np,\n                self.denominator,\n            )\n        return normalize_per_image(img, self.normalization)\n\n    @batch_transform(\"channel\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_images(self, images: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return self.apply(images, **params)\n\n    @batch_transform(\"channel\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_volume(self, volume: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return self.apply(volume, **params)\n\n    @batch_transform(\"channel\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return self.apply(volumes, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"mean\", \"std\", \"max_pixel_value\", \"normalization\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PixelDropout","title":"<code>class  PixelDropout</code> <code>       (dropout_prob=0.01, per_channel=False, drop_value=0, mask_drop_value=None, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Drops random pixels from the image.</p> <p>This transform randomly sets pixels in the image to a specified value, effectively \"dropping out\" those pixels. It can be applied to both the image and its corresponding mask.</p> <p>Parameters:</p> Name Type Description <code>dropout_prob</code> <code>float</code> <p>Probability of dropping out each pixel. Should be in the range [0, 1]. Default: 0.01</p> <code>per_channel</code> <code>bool</code> <p>If True, the dropout mask will be generated independently for each channel. If False, the same dropout mask will be applied to all channels. Default: False</p> <code>drop_value</code> <code>float | Sequence[float] | None</code> <p>Value to assign to the dropped pixels. If None, the value will be randomly sampled for each application:     - For uint8 images: Random integer in [0, 255]     - For float32 images: Random float in [0, 1] If a single number, that value will be used for all dropped pixels. If a sequence, it should contain one value per channel. Default: 0</p> <code>mask_drop_value</code> <code>float | Sequence[float] | None</code> <p>Value to assign to dropped pixels in the mask. If None, the mask will remain unchanged. If a single number, that value will be used for all dropped pixels in the mask. If a sequence, it should contain one value per channel of the mask. Note: Only applicable when per_channel=False. Default: None</p> <code>always_apply</code> <code>bool</code> <p>If True, the transform will always be applied. Default: False</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>When applied to bounding boxes, this transform may cause some boxes to have zero area   if all pixels within the box are dropped. Such boxes will be removed.</li> <li>When applied to keypoints, keypoints that fall on dropped pixels will be removed if   the keypoint processor is configured to remove invisible keypoints.</li> <li>The 'per_channel' option is not supported for mask dropout. If you need to drop pixels   in a multi-channel mask independently, consider applying this transform multiple times   with per_channel=False.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PixelDropout(dropout_prob=0.1, per_channel=True, p=1.0)\n&gt;&gt;&gt; result = transform(image=image, mask=mask)\n&gt;&gt;&gt; dropped_image, dropped_mask = result['image'], result['mask']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PixelDropout(DualTransform):\n    \"\"\"Drops random pixels from the image.\n\n    This transform randomly sets pixels in the image to a specified value, effectively \"dropping out\" those pixels.\n    It can be applied to both the image and its corresponding mask.\n\n    Args:\n        dropout_prob (float): Probability of dropping out each pixel. Should be in the range [0, 1].\n            Default: 0.01\n\n        per_channel (bool): If True, the dropout mask will be generated independently for each channel.\n            If False, the same dropout mask will be applied to all channels.\n            Default: False\n\n        drop_value (float | Sequence[float] | None): Value to assign to the dropped pixels.\n            If None, the value will be randomly sampled for each application:\n                - For uint8 images: Random integer in [0, 255]\n                - For float32 images: Random float in [0, 1]\n            If a single number, that value will be used for all dropped pixels.\n            If a sequence, it should contain one value per channel.\n            Default: 0\n\n        mask_drop_value (float | Sequence[float] | None): Value to assign to dropped pixels in the mask.\n            If None, the mask will remain unchanged.\n            If a single number, that value will be used for all dropped pixels in the mask.\n            If a sequence, it should contain one value per channel of the mask.\n            Note: Only applicable when per_channel=False.\n            Default: None\n\n        always_apply (bool): If True, the transform will always be applied.\n            Default: False\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - When applied to bounding boxes, this transform may cause some boxes to have zero area\n          if all pixels within the box are dropped. Such boxes will be removed.\n        - When applied to keypoints, keypoints that fall on dropped pixels will be removed if\n          the keypoint processor is configured to remove invisible keypoints.\n        - The 'per_channel' option is not supported for mask dropout. If you need to drop pixels\n          in a multi-channel mask independently, consider applying this transform multiple times\n          with per_channel=False.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PixelDropout(dropout_prob=0.1, per_channel=True, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image, mask=mask)\n        &gt;&gt;&gt; dropped_image, dropped_mask = result['image'], result['mask']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        dropout_prob: ProbabilityType\n        per_channel: bool\n        drop_value: ScaleFloatType | None\n        mask_drop_value: ScaleFloatType | None\n\n        @model_validator(mode=\"after\")\n        def validate_mask_drop_value(self) -&gt; Self:\n            if self.mask_drop_value is not None and self.per_channel:\n                msg = \"PixelDropout supports mask only with per_channel=False.\"\n                raise ValueError(msg)\n            return self\n\n    _targets = ALL_TARGETS\n\n    def __init__(\n        self,\n        dropout_prob: float = 0.01,\n        per_channel: bool = False,\n        drop_value: ScaleFloatType | None = 0,\n        mask_drop_value: ScaleFloatType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.dropout_prob = dropout_prob\n        self.per_channel = per_channel\n        self.drop_value = drop_value\n        self.mask_drop_value = mask_drop_value\n\n    def apply(\n        self,\n        img: np.ndarray,\n        drop_mask: np.ndarray,\n        drop_value: float | Sequence[float],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.pixel_dropout(img, drop_mask, drop_value)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        drop_mask: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if self.mask_drop_value is None:\n            return mask\n\n        if mask.ndim == MONO_CHANNEL_DIMENSIONS:\n            drop_mask = np.squeeze(drop_mask)\n\n        return fmain.pixel_dropout(mask, drop_mask, self.mask_drop_value)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        drop_mask: np.ndarray | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if drop_mask is None or self.per_channel:\n            return bboxes\n\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n\n        image_shape = params[\"shape\"][:2]\n\n        denormalized_bboxes = denormalize_bboxes(bboxes, image_shape)\n\n        result = fdropout.mask_dropout_bboxes(\n            denormalized_bboxes,\n            drop_mask,\n            image_shape,\n            processor.params.min_area,\n            processor.params.min_visibility,\n        )\n\n        return normalize_bboxes(result, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        drop_mask: np.ndarray | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if drop_mask is None or self.per_channel:\n            return keypoints\n\n        processor = cast(KeypointsProcessor, self.get_processor(\"keypoints\"))\n\n        if processor is None or not processor.params.remove_invisible:\n            return keypoints\n\n        return fdropout.mask_dropout_keypoints(keypoints, drop_mask)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        shape = image.shape if self.per_channel else image.shape[:2]\n\n        # Use choice to create boolean matrix, if we will use binomial after that we will need type conversion\n        drop_mask = self.random_generator.choice(\n            [True, False],\n            shape,\n            p=[self.dropout_prob, 1 - self.dropout_prob],\n        )\n\n        drop_value: float | Sequence[float] | np.ndarray\n\n        if drop_mask.ndim != image.ndim:\n            drop_mask = np.expand_dims(drop_mask, -1)\n        if self.drop_value is None:\n            drop_shape = 1 if is_grayscale_image(image) else int(image.shape[-1])\n\n            if image.dtype == np.uint8:\n                drop_value = self.random_generator.integers(\n                    0,\n                    int(MAX_VALUES_BY_DTYPE[image.dtype]),\n                    size=drop_shape,\n                    dtype=image.dtype,\n                )\n            elif image.dtype == np.float32:\n                drop_value = self.random_generator.uniform(\n                    0,\n                    1,\n                    size=drop_shape,\n                ).astype(image.dtype)\n            else:\n                raise ValueError(f\"Unsupported dtype: {image.dtype}\")\n        else:\n            drop_value = self.drop_value\n\n        return {\"drop_mask\": drop_mask, \"drop_value\": drop_value}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return (\"dropout_prob\", \"per_channel\", \"drop_value\", \"mask_drop_value\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PlanckianJitter","title":"<code>class  PlanckianJitter</code> <code>       (mode='blackbody', temperature_limit=None, sampling_method='uniform', p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Applies Planckian Jitter to the input image, simulating color temperature variations in illumination.</p> <p>This transform adjusts the color of an image to mimic the effect of different color temperatures of light sources, based on Planck's law of black body radiation. It can simulate the appearance of an image under various lighting conditions, from warm (reddish) to cool (bluish) color casts.</p> <p>PlanckianJitter vs. ColorJitter: PlanckianJitter is fundamentally different from ColorJitter in its approach and use cases: 1. Physics-based: PlanckianJitter is grounded in the physics of light, simulating real-world    color temperature changes. ColorJitter applies arbitrary color adjustments. 2. Natural effects: This transform produces color shifts that correspond to natural lighting    variations, making it ideal for outdoor scene simulation or color constancy problems. 3. Single parameter: Color changes are controlled by a single, physically meaningful parameter    (color temperature), unlike ColorJitter's multiple abstract parameters. 4. Correlated changes: Color shifts are correlated across channels in a way that mimics natural    light, whereas ColorJitter can make independent channel adjustments.</p> <p>When to use PlanckianJitter: - Simulating different times of day or lighting conditions in outdoor scenes - Augmenting data for computer vision tasks that need to be robust to natural lighting changes - Preparing synthetic data to better match real-world lighting variations - Color constancy research or applications - When you need physically plausible color variations rather than arbitrary color changes</p> <p>The logic behind PlanckianJitter: As the color temperature increases: 1. Lower temperatures (around 3000K) produce warm, reddish tones, simulating sunset or incandescent lighting. 2. Mid-range temperatures (around 5500K) correspond to daylight. 3. Higher temperatures (above 7000K) result in cool, bluish tones, similar to overcast sky or shade. This progression mimics the natural variation of sunlight throughout the day and in different weather conditions.</p> <p>Parameters:</p> Name Type Description <code>mode</code> <code>Literal[\"blackbody\", \"cied\"]</code> <p>The mode of the transformation. - \"blackbody\": Simulates blackbody radiation color changes. - \"cied\": Uses the CIE D illuminant series for color temperature simulation. Default: \"blackbody\"</p> <code>temperature_limit</code> <code>tuple[int, int] | None</code> <p>The range of color temperatures (in Kelvin) to sample from. - For \"blackbody\" mode: Should be within [3000K, 15000K]. Default: (3000, 15000) - For \"cied\" mode: Should be within [4000K, 15000K]. Default: (4000, 15000) If None, the default ranges will be used based on the selected mode. Higher temperatures produce cooler (bluish) images, lower temperatures produce warmer (reddish) images.</p> <code>sampling_method</code> <code>Literal[\"uniform\", \"gaussian\"]</code> <p>Method to sample the temperature. - \"uniform\": Samples uniformly across the specified range. - \"gaussian\": Samples from a Gaussian distribution centered at 6500K (approximate daylight). Default: \"uniform\"</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The transform preserves the overall brightness of the image while shifting its color.</li> <li>The \"blackbody\" mode provides a wider range of color shifts, especially in the lower (warmer) temperatures.</li> <li>The \"cied\" mode is based on standard illuminants and may provide more realistic daylight variations.</li> <li>The Gaussian sampling method tends to produce more subtle variations, as it's centered around daylight.</li> <li>Unlike ColorJitter, this transform ensures that color changes are physically plausible and correlated   across channels, maintaining the natural appearance of the scene under different lighting conditions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PlanckianJitter(mode=\"blackbody\",\n...                               temperature_range=(3000, 9000),\n...                               sampling_method=\"uniform\",\n...                               p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; jittered_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Planck's law: https://en.wikipedia.org/wiki/Planck%27s_law</li> <li>CIE Standard Illuminants: https://en.wikipedia.org/wiki/Standard_illuminant</li> <li>Color temperature: https://en.wikipedia.org/wiki/Color_temperature</li> <li>Implementation inspired by: https://github.com/TheZino/PlanckianJitter</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PlanckianJitter(ImageOnlyTransform):\n    \"\"\"Applies Planckian Jitter to the input image, simulating color temperature variations in illumination.\n\n    This transform adjusts the color of an image to mimic the effect of different color temperatures\n    of light sources, based on Planck's law of black body radiation. It can simulate the appearance\n    of an image under various lighting conditions, from warm (reddish) to cool (bluish) color casts.\n\n    PlanckianJitter vs. ColorJitter:\n    PlanckianJitter is fundamentally different from ColorJitter in its approach and use cases:\n    1. Physics-based: PlanckianJitter is grounded in the physics of light, simulating real-world\n       color temperature changes. ColorJitter applies arbitrary color adjustments.\n    2. Natural effects: This transform produces color shifts that correspond to natural lighting\n       variations, making it ideal for outdoor scene simulation or color constancy problems.\n    3. Single parameter: Color changes are controlled by a single, physically meaningful parameter\n       (color temperature), unlike ColorJitter's multiple abstract parameters.\n    4. Correlated changes: Color shifts are correlated across channels in a way that mimics natural\n       light, whereas ColorJitter can make independent channel adjustments.\n\n    When to use PlanckianJitter:\n    - Simulating different times of day or lighting conditions in outdoor scenes\n    - Augmenting data for computer vision tasks that need to be robust to natural lighting changes\n    - Preparing synthetic data to better match real-world lighting variations\n    - Color constancy research or applications\n    - When you need physically plausible color variations rather than arbitrary color changes\n\n    The logic behind PlanckianJitter:\n    As the color temperature increases:\n    1. Lower temperatures (around 3000K) produce warm, reddish tones, simulating sunset or incandescent lighting.\n    2. Mid-range temperatures (around 5500K) correspond to daylight.\n    3. Higher temperatures (above 7000K) result in cool, bluish tones, similar to overcast sky or shade.\n    This progression mimics the natural variation of sunlight throughout the day and in different weather conditions.\n\n    Args:\n        mode (Literal[\"blackbody\", \"cied\"]): The mode of the transformation.\n            - \"blackbody\": Simulates blackbody radiation color changes.\n            - \"cied\": Uses the CIE D illuminant series for color temperature simulation.\n            Default: \"blackbody\"\n\n        temperature_limit (tuple[int, int] | None): The range of color temperatures (in Kelvin) to sample from.\n            - For \"blackbody\" mode: Should be within [3000K, 15000K]. Default: (3000, 15000)\n            - For \"cied\" mode: Should be within [4000K, 15000K]. Default: (4000, 15000)\n            If None, the default ranges will be used based on the selected mode.\n            Higher temperatures produce cooler (bluish) images, lower temperatures produce warmer (reddish) images.\n\n        sampling_method (Literal[\"uniform\", \"gaussian\"]): Method to sample the temperature.\n            - \"uniform\": Samples uniformly across the specified range.\n            - \"gaussian\": Samples from a Gaussian distribution centered at 6500K (approximate daylight).\n            Default: \"uniform\"\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The transform preserves the overall brightness of the image while shifting its color.\n        - The \"blackbody\" mode provides a wider range of color shifts, especially in the lower (warmer) temperatures.\n        - The \"cied\" mode is based on standard illuminants and may provide more realistic daylight variations.\n        - The Gaussian sampling method tends to produce more subtle variations, as it's centered around daylight.\n        - Unlike ColorJitter, this transform ensures that color changes are physically plausible and correlated\n          across channels, maintaining the natural appearance of the scene under different lighting conditions.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PlanckianJitter(mode=\"blackbody\",\n        ...                               temperature_range=(3000, 9000),\n        ...                               sampling_method=\"uniform\",\n        ...                               p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; jittered_image = result[\"image\"]\n\n    References:\n        - Planck's law: https://en.wikipedia.org/wiki/Planck%27s_law\n        - CIE Standard Illuminants: https://en.wikipedia.org/wiki/Standard_illuminant\n        - Color temperature: https://en.wikipedia.org/wiki/Color_temperature\n        - Implementation inspired by: https://github.com/TheZino/PlanckianJitter\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mode: Literal[\"blackbody\", \"cied\"]\n        temperature_limit: Annotated[tuple[int, int], AfterValidator(nondecreasing)] | None\n        sampling_method: Literal[\"uniform\", \"gaussian\"]\n\n        @model_validator(mode=\"after\")\n        def validate_temperature(self) -&gt; Self:\n            max_temp = int(PLANKIAN_JITTER_CONST[\"MAX_TEMP\"])\n\n            if self.temperature_limit is None:\n                if self.mode == \"blackbody\":\n                    self.temperature_limit = (\n                        int(PLANKIAN_JITTER_CONST[\"MIN_BLACKBODY_TEMP\"]),\n                        max_temp,\n                    )\n                elif self.mode == \"cied\":\n                    self.temperature_limit = (\n                        int(PLANKIAN_JITTER_CONST[\"MIN_CIED_TEMP\"]),\n                        max_temp,\n                    )\n            else:\n                if self.mode == \"blackbody\" and (\n                    min(self.temperature_limit) &lt; PLANKIAN_JITTER_CONST[\"MIN_BLACKBODY_TEMP\"]\n                    or max(self.temperature_limit) &gt; max_temp\n                ):\n                    raise ValueError(\n                        \"Temperature limits for blackbody should be in [3000, 15000] range\",\n                    )\n                if self.mode == \"cied\" and (\n                    min(self.temperature_limit) &lt; PLANKIAN_JITTER_CONST[\"MIN_CIED_TEMP\"]\n                    or max(self.temperature_limit) &gt; max_temp\n                ):\n                    raise ValueError(\n                        \"Temperature limits for CIED should be in [4000, 15000] range\",\n                    )\n\n                if not self.temperature_limit[0] &lt;= PLANKIAN_JITTER_CONST[\"WHITE_TEMP\"] &lt;= self.temperature_limit[1]:\n                    raise ValueError(\n                        \"White temperature should be within the temperature limits\",\n                    )\n\n            return self\n\n    def __init__(\n        self,\n        mode: Literal[\"blackbody\", \"cied\"] = \"blackbody\",\n        temperature_limit: tuple[int, int] | None = None,\n        sampling_method: Literal[\"uniform\", \"gaussian\"] = \"uniform\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ) -&gt; None:\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.mode = mode\n        self.temperature_limit = cast(tuple[int, int], temperature_limit)\n        self.sampling_method = sampling_method\n\n    def apply(self, img: np.ndarray, temperature: int, **params: Any) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.planckian_jitter(img, temperature, mode=self.mode)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        sampling_prob_boundary = PLANKIAN_JITTER_CONST[\"SAMPLING_TEMP_PROB\"]\n        sampling_temp_boundary = PLANKIAN_JITTER_CONST[\"WHITE_TEMP\"]\n\n        if self.sampling_method == \"uniform\":\n            # Split into 2 cases to avoid selecting cold temperatures (&gt;6000) too often\n            if self.py_random.random() &lt; sampling_prob_boundary:\n                temperature = self.py_random.uniform(\n                    self.temperature_limit[0],\n                    sampling_temp_boundary,\n                )\n            else:\n                temperature = self.py_random.uniform(\n                    sampling_temp_boundary,\n                    self.temperature_limit[1],\n                )\n        elif self.sampling_method == \"gaussian\":\n            # Sample values from asymmetric gaussian distribution\n            if self.py_random.random() &lt; sampling_prob_boundary:\n                # Left side\n                shift = np.abs(\n                    self.py_random.gauss(\n                        0,\n                        np.abs(sampling_temp_boundary - self.temperature_limit[0]) / 3,\n                    ),\n                )\n                temperature = sampling_temp_boundary - shift\n            else:\n                # Right side\n                shift = np.abs(\n                    self.py_random.gauss(\n                        0,\n                        np.abs(self.temperature_limit[1] - sampling_temp_boundary) / 3,\n                    ),\n                )\n                temperature = sampling_temp_boundary + shift\n        else:\n            raise ValueError(f\"Unknown sampling method: {self.sampling_method}\")\n\n        # Ensure temperature is within the valid range\n        temperature = np.clip(\n            temperature,\n            self.temperature_limit[0],\n            self.temperature_limit[1],\n        )\n\n        return {\"temperature\": int(temperature)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"mode\", \"temperature_limit\", \"sampling_method\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PlasmaBrightnessContrast","title":"<code>class  PlasmaBrightnessContrast</code> <code>       (brightness_range=(-0.3, 0.3), contrast_range=(-0.3, 0.3), plasma_size=256, roughness=3.0, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply plasma fractal pattern to modify image brightness and contrast.</p> <p>This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns that are then used to create spatially-varying brightness and contrast adjustments. The result is a natural-looking, non-uniform modification of the image.</p> <p>Parameters:</p> Name Type Description <code>brightness_range</code> <code>float, float</code> <p>Range for brightness adjustment strength. Values between -1 and 1: - Positive values increase brightness - Negative values decrease brightness - 0 means no brightness change Default: (-0.3, 0.3)</p> <code>contrast_range</code> <code>float, float</code> <p>Range for contrast adjustment strength. Values between -1 and 1: - Positive values increase contrast - Negative values decrease contrast - 0 means no contrast change Default: (-0.3, 0.3)</p> <code>plasma_size</code> <code>int</code> <p>Size of the plasma pattern. Will be rounded up to nearest power of 2. Larger values create more detailed patterns. Default: 256</p> <code>roughness</code> <code>float</code> <p>Controls the roughness of the plasma pattern. Higher values create more rough/sharp transitions. Must be greater than 0. Typical values are between 1.0 and 5.0. Default: 3.0</p> <p>p (float): Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Mathematical Formulation:     1. Plasma Pattern Generation:        The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1] by:        - Starting with random corner values        - Recursively computing midpoints using:          M = (V1 + V2 + V3 + V4)/4 + R(d)        where V1..V4 are corner values and R(d) is random noise that        decreases with distance d according to the roughness parameter.</p> <pre><code>2. Brightness Adjustment:\n   For each pixel (x,y):\n   O(x,y) = I(x,y) + b\u00b7P(x,y)\u00b7max_value\n   where:\n   - I is the input image\n   - b is the brightness factor\n   - P is the plasma pattern\n   - max_value is the maximum possible pixel value\n\n3. Contrast Adjustment:\n   For each pixel (x,y):\n   O(x,y) = \u03bc + (I(x,y) - \u03bc)\u00b7(1 + c\u00b7P(x,y))\n   where:\n   - \u03bc is the mean pixel value\n   - c is the contrast factor\n   - P is the plasma pattern\n</code></pre> <p>Note</p> <ul> <li>The plasma pattern creates smooth, organic variations in the adjustments</li> <li>Brightness and contrast modifications are applied sequentially</li> <li>Final values are clipped to valid range [0, max_value]</li> <li>The same plasma pattern is used for both brightness and contrast   to maintain coherent spatial variations</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-parameters","title":"Default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(p=1.0)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-adjustments-with-fine-pattern","title":"Custom adjustments with fine pattern","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(\n...     brightness_range=(-0.5, 0.5),\n...     contrast_range=(-0.3, 0.3),\n...     plasma_size=512,  # More detailed pattern\n...     roughness=2.5,    # Smoother transitions\n...     p=1.0\n... )\n</code></pre> <p>References</p> <p>.. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"        Communications of the ACM, 1982.        Paper introducing the Diamond-Square algorithm.</p> <p>.. [2] Miller, \"The Diamond-Square Algorithm: A Detailed Analysis,\"        Journal of Computer Graphics Techniques, 2016.        Comprehensive analysis of the algorithm and its properties.</p> <p>.. [3] Ebert et al., \"Texturing &amp; Modeling: A Procedural Approach,\"        Chapter 12: Noise, Hypertexture, Antialiasing, and Gesture.        Detailed coverage of procedural noise patterns.</p> <p>.. [4] Diamond-Square algorithm:        https://en.wikipedia.org/wiki/Diamond-square_algorithm</p> <p>.. [5] Plasma effect:        https://lodev.org/cgtutor/plasma.html</p> <p>See Also:     - RandomBrightnessContrast: For uniform brightness/contrast adjustments     - CLAHE: For contrast limited adaptive histogram equalization     - FancyPCA: For color-based contrast enhancement     - HistogramMatching: For reference-based contrast adjustment</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PlasmaBrightnessContrast(ImageOnlyTransform):\n    \"\"\"Apply plasma fractal pattern to modify image brightness and contrast.\n\n    This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns\n    that are then used to create spatially-varying brightness and contrast adjustments.\n    The result is a natural-looking, non-uniform modification of the image.\n\n    Args:\n        brightness_range ((float, float)): Range for brightness adjustment strength.\n            Values between -1 and 1:\n            - Positive values increase brightness\n            - Negative values decrease brightness\n            - 0 means no brightness change\n            Default: (-0.3, 0.3)\n\n        contrast_range ((float, float)): Range for contrast adjustment strength.\n            Values between -1 and 1:\n            - Positive values increase contrast\n            - Negative values decrease contrast\n            - 0 means no contrast change\n            Default: (-0.3, 0.3)\n\n        plasma_size (int): Size of the plasma pattern. Will be rounded up to nearest power of 2.\n            Larger values create more detailed patterns. Default: 256\n\n        roughness (float): Controls the roughness of the plasma pattern.\n            Higher values create more rough/sharp transitions.\n            Must be greater than 0.\n            Typical values are between 1.0 and 5.0. Default: 3.0\n\n            p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Mathematical Formulation:\n        1. Plasma Pattern Generation:\n           The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1] by:\n           - Starting with random corner values\n           - Recursively computing midpoints using:\n             M = (V1 + V2 + V3 + V4)/4 + R(d)\n           where V1..V4 are corner values and R(d) is random noise that\n           decreases with distance d according to the roughness parameter.\n\n        2. Brightness Adjustment:\n           For each pixel (x,y):\n           O(x,y) = I(x,y) + b\u00b7P(x,y)\u00b7max_value\n           where:\n           - I is the input image\n           - b is the brightness factor\n           - P is the plasma pattern\n           - max_value is the maximum possible pixel value\n\n        3. Contrast Adjustment:\n           For each pixel (x,y):\n           O(x,y) = \u03bc + (I(x,y) - \u03bc)\u00b7(1 + c\u00b7P(x,y))\n           where:\n           - \u03bc is the mean pixel value\n           - c is the contrast factor\n           - P is the plasma pattern\n\n    Note:\n        - The plasma pattern creates smooth, organic variations in the adjustments\n        - Brightness and contrast modifications are applied sequentially\n        - Final values are clipped to valid range [0, max_value]\n        - The same plasma pattern is used for both brightness and contrast\n          to maintain coherent spatial variations\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Default parameters\n        &gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(p=1.0)\n\n        # Custom adjustments with fine pattern\n        &gt;&gt;&gt; transform = A.PlasmaBrightnessContrast(\n        ...     brightness_range=(-0.5, 0.5),\n        ...     contrast_range=(-0.3, 0.3),\n        ...     plasma_size=512,  # More detailed pattern\n        ...     roughness=2.5,    # Smoother transitions\n        ...     p=1.0\n        ... )\n\n    References:\n        .. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"\n               Communications of the ACM, 1982.\n               Paper introducing the Diamond-Square algorithm.\n\n        .. [2] Miller, \"The Diamond-Square Algorithm: A Detailed Analysis,\"\n               Journal of Computer Graphics Techniques, 2016.\n               Comprehensive analysis of the algorithm and its properties.\n\n        .. [3] Ebert et al., \"Texturing &amp; Modeling: A Procedural Approach,\"\n               Chapter 12: Noise, Hypertexture, Antialiasing, and Gesture.\n               Detailed coverage of procedural noise patterns.\n\n        .. [4] Diamond-Square algorithm:\n               https://en.wikipedia.org/wiki/Diamond-square_algorithm\n\n        .. [5] Plasma effect:\n               https://lodev.org/cgtutor/plasma.html\n\n    See Also:\n        - RandomBrightnessContrast: For uniform brightness/contrast adjustments\n        - CLAHE: For contrast limited adaptive histogram equalization\n        - FancyPCA: For color-based contrast enhancement\n        - HistogramMatching: For reference-based contrast adjustment\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        brightness_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(-1, 1)),\n        ]\n        contrast_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(-1, 1)),\n        ]\n        plasma_size: int = Field(default=256, gt=0)\n        roughness: float = Field(default=3.0, gt=0)\n\n    def __init__(\n        self,\n        brightness_range: tuple[float, float] = (-0.3, 0.3),\n        contrast_range: tuple[float, float] = (-0.3, 0.3),\n        plasma_size: int = 256,\n        roughness: float = 3.0,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.brightness_range = brightness_range\n        self.contrast_range = contrast_range\n        self.plasma_size = plasma_size\n        self.roughness = roughness\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample adjustment strengths\n        brightness = self.py_random.uniform(*self.brightness_range)\n        contrast = self.py_random.uniform(*self.contrast_range)\n\n        # Generate plasma pattern\n        plasma = fmain.generate_plasma_pattern(\n            target_shape=image.shape[:2],\n            size=self.plasma_size,\n            roughness=self.roughness,\n            random_generator=self.random_generator,\n        )\n\n        return {\n            \"brightness_factor\": brightness,\n            \"contrast_factor\": contrast,\n            \"plasma_pattern\": plasma,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        brightness_factor: float,\n        contrast_factor: float,\n        plasma_pattern: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.apply_plasma_brightness_contrast(\n            img,\n            brightness_factor,\n            contrast_factor,\n            plasma_pattern,\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"brightness_range\", \"contrast_range\", \"plasma_size\", \"roughness\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.PlasmaShadow","title":"<code>class  PlasmaShadow</code> <code>       (shadow_intensity_range=(0.3, 0.7), plasma_size=256, roughness=3.0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply plasma-based shadow effect to the image.</p> <p>Creates organic-looking shadows using plasma fractal noise pattern. The shadow intensity varies smoothly across the image, creating natural-looking darkening effects that can simulate shadows, shading, or lighting variations.</p> <p>Parameters:</p> Name Type Description <code>shadow_intensity_range</code> <code>tuple[float, float]</code> <p>Range for shadow intensity. Values between 0 and 1: - 0 means no shadow (original image) - 1 means maximum darkening (black) - Values between create partial shadows Default: (0.3, 0.7)</p> <code>plasma_size</code> <code>int</code> <p>Size of the plasma pattern. Will be rounded up to nearest power of 2. Larger values create more detailed shadow patterns: - Small values (~64): Large, smooth shadow regions - Medium values (~256): Balanced detail level - Large values (~512+): Fine shadow details Default: 256</p> <code>roughness</code> <code>float</code> <p>Controls the roughness of the plasma pattern. Higher values create more rough/sharp shadow transitions. Must be greater than 0: - Low values (~1.0): Very smooth transitions - Medium values (~3.0): Natural-looking shadows - High values (~5.0): More dramatic, sharp shadows Default: 3.0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform darkens the image using a plasma pattern</li> <li>Works with any number of channels (grayscale, RGB, multispectral)</li> <li>Shadow pattern is generated using Diamond-Square algorithm</li> <li>The same shadow pattern is applied to all channels</li> <li>Final values are clipped to valid range [0, max_value]</li> </ul> <p>Mathematical Formulation:     1. Plasma Pattern Generation:        The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1]        with fractal characteristics controlled by roughness parameter.</p> <pre><code>2. Shadow Application:\n   For each pixel (x,y):\n   O(x,y) = I(x,y) * (1 - i\u00b7P(x,y))\n   where:\n   - I is the input image\n   - P is the plasma pattern\n   - i is the shadow intensity\n   - O is the output image\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-parameters-for-natural-shadows","title":"Default parameters for natural shadows","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaShadow(p=1.0)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--subtle-smooth-shadows","title":"Subtle, smooth shadows","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaShadow(\n...     shadow_intensity=(0.1, 0.3),\n...     plasma_size=128,\n...     roughness=1.5,\n...     p=1.0\n... )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--dramatic-detailed-shadows","title":"Dramatic, detailed shadows","text":"Python<pre><code>&gt;&gt;&gt; transform = A.PlasmaShadow(\n...     shadow_intensity=(0.5, 0.9),\n...     plasma_size=512,\n...     roughness=4.0,\n...     p=1.0\n... )\n</code></pre> <p>References</p> <p>.. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"        Communications of the ACM, 1982.        Paper introducing the Diamond-Square algorithm.</p> <p>.. [2] Diamond-Square algorithm:        https://en.wikipedia.org/wiki/Diamond-square_algorithm</p> <p>See Also:     - PlasmaBrightnessContrast: For brightness/contrast adjustments using plasma patterns     - RandomShadow: For geometric shadow effects     - RandomToneCurve: For global lighting adjustments</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class PlasmaShadow(ImageOnlyTransform):\n    \"\"\"Apply plasma-based shadow effect to the image.\n\n    Creates organic-looking shadows using plasma fractal noise pattern.\n    The shadow intensity varies smoothly across the image, creating natural-looking\n    darkening effects that can simulate shadows, shading, or lighting variations.\n\n    Args:\n        shadow_intensity_range (tuple[float, float]): Range for shadow intensity.\n            Values between 0 and 1:\n            - 0 means no shadow (original image)\n            - 1 means maximum darkening (black)\n            - Values between create partial shadows\n            Default: (0.3, 0.7)\n\n        plasma_size (int): Size of the plasma pattern. Will be rounded up to nearest power of 2.\n            Larger values create more detailed shadow patterns:\n            - Small values (~64): Large, smooth shadow regions\n            - Medium values (~256): Balanced detail level\n            - Large values (~512+): Fine shadow details\n            Default: 256\n\n        roughness (float): Controls the roughness of the plasma pattern.\n            Higher values create more rough/sharp shadow transitions.\n            Must be greater than 0:\n            - Low values (~1.0): Very smooth transitions\n            - Medium values (~3.0): Natural-looking shadows\n            - High values (~5.0): More dramatic, sharp shadows\n            Default: 3.0\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform darkens the image using a plasma pattern\n        - Works with any number of channels (grayscale, RGB, multispectral)\n        - Shadow pattern is generated using Diamond-Square algorithm\n        - The same shadow pattern is applied to all channels\n        - Final values are clipped to valid range [0, max_value]\n\n    Mathematical Formulation:\n        1. Plasma Pattern Generation:\n           The Diamond-Square algorithm generates a pattern P(x,y) \u2208 [0,1]\n           with fractal characteristics controlled by roughness parameter.\n\n        2. Shadow Application:\n           For each pixel (x,y):\n           O(x,y) = I(x,y) * (1 - i\u00b7P(x,y))\n           where:\n           - I is the input image\n           - P is the plasma pattern\n           - i is the shadow intensity\n           - O is the output image\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Default parameters for natural shadows\n        &gt;&gt;&gt; transform = A.PlasmaShadow(p=1.0)\n\n        # Subtle, smooth shadows\n        &gt;&gt;&gt; transform = A.PlasmaShadow(\n        ...     shadow_intensity=(0.1, 0.3),\n        ...     plasma_size=128,\n        ...     roughness=1.5,\n        ...     p=1.0\n        ... )\n\n        # Dramatic, detailed shadows\n        &gt;&gt;&gt; transform = A.PlasmaShadow(\n        ...     shadow_intensity=(0.5, 0.9),\n        ...     plasma_size=512,\n        ...     roughness=4.0,\n        ...     p=1.0\n        ... )\n\n    References:\n        .. [1] Fournier, Fussell, and Carpenter, \"Computer rendering of stochastic models,\"\n               Communications of the ACM, 1982.\n               Paper introducing the Diamond-Square algorithm.\n\n        .. [2] Diamond-Square algorithm:\n               https://en.wikipedia.org/wiki/Diamond-square_algorithm\n\n    See Also:\n        - PlasmaBrightnessContrast: For brightness/contrast adjustments using plasma patterns\n        - RandomShadow: For geometric shadow effects\n        - RandomToneCurve: For global lighting adjustments\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        shadow_intensity_range: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        plasma_size: int = Field(default=256, gt=0)\n        roughness: float = Field(default=3.0, gt=0)\n\n    def __init__(\n        self,\n        shadow_intensity_range: tuple[float, float] = (0.3, 0.7),\n        plasma_size: int = 256,\n        roughness: float = 3.0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.shadow_intensity_range = shadow_intensity_range\n        self.plasma_size = plasma_size\n        self.roughness = roughness\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample shadow intensity\n        intensity = self.py_random.uniform(*self.shadow_intensity_range)\n\n        # Generate plasma pattern\n        plasma = fmain.generate_plasma_pattern(\n            target_shape=image.shape[:2],\n            size=self.plasma_size,\n            roughness=self.roughness,\n            random_generator=self.random_generator,\n        )\n\n        return {\n            \"intensity\": intensity,\n            \"plasma_pattern\": plasma,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        intensity: float,\n        plasma_pattern: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.apply_plasma_shadow(img, intensity, plasma_pattern)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"shadow_intensity_range\", \"plasma_size\", \"roughness\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Posterize","title":"<code>class  Posterize</code> <code>       (num_bits=4, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Reduces the number of bits for each color channel in the image.</p> <p>This transform applies color posterization, a technique that reduces the number of distinct colors used in an image. It works by lowering the number of bits used to represent each color channel, effectively creating a \"poster-like\" effect with fewer color gradations.</p> <p>Parameters:</p> Name Type Description <code>num_bits</code> <code>int | tuple[int, int] | list[int] | list[tuple[int, int]]</code> <p>Defines the number of bits to keep for each color channel. Can be specified in several ways: - Single int: Same number of bits for all channels. Range: [1, 7]. - tuple of two ints: (min_bits, max_bits) to randomly choose from. Range for each: [1, 7]. - list of three ints: Specific number of bits for each channel [r_bits, g_bits, b_bits]. - list of three tuples: Ranges for each channel [(r_min, r_max), (g_min, g_max), (b_min, b_max)]. Default: 4</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The effect becomes more pronounced as the number of bits is reduced.</li> <li>This transform can create interesting artistic effects or be used for image compression simulation.</li> <li>Posterization is particularly useful for:</li> <li>Creating stylized or retro-looking images</li> <li>Reducing the color palette for specific artistic effects</li> <li>Simulating the look of older or lower-quality digital images</li> <li>Data augmentation in scenarios where color depth might vary</li> </ul> <p>Mathematical Background:     For an 8-bit color channel, posterization to n bits can be expressed as:     new_value = (old_value &gt;&gt; (8 - n)) &lt;&lt; (8 - n)     This operation keeps the n most significant bits and sets the rest to zero.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--posterize-all-channels-to-3-bits","title":"Posterize all channels to 3 bits","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=3, p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--randomly-posterize-between-2-and-5-bits","title":"Randomly posterize between 2 and 5 bits","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=(2, 5), p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--different-bits-for-each-channel","title":"Different bits for each channel","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=[3, 5, 2], p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--range-of-bits-for-each-channel","title":"Range of bits for each channel","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Posterize(num_bits=[(1, 3), (3, 5), (2, 4)], p=1.0)\n&gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Color Quantization: https://en.wikipedia.org/wiki/Color_quantization</li> <li>Posterization: https://en.wikipedia.org/wiki/Posterization</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Posterize(ImageOnlyTransform):\n    \"\"\"Reduces the number of bits for each color channel in the image.\n\n    This transform applies color posterization, a technique that reduces the number of distinct\n    colors used in an image. It works by lowering the number of bits used to represent each\n    color channel, effectively creating a \"poster-like\" effect with fewer color gradations.\n\n    Args:\n        num_bits (int | tuple[int, int] | list[int] | list[tuple[int, int]]):\n            Defines the number of bits to keep for each color channel. Can be specified in several ways:\n            - Single int: Same number of bits for all channels. Range: [1, 7].\n            - tuple of two ints: (min_bits, max_bits) to randomly choose from. Range for each: [1, 7].\n            - list of three ints: Specific number of bits for each channel [r_bits, g_bits, b_bits].\n            - list of three tuples: Ranges for each channel [(r_min, r_max), (g_min, g_max), (b_min, b_max)].\n            Default: 4\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The effect becomes more pronounced as the number of bits is reduced.\n        - This transform can create interesting artistic effects or be used for image compression simulation.\n        - Posterization is particularly useful for:\n          * Creating stylized or retro-looking images\n          * Reducing the color palette for specific artistic effects\n          * Simulating the look of older or lower-quality digital images\n          * Data augmentation in scenarios where color depth might vary\n\n    Mathematical Background:\n        For an 8-bit color channel, posterization to n bits can be expressed as:\n        new_value = (old_value &gt;&gt; (8 - n)) &lt;&lt; (8 - n)\n        This operation keeps the n most significant bits and sets the rest to zero.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Posterize all channels to 3 bits\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=3, p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n        # Randomly posterize between 2 and 5 bits\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=(2, 5), p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n        # Different bits for each channel\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=[3, 5, 2], p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n        # Range of bits for each channel\n        &gt;&gt;&gt; transform = A.Posterize(num_bits=[(1, 3), (3, 5), (2, 4)], p=1.0)\n        &gt;&gt;&gt; posterized_image = transform(image=image)[\"image\"]\n\n    References:\n        - Color Quantization: https://en.wikipedia.org/wiki/Color_quantization\n        - Posterization: https://en.wikipedia.org/wiki/Posterization\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_bits: int | tuple[int, int] | list[tuple[int, int]]\n\n        @field_validator(\"num_bits\")\n        @classmethod\n        def validate_num_bits(\n            cls,\n            num_bits: Any,\n        ) -&gt; tuple[int, int] | list[tuple[int, int]]:\n            if isinstance(num_bits, int):\n                if num_bits &lt; 1 or num_bits &gt; SEVEN:\n                    raise ValueError(\"num_bits must be in the range [1, 7]\")\n                return (num_bits, num_bits)\n            if isinstance(num_bits, Sequence) and len(num_bits) &gt; PAIR:\n                return [to_tuple(i, i) for i in num_bits]\n            return cast(tuple[int, int], to_tuple(num_bits, num_bits))\n\n    def __init__(\n        self,\n        num_bits: int | tuple[int, int] | list[tuple[int, int]] = 4,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_bits = cast(Union[tuple[int, int], list[tuple[int, int]]], num_bits)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        num_bits: Literal[1, 2, 3, 4, 5, 6, 7] | list[Literal[1, 2, 3, 4, 5, 6, 7]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.posterize(img, num_bits)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        if isinstance(self.num_bits, list):\n            num_bits = [self.py_random.randint(*i) for i in self.num_bits]\n            return {\"num_bits\": num_bits}\n        return {\"num_bits\": self.py_random.randint(*self.num_bits)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"num_bits\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RGBShift","title":"<code>class  RGBShift</code> <code>       (r_shift_limit=(-20, 20), g_shift_limit=(-20, 20), b_shift_limit=(-20, 20), p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Randomly shift values for each channel of the input RGB image.</p> <p>A specialized version of AdditiveNoise that applies constant uniform shifts to RGB channels. Each channel (R,G,B) can have its own shift range specified.</p> <p>Parameters:</p> Name Type Description <code>r_shift_limit</code> <code>int, int) or int</code> <p>Range for shifting the red channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-r_shift_limit, r_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)</p> <code>g_shift_limit</code> <code>int, int) or int</code> <p>Range for shifting the green channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-g_shift_limit, g_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)</p> <code>b_shift_limit</code> <code>int, int) or int</code> <p>Range for shifting the blue channel. Options: - If tuple (min, max): Sample shift value from this range - If int: Sample shift value from (-b_shift_limit, b_shift_limit) - For uint8 images: Values represent absolute shifts in [0, 255] - For float images: Values represent relative shifts in [0, 1] Default: (-20, 20)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Values are shifted independently for each channel</li> <li>For uint8 images:<ul> <li>Input ranges like (-20, 20) represent pixel value shifts</li> <li>A shift of 20 means adding 20 to that channel</li> <li>Final values are clipped to [0, 255]</li> </ul> </li> <li>For float32 images:<ul> <li>Input ranges like (-0.1, 0.1) represent relative shifts</li> <li>A shift of 0.1 means adding 0.1 to that channel</li> <li>Final values are clipped to [0, 1]</li> </ul> </li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--shift-rgb-channels-of-uint8-image","title":"Shift RGB channels of uint8 image","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RGBShift(\n...     r_shift_limit=30,  # Will sample red shift from [-30, 30]\n...     g_shift_limit=(-20, 20),  # Will sample green shift from [-20, 20]\n...     b_shift_limit=(-10, 10),  # Will sample blue shift from [-10, 10]\n...     p=1.0\n... )\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; shifted = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--same-effect-using-additivenoise","title":"Same effect using AdditiveNoise","text":"Python<pre><code>&gt;&gt;&gt; transform = A.AdditiveNoise(\n...     noise_type=\"uniform\",\n...     spatial_mode=\"constant\",  # One value per channel\n...     noise_params={\n...         \"ranges\": [(-30/255, 30/255), (-20/255, 20/255), (-10/255, 10/255)]\n...     },\n...     p=1.0\n... )\n</code></pre> <p>See Also:     - AdditiveNoise: More general noise transform with various options:         * Different noise distributions (uniform, gaussian, laplace, beta)         * Spatial modes (constant, per-pixel, shared)         * Approximation for faster computation     - RandomToneCurve: For non-linear color transformations     - RandomBrightnessContrast: For combined brightness and contrast adjustments     - PlankianJitter: For color temperature adjustments     - HueSaturationValue: For HSV color space adjustments     - ColorJitter: For combined brightness, contrast, saturation adjustments</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RGBShift(AdditiveNoise):\n    \"\"\"Randomly shift values for each channel of the input RGB image.\n\n    A specialized version of AdditiveNoise that applies constant uniform shifts to RGB channels.\n    Each channel (R,G,B) can have its own shift range specified.\n\n    Args:\n        r_shift_limit ((int, int) or int): Range for shifting the red channel. Options:\n            - If tuple (min, max): Sample shift value from this range\n            - If int: Sample shift value from (-r_shift_limit, r_shift_limit)\n            - For uint8 images: Values represent absolute shifts in [0, 255]\n            - For float images: Values represent relative shifts in [0, 1]\n            Default: (-20, 20)\n\n        g_shift_limit ((int, int) or int): Range for shifting the green channel. Options:\n            - If tuple (min, max): Sample shift value from this range\n            - If int: Sample shift value from (-g_shift_limit, g_shift_limit)\n            - For uint8 images: Values represent absolute shifts in [0, 255]\n            - For float images: Values represent relative shifts in [0, 1]\n            Default: (-20, 20)\n\n        b_shift_limit ((int, int) or int): Range for shifting the blue channel. Options:\n            - If tuple (min, max): Sample shift value from this range\n            - If int: Sample shift value from (-b_shift_limit, b_shift_limit)\n            - For uint8 images: Values represent absolute shifts in [0, 255]\n            - For float images: Values represent relative shifts in [0, 1]\n            Default: (-20, 20)\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Values are shifted independently for each channel\n        - For uint8 images:\n            * Input ranges like (-20, 20) represent pixel value shifts\n            * A shift of 20 means adding 20 to that channel\n            * Final values are clipped to [0, 255]\n        - For float32 images:\n            * Input ranges like (-0.1, 0.1) represent relative shifts\n            * A shift of 0.1 means adding 0.1 to that channel\n            * Final values are clipped to [0, 1]\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n\n        # Shift RGB channels of uint8 image\n        &gt;&gt;&gt; transform = A.RGBShift(\n        ...     r_shift_limit=30,  # Will sample red shift from [-30, 30]\n        ...     g_shift_limit=(-20, 20),  # Will sample green shift from [-20, 20]\n        ...     b_shift_limit=(-10, 10),  # Will sample blue shift from [-10, 10]\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; shifted = transform(image=image)[\"image\"]\n\n        # Same effect using AdditiveNoise\n        &gt;&gt;&gt; transform = A.AdditiveNoise(\n        ...     noise_type=\"uniform\",\n        ...     spatial_mode=\"constant\",  # One value per channel\n        ...     noise_params={\n        ...         \"ranges\": [(-30/255, 30/255), (-20/255, 20/255), (-10/255, 10/255)]\n        ...     },\n        ...     p=1.0\n        ... )\n\n    See Also:\n        - AdditiveNoise: More general noise transform with various options:\n            * Different noise distributions (uniform, gaussian, laplace, beta)\n            * Spatial modes (constant, per-pixel, shared)\n            * Approximation for faster computation\n        - RandomToneCurve: For non-linear color transformations\n        - RandomBrightnessContrast: For combined brightness and contrast adjustments\n        - PlankianJitter: For color temperature adjustments\n        - HueSaturationValue: For HSV color space adjustments\n        - ColorJitter: For combined brightness, contrast, saturation adjustments\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        r_shift_limit: SymmetricRangeType\n        g_shift_limit: SymmetricRangeType\n        b_shift_limit: SymmetricRangeType\n\n    def __init__(\n        self,\n        r_shift_limit: ScaleFloatType = (-20, 20),\n        g_shift_limit: ScaleFloatType = (-20, 20),\n        b_shift_limit: ScaleFloatType = (-20, 20),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        # Convert RGB shift limits to normalized ranges if needed\n        def normalize_range(limit: tuple[float, float]) -&gt; tuple[float, float]:\n            # If any value is &gt; 1, assume uint8 range and normalize\n            if abs(limit[0]) &gt; 1 or abs(limit[1]) &gt; 1:\n                return (limit[0] / 255.0, limit[1] / 255.0)\n            return limit\n\n        ranges = [\n            normalize_range(cast(tuple[float, float], r_shift_limit)),\n            normalize_range(cast(tuple[float, float], g_shift_limit)),\n            normalize_range(cast(tuple[float, float], b_shift_limit)),\n        ]\n\n        # Initialize with fixed noise type and spatial mode\n        super().__init__(\n            noise_type=\"uniform\",\n            spatial_mode=\"constant\",\n            noise_params={\"ranges\": ranges},\n            approximation=1.0,\n            p=p,\n        )\n\n        # Store original limits for get_transform_init_args\n        self.r_shift_limit = cast(tuple[float, float], r_shift_limit)\n        self.g_shift_limit = cast(tuple[float, float], g_shift_limit)\n        self.b_shift_limit = cast(tuple[float, float], b_shift_limit)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"r_shift_limit\", \"g_shift_limit\", \"b_shift_limit\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomBrightnessContrast","title":"<code>class  RandomBrightnessContrast</code> <code>       (brightness_limit=(-0.2, 0.2), contrast_limit=(-0.2, 0.2), brightness_by_max=True, ensure_safe_range=False, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Randomly changes the brightness and contrast of the input image.</p> <p>This transform adjusts the brightness and contrast of an image simultaneously, allowing for a wide range of lighting and contrast variations. It's particularly useful for data augmentation in computer vision tasks, helping models become more robust to different lighting conditions.</p> <p>Parameters:</p> Name Type Description <code>brightness_limit</code> <code>float | tuple[float, float]</code> <p>Factor range for changing brightness. If a single float value is provided, the range will be (-brightness_limit, brightness_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum brightness, and -1.0 means minimum brightness. Default: (-0.2, 0.2).</p> <code>contrast_limit</code> <code>float | tuple[float, float]</code> <p>Factor range for changing contrast. If a single float value is provided, the range will be (-contrast_limit, contrast_limit). Values should typically be in the range [-1.0, 1.0], where 0 means no change, 1.0 means maximum increase in contrast, and -1.0 means maximum decrease in contrast. Default: (-0.2, 0.2).</p> <code>brightness_by_max</code> <code>bool</code> <p>If True, adjusts brightness by scaling pixel values up to the maximum value of the image's dtype. If False, uses the mean pixel value for adjustment. Default: True.</p> <code>ensure_safe_range</code> <code>bool</code> <p>If True, adjusts alpha and beta to prevent overflow/underflow. This ensures output values stay within the valid range for the image dtype without clipping. Default: False.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The order of operation is: contrast adjustment, then brightness adjustment.</li> <li>For uint8 images, the output is clipped to [0, 255] range.</li> <li>For float32 images, the output is clipped to [0, 1] range.</li> <li>The <code>brightness_by_max</code> parameter affects how brightness is adjusted:</li> <li>If True, brightness adjustment is more pronounced and can lead to more saturated results.</li> <li>If False, brightness adjustment is more subtle and preserves the overall lighting better.</li> <li>This transform is useful for:</li> <li>Simulating different lighting conditions</li> <li>Enhancing low-light or overexposed images</li> <li>Data augmentation to improve model robustness</li> </ul> <p>Mathematical Formulation:     Let a be the contrast adjustment factor and \u03b2 be the brightness adjustment factor.     For each pixel value x:     1. Contrast adjustment: x' = clip((x - mean) * (1 + a) + mean)     2. Brightness adjustment:        If brightness_by_max is True:  x'' = clip(x' * (1 + \u03b2))        If brightness_by_max is False: x'' = clip(x' + \u03b2 * max_value)     Where clip() ensures values stay within the valid range for the image dtype.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomBrightnessContrast(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-brightness-and-contrast-limits","title":"Custom brightness and contrast limits","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n...     brightness_limit=0.3,\n...     contrast_limit=0.3,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--adjust-brightness-based-on-mean-value","title":"Adjust brightness based on mean value","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n...     brightness_limit=0.2,\n...     contrast_limit=0.2,\n...     brightness_by_max=False,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Brightness: https://en.wikipedia.org/wiki/Brightness</li> <li>Contrast: https://en.wikipedia.org/wiki/Contrast_(vision)</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomBrightnessContrast(ImageOnlyTransform):\n    \"\"\"Randomly changes the brightness and contrast of the input image.\n\n    This transform adjusts the brightness and contrast of an image simultaneously, allowing for\n    a wide range of lighting and contrast variations. It's particularly useful for data augmentation\n    in computer vision tasks, helping models become more robust to different lighting conditions.\n\n    Args:\n        brightness_limit (float | tuple[float, float]): Factor range for changing brightness.\n            If a single float value is provided, the range will be (-brightness_limit, brightness_limit).\n            Values should typically be in the range [-1.0, 1.0], where 0 means no change,\n            1.0 means maximum brightness, and -1.0 means minimum brightness.\n            Default: (-0.2, 0.2).\n\n        contrast_limit (float | tuple[float, float]): Factor range for changing contrast.\n            If a single float value is provided, the range will be (-contrast_limit, contrast_limit).\n            Values should typically be in the range [-1.0, 1.0], where 0 means no change,\n            1.0 means maximum increase in contrast, and -1.0 means maximum decrease in contrast.\n            Default: (-0.2, 0.2).\n\n        brightness_by_max (bool): If True, adjusts brightness by scaling pixel values up to the\n            maximum value of the image's dtype. If False, uses the mean pixel value for adjustment.\n            Default: True.\n\n        ensure_safe_range (bool): If True, adjusts alpha and beta to prevent overflow/underflow.\n            This ensures output values stay within the valid range for the image dtype without clipping.\n            Default: False.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The order of operation is: contrast adjustment, then brightness adjustment.\n        - For uint8 images, the output is clipped to [0, 255] range.\n        - For float32 images, the output is clipped to [0, 1] range.\n        - The `brightness_by_max` parameter affects how brightness is adjusted:\n          * If True, brightness adjustment is more pronounced and can lead to more saturated results.\n          * If False, brightness adjustment is more subtle and preserves the overall lighting better.\n        - This transform is useful for:\n          * Simulating different lighting conditions\n          * Enhancing low-light or overexposed images\n          * Data augmentation to improve model robustness\n\n    Mathematical Formulation:\n        Let a be the contrast adjustment factor and \u03b2 be the brightness adjustment factor.\n        For each pixel value x:\n        1. Contrast adjustment: x' = clip((x - mean) * (1 + a) + mean)\n        2. Brightness adjustment:\n           If brightness_by_max is True:  x'' = clip(x' * (1 + \u03b2))\n           If brightness_by_max is False: x'' = clip(x' + \u03b2 * max_value)\n        Where clip() ensures values stay within the valid range for the image dtype.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomBrightnessContrast(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Custom brightness and contrast limits\n        &gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n        ...     brightness_limit=0.3,\n        ...     contrast_limit=0.3,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Adjust brightness based on mean value\n        &gt;&gt;&gt; transform = A.RandomBrightnessContrast(\n        ...     brightness_limit=0.2,\n        ...     contrast_limit=0.2,\n        ...     brightness_by_max=False,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Brightness: https://en.wikipedia.org/wiki/Brightness\n        - Contrast: https://en.wikipedia.org/wiki/Contrast_(vision)\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        brightness_limit: SymmetricRangeType\n        contrast_limit: SymmetricRangeType\n        brightness_by_max: bool\n        ensure_safe_range: bool\n\n    def __init__(\n        self,\n        brightness_limit: ScaleFloatType = (-0.2, 0.2),\n        contrast_limit: ScaleFloatType = (-0.2, 0.2),\n        brightness_by_max: bool = True,\n        ensure_safe_range: bool = False,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.brightness_limit = cast(tuple[float, float], brightness_limit)\n        self.contrast_limit = cast(tuple[float, float], contrast_limit)\n        self.brightness_by_max = brightness_by_max\n        self.ensure_safe_range = ensure_safe_range\n\n    def apply(\n        self,\n        img: np.ndarray,\n        alpha: float,\n        beta: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return albucore.multiply_add(img, alpha, beta, inplace=False)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, float]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample initial values\n        alpha = 1.0 + self.py_random.uniform(*self.contrast_limit)\n        beta = self.py_random.uniform(*self.brightness_limit)\n\n        max_value = MAX_VALUES_BY_DTYPE[image.dtype]\n        # Scale beta according to brightness_by_max setting\n        beta = beta * max_value if self.brightness_by_max else beta * np.mean(image)\n\n        # Clip values to safe ranges if needed\n        if self.ensure_safe_range:\n            alpha, beta = fmain.get_safe_brightness_contrast_params(\n                alpha,\n                beta,\n                max_value,\n            )\n\n        return {\n            \"alpha\": alpha,\n            \"beta\": beta,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"brightness_limit\",\n            \"contrast_limit\",\n            \"brightness_by_max\",\n            \"ensure_safe_range\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomFog","title":"<code>class  RandomFog</code> <code>       (fog_coef_lower=None, fog_coef_upper=None, alpha_coef=0.08, fog_coef_range=(0.3, 1), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Simulates fog for the image by adding random fog-like artifacts.</p> <p>This transform creates a fog effect by generating semi-transparent overlays that mimic the visual characteristics of fog. The fog intensity and distribution can be controlled to create various fog-like conditions.</p> <p>Parameters:</p> Name Type Description <code>fog_coef_range</code> <code>tuple[float, float]</code> <p>Range for fog intensity coefficient. Should be in [0, 1] range.</p> <code>alpha_coef</code> <code>float</code> <p>Transparency of the fog circles. Should be in [0, 1] range. Default: 0.08.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The fog effect is created by overlaying semi-transparent circles on the image.</li> <li>Higher fog coefficient values result in denser fog effects.</li> <li>The fog is typically denser in the center of the image and gradually decreases towards the edges.</li> <li>This transform is useful for:</li> <li>Simulating various weather conditions in outdoor scenes</li> <li>Data augmentation for improving model robustness to foggy conditions</li> <li>Creating atmospheric effects in image editing</li> </ul> <p>Mathematical Formulation:     For each fog particle:     1. A position (x, y) is randomly generated within the image.     2. A circle with random radius is drawn at this position.     3. The circle's alpha (transparency) is determined by the alpha_coef.     4. These circles are overlaid on the original image to create the fog effect.</p> <pre><code>The final pixel value is calculated as:\noutput = (1 - alpha) * original_pixel + alpha * fog_color\n\nwhere alpha is influenced by the fog_coef and alpha_coef parameters.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomFog(p=1.0)\n&gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-fog-intensity-range","title":"Custom fog intensity range","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.3, fog_coef_upper=0.8, p=1.0)\n&gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--adjust-fog-transparency","title":"Adjust fog transparency","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.2, fog_coef_upper=0.5, alpha_coef=0.1, p=1.0)\n&gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Fog: https://en.wikipedia.org/wiki/Fog</li> <li>Atmospheric perspective: https://en.wikipedia.org/wiki/Aerial_perspective</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomFog(ImageOnlyTransform):\n    \"\"\"Simulates fog for the image by adding random fog-like artifacts.\n\n    This transform creates a fog effect by generating semi-transparent overlays\n    that mimic the visual characteristics of fog. The fog intensity and distribution\n    can be controlled to create various fog-like conditions.\n\n    Args:\n        fog_coef_range (tuple[float, float]): Range for fog intensity coefficient. Should be in [0, 1] range.\n        alpha_coef (float): Transparency of the fog circles. Should be in [0, 1] range. Default: 0.08.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The fog effect is created by overlaying semi-transparent circles on the image.\n        - Higher fog coefficient values result in denser fog effects.\n        - The fog is typically denser in the center of the image and gradually decreases towards the edges.\n        - This transform is useful for:\n          * Simulating various weather conditions in outdoor scenes\n          * Data augmentation for improving model robustness to foggy conditions\n          * Creating atmospheric effects in image editing\n\n    Mathematical Formulation:\n        For each fog particle:\n        1. A position (x, y) is randomly generated within the image.\n        2. A circle with random radius is drawn at this position.\n        3. The circle's alpha (transparency) is determined by the alpha_coef.\n        4. These circles are overlaid on the original image to create the fog effect.\n\n        The final pixel value is calculated as:\n        output = (1 - alpha) * original_pixel + alpha * fog_color\n\n        where alpha is influenced by the fog_coef and alpha_coef parameters.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomFog(p=1.0)\n        &gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n\n        # Custom fog intensity range\n        &gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.3, fog_coef_upper=0.8, p=1.0)\n        &gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n\n        # Adjust fog transparency\n        &gt;&gt;&gt; transform = A.RandomFog(fog_coef_lower=0.2, fog_coef_upper=0.5, alpha_coef=0.1, p=1.0)\n        &gt;&gt;&gt; foggy_image = transform(image=image)[\"image\"]\n\n    References:\n        - Fog: https://en.wikipedia.org/wiki/Fog\n        - Atmospheric perspective: https://en.wikipedia.org/wiki/Aerial_perspective\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        fog_coef_lower: float | None = Field(\n            ge=0,\n            le=1,\n        )\n        fog_coef_upper: float | None = Field(\n            ge=0,\n            le=1,\n        )\n        fog_coef_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        alpha_coef: float = Field(ge=0, le=1)\n\n        @model_validator(mode=\"after\")\n        def validate_fog_coefficients(self) -&gt; Self:\n            if self.fog_coef_lower is not None:\n                warn(\n                    \"`fog_coef_lower` is deprecated, use `fog_coef_range` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n            if self.fog_coef_upper is not None:\n                warn(\n                    \"`fog_coef_upper` is deprecated, use `fog_coef_range` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            lower = self.fog_coef_lower if self.fog_coef_lower is not None else self.fog_coef_range[0]\n            upper = self.fog_coef_upper if self.fog_coef_upper is not None else self.fog_coef_range[1]\n            self.fog_coef_range = (lower, upper)\n\n            self.fog_coef_lower = None\n            self.fog_coef_upper = None\n\n            return self\n\n    def __init__(\n        self,\n        fog_coef_lower: float | None = None,\n        fog_coef_upper: float | None = None,\n        alpha_coef: float = 0.08,\n        fog_coef_range: tuple[float, float] = (0.3, 1),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.fog_coef_range = fog_coef_range\n        self.alpha_coef = alpha_coef\n\n    def apply(\n        self,\n        img: np.ndarray,\n        particle_positions: list[tuple[int, int]],\n        radiuses: list[int],\n        intensity: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.add_fog(\n            img,\n            intensity,\n            self.alpha_coef,\n            particle_positions,\n            radiuses,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        # Select a random fog intensity within the specified range\n        intensity = self.py_random.uniform(*self.fog_coef_range)\n\n        image_shape = params[\"shape\"][:2]\n\n        image_height, image_width = image_shape\n\n        # Calculate the size of the fog effect region based on image width and fog intensity\n        fog_region_size = max(1, int(image_width // 3 * intensity))\n\n        particle_positions = []\n\n        # Initialize the central region where fog will be most dense\n        center_x, center_y = (int(x) for x in fgeometric.center(image_shape))\n\n        # Define the initial size of the foggy area\n        current_width = image_width\n        current_height = image_height\n\n        # Define shrink factor for reducing the foggy area each iteration\n        shrink_factor = 0.1\n\n        max_iterations = 10  # Prevent infinite loop\n        iteration = 0\n\n        while current_width &gt; fog_region_size and current_height &gt; fog_region_size and iteration &lt; max_iterations:\n            # Calculate the number of particles for this region\n            area = current_width * current_height\n            particles_in_region = int(\n                area / (fog_region_size * fog_region_size) * intensity * 10,\n            )\n\n            for _ in range(particles_in_region):\n                # Generate random positions within the current region\n                x = self.py_random.randint(\n                    center_x - current_width // 2,\n                    center_x + current_width // 2,\n                )\n                y = self.py_random.randint(\n                    center_y - current_height // 2,\n                    center_y + current_height // 2,\n                )\n                particle_positions.append((x, y))\n\n            # Shrink the region for the next iteration\n            current_width = int(current_width * (1 - shrink_factor))\n            current_height = int(current_height * (1 - shrink_factor))\n\n            iteration += 1\n\n        radiuses = fmain.get_fog_particle_radiuses(\n            image_shape,\n            len(particle_positions),\n            intensity,\n            self.random_generator,\n        )\n\n        return {\n            \"particle_positions\": particle_positions,\n            \"intensity\": intensity,\n            \"radiuses\": radiuses,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"fog_coef_range\", \"alpha_coef\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomGamma","title":"<code>class  RandomGamma</code> <code>       (gamma_limit=(80, 120), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies random gamma correction to the input image.</p> <p>Gamma correction, or simply gamma, is a nonlinear operation used to encode and decode luminance or tristimulus values in imaging systems. This transform can adjust the brightness of an image while preserving the relative differences between darker and lighter areas, making it useful for simulating different lighting conditions or correcting for display characteristics.</p> <p>Parameters:</p> Name Type Description <code>gamma_limit</code> <code>float | tuple[float, float]</code> <p>If gamma_limit is a single float value, the range will be (1, gamma_limit). If it's a tuple of two floats, they will serve as the lower and upper bounds for gamma adjustment. Values are in terms of percentage change, e.g., (80, 120) means the gamma will be between 80% and 120% of the original. Default: (80, 120).</p> <code>eps</code> <p>A small value added to the gamma to avoid division by zero or log of zero errors. Default: 1e-7.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The gamma correction is applied using the formula: output = input^gamma</li> <li>Gamma values &gt; 1 will make the image darker, while values &lt; 1 will make it brighter</li> <li>This transform is particularly useful for:</li> <li>Simulating different lighting conditions</li> <li>Correcting for non-linear display characteristics</li> <li>Enhancing contrast in certain regions of the image</li> <li>Data augmentation in computer vision tasks</li> </ul> <p>Mathematical Formulation:     Let I be the input image and G (gamma) be the correction factor.     The gamma correction is applied as follows:     1. Normalize the image to [0, 1] range: I_norm = I / 255 (for uint8 images)     2. Apply gamma correction: I_corrected = I_norm ^ (1 / G)     3. Scale back to original range: output = I_corrected * 255 (for uint8 images)</p> <pre><code>The actual gamma value used is calculated as:\nG = 1 + (random_value / 100), where random_value is sampled from gamma_limit range.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGamma(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-gamma-range","title":"Custom gamma range","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGamma(gamma_limit=(50, 150), p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--applying-with-other-transforms","title":"Applying with other transforms","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomGamma(gamma_limit=(80, 120), p=0.5),\n...     A.RandomBrightnessContrast(p=0.5),\n... ])\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction</li> <li>Power law (Gamma) encoding: https://www.cambridgeincolour.com/tutorials/gamma-correction.htm</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomGamma(ImageOnlyTransform):\n    \"\"\"Applies random gamma correction to the input image.\n\n    Gamma correction, or simply gamma, is a nonlinear operation used to encode and decode luminance\n    or tristimulus values in imaging systems. This transform can adjust the brightness of an image\n    while preserving the relative differences between darker and lighter areas, making it useful\n    for simulating different lighting conditions or correcting for display characteristics.\n\n    Args:\n        gamma_limit (float | tuple[float, float]): If gamma_limit is a single float value, the range\n            will be (1, gamma_limit). If it's a tuple of two floats, they will serve as\n            the lower and upper bounds for gamma adjustment. Values are in terms of percentage change,\n            e.g., (80, 120) means the gamma will be between 80% and 120% of the original.\n            Default: (80, 120).\n        eps: A small value added to the gamma to avoid division by zero or log of zero errors.\n            Default: 1e-7.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The gamma correction is applied using the formula: output = input^gamma\n        - Gamma values &gt; 1 will make the image darker, while values &lt; 1 will make it brighter\n        - This transform is particularly useful for:\n          * Simulating different lighting conditions\n          * Correcting for non-linear display characteristics\n          * Enhancing contrast in certain regions of the image\n          * Data augmentation in computer vision tasks\n\n    Mathematical Formulation:\n        Let I be the input image and G (gamma) be the correction factor.\n        The gamma correction is applied as follows:\n        1. Normalize the image to [0, 1] range: I_norm = I / 255 (for uint8 images)\n        2. Apply gamma correction: I_corrected = I_norm ^ (1 / G)\n        3. Scale back to original range: output = I_corrected * 255 (for uint8 images)\n\n        The actual gamma value used is calculated as:\n        G = 1 + (random_value / 100), where random_value is sampled from gamma_limit range.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomGamma(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Custom gamma range\n        &gt;&gt;&gt; transform = A.RandomGamma(gamma_limit=(50, 150), p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Applying with other transforms\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomGamma(gamma_limit=(80, 120), p=0.5),\n        ...     A.RandomBrightnessContrast(p=0.5),\n        ... ])\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction\n        - Power law (Gamma) encoding: https://www.cambridgeincolour.com/tutorials/gamma-correction.htm\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        gamma_limit: OnePlusFloatRangeType\n\n    def __init__(\n        self,\n        gamma_limit: ScaleFloatType = (80, 120),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.gamma_limit = cast(tuple[float, float], gamma_limit)\n\n    def apply(self, img: np.ndarray, gamma: float, **params: Any) -&gt; np.ndarray:\n        return fmain.gamma_transform(img, gamma=gamma)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"gamma\": self.py_random.uniform(self.gamma_limit[0], self.gamma_limit[1]) / 100.0,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"gamma_limit\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomGravel","title":"<code>class  RandomGravel</code> <code>       (gravel_roi=(0.1, 0.4, 0.9, 0.9), number_of_patches=2, always_apply=None, p=0.5)                         </code>  [view source on GitHub]","text":"<p>Adds gravel-like artifacts to the input image.</p> <p>This transform simulates the appearance of gravel or small stones scattered across specific regions of an image. It's particularly useful for augmenting datasets of road or terrain images, adding realistic texture variations.</p> <p>Parameters:</p> Name Type Description <code>gravel_roi</code> <code>tuple[float, float, float, float]</code> <p>Region of interest where gravel will be added, specified as (x_min, y_min, x_max, y_max) in relative coordinates [0, 1]. Default: (0.1, 0.4, 0.9, 0.9).</p> <code>number_of_patches</code> <code>int</code> <p>Number of gravel patch regions to generate within the ROI. Each patch will contain multiple gravel particles. Default: 2.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The gravel effect is created by modifying the saturation channel in the HLS color space.</li> <li>Gravel particles are distributed within randomly generated patches inside the specified ROI.</li> <li>This transform is particularly useful for:</li> <li>Augmenting datasets for road condition analysis</li> <li>Simulating variations in terrain for computer vision tasks</li> <li>Adding realistic texture to synthetic images of outdoor scenes</li> </ul> <p>Mathematical Formulation:     For each gravel patch:     1. A rectangular region is randomly generated within the specified ROI.     2. Within this region, multiple gravel particles are placed.     3. For each particle:        - Random (x, y) coordinates are generated within the patch.        - A random radius (r) between 1 and 3 pixels is assigned.        - A random saturation value (sat) between 0 and 255 is assigned.     4. The saturation channel of the image is modified for each particle:        image_hls[y-r:y+r, x-r:x+r, 1] = sat</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGravel(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-roi-and-number-of-patches","title":"Custom ROI and number of patches","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomGravel(\n...     gravel_roi=(0.2, 0.2, 0.8, 0.8),\n...     number_of_patches=5,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--combining-with-other-transforms","title":"Combining with other transforms","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomGravel(p=0.7),\n...     A.RandomBrightnessContrast(p=0.5),\n... ])\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Road surface textures: https://en.wikipedia.org/wiki/Road_surface</li> <li>HLS color space: https://en.wikipedia.org/wiki/HSL_and_HSV</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomGravel(ImageOnlyTransform):\n    \"\"\"Adds gravel-like artifacts to the input image.\n\n    This transform simulates the appearance of gravel or small stones scattered across\n    specific regions of an image. It's particularly useful for augmenting datasets of\n    road or terrain images, adding realistic texture variations.\n\n    Args:\n        gravel_roi (tuple[float, float, float, float]): Region of interest where gravel\n            will be added, specified as (x_min, y_min, x_max, y_max) in relative coordinates\n            [0, 1]. Default: (0.1, 0.4, 0.9, 0.9).\n        number_of_patches (int): Number of gravel patch regions to generate within the ROI.\n            Each patch will contain multiple gravel particles. Default: 2.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The gravel effect is created by modifying the saturation channel in the HLS color space.\n        - Gravel particles are distributed within randomly generated patches inside the specified ROI.\n        - This transform is particularly useful for:\n          * Augmenting datasets for road condition analysis\n          * Simulating variations in terrain for computer vision tasks\n          * Adding realistic texture to synthetic images of outdoor scenes\n\n    Mathematical Formulation:\n        For each gravel patch:\n        1. A rectangular region is randomly generated within the specified ROI.\n        2. Within this region, multiple gravel particles are placed.\n        3. For each particle:\n           - Random (x, y) coordinates are generated within the patch.\n           - A random radius (r) between 1 and 3 pixels is assigned.\n           - A random saturation value (sat) between 0 and 255 is assigned.\n        4. The saturation channel of the image is modified for each particle:\n           image_hls[y-r:y+r, x-r:x+r, 1] = sat\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomGravel(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Custom ROI and number of patches\n        &gt;&gt;&gt; transform = A.RandomGravel(\n        ...     gravel_roi=(0.2, 0.2, 0.8, 0.8),\n        ...     number_of_patches=5,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n        # Combining with other transforms\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomGravel(p=0.7),\n        ...     A.RandomBrightnessContrast(p=0.5),\n        ... ])\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Road surface textures: https://en.wikipedia.org/wiki/Road_surface\n        - HLS color space: https://en.wikipedia.org/wiki/HSL_and_HSV\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        gravel_roi: tuple[float, float, float, float]\n        number_of_patches: int = Field(ge=1)\n\n        @model_validator(mode=\"after\")\n        def validate_gravel_roi(self) -&gt; Self:\n            gravel_lower_x, gravel_lower_y, gravel_upper_x, gravel_upper_y = self.gravel_roi\n            if not 0 &lt;= gravel_lower_x &lt; gravel_upper_x &lt;= 1 or not 0 &lt;= gravel_lower_y &lt; gravel_upper_y &lt;= 1:\n                raise ValueError(f\"Invalid gravel_roi. Got: {self.gravel_roi}.\")\n            return self\n\n    def __init__(\n        self,\n        gravel_roi: tuple[float, float, float, float] = (0.1, 0.4, 0.9, 0.9),\n        number_of_patches: int = 2,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p, always_apply)\n        self.gravel_roi = gravel_roi\n        self.number_of_patches = number_of_patches\n\n    def generate_gravel_patch(\n        self,\n        rectangular_roi: tuple[int, int, int, int],\n    ) -&gt; np.ndarray:\n        x_min, y_min, x_max, y_max = rectangular_roi\n        area = abs((x_max - x_min) * (y_max - y_min))\n        count = area // 10\n        gravels = np.empty([count, 2], dtype=np.int64)\n        gravels[:, 0] = self.random_generator.integers(x_min, x_max, count)\n        gravels[:, 1] = self.random_generator.integers(y_min, y_max, count)\n        return gravels\n\n    def apply(\n        self,\n        img: np.ndarray,\n        gravels_infos: list[Any],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_gravel(img, gravels_infos)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n\n        # Calculate ROI in pixels\n        x_min, y_min, x_max, y_max = (\n            int(coord * dim) for coord, dim in zip(self.gravel_roi, [width, height, width, height])\n        )\n\n        roi_width = x_max - x_min\n        roi_height = y_max - y_min\n\n        gravels_info = []\n\n        for _ in range(self.number_of_patches):\n            # Generate a random rectangular region within the ROI\n            patch_width = self.py_random.randint(roi_width // 10, roi_width // 5)\n            patch_height = self.py_random.randint(roi_height // 10, roi_height // 5)\n\n            patch_x = self.py_random.randint(x_min, x_max - patch_width)\n            patch_y = self.py_random.randint(y_min, y_max - patch_height)\n\n            # Generate gravel particles within this patch\n            num_particles = (patch_width * patch_height) // 100  # Adjust this divisor to control density\n\n            for _ in range(num_particles):\n                x = self.py_random.randint(patch_x, patch_x + patch_width)\n                y = self.py_random.randint(patch_y, patch_y + patch_height)\n                r = self.py_random.randint(1, 3)\n                sat = self.py_random.randint(0, 255)\n\n                gravels_info.append(\n                    [\n                        max(y - r, 0),  # min_y\n                        min(y + r, height - 1),  # max_y\n                        max(x - r, 0),  # min_x\n                        min(x + r, width - 1),  # max_x\n                        sat,  # saturation\n                    ],\n                )\n\n        return {\"gravels_infos\": np.array(gravels_info, dtype=np.int64)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return \"gravel_roi\", \"number_of_patches\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomRain","title":"<code>class  RandomRain</code> <code>       (slant_lower=None, slant_upper=None, slant_range=(-10, 10), drop_length=20, drop_width=1, drop_color=(200, 200, 200), blur_value=7, brightness_coefficient=0.7, rain_type='default', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Adds rain effects to an image.</p> <p>This transform simulates rainfall by overlaying semi-transparent streaks onto the image, creating a realistic rain effect. It can be used to augment datasets for computer vision tasks that need to perform well in rainy conditions.</p> <p>Parameters:</p> Name Type Description <code>slant_range</code> <code>tuple[int, int]</code> <p>Range for the rain slant angle in degrees. Negative values slant to the left, positive to the right. Default: (-10, 10).</p> <code>drop_length</code> <code>int</code> <p>Length of the rain drops in pixels. Default: 20.</p> <code>drop_width</code> <code>int</code> <p>Width of the rain drops in pixels. Default: 1.</p> <code>drop_color</code> <code>tuple[int, int, int]</code> <p>Color of the rain drops in RGB format. Default: (200, 200, 200).</p> <code>blur_value</code> <code>int</code> <p>Blur value for simulating rain effect. Rainy views are typically blurry. Default: 7.</p> <code>brightness_coefficient</code> <code>float</code> <p>Coefficient to adjust the brightness of the image. Rainy scenes are usually darker. Should be in the range (0, 1]. Default: 0.7.</p> <code>rain_type</code> <code>Literal[\"drizzle\", \"heavy\", \"torrential\", \"default\"]</code> <p>Type of rain to simulate.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <ul> <li>The rain effect is created by drawing semi-transparent lines on the image.</li> <li>The slant of the rain can be controlled to simulate wind effects.</li> <li>Different rain types (drizzle, heavy, torrential) adjust the density and appearance of the rain.</li> <li>The transform also adjusts image brightness and applies a blur to simulate the visual effects of rain.</li> <li>This transform is particularly useful for:</li> <li>Augmenting datasets for autonomous driving in rainy conditions</li> <li>Testing the robustness of computer vision models to weather effects</li> <li>Creating realistic rainy scenes for image editing or film production</li> </ul> <p>Mathematical Formulation:     For each raindrop:     1. Start position (x1, y1) is randomly generated within the image.     2. End position (x2, y2) is calculated based on drop_length and slant:        x2 = x1 + drop_length * sin(slant)        y2 = y1 + drop_length * cos(slant)     3. A line is drawn from (x1, y1) to (x2, y2) with the specified drop_color and drop_width.     4. The image is then blurred and its brightness is adjusted.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomRain(p=1.0)\n&gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-rain-parameters","title":"Custom rain parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomRain(\n...     slant_range=(-15, 15),\n...     drop_length=30,\n...     drop_width=2,\n...     drop_color=(180, 180, 180),\n...     blur_value=5,\n...     brightness_coefficient=0.8,\n...     p=1.0\n... )\n&gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--simulating-heavy-rain","title":"Simulating heavy rain","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomRain(rain_type=\"heavy\", p=1.0)\n&gt;&gt;&gt; heavy_rain_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Rain visualization techniques: https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-27-real-time-rain-rendering</li> <li>Weather effects in computer vision: https://www.sciencedirect.com/science/article/pii/S1077314220300692</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomRain(ImageOnlyTransform):\n    \"\"\"Adds rain effects to an image.\n\n    This transform simulates rainfall by overlaying semi-transparent streaks onto the image,\n    creating a realistic rain effect. It can be used to augment datasets for computer vision\n    tasks that need to perform well in rainy conditions.\n\n    Args:\n        slant_range (tuple[int, int]): Range for the rain slant angle in degrees.\n            Negative values slant to the left, positive to the right. Default: (-10, 10).\n        drop_length (int): Length of the rain drops in pixels. Default: 20.\n        drop_width (int): Width of the rain drops in pixels. Default: 1.\n        drop_color (tuple[int, int, int]): Color of the rain drops in RGB format. Default: (200, 200, 200).\n        blur_value (int): Blur value for simulating rain effect. Rainy views are typically blurry. Default: 7.\n        brightness_coefficient (float): Coefficient to adjust the brightness of the image.\n            Rainy scenes are usually darker. Should be in the range (0, 1]. Default: 0.7.\n        rain_type (Literal[\"drizzle\", \"heavy\", \"torrential\", \"default\"]): Type of rain to simulate.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        - The rain effect is created by drawing semi-transparent lines on the image.\n        - The slant of the rain can be controlled to simulate wind effects.\n        - Different rain types (drizzle, heavy, torrential) adjust the density and appearance of the rain.\n        - The transform also adjusts image brightness and applies a blur to simulate the visual effects of rain.\n        - This transform is particularly useful for:\n          * Augmenting datasets for autonomous driving in rainy conditions\n          * Testing the robustness of computer vision models to weather effects\n          * Creating realistic rainy scenes for image editing or film production\n\n    Mathematical Formulation:\n        For each raindrop:\n        1. Start position (x1, y1) is randomly generated within the image.\n        2. End position (x2, y2) is calculated based on drop_length and slant:\n           x2 = x1 + drop_length * sin(slant)\n           y2 = y1 + drop_length * cos(slant)\n        3. A line is drawn from (x1, y1) to (x2, y2) with the specified drop_color and drop_width.\n        4. The image is then blurred and its brightness is adjusted.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomRain(p=1.0)\n        &gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n\n        # Custom rain parameters\n        &gt;&gt;&gt; transform = A.RandomRain(\n        ...     slant_range=(-15, 15),\n        ...     drop_length=30,\n        ...     drop_width=2,\n        ...     drop_color=(180, 180, 180),\n        ...     blur_value=5,\n        ...     brightness_coefficient=0.8,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; rainy_image = transform(image=image)[\"image\"]\n\n        # Simulating heavy rain\n        &gt;&gt;&gt; transform = A.RandomRain(rain_type=\"heavy\", p=1.0)\n        &gt;&gt;&gt; heavy_rain_image = transform(image=image)[\"image\"]\n\n    References:\n        - Rain visualization techniques: https://developer.nvidia.com/gpugems/gpugems3/part-iv-image-effects/chapter-27-real-time-rain-rendering\n        - Weather effects in computer vision: https://www.sciencedirect.com/science/article/pii/S1077314220300692\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        slant_lower: int | None = Field(default=None)\n        slant_upper: int | None = Field(default=None)\n        slant_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]\n        drop_length: int = Field(ge=1)\n        drop_width: int = Field(ge=1)\n        drop_color: tuple[int, int, int]\n        blur_value: int = Field(ge=1)\n        brightness_coefficient: float = Field(gt=0, le=1)\n        rain_type: RainMode\n\n        @model_validator(mode=\"after\")\n        def validate_ranges(self) -&gt; Self:\n            if self.slant_lower is not None or self.slant_upper is not None:\n                if self.slant_lower is not None:\n                    warn(\n                        \"`slant_lower` deprecated. Use `slant_range` as tuple (slant_lower, slant_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.slant_upper is not None:\n                    warn(\n                        \"`slant_upper` deprecated. Use `slant_range` as tuple (slant_lower, slant_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.slant_lower if self.slant_lower is not None else self.slant_range[0]\n                upper = self.slant_upper if self.slant_upper is not None else self.slant_range[1]\n                self.slant_range = (lower, upper)\n                self.slant_lower = None\n                self.slant_upper = None\n\n            # Validate the slant_range\n            if not (-MAX_RAIN_ANGLE &lt;= self.slant_range[0] &lt;= self.slant_range[1] &lt;= MAX_RAIN_ANGLE):\n                raise ValueError(\n                    f\"slant_range values should be increasing within [-{MAX_RAIN_ANGLE}, {MAX_RAIN_ANGLE}] range.\",\n                )\n            return self\n\n    def __init__(\n        self,\n        slant_lower: int | None = None,\n        slant_upper: int | None = None,\n        slant_range: tuple[int, int] = (-10, 10),\n        drop_length: int = 20,\n        drop_width: int = 1,\n        drop_color: tuple[int, int, int] = (200, 200, 200),\n        blur_value: int = 7,\n        brightness_coefficient: float = 0.7,\n        rain_type: RainMode = \"default\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.slant_range = slant_range\n        self.drop_length = drop_length\n        self.drop_width = drop_width\n        self.drop_color = drop_color\n        self.blur_value = blur_value\n        self.brightness_coefficient = brightness_coefficient\n        self.rain_type = rain_type\n\n    def apply(\n        self,\n        img: np.ndarray,\n        slant: int,\n        drop_length: int,\n        rain_drops: list[tuple[int, int]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n\n        return fmain.add_rain(\n            img,\n            slant,\n            drop_length,\n            self.drop_width,\n            self.drop_color,\n            self.blur_value,\n            self.brightness_coefficient,\n            rain_drops,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        slant = int(self.py_random.uniform(*self.slant_range))\n\n        height, width = params[\"shape\"][:2]\n        area = height * width\n\n        if self.rain_type == \"drizzle\":\n            num_drops = area // 770\n            drop_length = 10\n        elif self.rain_type == \"heavy\":\n            num_drops = width * height // 600\n            drop_length = 30\n        elif self.rain_type == \"torrential\":\n            num_drops = area // 500\n            drop_length = 60\n        else:\n            drop_length = self.drop_length\n            num_drops = area // 600\n\n        rain_drops = []\n\n        for _ in range(num_drops):  # If You want heavy rain, try increasing this\n            x = self.py_random.randint(slant, width) if slant &lt; 0 else self.py_random.randint(0, max(width - slant, 0))\n            y = self.py_random.randint(0, max(height - drop_length, 0))\n\n            rain_drops.append((x, y))\n\n        return {\"drop_length\": drop_length, \"slant\": slant, \"rain_drops\": rain_drops}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"slant_range\",\n            \"drop_length\",\n            \"drop_width\",\n            \"drop_color\",\n            \"blur_value\",\n            \"brightness_coefficient\",\n            \"rain_type\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomShadow","title":"<code>class  RandomShadow</code> <code>       (shadow_roi=(0, 0.5, 1, 1), num_shadows_limit=(1, 2), num_shadows_lower=None, num_shadows_upper=None, shadow_dimension=5, shadow_intensity_range=(0.5, 0.5), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Simulates shadows for the image by reducing the brightness of the image in shadow regions.</p> <p>This transform adds realistic shadow effects to images, which can be useful for augmenting datasets for outdoor scene analysis, autonomous driving, or any computer vision task where shadows may be present.</p> <p>Parameters:</p> Name Type Description <code>shadow_roi</code> <code>tuple[float, float, float, float]</code> <p>Region of the image where shadows will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1]. Default: (0, 0.5, 1, 1).</p> <code>num_shadows_limit</code> <code>tuple[int, int]</code> <p>Lower and upper limits for the possible number of shadows. Default: (1, 2).</p> <code>shadow_dimension</code> <code>int</code> <p>Number of edges in the shadow polygons. Default: 5.</p> <code>shadow_intensity_range</code> <code>tuple[float, float]</code> <p>Range for the shadow intensity. Larger value means darker shadow. Should be two float values between 0 and 1. Default: (0.5, 0.5).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>Shadows are created by generating random polygons within the specified ROI and   reducing the brightness of the image in these areas.</li> <li>The number of shadows, their shapes, and intensities can be randomized for variety.</li> <li>This transform is particularly useful for:</li> <li>Augmenting datasets for outdoor scene understanding</li> <li>Improving robustness of object detection models to shadowed conditions</li> <li>Simulating different lighting conditions in synthetic datasets</li> </ul> <p>Mathematical Formulation:     For each shadow:     1. A polygon with <code>shadow_dimension</code> vertices is generated within the shadow ROI.     2. The shadow intensity a is randomly chosen from <code>shadow_intensity_range</code>.     3. For each pixel (x, y) within the polygon:        new_pixel_value = original_pixel_value * (1 - a)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage","title":"Default usage","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomShadow(p=1.0)\n&gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-shadow-parameters","title":"Custom shadow parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomShadow(\n...     shadow_roi=(0.2, 0.2, 0.8, 0.8),\n...     num_shadows_limit=(2, 4),\n...     shadow_dimension=8,\n...     shadow_intensity_range=(0.3, 0.7),\n...     p=1.0\n... )\n&gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--combining-with-other-transforms","title":"Combining with other transforms","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomShadow(p=0.5),\n...     A.RandomBrightnessContrast(p=0.5),\n... ])\n&gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Shadow detection and removal: https://www.sciencedirect.com/science/article/pii/S1047320315002035</li> <li>Shadows in computer vision: https://en.wikipedia.org/wiki/Shadow_detection</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomShadow(ImageOnlyTransform):\n    \"\"\"Simulates shadows for the image by reducing the brightness of the image in shadow regions.\n\n    This transform adds realistic shadow effects to images, which can be useful for augmenting\n    datasets for outdoor scene analysis, autonomous driving, or any computer vision task where\n    shadows may be present.\n\n    Args:\n        shadow_roi (tuple[float, float, float, float]): Region of the image where shadows\n            will appear (x_min, y_min, x_max, y_max). All values should be in range [0, 1].\n            Default: (0, 0.5, 1, 1).\n        num_shadows_limit (tuple[int, int]): Lower and upper limits for the possible number of shadows.\n            Default: (1, 2).\n        shadow_dimension (int): Number of edges in the shadow polygons. Default: 5.\n        shadow_intensity_range (tuple[float, float]): Range for the shadow intensity. Larger value\n            means darker shadow. Should be two float values between 0 and 1. Default: (0.5, 0.5).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - Shadows are created by generating random polygons within the specified ROI and\n          reducing the brightness of the image in these areas.\n        - The number of shadows, their shapes, and intensities can be randomized for variety.\n        - This transform is particularly useful for:\n          * Augmenting datasets for outdoor scene understanding\n          * Improving robustness of object detection models to shadowed conditions\n          * Simulating different lighting conditions in synthetic datasets\n\n    Mathematical Formulation:\n        For each shadow:\n        1. A polygon with `shadow_dimension` vertices is generated within the shadow ROI.\n        2. The shadow intensity a is randomly chosen from `shadow_intensity_range`.\n        3. For each pixel (x, y) within the polygon:\n           new_pixel_value = original_pixel_value * (1 - a)\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage\n        &gt;&gt;&gt; transform = A.RandomShadow(p=1.0)\n        &gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n\n        # Custom shadow parameters\n        &gt;&gt;&gt; transform = A.RandomShadow(\n        ...     shadow_roi=(0.2, 0.2, 0.8, 0.8),\n        ...     num_shadows_limit=(2, 4),\n        ...     shadow_dimension=8,\n        ...     shadow_intensity_range=(0.3, 0.7),\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; shadowed_image = transform(image=image)[\"image\"]\n\n        # Combining with other transforms\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomShadow(p=0.5),\n        ...     A.RandomBrightnessContrast(p=0.5),\n        ... ])\n        &gt;&gt;&gt; augmented_image = transform(image=image)[\"image\"]\n\n    References:\n        - Shadow detection and removal: https://www.sciencedirect.com/science/article/pii/S1047320315002035\n        - Shadows in computer vision: https://en.wikipedia.org/wiki/Shadow_detection\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        shadow_roi: tuple[float, float, float, float]\n        num_shadows_limit: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, None)),\n            AfterValidator(nondecreasing),\n        ]\n        num_shadows_lower: int | None\n        num_shadows_upper: int | None\n        shadow_dimension: int = Field(ge=3)\n\n        shadow_intensity_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        @model_validator(mode=\"after\")\n        def validate_shadows(self) -&gt; Self:\n            if self.num_shadows_lower is not None:\n                warn(\n                    \"`num_shadows_lower` is deprecated. Use `num_shadows_limit` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.num_shadows_upper is not None:\n                warn(\n                    \"`num_shadows_upper` is deprecated. Use `num_shadows_limit` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.num_shadows_lower is not None or self.num_shadows_upper is not None:\n                num_shadows_lower = (\n                    self.num_shadows_lower if self.num_shadows_lower is not None else self.num_shadows_limit[0]\n                )\n                num_shadows_upper = (\n                    self.num_shadows_upper if self.num_shadows_upper is not None else self.num_shadows_limit[1]\n                )\n\n                self.num_shadows_limit = (num_shadows_lower, num_shadows_upper)\n                self.num_shadows_lower = None\n                self.num_shadows_upper = None\n\n            shadow_lower_x, shadow_lower_y, shadow_upper_x, shadow_upper_y = self.shadow_roi\n\n            if not 0 &lt;= shadow_lower_x &lt;= shadow_upper_x &lt;= 1 or not 0 &lt;= shadow_lower_y &lt;= shadow_upper_y &lt;= 1:\n                raise ValueError(f\"Invalid shadow_roi. Got: {self.shadow_roi}\")\n\n            if isinstance(self.shadow_intensity_range, float):\n                if not (0 &lt;= self.shadow_intensity_range &lt;= 1):\n                    raise ValueError(\n                        f\"shadow_intensity_range value should be within [0, 1] range. \"\n                        f\"Got: {self.shadow_intensity_range}\",\n                    )\n            elif isinstance(self.shadow_intensity_range, tuple):\n                if not (0 &lt;= self.shadow_intensity_range[0] &lt;= self.shadow_intensity_range[1] &lt;= 1):\n                    raise ValueError(\n                        f\"shadow_intensity_range values should be within [0, 1] range and increasing. \"\n                        f\"Got: {self.shadow_intensity_range}\",\n                    )\n            else:\n                raise TypeError(\n                    \"shadow_intensity_range should be an float or a tuple of floats.\",\n                )\n\n            return self\n\n    def __init__(\n        self,\n        shadow_roi: tuple[float, float, float, float] = (0, 0.5, 1, 1),\n        num_shadows_limit: tuple[int, int] = (1, 2),\n        num_shadows_lower: int | None = None,\n        num_shadows_upper: int | None = None,\n        shadow_dimension: int = 5,\n        shadow_intensity_range: tuple[float, float] = (0.5, 0.5),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.shadow_roi = shadow_roi\n        self.shadow_dimension = shadow_dimension\n        self.num_shadows_limit = num_shadows_limit\n        self.shadow_intensity_range = shadow_intensity_range\n\n    def apply(\n        self,\n        img: np.ndarray,\n        vertices_list: list[np.ndarray],\n        intensities: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.add_shadow(img, vertices_list, intensities)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, list[np.ndarray]]:\n        height, width = params[\"shape\"][:2]\n\n        num_shadows = self.py_random.randint(*self.num_shadows_limit)\n\n        x_min, y_min, x_max, y_max = self.shadow_roi\n\n        x_min = int(x_min * width)\n        x_max = int(x_max * width)\n        y_min = int(y_min * height)\n        y_max = int(y_max * height)\n\n        vertices_list = [\n            np.stack(\n                [\n                    self.random_generator.integers(\n                        x_min,\n                        x_max,\n                        size=self.shadow_dimension,\n                    ),\n                    self.random_generator.integers(\n                        y_min,\n                        y_max,\n                        size=self.shadow_dimension,\n                    ),\n                ],\n                axis=1,\n            )\n            for _ in range(num_shadows)\n        ]\n\n        # Sample shadow intensity for each shadow\n        intensities = self.random_generator.uniform(\n            *self.shadow_intensity_range,\n            size=num_shadows,\n        )\n\n        return {\"vertices_list\": vertices_list, \"intensities\": intensities}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"shadow_roi\",\n            \"num_shadows_limit\",\n            \"shadow_dimension\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomSnow","title":"<code>class  RandomSnow</code> <code>       (snow_point_lower=None, snow_point_upper=None, brightness_coeff=2.5, snow_point_range=(0.1, 0.3), method='bleach', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies a random snow effect to the input image.</p> <p>This transform simulates snowfall by either bleaching out some pixel values or adding a snow texture to the image, depending on the chosen method.</p> <p>Parameters:</p> Name Type Description <code>snow_point_range</code> <code>tuple[float, float]</code> <p>Range for the snow point threshold. Both values should be in the (0, 1) range. Default: (0.1, 0.3).</p> <code>brightness_coeff</code> <code>float</code> <p>Coefficient applied to increase the brightness of pixels below the snow_point threshold. Larger values lead to more pronounced snow effects. Should be &gt; 0. Default: 2.5.</p> <code>method</code> <code>Literal[\"bleach\", \"texture\"]</code> <p>The snow simulation method to use. Options are: - \"bleach\": Uses a simple pixel value thresholding technique. - \"texture\": Applies a more realistic snow texture overlay. Default: \"texture\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The \"bleach\" method increases the brightness of pixels above a certain threshold,   creating a simple snow effect. This method is faster but may look less realistic.</li> <li>The \"texture\" method creates a more realistic snow effect through the following steps:</li> <li>Converts the image to HSV color space for better control over brightness.</li> <li>Increases overall image brightness to simulate the reflective nature of snow.</li> <li>Generates a snow texture using Gaussian noise, which is then smoothed with a Gaussian filter.</li> <li>Applies a depth effect to the snow texture, making it more prominent at the top of the image.</li> <li>Blends the snow texture with the original image using alpha compositing.</li> <li>Adds a slight blue tint to simulate the cool color of snow.</li> <li>Adds random sparkle effects to simulate light reflecting off snow crystals.   This method produces a more realistic result but is computationally more expensive.</li> </ul> <p>Mathematical Formulation:     For the \"bleach\" method:     Let L be the lightness channel in HLS color space.     For each pixel (i, j):     If L[i, j] &gt; snow_point:         L[i, j] = L[i, j] * brightness_coeff</p> <pre><code>For the \"texture\" method:\n1. Brightness adjustment: V_new = V * (1 + brightness_coeff * snow_point)\n2. Snow texture generation: T = GaussianFilter(GaussianNoise(\u03bc=0.5, sigma=0.3))\n3. Depth effect: D = LinearGradient(1.0 to 0.2)\n4. Final pixel value: P = (1 - alpha) * original_pixel + alpha * (T * D * 255)\n   where alpha is the snow intensity factor derived from snow_point.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-usage-bleach-method","title":"Default usage (bleach method)","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSnow(p=1.0)\n&gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--using-texture-method-with-custom-parameters","title":"Using texture method with custom parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSnow(\n...     snow_point_range=(0.2, 0.4),\n...     brightness_coeff=2.0,\n...     method=\"texture\",\n...     p=1.0\n... )\n&gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Bleach method: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library</li> <li>Texture method: Inspired by computer graphics techniques for snow rendering   and atmospheric scattering simulations.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomSnow(ImageOnlyTransform):\n    \"\"\"Applies a random snow effect to the input image.\n\n    This transform simulates snowfall by either bleaching out some pixel values or\n    adding a snow texture to the image, depending on the chosen method.\n\n    Args:\n        snow_point_range (tuple[float, float]): Range for the snow point threshold.\n            Both values should be in the (0, 1) range. Default: (0.1, 0.3).\n        brightness_coeff (float): Coefficient applied to increase the brightness of pixels\n            below the snow_point threshold. Larger values lead to more pronounced snow effects.\n            Should be &gt; 0. Default: 2.5.\n        method (Literal[\"bleach\", \"texture\"]): The snow simulation method to use. Options are:\n            - \"bleach\": Uses a simple pixel value thresholding technique.\n            - \"texture\": Applies a more realistic snow texture overlay.\n            Default: \"texture\".\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The \"bleach\" method increases the brightness of pixels above a certain threshold,\n          creating a simple snow effect. This method is faster but may look less realistic.\n        - The \"texture\" method creates a more realistic snow effect through the following steps:\n          1. Converts the image to HSV color space for better control over brightness.\n          2. Increases overall image brightness to simulate the reflective nature of snow.\n          3. Generates a snow texture using Gaussian noise, which is then smoothed with a Gaussian filter.\n          4. Applies a depth effect to the snow texture, making it more prominent at the top of the image.\n          5. Blends the snow texture with the original image using alpha compositing.\n          6. Adds a slight blue tint to simulate the cool color of snow.\n          7. Adds random sparkle effects to simulate light reflecting off snow crystals.\n          This method produces a more realistic result but is computationally more expensive.\n\n    Mathematical Formulation:\n        For the \"bleach\" method:\n        Let L be the lightness channel in HLS color space.\n        For each pixel (i, j):\n        If L[i, j] &gt; snow_point:\n            L[i, j] = L[i, j] * brightness_coeff\n\n        For the \"texture\" method:\n        1. Brightness adjustment: V_new = V * (1 + brightness_coeff * snow_point)\n        2. Snow texture generation: T = GaussianFilter(GaussianNoise(\u03bc=0.5, sigma=0.3))\n        3. Depth effect: D = LinearGradient(1.0 to 0.2)\n        4. Final pixel value: P = (1 - alpha) * original_pixel + alpha * (T * D * 255)\n           where alpha is the snow intensity factor derived from snow_point.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Default usage (bleach method)\n        &gt;&gt;&gt; transform = A.RandomSnow(p=1.0)\n        &gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n\n        # Using texture method with custom parameters\n        &gt;&gt;&gt; transform = A.RandomSnow(\n        ...     snow_point_range=(0.2, 0.4),\n        ...     brightness_coeff=2.0,\n        ...     method=\"texture\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; snowy_image = transform(image=image)[\"image\"]\n\n    References:\n        - Bleach method: https://github.com/UjjwalSaxena/Automold--Road-Augmentation-Library\n        - Texture method: Inspired by computer graphics techniques for snow rendering\n          and atmospheric scattering simulations.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        snow_point_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        snow_point_lower: float | None = Field(\n            gt=0,\n            lt=1,\n        )\n        snow_point_upper: float | None = Field(\n            gt=0,\n            lt=1,\n        )\n        brightness_coeff: float = Field(gt=0)\n        method: Literal[\"bleach\", \"texture\"]\n\n        @model_validator(mode=\"after\")\n        def validate_ranges(self) -&gt; Self:\n            if self.snow_point_lower is not None or self.snow_point_upper is not None:\n                if self.snow_point_lower is not None:\n                    warn(\n                        \"`snow_point_lower` deprecated. Use `snow_point_range` as tuple\"\n                        \" (snow_point_lower, snow_point_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.snow_point_upper is not None:\n                    warn(\n                        \"`snow_point_upper` deprecated. Use `snow_point_range` as tuple\"\n                        \"(snow_point_lower, snow_point_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.snow_point_lower if self.snow_point_lower is not None else self.snow_point_range[0]\n                upper = self.snow_point_upper if self.snow_point_upper is not None else self.snow_point_range[1]\n                self.snow_point_range = (lower, upper)\n                self.snow_point_lower = None\n                self.snow_point_upper = None\n\n            # Validate the snow_point_range\n            if not (0 &lt; self.snow_point_range[0] &lt;= self.snow_point_range[1] &lt; 1):\n                raise ValueError(\n                    \"snow_point_range values should be increasing within (0, 1) range.\",\n                )\n\n            return self\n\n    def __init__(\n        self,\n        snow_point_lower: float | None = None,\n        snow_point_upper: float | None = None,\n        brightness_coeff: float = 2.5,\n        snow_point_range: tuple[float, float] = (0.1, 0.3),\n        method: Literal[\"bleach\", \"texture\"] = \"bleach\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.snow_point_range = snow_point_range\n        self.brightness_coeff = brightness_coeff\n        self.method = method\n\n    def apply(\n        self,\n        img: np.ndarray,\n        snow_point: float,\n        snow_texture: np.ndarray,\n        sparkle_mask: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n\n        if self.method == \"bleach\":\n            return fmain.add_snow_bleach(img, snow_point, self.brightness_coeff)\n        if self.method == \"texture\":\n            return fmain.add_snow_texture(\n                img,\n                snow_point,\n                self.brightness_coeff,\n                snow_texture,\n                sparkle_mask,\n            )\n\n        raise ValueError(f\"Unknown snow method: {self.method}\")\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray | None]:\n        image_shape = params[\"shape\"][:2]\n        result = {\n            \"snow_point\": self.py_random.uniform(*self.snow_point_range),\n            \"snow_texture\": None,\n            \"sparkle_mask\": None,\n        }\n\n        if self.method == \"texture\":\n            snow_texture, sparkle_mask = fmain.generate_snow_textures(\n                img_shape=image_shape,\n                random_generator=self.random_generator,\n            )\n            result[\"snow_texture\"] = snow_texture\n            result[\"sparkle_mask\"] = sparkle_mask\n\n        return result\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"snow_point_range\", \"brightness_coeff\", \"method\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomSunFlare","title":"<code>class  RandomSunFlare</code> <code>       (flare_roi=(0, 0, 1, 0.5), angle_lower=None, angle_upper=None, num_flare_circles_lower=None, num_flare_circles_upper=None, src_radius=400, src_color=(255, 255, 255), angle_range=(0, 1), num_flare_circles_range=(6, 10), method='overlay', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Simulates a sun flare effect on the image by adding circles of light.</p> <p>This transform creates a sun flare effect by overlaying multiple semi-transparent circles of varying sizes and intensities along a line originating from a \"sun\" point. It offers two methods: a simple overlay technique and a more complex physics-based approach.</p> <p>Parameters:</p> Name Type Description <code>flare_roi</code> <code>tuple[float, float, float, float]</code> <p>Region of interest where the sun flare can appear. Values are in the range [0, 1] and represent (x_min, y_min, x_max, y_max) in relative coordinates. Default: (0, 0, 1, 0.5).</p> <code>angle_range</code> <code>tuple[float, float]</code> <p>Range of angles (in radians) for the flare direction. Values should be in the range [0, 1], where 0 represents 0 radians and 1 represents 2\u03c0 radians. Default: (0, 1).</p> <code>num_flare_circles_range</code> <code>tuple[int, int]</code> <p>Range for the number of flare circles to generate. Default: (6, 10).</p> <code>src_radius</code> <code>int</code> <p>Radius of the sun circle in pixels. Default: 400.</p> <code>src_color</code> <code>tuple[int, int, int]</code> <p>Color of the sun in RGB format. Default: (255, 255, 255).</p> <code>method</code> <code>Literal[\"overlay\", \"physics_based\"]</code> <p>Method to use for generating the sun flare. \"overlay\" uses a simple alpha blending technique, while \"physics_based\" simulates more realistic optical phenomena. Default: \"physics_based\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     3</p> <p>Note</p> <p>The transform offers two methods for generating sun flares:</p> <ol> <li>Overlay Method (\"overlay\"):</li> <li>Creates a simple sun flare effect using basic alpha blending.</li> <li>Steps:      a. Generate the main sun circle with a radial gradient.      b. Create smaller flare circles along the flare line.      c. Blend these elements with the original image using alpha compositing.</li> <li> <p>Characteristics:</p> <ul> <li>Faster computation</li> <li>Less realistic appearance</li> <li>Suitable for basic augmentation or when performance is a priority</li> </ul> </li> <li> <p>Physics-based Method (\"physics_based\"):</p> </li> <li>Simulates more realistic optical phenomena observed in actual lens flares.</li> <li>Steps:      a. Create a separate flare layer for complex manipulations.      b. Add the main sun circle and diffraction spikes to simulate light diffraction.      c. Generate and add multiple flare circles with varying properties.      d. Apply Gaussian blur to create a soft, glowing effect.      e. Create and apply a radial gradient mask for natural fading from the center.      f. Simulate chromatic aberration by applying different blurs to color channels.      g. Blend the flare with the original image using screen blending mode.</li> <li>Characteristics:<ul> <li>More computationally intensive</li> <li>Produces more realistic and visually appealing results</li> <li>Includes effects like diffraction spikes and chromatic aberration</li> <li>Suitable for high-quality augmentation or realistic image synthesis</li> </ul> </li> </ol> <p>Mathematical Formulation:     For both methods:     1. Sun position (x_s, y_s) is randomly chosen within the specified ROI.     2. Flare angle \u03b8 is randomly chosen from the angle_range.     3. For each flare circle i:        - Position (x_i, y_i) = (x_s + t_i * cos(\u03b8), y_s + t_i * sin(\u03b8))          where t_i is a random distance along the flare line.        - Radius r_i is randomly chosen, with larger circles closer to the sun.        - Alpha (transparency) alpha_i is randomly chosen in the range [0.05, 0.2].        - Color (R_i, G_i, B_i) is randomly chosen close to src_color.</p> <pre><code>Overlay method blending:\nnew_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n\nPhysics-based method blending:\nnew_pixel = 255 - ((255 - original_pixel) * (255 - flare_pixel) / 255)\n\n4. Each flare circle is blended with the image using alpha compositing:\n   new_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-sun-flare-overlay-method","title":"Default sun flare (overlay method)","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n&gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--physics-based-sun-flare-with-custom-parameters","title":"Physics-based sun flare with custom parameters","text":""},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--default-sun-flare","title":"Default sun flare","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n&gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--custom-sun-flare-parameters","title":"Custom sun flare parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomSunFlare(\n...     flare_roi=(0.1, 0, 0.9, 0.3),\n...     angle_range=(0.25, 0.75),\n...     num_flare_circles_range=(5, 15),\n...     src_radius=200,\n...     src_color=(255, 200, 100),\n...     method=\"physics_based\",\n...     p=1.0\n... )\n&gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Lens flare: https://en.wikipedia.org/wiki/Lens_flare</li> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Diffraction: https://en.wikipedia.org/wiki/Diffraction</li> <li>Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration</li> <li>Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomSunFlare(ImageOnlyTransform):\n    \"\"\"Simulates a sun flare effect on the image by adding circles of light.\n\n    This transform creates a sun flare effect by overlaying multiple semi-transparent\n    circles of varying sizes and intensities along a line originating from a \"sun\" point.\n    It offers two methods: a simple overlay technique and a more complex physics-based approach.\n\n    Args:\n        flare_roi (tuple[float, float, float, float]): Region of interest where the sun flare\n            can appear. Values are in the range [0, 1] and represent (x_min, y_min, x_max, y_max)\n            in relative coordinates. Default: (0, 0, 1, 0.5).\n        angle_range (tuple[float, float]): Range of angles (in radians) for the flare direction.\n            Values should be in the range [0, 1], where 0 represents 0 radians and 1 represents 2\u03c0 radians.\n            Default: (0, 1).\n        num_flare_circles_range (tuple[int, int]): Range for the number of flare circles to generate.\n            Default: (6, 10).\n        src_radius (int): Radius of the sun circle in pixels. Default: 400.\n        src_color (tuple[int, int, int]): Color of the sun in RGB format. Default: (255, 255, 255).\n        method (Literal[\"overlay\", \"physics_based\"]): Method to use for generating the sun flare.\n            \"overlay\" uses a simple alpha blending technique, while \"physics_based\" simulates\n            more realistic optical phenomena. Default: \"physics_based\".\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        3\n\n    Note:\n        The transform offers two methods for generating sun flares:\n\n        1. Overlay Method (\"overlay\"):\n           - Creates a simple sun flare effect using basic alpha blending.\n           - Steps:\n             a. Generate the main sun circle with a radial gradient.\n             b. Create smaller flare circles along the flare line.\n             c. Blend these elements with the original image using alpha compositing.\n           - Characteristics:\n             * Faster computation\n             * Less realistic appearance\n             * Suitable for basic augmentation or when performance is a priority\n\n        2. Physics-based Method (\"physics_based\"):\n           - Simulates more realistic optical phenomena observed in actual lens flares.\n           - Steps:\n             a. Create a separate flare layer for complex manipulations.\n             b. Add the main sun circle and diffraction spikes to simulate light diffraction.\n             c. Generate and add multiple flare circles with varying properties.\n             d. Apply Gaussian blur to create a soft, glowing effect.\n             e. Create and apply a radial gradient mask for natural fading from the center.\n             f. Simulate chromatic aberration by applying different blurs to color channels.\n             g. Blend the flare with the original image using screen blending mode.\n           - Characteristics:\n             * More computationally intensive\n             * Produces more realistic and visually appealing results\n             * Includes effects like diffraction spikes and chromatic aberration\n             * Suitable for high-quality augmentation or realistic image synthesis\n\n    Mathematical Formulation:\n        For both methods:\n        1. Sun position (x_s, y_s) is randomly chosen within the specified ROI.\n        2. Flare angle \u03b8 is randomly chosen from the angle_range.\n        3. For each flare circle i:\n           - Position (x_i, y_i) = (x_s + t_i * cos(\u03b8), y_s + t_i * sin(\u03b8))\n             where t_i is a random distance along the flare line.\n           - Radius r_i is randomly chosen, with larger circles closer to the sun.\n           - Alpha (transparency) alpha_i is randomly chosen in the range [0.05, 0.2].\n           - Color (R_i, G_i, B_i) is randomly chosen close to src_color.\n\n        Overlay method blending:\n        new_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n\n        Physics-based method blending:\n        new_pixel = 255 - ((255 - original_pixel) * (255 - flare_pixel) / 255)\n\n        4. Each flare circle is blended with the image using alpha compositing:\n           new_pixel = (1 - alpha_i) * original_pixel + alpha_i * flare_color_i\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [1000, 1000, 3], dtype=np.uint8)\n\n        # Default sun flare (overlay method)\n        &gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n        &gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n\n        # Physics-based sun flare with custom parameters\n\n        # Default sun flare\n        &gt;&gt;&gt; transform = A.RandomSunFlare(p=1.0)\n        &gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n\n        # Custom sun flare parameters\n\n        &gt;&gt;&gt; transform = A.RandomSunFlare(\n        ...     flare_roi=(0.1, 0, 0.9, 0.3),\n        ...     angle_range=(0.25, 0.75),\n        ...     num_flare_circles_range=(5, 15),\n        ...     src_radius=200,\n        ...     src_color=(255, 200, 100),\n        ...     method=\"physics_based\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; flared_image = transform(image=image)[\"image\"]\n\n    References:\n        - Lens flare: https://en.wikipedia.org/wiki/Lens_flare\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Diffraction: https://en.wikipedia.org/wiki/Diffraction\n        - Chromatic aberration: https://en.wikipedia.org/wiki/Chromatic_aberration\n        - Screen blending: https://en.wikipedia.org/wiki/Blend_modes#Screen\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        flare_roi: tuple[float, float, float, float]\n        angle_lower: float | None = Field(ge=0, le=1)\n        angle_upper: float | None = Field(ge=0, le=1)\n\n        num_flare_circles_lower: int | None = Field(\n            ge=0,\n        )\n        num_flare_circles_upper: int | None = Field(\n            gt=0,\n        )\n        src_radius: int = Field(gt=1)\n        src_color: tuple[int, ...]\n\n        angle_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        num_flare_circles_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, None)),\n            AfterValidator(nondecreasing),\n        ]\n        method: Literal[\"overlay\", \"physics_based\"]\n\n        @model_validator(mode=\"after\")\n        def validate_parameters(self) -&gt; Self:\n            (\n                flare_center_lower_x,\n                flare_center_lower_y,\n                flare_center_upper_x,\n                flare_center_upper_y,\n            ) = self.flare_roi\n            if (\n                not 0 &lt;= flare_center_lower_x &lt; flare_center_upper_x &lt;= 1\n                or not 0 &lt;= flare_center_lower_y &lt; flare_center_upper_y &lt;= 1\n            ):\n                raise ValueError(f\"Invalid flare_roi. Got: {self.flare_roi}\")\n\n            if self.angle_lower is not None or self.angle_upper is not None:\n                if self.angle_lower is not None:\n                    warn(\n                        \"`angle_lower` deprecated. Use `angle_range` as tuple (angle_lower, angle_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.angle_upper is not None:\n                    warn(\n                        \"`angle_upper` deprecated. Use `angle_range` as tuple(angle_lower, angle_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = self.angle_lower if self.angle_lower is not None else self.angle_range[0]\n                upper = self.angle_upper if self.angle_upper is not None else self.angle_range[1]\n                self.angle_range = (lower, upper)\n\n            if self.num_flare_circles_lower is not None or self.num_flare_circles_upper is not None:\n                if self.num_flare_circles_lower is not None:\n                    warn(\n                        \"`num_flare_circles_lower` deprecated. Use `num_flare_circles_range` as tuple\"\n                        \" (num_flare_circles_lower, num_flare_circles_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                if self.num_flare_circles_upper is not None:\n                    warn(\n                        \"`num_flare_circles_upper` deprecated. Use `num_flare_circles_range` as tuple\"\n                        \" (num_flare_circles_lower, num_flare_circles_upper) instead.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                lower = (\n                    self.num_flare_circles_lower\n                    if self.num_flare_circles_lower is not None\n                    else self.num_flare_circles_range[0]\n                )\n                upper = (\n                    self.num_flare_circles_upper\n                    if self.num_flare_circles_upper is not None\n                    else self.num_flare_circles_range[1]\n                )\n                self.num_flare_circles_range = (lower, upper)\n\n            return self\n\n    def __init__(\n        self,\n        flare_roi: tuple[float, float, float, float] = (0, 0, 1, 0.5),\n        angle_lower: float | None = None,\n        angle_upper: float | None = None,\n        num_flare_circles_lower: int | None = None,\n        num_flare_circles_upper: int | None = None,\n        src_radius: int = 400,\n        src_color: tuple[int, ...] = (255, 255, 255),\n        angle_range: tuple[float, float] = (0, 1),\n        num_flare_circles_range: tuple[int, int] = (6, 10),\n        method: Literal[\"overlay\", \"physics_based\"] = \"overlay\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.angle_range = angle_range\n        self.num_flare_circles_range = num_flare_circles_range\n\n        self.src_radius = src_radius\n        self.src_color = src_color\n        self.flare_roi = flare_roi\n        self.method = method\n\n    def apply(\n        self,\n        img: np.ndarray,\n        flare_center: tuple[float, float],\n        circles: list[Any],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        if self.method == \"overlay\":\n            return fmain.add_sun_flare_overlay(\n                img,\n                flare_center,\n                self.src_radius,\n                self.src_color,\n                circles,\n            )\n        if self.method == \"physics_based\":\n            return fmain.add_sun_flare_physics_based(\n                img,\n                flare_center,\n                self.src_radius,\n                self.src_color,\n                circles,\n            )\n\n        raise ValueError(f\"Invalid method: {self.method}\")\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n        diagonal = math.sqrt(height**2 + width**2)\n\n        angle = 2 * math.pi * self.py_random.uniform(*self.angle_range)\n\n        # Calculate flare center in pixel coordinates\n        x_min, y_min, x_max, y_max = self.flare_roi\n        flare_center_x = int(width * self.py_random.uniform(x_min, x_max))\n        flare_center_y = int(height * self.py_random.uniform(y_min, y_max))\n\n        num_circles = self.py_random.randint(*self.num_flare_circles_range)\n\n        # Calculate parameters relative to image size\n        step_size = max(1, int(diagonal * 0.01))  # 1% of diagonal, minimum 1 pixel\n        max_radius = max(2, int(height * 0.01))  # 1% of height, minimum 2 pixels\n        color_range = int(max(self.src_color) * 0.2)  # 20% of max color value\n\n        def line(t: float) -&gt; tuple[float, float]:\n            return (\n                flare_center_x + t * math.cos(angle),\n                flare_center_y + t * math.sin(angle),\n            )\n\n        # Generate points along the flare line\n        t_range = range(-flare_center_x, width - flare_center_x, step_size)\n        points = [line(t) for t in t_range]\n\n        circles = []\n        for _ in range(num_circles):\n            alpha = self.py_random.uniform(0.05, 0.2)\n            point = self.py_random.choice(points)\n            rad = self.py_random.randint(1, max_radius)\n\n            # Generate colors relative to src_color\n            colors = [self.py_random.randint(max(c - color_range, 0), c) for c in self.src_color]\n\n            circles.append(\n                (\n                    alpha,\n                    (int(point[0]), int(point[1])),\n                    pow(rad, 3),\n                    tuple(colors),\n                ),\n            )\n\n        return {\n            \"circles\": circles,\n            \"flare_center\": (flare_center_x, flare_center_y),\n        }\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"flare_roi\": self.flare_roi,\n            \"angle_range\": self.angle_range,\n            \"num_flare_circles_range\": self.num_flare_circles_range,\n            \"src_radius\": self.src_radius,\n            \"src_color\": self.src_color,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RandomToneCurve","title":"<code>class  RandomToneCurve</code> <code>       (scale=0.1, per_channel=False, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Randomly change the relationship between bright and dark areas of the image by manipulating its tone curve.</p> <p>This transform applies a random S-curve to the image's tone curve, adjusting the brightness and contrast in a non-linear manner. It can be applied to the entire image or to each channel separately.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>float</code> <p>Standard deviation of the normal distribution used to sample random distances to move two control points that modify the image's curve. Values should be in range [0, 1]. Higher values will result in more dramatic changes to the image. Default: 0.1</p> <code>per_channel</code> <code>bool</code> <p>If True, the tone curve will be applied to each channel of the input image separately, which can lead to color distortion. If False, the same curve is applied to all channels, preserving the original color relationships. Default: False</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform modifies the image's histogram by applying a smooth, S-shaped curve to it.</li> <li>The S-curve is defined by moving two control points of a quadratic B\u00e9zier curve.</li> <li>When per_channel is False, the same curve is applied to all channels, maintaining color balance.</li> <li>When per_channel is True, different curves are applied to each channel, which can create color shifts.</li> <li>This transform can be used to adjust image contrast and brightness in a more natural way than linear     transforms.</li> <li>The effect can range from subtle contrast adjustments to more dramatic \"vintage\" or \"faded\" looks.</li> </ul> <p>Mathematical Formulation:     1. Two control points are randomly moved from their default positions (0.25, 0.25) and (0.75, 0.75).     2. The new positions are sampled from a normal distribution: N(\u03bc, \u03c3\u00b2), where \u03bc is the original position     and alpha is the scale parameter.     3. These points, along with fixed points at (0, 0) and (1, 1), define a quadratic B\u00e9zier curve.     4. The curve is applied as a lookup table to the image intensities:        new_intensity = curve(original_intensity)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-a-random-tone-curve-to-all-channels-together","title":"Apply a random tone curve to all channels together","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.1, per_channel=False, p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-random-tone-curves-to-each-channel-separately","title":"Apply random tone curves to each channel separately","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.2, per_channel=True, p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>\"What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance\"   by Mahmoud Afifi and Michael S. Brown, ICCV 2019.</li> <li>B\u00e9zier curve: https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Quadratic_B%C3%A9zier_curves</li> <li>Tone mapping: https://en.wikipedia.org/wiki/Tone_mapping</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RandomToneCurve(ImageOnlyTransform):\n    \"\"\"Randomly change the relationship between bright and dark areas of the image by manipulating its tone curve.\n\n    This transform applies a random S-curve to the image's tone curve, adjusting the brightness and contrast\n    in a non-linear manner. It can be applied to the entire image or to each channel separately.\n\n    Args:\n        scale (float): Standard deviation of the normal distribution used to sample random distances\n            to move two control points that modify the image's curve. Values should be in range [0, 1].\n            Higher values will result in more dramatic changes to the image. Default: 0.1\n        per_channel (bool): If True, the tone curve will be applied to each channel of the input image separately,\n            which can lead to color distortion. If False, the same curve is applied to all channels,\n            preserving the original color relationships. Default: False\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform modifies the image's histogram by applying a smooth, S-shaped curve to it.\n        - The S-curve is defined by moving two control points of a quadratic B\u00e9zier curve.\n        - When per_channel is False, the same curve is applied to all channels, maintaining color balance.\n        - When per_channel is True, different curves are applied to each channel, which can create color shifts.\n        - This transform can be used to adjust image contrast and brightness in a more natural way than linear\n            transforms.\n        - The effect can range from subtle contrast adjustments to more dramatic \"vintage\" or \"faded\" looks.\n\n    Mathematical Formulation:\n        1. Two control points are randomly moved from their default positions (0.25, 0.25) and (0.75, 0.75).\n        2. The new positions are sampled from a normal distribution: N(\u03bc, \u03c3\u00b2), where \u03bc is the original position\n        and alpha is the scale parameter.\n        3. These points, along with fixed points at (0, 0) and (1, 1), define a quadratic B\u00e9zier curve.\n        4. The curve is applied as a lookup table to the image intensities:\n           new_intensity = curve(original_intensity)\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply a random tone curve to all channels together\n        &gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.1, per_channel=False, p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n\n        # Apply random tone curves to each channel separately\n        &gt;&gt;&gt; transform = A.RandomToneCurve(scale=0.2, per_channel=True, p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n\n    References:\n        - \"What Else Can Fool Deep Learning? Addressing Color Constancy Errors on Deep Neural Network Performance\"\n          by Mahmoud Afifi and Michael S. Brown, ICCV 2019.\n        - B\u00e9zier curve: https://en.wikipedia.org/wiki/B%C3%A9zier_curve#Quadratic_B%C3%A9zier_curves\n        - Tone mapping: https://en.wikipedia.org/wiki/Tone_mapping\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: float = Field(\n            ge=0,\n            le=1,\n        )\n        per_channel: bool\n\n    def __init__(\n        self,\n        scale: float = 0.1,\n        per_channel: bool = False,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale = scale\n        self.per_channel = per_channel\n\n    def apply(\n        self,\n        img: np.ndarray,\n        low_y: float | np.ndarray,\n        high_y: float | np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.move_tone_curve(img, low_y, high_y)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        num_channels = get_num_channels(image)\n\n        if self.per_channel and num_channels != 1:\n            return {\n                \"low_y\": np.clip(\n                    self.random_generator.normal(\n                        loc=0.25,\n                        scale=self.scale,\n                        size=(num_channels,),\n                    ),\n                    0,\n                    1,\n                ),\n                \"high_y\": np.clip(\n                    self.random_generator.normal(\n                        loc=0.75,\n                        scale=self.scale,\n                        size=(num_channels,),\n                    ),\n                    0,\n                    1,\n                ),\n            }\n        # Same values for all channels\n        low_y = np.clip(self.random_generator.normal(loc=0.25, scale=self.scale), 0, 1)\n        high_y = np.clip(self.random_generator.normal(loc=0.75, scale=self.scale), 0, 1)\n\n        return {\"low_y\": low_y, \"high_y\": high_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"scale\", \"per_channel\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.RingingOvershoot","title":"<code>class  RingingOvershoot</code> <code>       (blur_limit=(7, 15), cutoff=(0.7853981633974483, 1.5707963267948966), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Create ringing or overshoot artifacts by convolving the image with a 2D sinc filter.</p> <p>This transform simulates the ringing artifacts that can occur in digital image processing, particularly after sharpening or edge enhancement operations. It creates oscillations or overshoots near sharp transitions in the image.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Maximum kernel size for the sinc filter. Must be an odd number in the range [3, inf). If a single int is provided, the kernel size will be randomly chosen from the range (3, blur_limit). If a tuple (min, max) is provided, the kernel size will be randomly chosen from the range (min, max). Default: (7, 15).</p> <code>cutoff</code> <code>tuple[float, float]</code> <p>Range to choose the cutoff frequency in radians. Values should be in the range (0, \u03c0). A lower cutoff frequency will result in more pronounced ringing effects. Default: (\u03c0/4, \u03c0/2).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>Ringing artifacts are oscillations of the image intensity function in the neighborhood   of sharp transitions, such as edges or object boundaries.</li> <li>This transform uses a 2D sinc filter (also known as a 2D cardinal sine function)   to introduce these artifacts.</li> <li>The severity of the ringing effect is controlled by both the kernel size (blur_limit)   and the cutoff frequency.</li> <li>Larger kernel sizes and lower cutoff frequencies will generally produce more   noticeable ringing effects.</li> <li>This transform can be useful for:</li> <li>Simulating imperfections in image processing or transmission systems</li> <li>Testing the robustness of computer vision models to ringing artifacts</li> <li>Creating artistic effects that emphasize edges and transitions in images</li> </ul> <p>Mathematical Formulation:     The 2D sinc filter kernel is defined as:</p> <pre><code>K(x, y) = cutoff * J\u2081(cutoff * \u221a(x\u00b2 + y\u00b2)) / (2\u03c0 * \u221a(x\u00b2 + y\u00b2))\n\nwhere:\n- J\u2081 is the Bessel function of the first kind of order 1\n- cutoff is the chosen cutoff frequency\n- x and y are the distances from the kernel center\n\nThe filtered image I' is obtained by convolving the input image I with the kernel K:\n\nI'(x, y) = \u2211\u2211 I(x-u, y-v) * K(u, v)\n\nThe convolution operation introduces the ringing artifacts near sharp transitions.\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-ringing-effect-with-default-parameters","title":"Apply ringing effect with default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RingingOvershoot(p=1.0)\n&gt;&gt;&gt; ringing_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-ringing-effect-with-custom-parameters","title":"Apply ringing effect with custom parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.RingingOvershoot(\n...     blur_limit=(9, 17),\n...     cutoff=(np.pi/6, np.pi/3),\n...     p=1.0\n... )\n&gt;&gt;&gt; ringing_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>Ringing artifacts: https://en.wikipedia.org/wiki/Ringing_artifacts</li> <li>Sinc filter: https://en.wikipedia.org/wiki/Sinc_filter</li> <li>\"The Importance of Ringing Artifacts in Image Processing\" by Jae S. Lim, 1981</li> <li>\"Digital Image Processing\" by Rafael C. Gonzalez and Richard E. Woods, 4th Edition</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class RingingOvershoot(ImageOnlyTransform):\n    \"\"\"Create ringing or overshoot artifacts by convolving the image with a 2D sinc filter.\n\n    This transform simulates the ringing artifacts that can occur in digital image processing,\n    particularly after sharpening or edge enhancement operations. It creates oscillations\n    or overshoots near sharp transitions in the image.\n\n    Args:\n        blur_limit (tuple[int, int] | int): Maximum kernel size for the sinc filter.\n            Must be an odd number in the range [3, inf).\n            If a single int is provided, the kernel size will be randomly chosen\n            from the range (3, blur_limit). If a tuple (min, max) is provided,\n            the kernel size will be randomly chosen from the range (min, max).\n            Default: (7, 15).\n        cutoff (tuple[float, float]): Range to choose the cutoff frequency in radians.\n            Values should be in the range (0, \u03c0). A lower cutoff frequency will\n            result in more pronounced ringing effects.\n            Default: (\u03c0/4, \u03c0/2).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - Ringing artifacts are oscillations of the image intensity function in the neighborhood\n          of sharp transitions, such as edges or object boundaries.\n        - This transform uses a 2D sinc filter (also known as a 2D cardinal sine function)\n          to introduce these artifacts.\n        - The severity of the ringing effect is controlled by both the kernel size (blur_limit)\n          and the cutoff frequency.\n        - Larger kernel sizes and lower cutoff frequencies will generally produce more\n          noticeable ringing effects.\n        - This transform can be useful for:\n          * Simulating imperfections in image processing or transmission systems\n          * Testing the robustness of computer vision models to ringing artifacts\n          * Creating artistic effects that emphasize edges and transitions in images\n\n    Mathematical Formulation:\n        The 2D sinc filter kernel is defined as:\n\n        K(x, y) = cutoff * J\u2081(cutoff * \u221a(x\u00b2 + y\u00b2)) / (2\u03c0 * \u221a(x\u00b2 + y\u00b2))\n\n        where:\n        - J\u2081 is the Bessel function of the first kind of order 1\n        - cutoff is the chosen cutoff frequency\n        - x and y are the distances from the kernel center\n\n        The filtered image I' is obtained by convolving the input image I with the kernel K:\n\n        I'(x, y) = \u2211\u2211 I(x-u, y-v) * K(u, v)\n\n        The convolution operation introduces the ringing artifacts near sharp transitions.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n\n        # Apply ringing effect with default parameters\n        &gt;&gt;&gt; transform = A.RingingOvershoot(p=1.0)\n        &gt;&gt;&gt; ringing_image = transform(image=image)['image']\n\n        # Apply ringing effect with custom parameters\n        &gt;&gt;&gt; transform = A.RingingOvershoot(\n        ...     blur_limit=(9, 17),\n        ...     cutoff=(np.pi/6, np.pi/3),\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; ringing_image = transform(image=image)['image']\n\n    References:\n        - Ringing artifacts: https://en.wikipedia.org/wiki/Ringing_artifacts\n        - Sinc filter: https://en.wikipedia.org/wiki/Sinc_filter\n        - \"The Importance of Ringing Artifacts in Image Processing\" by Jae S. Lim, 1981\n        - \"Digital Image Processing\" by Rafael C. Gonzalez and Richard E. Woods, 4th Edition\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        blur_limit: ScaleIntType\n        cutoff: Annotated[tuple[float, float], nondecreasing]\n\n        @field_validator(\"cutoff\")\n        @classmethod\n        def check_cutoff(\n            cls,\n            v: tuple[float, float],\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = 0, np.pi\n            check_range(v, *bounds, info.field_name)\n            return v\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (7, 15),\n        cutoff: tuple[float, float] = (np.pi / 4, np.pi / 2),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.cutoff = cutoff\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        ksize = self.py_random.randrange(self.blur_limit[0], self.blur_limit[1] + 1, 2)\n        if ksize % 2 == 0:\n            raise ValueError(f\"Kernel size must be odd. Got: {ksize}\")\n\n        cutoff = self.py_random.uniform(*self.cutoff)\n\n        # From dsp.stackexchange.com/questions/58301/2-d-circularly-symmetric-low-pass-filter\n        with np.errstate(divide=\"ignore\", invalid=\"ignore\"):\n            kernel = np.fromfunction(\n                lambda x, y: cutoff\n                * special.j1(\n                    cutoff * np.sqrt((x - (ksize - 1) / 2) ** 2 + (y - (ksize - 1) / 2) ** 2),\n                )\n                / (2 * np.pi * np.sqrt((x - (ksize - 1) / 2) ** 2 + (y - (ksize - 1) / 2) ** 2)),\n                [ksize, ksize],\n            )\n        kernel[(ksize - 1) // 2, (ksize - 1) // 2] = cutoff**2 / (4 * np.pi)\n\n        # Normalize kernel\n        kernel = kernel.astype(np.float32) / np.sum(kernel)\n\n        return {\"kernel\": kernel}\n\n    def apply(self, img: np.ndarray, kernel: int, **params: Any) -&gt; np.ndarray:\n        return fmain.convolve(img, kernel)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"blur_limit\", \"cutoff\")\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.SaltAndPepper","title":"<code>class  SaltAndPepper</code> <code>       (amount=(0.01, 0.06), salt_vs_pepper=(0.4, 0.6), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply salt and pepper noise to the input image.</p> <p>Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt) or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.</p> <p>Parameters:</p> Name Type Description <code>amount</code> <code>float, float</code> <p>Range for total amount of noise (both salt and pepper). Values between 0 and 1. For example: - 0.05 means 5% of all pixels will be replaced with noise - (0.01, 0.06) will sample amount uniformly from 1% to 6% Default: (0.01, 0.06)</p> <code>salt_vs_pepper</code> <code>float, float</code> <p>Range for ratio of salt (white) vs pepper (black) noise. Values between 0 and 1. For example: - 0.5 means equal amounts of salt and pepper - 0.7 means 70% of noisy pixels will be salt, 30% pepper - (0.4, 0.6) will sample ratio uniformly from 40% to 60% Default: (0.4, 0.6)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Salt noise sets pixels to maximum value (255 for uint8, 1.0 for float32)</li> <li>Pepper noise sets pixels to 0</li> <li>Salt and pepper masks are generated independently, so a pixel could theoretically   be selected for both (in this case, pepper overrides salt)</li> <li>The actual number of affected pixels might slightly differ from the specified amount   due to random sampling and potential overlap of salt and pepper masks</li> </ul> <p>Mathematical Formulation:     For an input image I, the output O is:     O[x,y] = max_value,  if salt_mask[x,y] = True     O[x,y] = 0,         if pepper_mask[x,y] = True     O[x,y] = I[x,y],    otherwise</p> <pre><code>where:\nP(salt_mask[x,y] = True) = amount * salt_ratio\nP(pepper_mask[x,y] = True) = amount * (1 - salt_ratio)\namount \u2208 [amount_min, amount_max]\nsalt_ratio \u2208 [salt_vs_pepper_min, salt_vs_pepper_max]\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-salt-and-pepper-noise-with-default-parameters","title":"Apply salt and pepper noise with default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.SaltAndPepper(p=1.0)\n&gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--heavy-noise-with-more-salt-than-pepper","title":"Heavy noise with more salt than pepper","text":"Python<pre><code>&gt;&gt;&gt; transform = A.SaltAndPepper(\n...     amount=(0.1, 0.2),       # 10-20% of pixels will be noisy\n...     salt_vs_pepper=(0.7, 0.9),  # 70-90% of noise will be salt\n...     p=1.0\n... )\n&gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <p>.. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"        Chapter 5: Image Restoration and Reconstruction.</p> <p>.. [2] A. K. Jain, \"Fundamentals of Digital Image Processing,\"        Chapter 7: Image Degradation and Restoration.</p> <p>.. [3] Salt and pepper noise:        https://en.wikipedia.org/wiki/Salt-and-pepper_noise</p> <p>See Also:     - GaussNoise: For additive Gaussian noise     - MultiplicativeNoise: For multiplicative noise     - ISONoise: For camera sensor noise simulation</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class SaltAndPepper(ImageOnlyTransform):\n    \"\"\"Apply salt and pepper noise to the input image.\n\n    Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt)\n    or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.\n\n    Args:\n        amount ((float, float)): Range for total amount of noise (both salt and pepper).\n            Values between 0 and 1. For example:\n            - 0.05 means 5% of all pixels will be replaced with noise\n            - (0.01, 0.06) will sample amount uniformly from 1% to 6%\n            Default: (0.01, 0.06)\n\n        salt_vs_pepper ((float, float)): Range for ratio of salt (white) vs pepper (black) noise.\n            Values between 0 and 1. For example:\n            - 0.5 means equal amounts of salt and pepper\n            - 0.7 means 70% of noisy pixels will be salt, 30% pepper\n            - (0.4, 0.6) will sample ratio uniformly from 40% to 60%\n            Default: (0.4, 0.6)\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Salt noise sets pixels to maximum value (255 for uint8, 1.0 for float32)\n        - Pepper noise sets pixels to 0\n        - Salt and pepper masks are generated independently, so a pixel could theoretically\n          be selected for both (in this case, pepper overrides salt)\n        - The actual number of affected pixels might slightly differ from the specified amount\n          due to random sampling and potential overlap of salt and pepper masks\n\n    Mathematical Formulation:\n        For an input image I, the output O is:\n        O[x,y] = max_value,  if salt_mask[x,y] = True\n        O[x,y] = 0,         if pepper_mask[x,y] = True\n        O[x,y] = I[x,y],    otherwise\n\n        where:\n        P(salt_mask[x,y] = True) = amount * salt_ratio\n        P(pepper_mask[x,y] = True) = amount * (1 - salt_ratio)\n        amount \u2208 [amount_min, amount_max]\n        salt_ratio \u2208 [salt_vs_pepper_min, salt_vs_pepper_max]\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Apply salt and pepper noise with default parameters\n        &gt;&gt;&gt; transform = A.SaltAndPepper(p=1.0)\n        &gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n\n        # Heavy noise with more salt than pepper\n        &gt;&gt;&gt; transform = A.SaltAndPepper(\n        ...     amount=(0.1, 0.2),       # 10-20% of pixels will be noisy\n        ...     salt_vs_pepper=(0.7, 0.9),  # 70-90% of noise will be salt\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n\n    References:\n        .. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"\n               Chapter 5: Image Restoration and Reconstruction.\n\n        .. [2] A. K. Jain, \"Fundamentals of Digital Image Processing,\"\n               Chapter 7: Image Degradation and Restoration.\n\n        .. [3] Salt and pepper noise:\n               https://en.wikipedia.org/wiki/Salt-and-pepper_noise\n\n    See Also:\n        - GaussNoise: For additive Gaussian noise\n        - MultiplicativeNoise: For multiplicative noise\n        - ISONoise: For camera sensor noise simulation\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        amount: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        salt_vs_pepper: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n\n    def __init__(\n        self,\n        amount: tuple[float, float] = (0.01, 0.06),\n        salt_vs_pepper: tuple[float, float] = (0.4, 0.6),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.amount = amount\n        self.salt_vs_pepper = salt_vs_pepper\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        # Sample total amount and salt ratio\n        total_amount = self.py_random.uniform(*self.amount)\n        salt_ratio = self.py_random.uniform(*self.salt_vs_pepper)\n\n        # Calculate individual probabilities\n        prob_salt = total_amount * salt_ratio\n        prob_pepper = total_amount * (1 - salt_ratio)\n\n        # Generate masks\n        salt_mask = self.random_generator.random(image.shape) &lt; prob_salt\n        pepper_mask = self.random_generator.random(image.shape) &lt; prob_pepper\n\n        return {\n            \"salt_mask\": salt_mask,\n            \"pepper_mask\": pepper_mask,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        salt_mask: np.ndarray,\n        pepper_mask: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.apply_salt_and_pepper(img, salt_mask, pepper_mask)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"amount\", \"salt_vs_pepper\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Sharpen","title":"<code>class  Sharpen</code> <code>       (alpha=(0.2, 0.5), lightness=(0.5, 1.0), method='kernel', kernel_size=5, sigma=1.0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Sharpen the input image using either kernel-based or Gaussian interpolation method.</p> <p>Implements two different approaches to image sharpening: 1. Traditional kernel-based method using Laplacian operator 2. Gaussian interpolation method (similar to Kornia's approach)</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>tuple[float, float]</code> <p>Range for the visibility of sharpening effect. At 0, only the original image is visible, at 1.0 only its processed version is visible. Values should be in the range [0, 1]. Used in both methods. Default: (0.2, 0.5).</p> <code>lightness</code> <code>tuple[float, float]</code> <p>Range for the lightness of the sharpened image. Only used in 'kernel' method. Larger values create higher contrast. Values should be greater than 0. Default: (0.5, 1.0).</p> <code>method</code> <code>Literal['kernel', 'gaussian']</code> <p>Sharpening algorithm to use: - 'kernel': Traditional kernel-based sharpening using Laplacian operator - 'gaussian': Interpolation between Gaussian blurred and original image Default: 'kernel'</p> <code>kernel_size</code> <code>int</code> <p>Size of the Gaussian blur kernel for 'gaussian' method. Must be odd. Default: 5</p> <code>sigma</code> <code>float</code> <p>Standard deviation for Gaussian kernel in 'gaussian' method. Default: 1.0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Mathematical Formulation:     1. Kernel Method:        The sharpening operation is based on the Laplacian operator L:        L = [[-1, -1, -1],             [-1,  8, -1],             [-1, -1, -1]]</p> <pre><code>   The final kernel K is a weighted sum:\n   K = (1 - a)I + a(L + \u03bbI)\n\n   where:\n   - a is the alpha value\n   - \u03bb is the lightness value\n   - I is the identity kernel\n\n   The output image O is computed as:\n   O = K * I  (convolution)\n\n2. Gaussian Method:\n   Based on the unsharp mask principle:\n   O = aI + (1-a)G\n\n   where:\n   - I is the input image\n   - G is the Gaussian blurred version of I\n   - a is the alpha value (sharpness)\n\n   The Gaussian kernel G(x,y) is defined as:\n   G(x,y) = (1/(2\u03c0s\u00b2))exp(-(x\u00b2+y\u00b2)/(2s\u00b2))\n</code></pre> <p>Note</p> <ul> <li>Kernel sizes must be odd to maintain spatial alignment</li> <li>Methods produce different visual results:</li> <li>Kernel method: More pronounced edges, possible artifacts</li> <li>Gaussian method: More natural look, limited to original sharpness</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import numpy as np\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--traditional-kernel-sharpening","title":"Traditional kernel sharpening","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Sharpen(\n...     alpha=(0.2, 0.5),\n...     lightness=(0.5, 1.0),\n...     method='kernel',\n...     p=1.0\n... )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--gaussian-interpolation-sharpening","title":"Gaussian interpolation sharpening","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Sharpen(\n...     alpha=(0.5, 1.0),\n...     method='gaussian',\n...     kernel_size=5,\n...     sigma=1.0,\n...     p=1.0\n... )\n</code></pre> <p>References</p> <p>.. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"        Chapter 3: Intensity Transformations and Spatial Filtering.</p> <p>.. [2] J. C. Russ, \"The Image Processing Handbook (7th Edition),\"        Chapter 4: Image Enhancement.</p> <p>.. [3] T. Acharya and A. K. Ray, \"Image Processing: Principles and Applications,\"        Chapter 5: Image Enhancement.</p> <p>.. [4] Unsharp masking:        https://en.wikipedia.org/wiki/Unsharp_masking</p> <p>.. [5] Laplacian operator:        https://en.wikipedia.org/wiki/Laplace_operator</p> <p>.. [6] Gaussian blur:        https://en.wikipedia.org/wiki/Gaussian_blur</p> <p>See Also:     - Blur: For Gaussian blurring     - UnsharpMask: Alternative sharpening method     - RandomBrightnessContrast: For adjusting image contrast</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Sharpen(ImageOnlyTransform):\n    \"\"\"Sharpen the input image using either kernel-based or Gaussian interpolation method.\n\n    Implements two different approaches to image sharpening:\n    1. Traditional kernel-based method using Laplacian operator\n    2. Gaussian interpolation method (similar to Kornia's approach)\n\n    Args:\n        alpha (tuple[float, float]): Range for the visibility of sharpening effect.\n            At 0, only the original image is visible, at 1.0 only its processed version is visible.\n            Values should be in the range [0, 1].\n            Used in both methods. Default: (0.2, 0.5).\n\n        lightness (tuple[float, float]): Range for the lightness of the sharpened image.\n            Only used in 'kernel' method. Larger values create higher contrast.\n            Values should be greater than 0. Default: (0.5, 1.0).\n\n        method (Literal['kernel', 'gaussian']): Sharpening algorithm to use:\n            - 'kernel': Traditional kernel-based sharpening using Laplacian operator\n            - 'gaussian': Interpolation between Gaussian blurred and original image\n            Default: 'kernel'\n\n        kernel_size (int): Size of the Gaussian blur kernel for 'gaussian' method.\n            Must be odd. Default: 5\n\n        sigma (float): Standard deviation for Gaussian kernel in 'gaussian' method.\n            Default: 1.0\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Mathematical Formulation:\n        1. Kernel Method:\n           The sharpening operation is based on the Laplacian operator L:\n           L = [[-1, -1, -1],\n                [-1,  8, -1],\n                [-1, -1, -1]]\n\n           The final kernel K is a weighted sum:\n           K = (1 - a)I + a(L + \u03bbI)\n\n           where:\n           - a is the alpha value\n           - \u03bb is the lightness value\n           - I is the identity kernel\n\n           The output image O is computed as:\n           O = K * I  (convolution)\n\n        2. Gaussian Method:\n           Based on the unsharp mask principle:\n           O = aI + (1-a)G\n\n           where:\n           - I is the input image\n           - G is the Gaussian blurred version of I\n           - a is the alpha value (sharpness)\n\n           The Gaussian kernel G(x,y) is defined as:\n           G(x,y) = (1/(2\u03c0s\u00b2))exp(-(x\u00b2+y\u00b2)/(2s\u00b2))\n\n    Note:\n        - Kernel sizes must be odd to maintain spatial alignment\n        - Methods produce different visual results:\n          * Kernel method: More pronounced edges, possible artifacts\n          * Gaussian method: More natural look, limited to original sharpness\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import numpy as np\n\n        # Traditional kernel sharpening\n        &gt;&gt;&gt; transform = A.Sharpen(\n        ...     alpha=(0.2, 0.5),\n        ...     lightness=(0.5, 1.0),\n        ...     method='kernel',\n        ...     p=1.0\n        ... )\n\n        # Gaussian interpolation sharpening\n        &gt;&gt;&gt; transform = A.Sharpen(\n        ...     alpha=(0.5, 1.0),\n        ...     method='gaussian',\n        ...     kernel_size=5,\n        ...     sigma=1.0,\n        ...     p=1.0\n        ... )\n\n    References:\n        .. [1] R. C. Gonzalez and R. E. Woods, \"Digital Image Processing (4th Edition),\"\n               Chapter 3: Intensity Transformations and Spatial Filtering.\n\n        .. [2] J. C. Russ, \"The Image Processing Handbook (7th Edition),\"\n               Chapter 4: Image Enhancement.\n\n        .. [3] T. Acharya and A. K. Ray, \"Image Processing: Principles and Applications,\"\n               Chapter 5: Image Enhancement.\n\n        .. [4] Unsharp masking:\n               https://en.wikipedia.org/wiki/Unsharp_masking\n\n        .. [5] Laplacian operator:\n               https://en.wikipedia.org/wiki/Laplace_operator\n\n        .. [6] Gaussian blur:\n               https://en.wikipedia.org/wiki/Gaussian_blur\n\n    See Also:\n        - Blur: For Gaussian blurring\n        - UnsharpMask: Alternative sharpening method\n        - RandomBrightnessContrast: For adjusting image contrast\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        alpha: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        lightness: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, None))]\n        method: Literal[\"kernel\", \"gaussian\"]\n        kernel_size: int = Field(ge=3)\n        sigma: float = Field(gt=0)\n\n    @field_validator(\"kernel_size\")\n    @classmethod\n    def check_kernel_size(cls, value: int) -&gt; int:\n        return value + 1 if value % 2 == 0 else value\n\n    def __init__(\n        self,\n        alpha: tuple[float, float] = (0.2, 0.5),\n        lightness: tuple[float, float] = (0.5, 1.0),\n        method: Literal[\"kernel\", \"gaussian\"] = \"kernel\",\n        kernel_size: int = 5,\n        sigma: float = 1.0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.alpha = alpha\n        self.lightness = lightness\n        self.method = method\n        self.kernel_size = kernel_size\n        self.sigma = sigma\n\n    @staticmethod\n    def __generate_sharpening_matrix(\n        alpha: np.ndarray,\n        lightness: np.ndarray,\n    ) -&gt; np.ndarray:\n        matrix_nochange = np.array([[0, 0, 0], [0, 1, 0], [0, 0, 0]], dtype=np.float32)\n        matrix_effect = np.array(\n            [[-1, -1, -1], [-1, 8 + lightness, -1], [-1, -1, -1]],\n            dtype=np.float32,\n        )\n\n        return (1 - alpha) * matrix_nochange + alpha * matrix_effect\n\n    def get_params(self) -&gt; dict[str, Any]:\n        alpha = self.py_random.uniform(*self.alpha)\n\n        if self.method == \"kernel\":\n            lightness = self.py_random.uniform(*self.lightness)\n            return {\n                \"alpha\": alpha,\n                \"sharpening_matrix\": self.__generate_sharpening_matrix(\n                    alpha,\n                    lightness,\n                ),\n            }\n\n        return {\"alpha\": alpha, \"sharpening_matrix\": None}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        alpha: float,\n        sharpening_matrix: np.ndarray | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if self.method == \"kernel\":\n            return fmain.convolve(img, sharpening_matrix)\n        return fmain.sharpen_gaussian(img, alpha, self.kernel_size, self.sigma)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"alpha\", \"lightness\", \"method\", \"kernel_size\", \"sigma\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ShotNoise","title":"<code>class  ShotNoise</code> <code>       (scale_range=(0.1, 0.3), p=0.5, always_apply=False)                       </code>  [view source on GitHub]","text":"<p>Apply shot noise to the image by modeling photon counting as a Poisson process.</p> <p>Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light. When photons hit an imaging sensor, they arrive at random times following Poisson statistics. This transform simulates this physical process in linear light space by: 1. Converting to linear space (removing gamma) 2. Treating each pixel value as an expected photon count 3. Sampling actual photon counts from a Poisson distribution 4. Converting back to display space (reapplying gamma)</p> <p>The noise characteristics follow real camera behavior: - Noise variance equals signal mean in linear space (Poisson statistics) - Brighter regions have more absolute noise but less relative noise - Darker regions have less absolute noise but more relative noise - Noise is generated independently for each pixel and color channel</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for sampling the noise scale factor. Represents the reciprocal of the expected photon count per unit intensity. Higher values mean more noise: - scale = 0.1: ~100 photons per unit intensity (low noise) - scale = 1.0: ~1 photon per unit intensity (moderate noise) - scale = 10.0: ~0.1 photons per unit intensity (high noise) Default: (0.1, 0.3)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Performs calculations in linear light space (gamma = 2.2)</li> <li>Preserves the image's mean intensity</li> <li>Memory efficient with in-place operations</li> <li>Thread-safe with independent random seeds</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Generate synthetic image\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; # Apply moderate shot noise\n&gt;&gt;&gt; transform = A.ShotNoise(scale_range=(0.1, 1.0), p=1.0)\n&gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Shot noise: https://en.wikipedia.org/wiki/Shot_noise</li> <li>Original paper: https://doi.org/10.1002/andp.19183622304 (Schottky, 1918)</li> <li>Poisson process: https://en.wikipedia.org/wiki/Poisson_point_process</li> <li>Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ShotNoise(ImageOnlyTransform):\n    \"\"\"Apply shot noise to the image by modeling photon counting as a Poisson process.\n\n    Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light.\n    When photons hit an imaging sensor, they arrive at random times following Poisson statistics.\n    This transform simulates this physical process in linear light space by:\n    1. Converting to linear space (removing gamma)\n    2. Treating each pixel value as an expected photon count\n    3. Sampling actual photon counts from a Poisson distribution\n    4. Converting back to display space (reapplying gamma)\n\n    The noise characteristics follow real camera behavior:\n    - Noise variance equals signal mean in linear space (Poisson statistics)\n    - Brighter regions have more absolute noise but less relative noise\n    - Darker regions have less absolute noise but more relative noise\n    - Noise is generated independently for each pixel and color channel\n\n    Args:\n        scale_range (tuple[float, float]): Range for sampling the noise scale factor.\n            Represents the reciprocal of the expected photon count per unit intensity.\n            Higher values mean more noise:\n            - scale = 0.1: ~100 photons per unit intensity (low noise)\n            - scale = 1.0: ~1 photon per unit intensity (moderate noise)\n            - scale = 10.0: ~0.1 photons per unit intensity (high noise)\n            Default: (0.1, 0.3)\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Performs calculations in linear light space (gamma = 2.2)\n        - Preserves the image's mean intensity\n        - Memory efficient with in-place operations\n        - Thread-safe with independent random seeds\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Generate synthetic image\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; # Apply moderate shot noise\n        &gt;&gt;&gt; transform = A.ShotNoise(scale_range=(0.1, 1.0), p=1.0)\n        &gt;&gt;&gt; noisy_image = transform(image=image)[\"image\"]\n\n    References:\n        - Shot noise: https://en.wikipedia.org/wiki/Shot_noise\n        - Original paper: https://doi.org/10.1002/andp.19183622304 (Schottky, 1918)\n        - Poisson process: https://en.wikipedia.org/wiki/Poisson_point_process\n        - Gamma correction: https://en.wikipedia.org/wiki/Gamma_correction\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_range: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, None)),\n        ]\n\n    def __init__(\n        self,\n        scale_range: tuple[float, float] = (0.1, 0.3),\n        p: float = 0.5,\n        always_apply: bool = False,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_range = scale_range\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        random_seed: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.shot_noise(img, scale, np.random.default_rng(random_seed))\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"scale\": self.py_random.uniform(*self.scale_range),\n            \"random_seed\": self.random_generator.integers(0, 2**32 - 1),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"scale_range\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Solarize","title":"<code>class  Solarize</code> <code>       (threshold=None, threshold_range=(0.5, 0.5), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Invert all pixel values above a threshold.</p> <p>This transform applies a solarization effect to the input image. Solarization is a phenomenon in photography in which the image recorded on a negative or on a photographic print is wholly or partially reversed in tone. Dark areas appear light or light areas appear dark.</p> <p>In this implementation, all pixel values above a threshold are inverted.</p> <p>Parameters:</p> Name Type Description <code>threshold_range</code> <code>tuple[float, float]</code> <p>Range for solarizing threshold as a fraction of maximum value. The threshold_range should be in the range [0, 1] and will be multiplied by the maximum value of the image type (255 for uint8 images or 1.0 for float images). Default: (0.5, 0.5) (corresponds to 127.5 for uint8 and 0.5 for float32).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>For uint8 images, pixel values above the threshold are inverted as: 255 - pixel_value</li> <li>For float32 images, pixel values above the threshold are inverted as: 1.0 - pixel_value</li> <li>The threshold is applied to each channel independently</li> <li>The threshold is calculated in two steps:</li> <li>Sample a value from threshold_range</li> <li>Multiply by the image's maximum value:<ul> <li>For uint8: threshold = sampled_value * 255</li> <li>For float32: threshold = sampled_value * 1.0</li> </ul> </li> <li>This transform can create interesting artistic effects or be used for data augmentation</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n# Solarize uint8 image with fixed threshold at 50% of max value (127.5)\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n&gt;&gt;&gt; solarized_image = transform(image=image)['image']\n&gt;&gt;&gt;\n# Solarize uint8 image with random threshold between 40-60% of max value (102-153)\n&gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.4, 0.6), p=1.0)\n&gt;&gt;&gt; solarized_image = transform(image=image)['image']\n&gt;&gt;&gt;\n# Solarize float32 image at 50% of max value (0.5)\n&gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n&gt;&gt;&gt; solarized_image = transform(image=image)['image']\n</code></pre> <p>Mathematical Formulation:     Let f be a value sampled from threshold_range (min, max).     For each pixel value p:     threshold = f * max_value     if p &gt; threshold:         p_new = max_value - p     !!! else         p_new = p</p> <pre><code>Where max_value is 255 for uint8 images and 1.0 for float32 images.\n</code></pre> <p>See Also:     Invert: For inverting all pixel values regardless of a threshold.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Solarize(ImageOnlyTransform):\n    \"\"\"Invert all pixel values above a threshold.\n\n    This transform applies a solarization effect to the input image. Solarization is a phenomenon in\n    photography in which the image recorded on a negative or on a photographic print is wholly or\n    partially reversed in tone. Dark areas appear light or light areas appear dark.\n\n    In this implementation, all pixel values above a threshold are inverted.\n\n    Args:\n        threshold_range (tuple[float, float]): Range for solarizing threshold as a fraction\n            of maximum value. The threshold_range should be in the range [0, 1] and will be multiplied by the\n            maximum value of the image type (255 for uint8 images or 1.0 for float images).\n            Default: (0.5, 0.5) (corresponds to 127.5 for uint8 and 0.5 for float32).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - For uint8 images, pixel values above the threshold are inverted as: 255 - pixel_value\n        - For float32 images, pixel values above the threshold are inverted as: 1.0 - pixel_value\n        - The threshold is applied to each channel independently\n        - The threshold is calculated in two steps:\n          1. Sample a value from threshold_range\n          2. Multiply by the image's maximum value:\n             * For uint8: threshold = sampled_value * 255\n             * For float32: threshold = sampled_value * 1.0\n        - This transform can create interesting artistic effects or be used for data augmentation\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        # Solarize uint8 image with fixed threshold at 50% of max value (127.5)\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n        &gt;&gt;&gt; solarized_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        # Solarize uint8 image with random threshold between 40-60% of max value (102-153)\n        &gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.4, 0.6), p=1.0)\n        &gt;&gt;&gt; solarized_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        # Solarize float32 image at 50% of max value (0.5)\n        &gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; transform = A.Solarize(threshold_range=(0.5, 0.5), p=1.0)\n        &gt;&gt;&gt; solarized_image = transform(image=image)['image']\n\n    Mathematical Formulation:\n        Let f be a value sampled from threshold_range (min, max).\n        For each pixel value p:\n        threshold = f * max_value\n        if p &gt; threshold:\n            p_new = max_value - p\n        else:\n            p_new = p\n\n        Where max_value is 255 for uint8 images and 1.0 for float32 images.\n\n    See Also:\n        Invert: For inverting all pixel values regardless of a threshold.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        threshold: ScaleFloatType | None\n        threshold_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n\n        @staticmethod\n        def normalize_threshold(\n            threshold: ScaleFloatType | None,\n            threshold_range: tuple[float, float],\n        ) -&gt; tuple[float, float]:\n            \"\"\"Convert legacy threshold or use threshold_range, normalizing to [0,1] range.\"\"\"\n            if threshold is not None:\n                warn(\"`threshold` deprecated. Use `threshold_range` instead.\", DeprecationWarning, stacklevel=2)\n                value = to_tuple(threshold, threshold)\n                return (value[0] / 255, value[1] / 255) if value[1] &gt; 1 else value\n            return threshold_range\n\n        @model_validator(mode=\"after\")\n        def process_threshold(self) -&gt; Self:\n            self.threshold_range = self.normalize_threshold(\n                self.threshold,\n                self.threshold_range,\n            )\n            return self\n\n    def __init__(\n        self,\n        threshold: ScaleFloatType | None = None,\n        threshold_range: tuple[float, float] = (0.5, 0.5),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.threshold_range = threshold_range\n\n    def apply(self, img: np.ndarray, threshold: float, **params: Any) -&gt; np.ndarray:\n        return fmain.solarize(img, threshold)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"threshold\": self.py_random.uniform(*self.threshold_range)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"threshold_range\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Spatter","title":"<code>class  Spatter</code> <code>       (mean=(0.65, 0.65), std=(0.3, 0.3), gauss_sigma=(2, 2), cutout_threshold=(0.68, 0.68), intensity=(0.6, 0.6), mode='rain', color=None, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply spatter transform. It simulates corruption which can occlude a lens in the form of rain or mud.</p> <p>Parameters:</p> Name Type Description <code>mean</code> <code>tuple[float, float] | float</code> <p>Mean value of normal distribution for generating liquid layer. If single float mean will be sampled from <code>(0, mean)</code> If tuple of float mean will be sampled from range <code>(mean[0], mean[1])</code>. If you want constant value use (mean, mean). Default (0.65, 0.65)</p> <code>std</code> <code>tuple[float, float] | float</code> <p>Standard deviation value of normal distribution for generating liquid layer. If single float the number will be sampled from <code>(0, std)</code>. If tuple of float std will be sampled from range <code>(std[0], std[1])</code>. If you want constant value use (std, std). Default: (0.3, 0.3).</p> <code>gauss_sigma</code> <code>tuple[float, float] | floats</code> <p>Sigma value for gaussian filtering of liquid layer. If single float the number will be sampled from <code>(0, gauss_sigma)</code>. If tuple of float gauss_sigma will be sampled from range <code>(gauss_sigma[0], gauss_sigma[1])</code>. If you want constant value use (gauss_sigma, gauss_sigma). Default: (2, 3).</p> <code>cutout_threshold</code> <code>tuple[float, float] | floats</code> <p>Threshold for filtering liqued layer (determines number of drops). If single float it will used as cutout_threshold. If single float the number will be sampled from <code>(0, cutout_threshold)</code>. If tuple of float cutout_threshold will be sampled from range <code>(cutout_threshold[0], cutout_threshold[1])</code>. If you want constant value use <code>(cutout_threshold, cutout_threshold)</code>. Default: (0.68, 0.68).</p> <code>intensity</code> <code>tuple[float, float] | floats</code> <p>Intensity of corruption. If single float the number will be sampled from <code>(0, intensity)</code>. If tuple of float intensity will be sampled from range <code>(intensity[0], intensity[1])</code>. If you want constant value use <code>(intensity, intensity)</code>. Default: (0.6, 0.6).</p> <code>mode</code> <code>str, or list[str]</code> <p>Type of corruption. Currently, supported options are 'rain' and 'mud'.  If list is provided type of corruption will be sampled list. Default: (\"rain\").</p> <code>color</code> <code>list of (r, g, b) or dict or None</code> <p>Corruption elements color. If list uses provided list as color for specified mode. If dict uses provided color for specified mode. Color for each specified mode should be provided in dict. If None uses default colors (rain: (238, 238, 175), mud: (20, 42, 63)).</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>https://arxiv.org/abs/1903.12261 https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Spatter(ImageOnlyTransform):\n    \"\"\"Apply spatter transform. It simulates corruption which can occlude a lens in the form of rain or mud.\n\n    Args:\n        mean (tuple[float, float] | float): Mean value of normal distribution for generating liquid layer.\n            If single float mean will be sampled from `(0, mean)`\n            If tuple of float mean will be sampled from range `(mean[0], mean[1])`.\n            If you want constant value use (mean, mean).\n            Default (0.65, 0.65)\n        std (tuple[float, float] | float): Standard deviation value of normal distribution for generating liquid layer.\n            If single float the number will be sampled from `(0, std)`.\n            If tuple of float std will be sampled from range `(std[0], std[1])`.\n            If you want constant value use (std, std).\n            Default: (0.3, 0.3).\n        gauss_sigma (tuple[float, float] | floats): Sigma value for gaussian filtering of liquid layer.\n            If single float the number will be sampled from `(0, gauss_sigma)`.\n            If tuple of float gauss_sigma will be sampled from range `(gauss_sigma[0], gauss_sigma[1])`.\n            If you want constant value use (gauss_sigma, gauss_sigma).\n            Default: (2, 3).\n        cutout_threshold (tuple[float, float] | floats): Threshold for filtering liqued layer\n            (determines number of drops). If single float it will used as cutout_threshold.\n            If single float the number will be sampled from `(0, cutout_threshold)`.\n            If tuple of float cutout_threshold will be sampled from range `(cutout_threshold[0], cutout_threshold[1])`.\n            If you want constant value use `(cutout_threshold, cutout_threshold)`.\n            Default: (0.68, 0.68).\n        intensity (tuple[float, float] | floats): Intensity of corruption.\n            If single float the number will be sampled from `(0, intensity)`.\n            If tuple of float intensity will be sampled from range `(intensity[0], intensity[1])`.\n            If you want constant value use `(intensity, intensity)`.\n            Default: (0.6, 0.6).\n        mode (str, or list[str]): Type of corruption. Currently, supported options are 'rain' and 'mud'.\n             If list is provided type of corruption will be sampled list. Default: (\"rain\").\n        color (list of (r, g, b) or dict or None): Corruption elements color.\n            If list uses provided list as color for specified mode.\n            If dict uses provided color for specified mode. Color for each specified mode should be provided in dict.\n            If None uses default colors (rain: (238, 238, 175), mud: (20, 42, 63)).\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        https://arxiv.org/abs/1903.12261\n        https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        mean: ZeroOneRangeType = (0.65, 0.65)\n        std: ZeroOneRangeType = (0.3, 0.3)\n        gauss_sigma: NonNegativeFloatRangeType = (2, 2)\n        cutout_threshold: ZeroOneRangeType = (0.68, 0.68)\n        intensity: ZeroOneRangeType = (0.6, 0.6)\n        mode: SpatterMode | Sequence[SpatterMode]\n        color: Sequence[int] | dict[str, Sequence[int]] | None = None\n\n        @field_validator(\"mode\")\n        @classmethod\n        def check_mode(\n            cls,\n            mode: SpatterMode | Sequence[SpatterMode],\n        ) -&gt; Sequence[SpatterMode]:\n            if isinstance(mode, str):\n                return [mode]\n            return mode\n\n        @model_validator(mode=\"after\")\n        def check_color(self) -&gt; Self:\n            if self.color is None:\n                self.color = {\"rain\": [238, 238, 175], \"mud\": [20, 42, 63]}\n\n            elif isinstance(self.color, (list, tuple)) and len(self.mode) == 1:\n                if len(self.color) != NUM_RGB_CHANNELS:\n                    msg = \"Color must be a list of three integers for RGB format.\"\n                    raise ValueError(msg)\n                self.color = {self.mode[0]: self.color}\n            elif isinstance(self.color, dict):\n                result = {}\n                for mode in self.mode:\n                    if mode not in self.color:\n                        raise ValueError(f\"Color for mode {mode} is not specified.\")\n                    if len(self.color[mode]) != NUM_RGB_CHANNELS:\n                        raise ValueError(\n                            f\"Color for mode {mode} must be in RGB format.\",\n                        )\n                    result[mode] = self.color[mode]\n            else:\n                msg = \"Color must be a list of RGB values or a dict mapping mode to RGB values.\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        mean: ScaleFloatType = (0.65, 0.65),\n        std: ScaleFloatType = (0.3, 0.3),\n        gauss_sigma: ScaleFloatType = (2, 2),\n        cutout_threshold: ScaleFloatType = (0.68, 0.68),\n        intensity: ScaleFloatType = (0.6, 0.6),\n        mode: SpatterMode | Sequence[SpatterMode] = \"rain\",\n        color: Sequence[int] | dict[str, Sequence[int]] | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.mean = cast(tuple[float, float], mean)\n        self.std = cast(tuple[float, float], std)\n        self.gauss_sigma = cast(tuple[float, float], gauss_sigma)\n        self.cutout_threshold = cast(tuple[float, float], cutout_threshold)\n        self.intensity = cast(tuple[float, float], intensity)\n        self.mode = mode\n        self.color = cast(dict[str, Sequence[int]], color)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        non_mud: np.ndarray,\n        mud: np.ndarray,\n        drops: np.ndarray,\n        mode: SpatterMode,\n        **params: dict[str, Any],\n    ) -&gt; np.ndarray:\n        non_rgb_error(img)\n        return fmain.spatter(img, non_mud, mud, drops, mode)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        mean = self.py_random.uniform(*self.mean)\n        std = self.py_random.uniform(*self.std)\n        cutout_threshold = self.py_random.uniform(*self.cutout_threshold)\n        sigma = self.py_random.uniform(*self.gauss_sigma)\n        mode = self.py_random.choice(self.mode)\n        intensity = self.py_random.uniform(*self.intensity)\n        color = np.array(self.color[mode]) / 255.0\n\n        liquid_layer = self.random_generator.normal(\n            size=(height, width),\n            loc=mean,\n            scale=std,\n        )\n        liquid_layer = gaussian_filter(liquid_layer, sigma=sigma, mode=\"nearest\")\n        liquid_layer[liquid_layer &lt; cutout_threshold] = 0\n\n        if mode == \"rain\":\n            liquid_layer = clip(liquid_layer * 255, np.uint8, inplace=False)\n            dist = 255 - cv2.Canny(liquid_layer, 50, 150)\n            dist = cv2.distanceTransform(dist, cv2.DIST_L2, 5)\n            _, dist = cv2.threshold(dist, 20, 20, cv2.THRESH_TRUNC)\n            dist = clip(fblur.blur(dist, 3), np.uint8, inplace=True)\n            dist = fmain.equalize(dist)\n\n            ker = np.array([[-2, -1, 0], [-1, 1, 1], [0, 1, 2]])\n            dist = fmain.convolve(dist, ker)\n            dist = fblur.blur(dist, 3).astype(np.float32)\n\n            m = liquid_layer * dist\n            m *= 1 / np.max(m, axis=(0, 1))\n\n            drops = m[:, :, None] * color * intensity\n            mud = None\n            non_mud = None\n        else:\n            m = np.where(liquid_layer &gt; cutout_threshold, 1, 0)\n            m = gaussian_filter(m.astype(np.float32), sigma=sigma, mode=\"nearest\")\n            m[m &lt; 1.2 * cutout_threshold] = 0\n            m = m[..., np.newaxis]\n\n            mud = m * color\n            non_mud = 1 - m\n            drops = None\n\n        return {\n            \"non_mud\": non_mud,\n            \"mud\": mud,\n            \"drops\": drops,\n            \"mode\": mode,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str, str, str, str]:\n        return (\n            \"mean\",\n            \"std\",\n            \"gauss_sigma\",\n            \"intensity\",\n            \"cutout_threshold\",\n            \"mode\",\n            \"color\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.Superpixels","title":"<code>class  Superpixels</code> <code>       (p_replace=(0, 0.1), n_segments=(100, 100), max_size=128, interpolation=1, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Transform images partially/completely to their superpixel representation.</p> <p>Parameters:</p> Name Type Description <code>p_replace</code> <code>tuple[float, float] | float</code> <p>Defines for any segment the probability that the pixels within that segment are replaced by their average color (otherwise, the pixels are not changed).</p> <ul> <li>A probability of <code>0.0</code> would mean, that the pixels in no     segment are replaced by their average color (image is not     changed at all).</li> <li>A probability of <code>0.5</code> would mean, that around half of all     segments are replaced by their average color.</li> <li>A probability of <code>1.0</code> would mean, that all segments are     replaced by their average color (resulting in a voronoi     image).</li> </ul> <p>Behavior based on chosen data types for this parameter: * If a <code>float</code>, then that <code>float</code> will always be used. * If <code>tuple</code> <code>(a, b)</code>, then a random probability will be sampled from the interval <code>[a, b]</code> per image. Default: (0.1, 0.3)</p> <code>n_segments</code> <code>tuple[int, int] | int</code> <p>Rough target number of how many superpixels to generate. The algorithm may deviate from this number. Lower value will lead to coarser superpixels. Higher values are computationally more intensive and will hence lead to a slowdown. If tuple <code>(a, b)</code>, then a value from the discrete interval <code>[a..b]</code> will be sampled per image. Default: (15, 120)</p> <code>max_size</code> <code>int | None</code> <p>Maximum image size at which the augmentation is performed. If the width or height of an image exceeds this value, it will be downscaled before the augmentation so that the longest side matches <code>max_size</code>. This is done to speed up the process. The final output image has the same size as the input image. Note that in case <code>p_replace</code> is below <code>1.0</code>, the down-/upscaling will affect the not-replaced pixels too. Use <code>None</code> to apply no down-/upscaling. Default: 128</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform can significantly change the visual appearance of the image.</li> <li>The transform makes use of a superpixel algorithm, which tends to be slow. If performance is a concern, consider using <code>max_size</code> to limit the image size.</li> <li>The effect of this transform can vary greatly depending on the <code>p_replace</code> and <code>n_segments</code> parameters.</li> <li>When <code>p_replace</code> is high, the image can become highly abstracted, resembling a voronoi diagram.</li> <li>The transform preserves the original image type (uint8 or float32).</li> </ul> <p>Mathematical Formulation:     1. The image is segmented into approximately <code>n_segments</code> superpixels using the SLIC algorithm.     2. For each superpixel:     - With probability <code>p_replace</code>, all pixels in the superpixel are replaced with their mean color.     - With probability <code>1 - p_replace</code>, the superpixel is left unchanged.     3. If the image was resized due to <code>max_size</code>, it is resized back to its original dimensions.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-superpixels-with-default-parameters","title":"Apply superpixels with default parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Superpixels(p=1.0)\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms--apply-superpixels-with-custom-parameters","title":"Apply superpixels with custom parameters","text":"Python<pre><code>&gt;&gt;&gt; transform = A.Superpixels(\n...     p_replace=(0.5, 0.7),\n...     n_segments=(50, 100),\n...     max_size=None,\n...     interpolation=cv2.INTER_NEAREST,\n...     p=1.0\n... )\n&gt;&gt;&gt; augmented_image = transform(image=image)['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class Superpixels(ImageOnlyTransform):\n    \"\"\"Transform images partially/completely to their superpixel representation.\n\n    Args:\n        p_replace (tuple[float, float] | float): Defines for any segment the probability that the pixels within that\n            segment are replaced by their average color (otherwise, the pixels are not changed).\n\n\n            * A probability of ``0.0`` would mean, that the pixels in no\n                segment are replaced by their average color (image is not\n                changed at all).\n            * A probability of ``0.5`` would mean, that around half of all\n                segments are replaced by their average color.\n            * A probability of ``1.0`` would mean, that all segments are\n                replaced by their average color (resulting in a voronoi\n                image).\n\n            Behavior based on chosen data types for this parameter:\n            * If a ``float``, then that ``float`` will always be used.\n            * If ``tuple`` ``(a, b)``, then a random probability will be\n            sampled from the interval ``[a, b]`` per image.\n            Default: (0.1, 0.3)\n\n        n_segments (tuple[int, int] | int): Rough target number of how many superpixels to generate.\n            The algorithm may deviate from this number.\n            Lower value will lead to coarser superpixels.\n            Higher values are computationally more intensive and will hence lead to a slowdown.\n            If tuple ``(a, b)``, then a value from the discrete interval ``[a..b]`` will be sampled per image.\n            Default: (15, 120)\n\n        max_size (int | None): Maximum image size at which the augmentation is performed.\n            If the width or height of an image exceeds this value, it will be\n            downscaled before the augmentation so that the longest side matches `max_size`.\n            This is done to speed up the process. The final output image has the same size as the input image.\n            Note that in case `p_replace` is below ``1.0``,\n            the down-/upscaling will affect the not-replaced pixels too.\n            Use ``None`` to apply no down-/upscaling.\n            Default: 128\n\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform can significantly change the visual appearance of the image.\n        - The transform makes use of a superpixel algorithm, which tends to be slow.\n        If performance is a concern, consider using `max_size` to limit the image size.\n        - The effect of this transform can vary greatly depending on the `p_replace` and `n_segments` parameters.\n        - When `p_replace` is high, the image can become highly abstracted, resembling a voronoi diagram.\n        - The transform preserves the original image type (uint8 or float32).\n\n    Mathematical Formulation:\n        1. The image is segmented into approximately `n_segments` superpixels using the SLIC algorithm.\n        2. For each superpixel:\n        - With probability `p_replace`, all pixels in the superpixel are replaced with their mean color.\n        - With probability `1 - p_replace`, the superpixel is left unchanged.\n        3. If the image was resized due to `max_size`, it is resized back to its original dimensions.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply superpixels with default parameters\n        &gt;&gt;&gt; transform = A.Superpixels(p=1.0)\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n\n        # Apply superpixels with custom parameters\n        &gt;&gt;&gt; transform = A.Superpixels(\n        ...     p_replace=(0.5, 0.7),\n        ...     n_segments=(50, 100),\n        ...     max_size=None,\n        ...     interpolation=cv2.INTER_NEAREST,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; augmented_image = transform(image=image)['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        p_replace: ZeroOneRangeType\n        n_segments: OnePlusIntRangeType\n        max_size: int | None = Field(ge=1)\n        interpolation: InterpolationType\n\n    def __init__(\n        self,\n        p_replace: ScaleFloatType = (0, 0.1),\n        n_segments: ScaleIntType = (100, 100),\n        max_size: int | None = 128,\n        interpolation: int = cv2.INTER_LINEAR,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.p_replace = cast(tuple[float, float], p_replace)\n        self.n_segments = cast(tuple[int, int], n_segments)\n        self.max_size = max_size\n        self.interpolation = interpolation\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"p_replace\", \"n_segments\", \"max_size\", \"interpolation\"\n\n    def get_params(self) -&gt; dict[str, Any]:\n        n_segments = self.py_random.randint(*self.n_segments)\n        p = self.py_random.uniform(*self.p_replace)\n        return {\n            \"replace_samples\": self.random_generator.random(n_segments) &lt; p,\n            \"n_segments\": n_segments,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        replace_samples: Sequence[bool],\n        n_segments: int,\n        **kwargs: Any,\n    ) -&gt; np.ndarray:\n        return fmain.superpixels(\n            img,\n            n_segments,\n            replace_samples,\n            self.max_size,\n            self.interpolation,\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToFloat","title":"<code>class  ToFloat</code> <code>       (max_value=None, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Convert the input image to a floating-point representation.</p> <p>This transform divides pixel values by <code>max_value</code> to get a float32 output array where all values lie in the range [0, 1.0]. It's useful for normalizing image data before feeding it into neural networks or other algorithms that expect float input.</p> <p>Parameters:</p> Name Type Description <code>max_value</code> <code>float | None</code> <p>The maximum possible input value. If None, the transform will try to infer the maximum value by inspecting the data type of the input image: - uint8: 255 - uint16: 65535 - uint32: 4294967295 - float32: 1.0 Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, uint16, uint32, float32</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Image in floating point representation, with values in range [0, 1.0].</p> <p>Note</p> <ul> <li>If the input image is already float32 with values in [0, 1], it will be returned unchanged.</li> <li>For integer types (uint8, uint16, uint32), the function will scale the values to [0, 1] range.</li> <li>The output will always be float32, regardless of the input type.</li> <li>This transform is often used as a preprocessing step before applying other transformations   or feeding the image into a neural network.</li> </ul> <p>Exceptions:</p> Type Description <code>TypeError</code> <p>If the input image data type is not supported.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n# Convert uint8 image to float\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ToFloat(max_value=None)\n&gt;&gt;&gt; float_image = transform(image=image)['image']\n&gt;&gt;&gt; assert float_image.dtype == np.float32\n&gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n&gt;&gt;&gt;\n# Convert uint16 image to float with custom max_value\n&gt;&gt;&gt; image = np.random.randint(0, 4096, (100, 100, 3), dtype=np.uint16)\n&gt;&gt;&gt; transform = A.ToFloat(max_value=4095)\n&gt;&gt;&gt; float_image = transform(image=image)['image']\n&gt;&gt;&gt; assert float_image.dtype == np.float32\n&gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n</code></pre> <p>See Also:     FromFloat: The inverse operation, converting from float back to the original data type.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToFloat(ImageOnlyTransform):\n    \"\"\"Convert the input image to a floating-point representation.\n\n    This transform divides pixel values by `max_value` to get a float32 output array\n    where all values lie in the range [0, 1.0]. It's useful for normalizing image data\n    before feeding it into neural networks or other algorithms that expect float input.\n\n    Args:\n        max_value (float | None): The maximum possible input value. If None, the transform\n            will try to infer the maximum value by inspecting the data type of the input image:\n            - uint8: 255\n            - uint16: 65535\n            - uint32: 4294967295\n            - float32: 1.0\n            Default: None.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, uint16, uint32, float32\n\n    Returns:\n        np.ndarray: Image in floating point representation, with values in range [0, 1.0].\n\n    Note:\n        - If the input image is already float32 with values in [0, 1], it will be returned unchanged.\n        - For integer types (uint8, uint16, uint32), the function will scale the values to [0, 1] range.\n        - The output will always be float32, regardless of the input type.\n        - This transform is often used as a preprocessing step before applying other transformations\n          or feeding the image into a neural network.\n\n    Raises:\n        TypeError: If the input image data type is not supported.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        # Convert uint8 image to float\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ToFloat(max_value=None)\n        &gt;&gt;&gt; float_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert float_image.dtype == np.float32\n        &gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n        &gt;&gt;&gt;\n        # Convert uint16 image to float with custom max_value\n        &gt;&gt;&gt; image = np.random.randint(0, 4096, (100, 100, 3), dtype=np.uint16)\n        &gt;&gt;&gt; transform = A.ToFloat(max_value=4095)\n        &gt;&gt;&gt; float_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert float_image.dtype == np.float32\n        &gt;&gt;&gt; assert 0 &lt;= float_image.min() &lt;= float_image.max() &lt;= 1.0\n\n    See Also:\n        FromFloat: The inverse operation, converting from float back to the original data type.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        max_value: float | None\n\n    def __init__(\n        self,\n        max_value: float | None = None,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p, always_apply)\n        self.max_value = max_value\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return to_float(img, self.max_value)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str]:\n        return (\"max_value\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToGray","title":"<code>class  ToGray</code> <code>       (num_output_channels=3, method='weighted_average', always_apply=None, p=0.5)                     </code>  [view source on GitHub]","text":"<p>Convert an image to grayscale and optionally replicate the grayscale channel.</p> <p>This transform first converts a color image to a single-channel grayscale image using various methods, then replicates the grayscale channel if num_output_channels is greater than 1.</p> <p>Parameters:</p> Name Type Description <code>num_output_channels</code> <code>int</code> <p>The number of channels in the output image. If greater than 1, the grayscale channel will be replicated. Default: 3.</p> <code>method</code> <code>Literal[\"weighted_average\", \"from_lab\", \"desaturation\", \"average\", \"max\", \"pca\"]</code> <p>The method used for grayscale conversion: - \"weighted_average\": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B).   Works only with 3-channel images. Provides realistic results based on human perception. - \"from_lab\": Extracts the L channel from the LAB color space.   Works only with 3-channel images. Gives perceptually uniform results. - \"desaturation\": Averages the maximum and minimum values across channels.   Works with any number of channels. Fast but may not preserve perceived brightness well. - \"average\": Simple average of all channels.   Works with any number of channels. Fast but may not give realistic results. - \"max\": Takes the maximum value across all channels.   Works with any number of channels. Tends to produce brighter results. - \"pca\": Applies Principal Component Analysis to reduce channels.   Works with any number of channels. Can preserve more information but is computationally intensive.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Exceptions:</p> Type Description <code>TypeError</code> <p>If the input image doesn't have 3 channels for methods that require it.</p> <p>Note</p> <ul> <li>The transform first converts the input image to single-channel grayscale, then replicates   this channel if num_output_channels &gt; 1.</li> <li>\"weighted_average\" and \"from_lab\" are typically used in image processing and computer vision   applications where accurate representation of human perception is important.</li> <li>\"desaturation\" and \"average\" are often used in simple image manipulation tools or when   computational speed is a priority.</li> <li>\"max\" method can be useful in scenarios where preserving bright features is important,   such as in some medical imaging applications.</li> <li>\"pca\" might be used in advanced image analysis tasks or when dealing with hyperspectral images.</li> </ul> <p>Image types:     uint8, float32</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Grayscale image with the specified number of channels.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToGray(ImageOnlyTransform):\n    \"\"\"Convert an image to grayscale and optionally replicate the grayscale channel.\n\n    This transform first converts a color image to a single-channel grayscale image using various methods,\n    then replicates the grayscale channel if num_output_channels is greater than 1.\n\n    Args:\n        num_output_channels (int): The number of channels in the output image. If greater than 1,\n            the grayscale channel will be replicated. Default: 3.\n        method (Literal[\"weighted_average\", \"from_lab\", \"desaturation\", \"average\", \"max\", \"pca\"]):\n            The method used for grayscale conversion:\n            - \"weighted_average\": Uses a weighted sum of RGB channels (0.299R + 0.587G + 0.114B).\n              Works only with 3-channel images. Provides realistic results based on human perception.\n            - \"from_lab\": Extracts the L channel from the LAB color space.\n              Works only with 3-channel images. Gives perceptually uniform results.\n            - \"desaturation\": Averages the maximum and minimum values across channels.\n              Works with any number of channels. Fast but may not preserve perceived brightness well.\n            - \"average\": Simple average of all channels.\n              Works with any number of channels. Fast but may not give realistic results.\n            - \"max\": Takes the maximum value across all channels.\n              Works with any number of channels. Tends to produce brighter results.\n            - \"pca\": Applies Principal Component Analysis to reduce channels.\n              Works with any number of channels. Can preserve more information but is computationally intensive.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Raises:\n        TypeError: If the input image doesn't have 3 channels for methods that require it.\n\n    Note:\n        - The transform first converts the input image to single-channel grayscale, then replicates\n          this channel if num_output_channels &gt; 1.\n        - \"weighted_average\" and \"from_lab\" are typically used in image processing and computer vision\n          applications where accurate representation of human perception is important.\n        - \"desaturation\" and \"average\" are often used in simple image manipulation tools or when\n          computational speed is a priority.\n        - \"max\" method can be useful in scenarios where preserving bright features is important,\n          such as in some medical imaging applications.\n        - \"pca\" might be used in advanced image analysis tasks or when dealing with hyperspectral images.\n\n    Image types:\n        uint8, float32\n\n    Returns:\n        np.ndarray: Grayscale image with the specified number of channels.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_output_channels: int = Field(\n            default=3,\n            description=\"The number of output channels.\",\n            ge=1,\n        )\n        method: Literal[\n            \"weighted_average\",\n            \"from_lab\",\n            \"desaturation\",\n            \"average\",\n            \"max\",\n            \"pca\",\n        ]\n\n    def __init__(\n        self,\n        num_output_channels: int = 3,\n        method: Literal[\n            \"weighted_average\",\n            \"from_lab\",\n            \"desaturation\",\n            \"average\",\n            \"max\",\n            \"pca\",\n        ] = \"weighted_average\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_output_channels = num_output_channels\n        self.method = method\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if is_grayscale_image(img):\n            warnings.warn(\"The image is already gray.\", stacklevel=2)\n            return img\n\n        num_channels = get_num_channels(img)\n\n        if num_channels != NUM_RGB_CHANNELS and self.method not in {\n            \"desaturation\",\n            \"average\",\n            \"max\",\n            \"pca\",\n        }:\n            msg = \"ToGray transformation expects 3-channel images.\"\n            raise TypeError(msg)\n\n        return fmain.to_gray(img, self.num_output_channels, self.method)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"num_output_channels\", \"method\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToRGB","title":"<code>class  ToRGB</code> <code>       (num_output_channels=3, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Convert an input image from grayscale to RGB format.</p> <p>Parameters:</p> Name Type Description <code>num_output_channels</code> <code>int</code> <p>The number of channels in the output image. Default: 3.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1</p> <p>Note</p> <ul> <li>For single-channel (grayscale) images, the channel is replicated to create an RGB image.</li> <li>If the input is already a 3-channel RGB image, it is returned unchanged.</li> <li>This transform does not change the data type of the image (e.g., uint8 remains uint8).</li> </ul> <p>Exceptions:</p> Type Description <code>TypeError</code> <p>If the input image has more than 1 channel.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Convert a grayscale image to RGB\n&gt;&gt;&gt; transform = A.Compose([A.ToRGB(p=1.0)])\n&gt;&gt;&gt; grayscale_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; rgb_image = transform(image=grayscale_image)['image']\n&gt;&gt;&gt; assert rgb_image.shape == (100, 100, 3)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToRGB(ImageOnlyTransform):\n    \"\"\"Convert an input image from grayscale to RGB format.\n\n    Args:\n        num_output_channels (int): The number of channels in the output image. Default: 3.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1\n\n    Note:\n        - For single-channel (grayscale) images, the channel is replicated to create an RGB image.\n        - If the input is already a 3-channel RGB image, it is returned unchanged.\n        - This transform does not change the data type of the image (e.g., uint8 remains uint8).\n\n    Raises:\n        TypeError: If the input image has more than 1 channel.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Convert a grayscale image to RGB\n        &gt;&gt;&gt; transform = A.Compose([A.ToRGB(p=1.0)])\n        &gt;&gt;&gt; grayscale_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; rgb_image = transform(image=grayscale_image)['image']\n        &gt;&gt;&gt; assert rgb_image.shape == (100, 100, 3)\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_output_channels: int = Field(ge=1)\n\n    def __init__(\n        self,\n        num_output_channels: int = 3,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.num_output_channels = num_output_channels\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if is_rgb_image(img):\n            warnings.warn(\"The image is already an RGB.\", stacklevel=2)\n            return np.ascontiguousarray(img)\n        if not is_grayscale_image(img):\n            msg = \"ToRGB transformation expects 2-dim images or 3-dim with the last dimension equal to 1.\"\n            raise TypeError(msg)\n\n        return fmain.grayscale_to_multichannel(\n            img,\n            num_output_channels=self.num_output_channels,\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str]:\n        return (\"num_output_channels\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.ToSepia","title":"<code>class  ToSepia</code> <code>       (p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply a sepia filter to the input image.</p> <p>This transform converts a color image to a sepia tone, giving it a warm, brownish tint that is reminiscent of old photographs. The sepia effect is achieved by applying a specific color transformation matrix to the RGB channels of the input image. For grayscale images, the transform is a no-op and returns the original image.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     1,3</p> <p>Note</p> <ul> <li>The sepia effect only works with RGB images (3 channels). For grayscale images,   the original image is returned unchanged since the sepia transformation would   have no visible effect when R=G=B.</li> <li>The sepia effect is created using a fixed color transformation matrix:   [[0.393, 0.769, 0.189],    [0.349, 0.686, 0.168],    [0.272, 0.534, 0.131]]</li> <li>The output image will have the same data type as the input image.</li> <li>For float32 images, ensure the input values are in the range [0, 1].</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n# Apply sepia effect to a uint8 RGB image\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n&gt;&gt;&gt; sepia_image = transform(image=image)['image']\n&gt;&gt;&gt; assert sepia_image.shape == image.shape\n&gt;&gt;&gt; assert sepia_image.dtype == np.uint8\n&gt;&gt;&gt;\n# Apply sepia effect to a float32 RGB image\n&gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n&gt;&gt;&gt; sepia_image = transform(image=image)['image']\n&gt;&gt;&gt; assert sepia_image.shape == image.shape\n&gt;&gt;&gt; assert sepia_image.dtype == np.float32\n&gt;&gt;&gt; assert 0 &lt;= sepia_image.min() &lt;= sepia_image.max() &lt;= 1.0\n&gt;&gt;&gt;\n# No effect on grayscale images\n&gt;&gt;&gt; gray_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n&gt;&gt;&gt; result = transform(image=gray_image)['image']\n&gt;&gt;&gt; assert np.array_equal(result, gray_image)\n</code></pre> <p>Mathematical Formulation:     Given an input pixel [R, G, B], the sepia tone is calculated as:     R_sepia = 0.393R + 0.769G + 0.189B     G_sepia = 0.349R + 0.686G + 0.168B     B_sepia = 0.272R + 0.534G + 0.131*B</p> <pre><code>For grayscale images where R=G=B, this transformation would result in a simple\nscaling of the original value, so we skip it.\n\nThe output values are clipped to the valid range for the image's data type.\n</code></pre> <p>See Also:     ToGray: For converting images to grayscale instead of sepia.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class ToSepia(ImageOnlyTransform):\n    \"\"\"Apply a sepia filter to the input image.\n\n    This transform converts a color image to a sepia tone, giving it a warm, brownish tint\n    that is reminiscent of old photographs. The sepia effect is achieved by applying a\n    specific color transformation matrix to the RGB channels of the input image.\n    For grayscale images, the transform is a no-op and returns the original image.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        1,3\n\n    Note:\n        - The sepia effect only works with RGB images (3 channels). For grayscale images,\n          the original image is returned unchanged since the sepia transformation would\n          have no visible effect when R=G=B.\n        - The sepia effect is created using a fixed color transformation matrix:\n          [[0.393, 0.769, 0.189],\n           [0.349, 0.686, 0.168],\n           [0.272, 0.534, 0.131]]\n        - The output image will have the same data type as the input image.\n        - For float32 images, ensure the input values are in the range [0, 1].\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        # Apply sepia effect to a uint8 RGB image\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n        &gt;&gt;&gt; sepia_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert sepia_image.shape == image.shape\n        &gt;&gt;&gt; assert sepia_image.dtype == np.uint8\n        &gt;&gt;&gt;\n        # Apply sepia effect to a float32 RGB image\n        &gt;&gt;&gt; image = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n        &gt;&gt;&gt; sepia_image = transform(image=image)['image']\n        &gt;&gt;&gt; assert sepia_image.shape == image.shape\n        &gt;&gt;&gt; assert sepia_image.dtype == np.float32\n        &gt;&gt;&gt; assert 0 &lt;= sepia_image.min() &lt;= sepia_image.max() &lt;= 1.0\n        &gt;&gt;&gt;\n        # No effect on grayscale images\n        &gt;&gt;&gt; gray_image = np.random.randint(0, 256, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ToSepia(p=1.0)\n        &gt;&gt;&gt; result = transform(image=gray_image)['image']\n        &gt;&gt;&gt; assert np.array_equal(result, gray_image)\n\n    Mathematical Formulation:\n        Given an input pixel [R, G, B], the sepia tone is calculated as:\n        R_sepia = 0.393*R + 0.769*G + 0.189*B\n        G_sepia = 0.349*R + 0.686*G + 0.168*B\n        B_sepia = 0.272*R + 0.534*G + 0.131*B\n\n        For grayscale images where R=G=B, this transformation would result in a simple\n        scaling of the original value, so we skip it.\n\n        The output values are clipped to the valid range for the image's data type.\n\n    See Also:\n        ToGray: For converting images to grayscale instead of sepia.\n    \"\"\"\n\n    def __init__(self, p: float = 0.5, always_apply: bool | None = None):\n        super().__init__(p, always_apply)\n        self.sepia_transformation_matrix = np.array(\n            [[0.393, 0.769, 0.189], [0.349, 0.686, 0.168], [0.272, 0.534, 0.131]],\n        )\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if is_grayscale_image(img):\n            return img\n\n        if not is_rgb_image(img):\n            msg = \"ToSepia transformation expects 1 or 3-channel images.\"\n            raise TypeError(msg)\n        return fmain.linear_transformation_rgb(img, self.sepia_transformation_matrix)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.UniformParams","title":"<code>class  UniformParams</code> <code> </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class UniformParams(NoiseParamsBase):\n    noise_type: Literal[\"uniform\"] = \"uniform\"\n    ranges: list[Sequence[float]] = Field(\n        description=\"List of (min, max) ranges for each channel\",\n        min_length=1,\n    )\n\n    @field_validator(\"ranges\", mode=\"after\")\n    @classmethod\n    def validate_ranges(cls, v: list[Sequence[float]]) -&gt; list[tuple[float, float]]:\n        result = []\n        for range_values in v:\n            if len(range_values) != PAIR:\n                raise ValueError(\"Each range must have exactly 2 values\")\n            min_val, max_val = range_values\n            if not (-1 &lt;= min_val &lt;= max_val &lt;= 1):\n                raise ValueError(\"Range values must be in [-1, 1] and min &lt;= max\")\n            result.append((float(min_val), float(max_val)))\n        return result\n</code></pre>"},{"location":"api_reference/augmentations/transforms/#albumentations.augmentations.transforms.UnsharpMask","title":"<code>class  UnsharpMask</code> <code>       (blur_limit=(3, 7), sigma_limit=0.0, alpha=(0.2, 0.5), threshold=10, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.</p> <p>Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased     sharpness. This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask which is combined with the original image to enhance edges and fine details.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>maximum Gaussian kernel size for blurring the input image. Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma as <code>round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1</code>. If set single value <code>blur_limit</code> will be in range (0, blur_limit). Default: (3, 7).</p> <code>sigma_limit</code> <code>tuple[float, float] | float</code> <p>Gaussian kernel standard deviation. Must be in range [0, inf). If set single value <code>sigma_limit</code> will be in range (0, sigma_limit). If set to 0 sigma will be computed as <code>sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8</code>. Default: 0.</p> <code>alpha</code> <code>tuple[float, float]</code> <p>range to choose the visibility of the sharpened image. At 0, only the original image is visible, at 1.0 only its sharpened version is visible. Default: (0.2, 0.5).</p> <code>threshold</code> <code>int</code> <p>Value to limit sharpening only for areas with high pixel difference between original image and it's smoothed version. Higher threshold means less sharpening on flat areas. Must be in range [0, 255]. Default: 10.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian     blurred version.</li> <li>The final image is computed as: output = I + M if |I - G| &gt; threshold, else I.</li> <li>Higher alpha values increase the strength of the sharpening effect.</li> <li>Higher threshold values limit the sharpening effect to areas with more significant edges or details.</li> <li>The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.</li> </ul> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Unsharp_masking</li> <li>https://arxiv.org/pdf/2107.10833.pdf</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt;\n# Apply UnsharpMask with default parameters\n&gt;&gt;&gt; transform = A.UnsharpMask(p=1.0)\n&gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n&gt;&gt;&gt;\n# Apply UnsharpMask with custom parameters\n&gt;&gt;&gt; transform = A.UnsharpMask(\n...     blur_limit=(3, 7),\n...     sigma_limit=(0.1, 0.5),\n...     alpha=(0.2, 0.7),\n...     threshold=15,\n...     p=1.0\n... )\n&gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms.py</code> Python<pre><code>class UnsharpMask(ImageOnlyTransform):\n    \"\"\"Sharpen the input image using Unsharp Masking processing and overlays the result with the original image.\n\n    Unsharp masking is a technique that enhances edge contrast in an image, creating the illusion of increased\n        sharpness.\n    This transform applies Gaussian blur to create a blurred version of the image, then uses this to create a mask\n    which is combined with the original image to enhance edges and fine details.\n\n    Args:\n        blur_limit (tuple[int, int] | int): maximum Gaussian kernel size for blurring the input image.\n            Must be zero or odd and in range [0, inf). If set to 0 it will be computed from sigma\n            as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`.\n            If set single value `blur_limit` will be in range (0, blur_limit).\n            Default: (3, 7).\n        sigma_limit (tuple[float, float] | float): Gaussian kernel standard deviation. Must be in range [0, inf).\n            If set single value `sigma_limit` will be in range (0, sigma_limit).\n            If set to 0 sigma will be computed as `sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8`. Default: 0.\n        alpha (tuple[float, float]): range to choose the visibility of the sharpened image.\n            At 0, only the original image is visible, at 1.0 only its sharpened version is visible.\n            Default: (0.2, 0.5).\n        threshold (int): Value to limit sharpening only for areas with high pixel difference between original image\n            and it's smoothed version. Higher threshold means less sharpening on flat areas.\n            Must be in range [0, 255]. Default: 10.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The algorithm creates a mask M = (I - G) * alpha, where I is the original image and G is the Gaussian\n            blurred version.\n        - The final image is computed as: output = I + M if |I - G| &gt; threshold, else I.\n        - Higher alpha values increase the strength of the sharpening effect.\n        - Higher threshold values limit the sharpening effect to areas with more significant edges or details.\n        - The blur_limit and sigma_limit parameters control the Gaussian blur used to create the mask.\n\n    References:\n        - https://en.wikipedia.org/wiki/Unsharp_masking\n        - https://arxiv.org/pdf/2107.10833.pdf\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt;\n        # Apply UnsharpMask with default parameters\n        &gt;&gt;&gt; transform = A.UnsharpMask(p=1.0)\n        &gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n        &gt;&gt;&gt;\n        # Apply UnsharpMask with custom parameters\n        &gt;&gt;&gt; transform = A.UnsharpMask(\n        ...     blur_limit=(3, 7),\n        ...     sigma_limit=(0.1, 0.5),\n        ...     alpha=(0.2, 0.7),\n        ...     threshold=15,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; sharpened_image = transform(image=image)['image']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        sigma_limit: NonNegativeFloatRangeType\n        alpha: ZeroOneRangeType\n        threshold: int = Field(ge=0, le=255)\n        blur_limit: ScaleIntType\n\n        @field_validator(\"blur_limit\")\n        @classmethod\n        def process_blur(\n            cls,\n            value: ScaleIntType,\n            info: ValidationInfo,\n        ) -&gt; tuple[int, int]:\n            return fblur.process_blur_limit(value, info, min_value=3)\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        sigma_limit: ScaleFloatType = 0.0,\n        alpha: ScaleFloatType = (0.2, 0.5),\n        threshold: int = 10,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.sigma_limit = cast(tuple[float, float], sigma_limit)\n        self.alpha = cast(tuple[float, float], alpha)\n        self.threshold = threshold\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"ksize\": self.py_random.randrange(\n                self.blur_limit[0],\n                self.blur_limit[1] + 1,\n                2,\n            ),\n            \"sigma\": self.py_random.uniform(*self.sigma_limit),\n            \"alpha\": self.py_random.uniform(*self.alpha),\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        ksize: int,\n        sigma: int,\n        alpha: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fmain.unsharp_mask(\n            img,\n            ksize,\n            sigma=sigma,\n            alpha=alpha,\n            threshold=self.threshold,\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"blur_limit\", \"sigma_limit\", \"alpha\", \"threshold\"\n</code></pre>"},{"location":"api_reference/augmentations/blur/","title":"Index","text":"<ul> <li>Blur transforms (albumentations.augmentations.blur.transforms)</li> </ul>"},{"location":"api_reference/augmentations/blur/functional/","title":"Blur functional transforms (augmentations.blur.functional)","text":""},{"location":"api_reference/augmentations/blur/functional/#albumentations.augmentations.blur.functional.create_motion_kernel","title":"<code>def create_motion_kernel    (kernel_size, angle, direction, allow_shifted, random_state)    </code> [view source on GitHub]","text":"<p>Create a motion blur kernel.</p> <p>Parameters:</p> Name Type Description <code>kernel_size</code> <code>int</code> <p>Size of the kernel (must be odd)</p> <code>angle</code> <code>float</code> <p>Angle in degrees (counter-clockwise)</p> <code>direction</code> <code>float</code> <p>Blur direction (-1.0 to 1.0)</p> <code>allow_shifted</code> <code>bool</code> <p>Allow kernel to be randomly shifted from center</p> <code>random_state</code> <code>Random</code> <p>Python's random.Random instance</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Motion blur kernel</p> Source code in <code>albumentations/augmentations/blur/functional.py</code> Python<pre><code>def create_motion_kernel(\n    kernel_size: int,\n    angle: float,\n    direction: float,\n    allow_shifted: bool,\n    random_state: Random,\n) -&gt; np.ndarray:\n    \"\"\"Create a motion blur kernel.\n\n    Args:\n        kernel_size: Size of the kernel (must be odd)\n        angle: Angle in degrees (counter-clockwise)\n        direction: Blur direction (-1.0 to 1.0)\n        allow_shifted: Allow kernel to be randomly shifted from center\n        random_state: Python's random.Random instance\n\n    Returns:\n        Motion blur kernel\n    \"\"\"\n    kernel = np.zeros((kernel_size, kernel_size), dtype=np.float32)\n    center = kernel_size // 2\n\n    # Convert angle to radians\n    angle_rad = np.deg2rad(angle)\n\n    # Calculate direction vector\n    dx = np.cos(angle_rad)\n    dy = np.sin(angle_rad)\n\n    # Create line points with direction bias\n    line_length = kernel_size // 2\n    t = np.linspace(-line_length, line_length, kernel_size * 2)\n\n    # Apply direction bias\n    if direction != 0:\n        t = t * (1 + abs(direction))\n        if direction &lt; 0:\n            t = t * -1\n\n    # Generate line coordinates\n    x = center + dx * t\n    y = center + dy * t\n\n    # Apply random shift if allowed\n    if allow_shifted and random_state is not None:\n        shift_x = random_state.uniform(-1, 1) * line_length / 2\n        shift_y = random_state.uniform(-1, 1) * line_length / 2\n        x += shift_x\n        y += shift_y\n\n    # Round coordinates and clip to kernel bounds\n    x = np.clip(np.round(x), 0, kernel_size - 1).astype(int)\n    y = np.clip(np.round(y), 0, kernel_size - 1).astype(int)\n\n    # Keep only unique points to avoid multiple assignments\n    points = np.unique(np.column_stack([y, x]), axis=0)\n    kernel[points[:, 0], points[:, 1]] = 1\n\n    # Ensure at least one point is set\n    if not kernel.any():\n        kernel[center, center] = 1\n\n    return kernel\n</code></pre>"},{"location":"api_reference/augmentations/blur/functional/#albumentations.augmentations.blur.functional.process_blur_limit","title":"<code>def process_blur_limit    (value, info, min_value=0)    </code> [view source on GitHub]","text":"<p>Process blur limit to ensure valid kernel sizes.</p> Source code in <code>albumentations/augmentations/blur/functional.py</code> Python<pre><code>def process_blur_limit(value: ScaleIntType, info: ValidationInfo, min_value: int = 0) -&gt; tuple[int, int]:\n    \"\"\"Process blur limit to ensure valid kernel sizes.\"\"\"\n    result = value if isinstance(value, Sequence) else (min_value, value)\n\n    result = _ensure_min_value(result, min_value, info.field_name)\n    result = _ensure_odd_values(result, info.field_name)\n\n    if result[0] &gt; result[1]:\n        final_result = (result[1], result[1])\n        warn(\n            f\"{info.field_name}: Invalid range {result} (min &gt; max). \"\n            f\"Range automatically adjusted to {final_result}.\",\n            UserWarning,\n            stacklevel=2,\n        )\n        return final_result\n\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/blur/functional/#albumentations.augmentations.blur.functional.sample_odd_from_range","title":"<code>def sample_odd_from_range    (random_state, low, high)    </code> [view source on GitHub]","text":"<p>Sample an odd number from the range [low, high] (inclusive).</p> <p>Parameters:</p> Name Type Description <code>random_state</code> <code>Random</code> <p>instance of random.Random</p> <code>low</code> <code>int</code> <p>lower bound (will be converted to nearest valid odd number)</p> <code>high</code> <code>int</code> <p>upper bound (will be converted to nearest valid odd number)</p> <p>Returns:</p> Type Description <code>int</code> <p>Randomly sampled odd number from the range</p> <p>Note</p> <ul> <li>Input values will be converted to nearest valid odd numbers:</li> <li>Values less than 3 will become 3</li> <li>Even values will be rounded up to next odd number</li> <li>After normalization, high must be &gt;= low</li> </ul> Source code in <code>albumentations/augmentations/blur/functional.py</code> Python<pre><code>def sample_odd_from_range(random_state: Random, low: int, high: int) -&gt; int:\n    \"\"\"Sample an odd number from the range [low, high] (inclusive).\n\n    Args:\n        random_state: instance of random.Random\n        low: lower bound (will be converted to nearest valid odd number)\n        high: upper bound (will be converted to nearest valid odd number)\n\n    Returns:\n        Randomly sampled odd number from the range\n\n    Note:\n        - Input values will be converted to nearest valid odd numbers:\n          * Values less than 3 will become 3\n          * Even values will be rounded up to next odd number\n        - After normalization, high must be &gt;= low\n    \"\"\"\n    # Normalize low value\n    low = max(3, low + (low % 2 == 0))\n    # Normalize high value\n    high = max(3, high + (high % 2 == 0))\n\n    # Ensure high &gt;= low after normalization\n    high = max(high, low)\n\n    if low == high:\n        return low\n\n    # Calculate number of possible odd values\n    num_odd_values = (high - low) // 2 + 1\n    # Generate random index and convert to corresponding odd number\n    rand_idx = random_state.randint(0, num_odd_values - 1)\n    return low + (2 * rand_idx)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/","title":"Blur transforms (augmentations.blur.transforms)","text":""},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.AdvancedBlur","title":"<code>class  AdvancedBlur</code> <code>       (blur_limit=(3, 7), sigma_x_limit=(0.2, 1.0), sigma_y_limit=(0.2, 1.0), sigmaX_limit=None, sigmaY_limit=None, rotate_limit=(-90, 90), beta_limit=(0.5, 8.0), noise_limit=(0.9, 1.1), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Applies a Generalized Gaussian blur to the input image with randomized parameters for advanced data augmentation.</p> <p>This transform creates a custom blur kernel based on the Generalized Gaussian distribution, which allows for a wide range of blur effects beyond standard Gaussian blur. It then applies this kernel to the input image through convolution. The transform also incorporates noise into the kernel, resulting in a unique combination of blurring and noise injection.</p> <p>Key features of this augmentation:</p> <ol> <li> <p>Generalized Gaussian Kernel: Uses a generalized normal distribution to create kernels    that can range from box-like blurs to very peaked blurs, controlled by the beta parameter.</p> </li> <li> <p>Anisotropic Blurring: Allows for different blur strengths in horizontal and vertical    directions (controlled by sigma_x and sigma_y), and rotation of the kernel.</p> </li> <li> <p>Kernel Noise: Adds multiplicative noise to the kernel before applying it to the image,    creating more diverse and realistic blur effects.</p> </li> </ol> <p>Implementation Details:     The kernel is generated using a 2D Generalized Gaussian function. The process involves:     1. Creating a 2D grid based on the kernel size     2. Applying rotation to this grid     3. Calculating the kernel values using the Generalized Gaussian formula     4. Adding multiplicative noise to the kernel     5. Normalizing the kernel</p> <pre><code>The resulting kernel is then applied to the image using convolution.\n</code></pre> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Controls the size of the blur kernel. If a single int is provided, the kernel size will be randomly chosen between 3 and that value. Must be odd and \u2265 3. Larger values create stronger blur effects. Default: (3, 7)</p> <code>sigma_x_limit</code> <code>tuple[float, float] | float</code> <p>Controls the spread of the blur in the x direction. Higher values increase blur strength. If a single float is provided, the range will be (0, limit). Default: (0.2, 1.0)</p> <code>sigma_y_limit</code> <code>tuple[float, float] | float</code> <p>Controls the spread of the blur in the y direction. Higher values increase blur strength. If a single float is provided, the range will be (0, limit). Default: (0.2, 1.0)</p> <code>rotate_limit</code> <code>tuple[int, int] | int</code> <p>Range of angles (in degrees) for rotating the kernel. This rotation allows for diagonal blur directions. If limit is a single int, an angle is picked from (-rotate_limit, rotate_limit). Default: (-90, 90)</p> <code>beta_limit</code> <code>tuple[float, float] | float</code> <p>Shape parameter of the Generalized Gaussian distribution. - beta = 1 gives a standard Gaussian distribution - beta &lt; 1 creates heavier tails, resulting in more uniform, box-like blur - beta &gt; 1 creates lighter tails, resulting in more peaked, focused blur Default: (0.5, 8.0)</p> <code>noise_limit</code> <code>tuple[float, float] | float</code> <p>Controls the strength of multiplicative noise applied to the kernel. Values around 1.0 keep the original kernel mostly intact, while values further from 1.0 introduce more variation. Default: (0.75, 1.25)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Notes</p> <ul> <li>This transform is particularly useful for simulating complex, real-world blur effects   that go beyond simple Gaussian blur.</li> <li>The combination of blur and noise can help in creating more robust models by simulating   a wider range of image degradations.</li> <li>Extreme values, especially for beta and noise, may result in unrealistic effects and   should be used cautiously.</li> </ul> <p>Reference</p> <p>This transform is inspired by techniques described in: \"Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data\" https://arxiv.org/abs/2107.10833</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class AdvancedBlur(ImageOnlyTransform):\n    \"\"\"Applies a Generalized Gaussian blur to the input image with randomized parameters for advanced data augmentation.\n\n    This transform creates a custom blur kernel based on the Generalized Gaussian distribution,\n    which allows for a wide range of blur effects beyond standard Gaussian blur. It then applies\n    this kernel to the input image through convolution. The transform also incorporates noise\n    into the kernel, resulting in a unique combination of blurring and noise injection.\n\n    Key features of this augmentation:\n\n    1. Generalized Gaussian Kernel: Uses a generalized normal distribution to create kernels\n       that can range from box-like blurs to very peaked blurs, controlled by the beta parameter.\n\n    2. Anisotropic Blurring: Allows for different blur strengths in horizontal and vertical\n       directions (controlled by sigma_x and sigma_y), and rotation of the kernel.\n\n    3. Kernel Noise: Adds multiplicative noise to the kernel before applying it to the image,\n       creating more diverse and realistic blur effects.\n\n    Implementation Details:\n        The kernel is generated using a 2D Generalized Gaussian function. The process involves:\n        1. Creating a 2D grid based on the kernel size\n        2. Applying rotation to this grid\n        3. Calculating the kernel values using the Generalized Gaussian formula\n        4. Adding multiplicative noise to the kernel\n        5. Normalizing the kernel\n\n        The resulting kernel is then applied to the image using convolution.\n\n    Args:\n        blur_limit (tuple[int, int] | int, optional): Controls the size of the blur kernel. If a single int\n            is provided, the kernel size will be randomly chosen between 3 and that value.\n            Must be odd and \u2265 3. Larger values create stronger blur effects.\n            Default: (3, 7)\n\n        sigma_x_limit (tuple[float, float] | float): Controls the spread of the blur in the x direction.\n            Higher values increase blur strength.\n            If a single float is provided, the range will be (0, limit).\n            Default: (0.2, 1.0)\n\n        sigma_y_limit (tuple[float, float] | float): Controls the spread of the blur in the y direction.\n            Higher values increase blur strength.\n            If a single float is provided, the range will be (0, limit).\n            Default: (0.2, 1.0)\n\n        rotate_limit (tuple[int, int] | int): Range of angles (in degrees) for rotating the kernel.\n            This rotation allows for diagonal blur directions. If limit is a single int, an angle is picked\n            from (-rotate_limit, rotate_limit).\n            Default: (-90, 90)\n\n        beta_limit (tuple[float, float] | float): Shape parameter of the Generalized Gaussian distribution.\n            - beta = 1 gives a standard Gaussian distribution\n            - beta &lt; 1 creates heavier tails, resulting in more uniform, box-like blur\n            - beta &gt; 1 creates lighter tails, resulting in more peaked, focused blur\n            Default: (0.5, 8.0)\n\n        noise_limit (tuple[float, float] | float): Controls the strength of multiplicative noise\n            applied to the kernel. Values around 1.0 keep the original kernel mostly intact,\n            while values further from 1.0 introduce more variation.\n            Default: (0.75, 1.25)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Notes:\n        - This transform is particularly useful for simulating complex, real-world blur effects\n          that go beyond simple Gaussian blur.\n        - The combination of blur and noise can help in creating more robust models by simulating\n          a wider range of image degradations.\n        - Extreme values, especially for beta and noise, may result in unrealistic effects and\n          should be used cautiously.\n\n    Reference:\n        This transform is inspired by techniques described in:\n        \"Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data\"\n        https://arxiv.org/abs/2107.10833\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        sigma_x_limit: NonNegativeFloatRangeType\n        sigma_y_limit: NonNegativeFloatRangeType\n        beta_limit: NonNegativeFloatRangeType\n        noise_limit: NonNegativeFloatRangeType\n        rotate_limit: SymmetricRangeType\n\n        @field_validator(\"beta_limit\")\n        @classmethod\n        def check_beta_limit(cls, value: ScaleFloatType) -&gt; tuple[float, float]:\n            result = to_tuple(value, low=0)\n            if not (result[0] &lt; 1.0 &lt; result[1]):\n                msg = \"beta_limit is expected to include 1.0.\"\n                raise ValueError(msg)\n            return result\n\n        @model_validator(mode=\"after\")\n        def validate_limits(self) -&gt; Self:\n            if (\n                isinstance(self.sigma_x_limit, (tuple, list))\n                and self.sigma_x_limit[0] == 0\n                and isinstance(self.sigma_y_limit, (tuple, list))\n                and self.sigma_y_limit[0] == 0\n            ):\n                msg = \"sigma_x_limit and sigma_y_limit minimum value cannot be both equal to 0.\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        sigma_x_limit: ScaleFloatType = (0.2, 1.0),\n        sigma_y_limit: ScaleFloatType = (0.2, 1.0),\n        sigmaX_limit: ScaleFloatType | None = None,  # noqa: N803\n        sigmaY_limit: ScaleFloatType | None = None,  # noqa: N803\n        rotate_limit: ScaleIntType = (-90, 90),\n        beta_limit: ScaleFloatType = (0.5, 8.0),\n        noise_limit: ScaleFloatType = (0.9, 1.1),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        if sigmaX_limit is not None:\n            warnings.warn(\n                \"sigmaX_limit is deprecated; use sigma_x_limit instead.\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            sigma_x_limit = sigmaX_limit\n\n        if sigmaY_limit is not None:\n            warnings.warn(\n                \"sigmaY_limit is deprecated; use sigma_y_limit instead.\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            sigma_y_limit = sigmaY_limit\n\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.sigma_x_limit = cast(tuple[float, float], sigma_x_limit)\n        self.sigma_y_limit = cast(tuple[float, float], sigma_y_limit)\n        self.rotate_limit = cast(tuple[int, int], rotate_limit)\n        self.beta_limit = cast(tuple[float, float], beta_limit)\n        self.noise_limit = cast(tuple[float, float], noise_limit)\n\n    def apply(self, img: np.ndarray, kernel: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.convolve(img, kernel=kernel)\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        ksize = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n        sigma_x = self.py_random.uniform(*self.sigma_x_limit)\n        sigma_y = self.py_random.uniform(*self.sigma_y_limit)\n        angle = np.deg2rad(self.py_random.uniform(*self.rotate_limit))\n\n        # Split into 2 cases to avoid selection of narrow kernels (beta &gt; 1) too often.\n        beta = (\n            self.py_random.uniform(self.beta_limit[0], 1)\n            if self.py_random.random() &lt; HALF\n            else self.py_random.uniform(1, self.beta_limit[1])\n        )\n\n        noise_matrix = self.random_generator.uniform(\n            *self.noise_limit,\n            size=(ksize, ksize),\n        )\n\n        # Generate mesh grid centered at zero.\n        ax = np.arange(-ksize // 2 + 1.0, ksize // 2 + 1.0)\n        # &gt; Shape (ksize, ksize, 2)\n        grid = np.stack(np.meshgrid(ax, ax), axis=-1)\n\n        # Calculate rotated sigma matrix\n        d_matrix = np.array([[sigma_x**2, 0], [0, sigma_y**2]])\n        u_matrix = np.array(\n            [[np.cos(angle), -np.sin(angle)], [np.sin(angle), np.cos(angle)]],\n        )\n        sigma_matrix = np.dot(u_matrix, np.dot(d_matrix, u_matrix.T))\n\n        inverse_sigma = np.linalg.inv(sigma_matrix)\n        # Described in \"Parameter Estimation For Multivariate Generalized Gaussian Distributions\"\n        kernel = np.exp(\n            -0.5 * np.power(np.sum(np.dot(grid, inverse_sigma) * grid, 2), beta),\n        )\n        # Add noise\n        kernel *= noise_matrix\n\n        # Normalize kernel\n        kernel = kernel.astype(np.float32) / np.sum(kernel)\n        return {\"kernel\": kernel}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str, str, str]:\n        return (\n            \"blur_limit\",\n            \"sigma_x_limit\",\n            \"sigma_y_limit\",\n            \"rotate_limit\",\n            \"beta_limit\",\n            \"noise_limit\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.Blur","title":"<code>class  Blur</code> <code>       (blur_limit=(3, 7), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply uniform box blur to the input image using a randomly sized square kernel.</p> <p>This transform uses OpenCV's cv2.blur function, which performs a simple box filter blur. The size of the blur kernel is randomly selected for each application, allowing for varying degrees of blur intensity.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Controls the range of the blur kernel size. - If a single int is provided, the kernel size will be randomly chosen   between 3 and that value. - If a tuple of two ints is provided, it defines the inclusive range   of possible kernel sizes. The kernel size must be odd and greater than or equal to 3. Larger kernel sizes produce stronger blur effects. Default: (3, 7)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Notes</p> <ul> <li>The blur kernel is always square (same width and height).</li> <li>Only odd kernel sizes are used to ensure the blur has a clear center pixel.</li> <li>Box blur is faster than Gaussian blur but may produce less natural results.</li> <li>This blur method averages all pixels under the kernel area, which can   reduce noise but also reduce image detail.</li> </ul> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Blur(blur_limit=(3, 7), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; blurred_image = result[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class Blur(ImageOnlyTransform):\n    \"\"\"Apply uniform box blur to the input image using a randomly sized square kernel.\n\n    This transform uses OpenCV's cv2.blur function, which performs a simple box filter blur.\n    The size of the blur kernel is randomly selected for each application, allowing for\n    varying degrees of blur intensity.\n\n    Args:\n        blur_limit (tuple[int, int] | int): Controls the range of the blur kernel size.\n            - If a single int is provided, the kernel size will be randomly chosen\n              between 3 and that value.\n            - If a tuple of two ints is provided, it defines the inclusive range\n              of possible kernel sizes.\n            The kernel size must be odd and greater than or equal to 3.\n            Larger kernel sizes produce stronger blur effects.\n            Default: (3, 7)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Notes:\n        - The blur kernel is always square (same width and height).\n        - Only odd kernel sizes are used to ensure the blur has a clear center pixel.\n        - Box blur is faster than Gaussian blur but may produce less natural results.\n        - This blur method averages all pixels under the kernel area, which can\n          reduce noise but also reduce image detail.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Blur(blur_limit=(3, 7), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; blurred_image = result[\"image\"]\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        pass\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n\n    def apply(self, img: np.ndarray, kernel: int, **params: Any) -&gt; np.ndarray:\n        return fblur.blur(img, kernel)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        kernel = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n        return {\"kernel\": kernel}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"blur_limit\",)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.BlurInitSchema","title":"<code>class  BlurInitSchema</code> <code> </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class BlurInitSchema(BaseTransformInitSchema):\n    blur_limit: ScaleIntType\n\n    @field_validator(\"blur_limit\")\n    @classmethod\n    def process_blur(cls, value: ScaleIntType, info: ValidationInfo) -&gt; tuple[int, int]:\n        return fblur.process_blur_limit(value, info, min_value=3)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.Defocus","title":"<code>class  Defocus</code> <code>       (radius=(3, 10), alias_blur=(0.1, 0.5), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply defocus blur to the input image.</p> <p>This transform simulates the effect of an out-of-focus camera by applying a defocus blur to the image. It uses a combination of disc kernels and Gaussian blur to create a realistic defocus effect.</p> <p>Parameters:</p> Name Type Description <code>radius</code> <code>tuple[int, int] | int</code> <p>Range for the radius of the defocus blur. If a single int is provided, the range will be [1, radius]. Larger values create a stronger blur effect. Default: (3, 10)</p> <code>alias_blur</code> <code>tuple[float, float] | float</code> <p>Range for the standard deviation of the Gaussian blur applied after the main defocus blur. This helps to reduce aliasing artifacts. If a single float is provided, the range will be (0, alias_blur). Larger values create a smoother, more aliased effect. Default: (0.1, 0.5)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The defocus effect is created using a disc kernel, which simulates the shape of a camera's aperture.</li> <li>The additional Gaussian blur (alias_blur) helps to soften the edges of the disc kernel, creating a   more natural-looking defocus effect.</li> <li>Larger radius values will create a stronger, more noticeable defocus effect.</li> <li>The alias_blur parameter can be used to fine-tune the appearance of the defocus, with larger values   creating a smoother, potentially more realistic effect.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Defocus(radius=(4, 8), alias_blur=(0.2, 0.4), always_apply=True)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; defocused_image = result['image']\n</code></pre> <p>References</p> <ul> <li>https://en.wikipedia.org/wiki/Defocus_aberration</li> <li>https://www.researchgate.net/publication/261311609_Realistic_Defocus_Blur_for_Multiplane_Computer-Generated_Holography</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class Defocus(ImageOnlyTransform):\n    \"\"\"Apply defocus blur to the input image.\n\n    This transform simulates the effect of an out-of-focus camera by applying a defocus blur\n    to the image. It uses a combination of disc kernels and Gaussian blur to create a realistic\n    defocus effect.\n\n    Args:\n        radius (tuple[int, int] | int): Range for the radius of the defocus blur.\n            If a single int is provided, the range will be [1, radius].\n            Larger values create a stronger blur effect.\n            Default: (3, 10)\n\n        alias_blur (tuple[float, float] | float): Range for the standard deviation of the Gaussian blur\n            applied after the main defocus blur. This helps to reduce aliasing artifacts.\n            If a single float is provided, the range will be (0, alias_blur).\n            Larger values create a smoother, more aliased effect.\n            Default: (0.1, 0.5)\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The defocus effect is created using a disc kernel, which simulates the shape of a camera's aperture.\n        - The additional Gaussian blur (alias_blur) helps to soften the edges of the disc kernel, creating a\n          more natural-looking defocus effect.\n        - Larger radius values will create a stronger, more noticeable defocus effect.\n        - The alias_blur parameter can be used to fine-tune the appearance of the defocus, with larger values\n          creating a smoother, potentially more realistic effect.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Defocus(radius=(4, 8), alias_blur=(0.2, 0.4), always_apply=True)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; defocused_image = result['image']\n\n    References:\n        - https://en.wikipedia.org/wiki/Defocus_aberration\n        - https://www.researchgate.net/publication/261311609_Realistic_Defocus_Blur_for_Multiplane_Computer-Generated_Holography\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        radius: OnePlusIntRangeType\n        alias_blur: NonNegativeFloatRangeType\n\n    def __init__(\n        self,\n        radius: ScaleIntType = (3, 10),\n        alias_blur: ScaleFloatType = (0.1, 0.5),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.radius = cast(tuple[int, int], radius)\n        self.alias_blur = cast(tuple[float, float], alias_blur)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        radius: int,\n        alias_blur: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.defocus(img, radius, alias_blur)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"radius\": self.py_random.randint(*self.radius),\n            \"alias_blur\": self.py_random.uniform(*self.alias_blur),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"radius\", \"alias_blur\")\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.GaussianBlur","title":"<code>class  GaussianBlur</code> <code>       (blur_limit=(3, 7), sigma_limit=0, always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply Gaussian blur to the input image using a randomly sized kernel.</p> <p>This transform blurs the input image using a Gaussian filter with a random kernel size and sigma value. Gaussian blur is a widely used image processing technique that reduces image noise and detail, creating a smoothing effect.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>tuple[int, int] | int</code> <p>Controls the range of the Gaussian kernel size. - If a single int is provided, the kernel size will be randomly chosen   between 0 and that value. - If a tuple of two ints is provided, it defines the inclusive range   of possible kernel sizes. Must be zero or odd and in range [0, inf). If set to 0, it will be computed from sigma as <code>round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1</code>. Larger kernel sizes produce stronger blur effects. Default: (3, 7)</p> <code>sigma_limit</code> <code>tuple[float, float] | float</code> <p>Range for the Gaussian kernel standard deviation (sigma). Must be in range [0, inf). - If a single float is provided, sigma will be randomly chosen   between 0 and that value. - If a tuple of two floats is provided, it defines the inclusive range   of possible sigma values. If set to 0, sigma will be computed as <code>sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8</code>. Larger sigma values produce stronger blur effects. Default: 0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The relationship between kernel size and sigma affects the blur strength:   larger kernel sizes allow for stronger blurring effects.</li> <li>When both blur_limit and sigma_limit are set to ranges starting from 0,   the blur_limit minimum is automatically set to 3 to ensure a valid kernel size.</li> <li>For uint8 images, the computation might be faster than for floating-point images.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.GaussianBlur(blur_limit=(3, 7), sigma_limit=(0.1, 2), p=1)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; blurred_image = result[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class GaussianBlur(ImageOnlyTransform):\n    \"\"\"Apply Gaussian blur to the input image using a randomly sized kernel.\n\n    This transform blurs the input image using a Gaussian filter with a random kernel size\n    and sigma value. Gaussian blur is a widely used image processing technique that reduces\n    image noise and detail, creating a smoothing effect.\n\n    Args:\n        blur_limit (tuple[int, int] | int): Controls the range of the Gaussian kernel size.\n            - If a single int is provided, the kernel size will be randomly chosen\n              between 0 and that value.\n            - If a tuple of two ints is provided, it defines the inclusive range\n              of possible kernel sizes.\n            Must be zero or odd and in range [0, inf). If set to 0, it will be computed\n            from sigma as `round(sigma * (3 if img.dtype == np.uint8 else 4) * 2 + 1) + 1`.\n            Larger kernel sizes produce stronger blur effects.\n            Default: (3, 7)\n\n        sigma_limit (tuple[float, float] | float): Range for the Gaussian kernel standard\n            deviation (sigma). Must be in range [0, inf).\n            - If a single float is provided, sigma will be randomly chosen\n              between 0 and that value.\n            - If a tuple of two floats is provided, it defines the inclusive range\n              of possible sigma values.\n            If set to 0, sigma will be computed as `sigma = 0.3*((ksize-1)*0.5 - 1) + 0.8`.\n            Larger sigma values produce stronger blur effects.\n            Default: 0\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The relationship between kernel size and sigma affects the blur strength:\n          larger kernel sizes allow for stronger blurring effects.\n        - When both blur_limit and sigma_limit are set to ranges starting from 0,\n          the blur_limit minimum is automatically set to 3 to ensure a valid kernel size.\n        - For uint8 images, the computation might be faster than for floating-point images.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.GaussianBlur(blur_limit=(3, 7), sigma_limit=(0.1, 2), p=1)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; blurred_image = result[\"image\"]\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        sigma_limit: NonNegativeFloatRangeType\n\n        @field_validator(\"blur_limit\")\n        @classmethod\n        def process_blur(\n            cls,\n            value: ScaleIntType,\n            info: ValidationInfo,\n        ) -&gt; tuple[int, int]:\n            return fblur.process_blur_limit(value, info, min_value=0)\n\n        @model_validator(mode=\"after\")\n        def validate_limits(self) -&gt; Self:\n            if (\n                isinstance(self.blur_limit, (tuple, list))\n                and self.blur_limit[0] == 0\n                and isinstance(self.sigma_limit, (tuple, list))\n                and self.sigma_limit[0] == 0\n            ):\n                self.blur_limit = 3, max(3, self.blur_limit[1])\n                warnings.warn(\n                    \"blur_limit and sigma_limit minimum value can not be both equal to 0. \"\n                    \"blur_limit minimum value changed to 3.\",\n                    stacklevel=2,\n                )\n\n            return self\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = (3, 7),\n        sigma_limit: ScaleFloatType = 0,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p, always_apply)\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.sigma_limit = cast(tuple[float, float], sigma_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        ksize: int,\n        sigma: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.gaussian_blur(img, ksize, sigma=sigma)\n\n    def get_params(self) -&gt; dict[str, float]:\n        ksize = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n\n        return {\"ksize\": ksize, \"sigma\": self.py_random.uniform(*self.sigma_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"blur_limit\", \"sigma_limit\"\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.GlassBlur","title":"<code>class  GlassBlur</code> <code>       (sigma=0.7, max_delta=4, iterations=2, mode='fast', always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply a glass blur effect to the input image.</p> <p>This transform simulates the effect of looking through textured glass by locally shuffling pixels in the image. It creates a distorted, frosted glass-like appearance.</p> <p>Parameters:</p> Name Type Description <code>sigma</code> <code>float</code> <p>Standard deviation for the Gaussian kernel used in the process. Higher values increase the blur effect. Must be non-negative. Default: 0.7</p> <code>max_delta</code> <code>int</code> <p>Maximum distance in pixels for shuffling. Determines how far pixels can be moved. Larger values create more distortion. Must be a positive integer. Default: 4</p> <code>iterations</code> <code>int</code> <p>Number of times to apply the glass blur effect. More iterations create a stronger effect but increase computation time. Must be a positive integer. Default: 2</p> <code>mode</code> <code>Literal[\"fast\", \"exact\"]</code> <p>Mode of computation. Options are: - \"fast\": Uses a faster but potentially less accurate method. - \"exact\": Uses a slower but more precise method. Default: \"fast\"</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>This transform is particularly effective for creating a 'looking through   glass' effect or simulating the view through a frosted window.</li> <li>The 'fast' mode is recommended for most use cases as it provides a good   balance between effect quality and computation speed.</li> <li>Increasing 'iterations' will strengthen the effect but also increase the   processing time linearly.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.GlassBlur(sigma=0.7, max_delta=4, iterations=3, mode=\"fast\", p=1)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; glass_blurred_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>This implementation is based on the technique described in:   \"ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness\"   https://arxiv.org/abs/1903.12261</li> <li>Original implementation:   https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class GlassBlur(ImageOnlyTransform):\n    \"\"\"Apply a glass blur effect to the input image.\n\n    This transform simulates the effect of looking through textured glass by locally\n    shuffling pixels in the image. It creates a distorted, frosted glass-like appearance.\n\n    Args:\n        sigma (float): Standard deviation for the Gaussian kernel used in the process.\n            Higher values increase the blur effect. Must be non-negative.\n            Default: 0.7\n\n        max_delta (int): Maximum distance in pixels for shuffling.\n            Determines how far pixels can be moved. Larger values create more distortion.\n            Must be a positive integer.\n            Default: 4\n\n        iterations (int): Number of times to apply the glass blur effect.\n            More iterations create a stronger effect but increase computation time.\n            Must be a positive integer.\n            Default: 2\n\n        mode (Literal[\"fast\", \"exact\"]): Mode of computation. Options are:\n            - \"fast\": Uses a faster but potentially less accurate method.\n            - \"exact\": Uses a slower but more precise method.\n            Default: \"fast\"\n\n        p (float): Probability of applying the transform. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - This transform is particularly effective for creating a 'looking through\n          glass' effect or simulating the view through a frosted window.\n        - The 'fast' mode is recommended for most use cases as it provides a good\n          balance between effect quality and computation speed.\n        - Increasing 'iterations' will strengthen the effect but also increase the\n          processing time linearly.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.GlassBlur(sigma=0.7, max_delta=4, iterations=3, mode=\"fast\", p=1)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; glass_blurred_image = result[\"image\"]\n\n    References:\n        - This implementation is based on the technique described in:\n          \"ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness\"\n          https://arxiv.org/abs/1903.12261\n        - Original implementation:\n          https://github.com/hendrycks/robustness/blob/master/ImageNet-C/create_c/make_imagenet_c.py\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        sigma: float = Field(ge=0)\n        max_delta: int = Field(ge=1)\n        iterations: int = Field(ge=1)\n        mode: Literal[\"fast\", \"exact\"]\n\n    def __init__(\n        self,\n        sigma: float = 0.7,\n        max_delta: int = 4,\n        iterations: int = 2,\n        mode: Literal[\"fast\", \"exact\"] = \"fast\",\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.sigma = sigma\n        self.max_delta = max_delta\n        self.iterations = iterations\n        self.mode = mode\n\n    def apply(\n        self,\n        img: np.ndarray,\n        *args: Any,\n        dxy: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.glass_blur(\n            img,\n            self.sigma,\n            self.max_delta,\n            self.iterations,\n            dxy,\n            self.mode,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n\n        # generate array containing all necessary values for transformations\n        width_pixels = height - self.max_delta * 2\n        height_pixels = width - self.max_delta * 2\n        total_pixels = int(width_pixels * height_pixels)\n        dxy = self.random_generator.integers(\n            -self.max_delta,\n            self.max_delta,\n            size=(total_pixels, self.iterations, 2),\n        )\n\n        return {\"dxy\": dxy}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return \"sigma\", \"max_delta\", \"iterations\", \"mode\"\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.MedianBlur","title":"<code>class  MedianBlur</code> <code>       (blur_limit=7, p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Apply median blur to the input image.</p> <p>This transform uses a median filter to blur the input image. Median filtering is particularly effective at removing salt-and-pepper noise while preserving edges, making it a popular choice for noise reduction in image processing.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>int | tuple[int, int]</code> <p>Maximum aperture linear size for blurring the input image. Must be odd and in the range [3, inf). - If a single int is provided, the kernel size will be randomly chosen   between 3 and that value. - If a tuple of two ints is provided, it defines the inclusive range   of possible kernel sizes. Default: (3, 7)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The kernel size (aperture linear size) must always be odd and greater than 1.</li> <li>Unlike mean blur or Gaussian blur, median blur uses the median of all pixels under   the kernel area, making it more robust to outliers.</li> <li>This transform is particularly useful for:</li> <li>Removing salt-and-pepper noise</li> <li>Preserving edges while smoothing images</li> <li>Pre-processing images for edge detection algorithms</li> <li>For color images, the median is calculated independently for each channel.</li> <li>Larger kernel sizes result in stronger blurring effects but may also remove   fine details from the image.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.MedianBlur(blur_limit=(3, 7), p=0.5)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; blurred_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Median filter: https://en.wikipedia.org/wiki/Median_filter</li> <li>OpenCV medianBlur: https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga564869aa33e58769b4469101aac458f9</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class MedianBlur(Blur):\n    \"\"\"Apply median blur to the input image.\n\n    This transform uses a median filter to blur the input image. Median filtering is particularly\n    effective at removing salt-and-pepper noise while preserving edges, making it a popular choice\n    for noise reduction in image processing.\n\n    Args:\n        blur_limit (int | tuple[int, int]): Maximum aperture linear size for blurring the input image.\n            Must be odd and in the range [3, inf).\n            - If a single int is provided, the kernel size will be randomly chosen\n              between 3 and that value.\n            - If a tuple of two ints is provided, it defines the inclusive range\n              of possible kernel sizes.\n            Default: (3, 7)\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The kernel size (aperture linear size) must always be odd and greater than 1.\n        - Unlike mean blur or Gaussian blur, median blur uses the median of all pixels under\n          the kernel area, making it more robust to outliers.\n        - This transform is particularly useful for:\n          * Removing salt-and-pepper noise\n          * Preserving edges while smoothing images\n          * Pre-processing images for edge detection algorithms\n        - For color images, the median is calculated independently for each channel.\n        - Larger kernel sizes result in stronger blurring effects but may also remove\n          fine details from the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.MedianBlur(blur_limit=(3, 7), p=0.5)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; blurred_image = result[\"image\"]\n\n    References:\n        - Median filter: https://en.wikipedia.org/wiki/Median_filter\n        - OpenCV medianBlur: https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga564869aa33e58769b4469101aac458f9\n    \"\"\"\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = 7,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(blur_limit=blur_limit, p=p, always_apply=always_apply)\n\n    def apply(self, img: np.ndarray, kernel: int, **params: Any) -&gt; np.ndarray:\n        return fblur.median_blur(img, kernel)\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.MotionBlur","title":"<code>class  MotionBlur</code> <code>       (blur_limit=7, allow_shifted=True, angle_range=(0, 360), direction_range=(-1.0, 1.0), p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Apply motion blur to the input image using a directional kernel.</p> <p>This transform simulates motion blur effects that occur during image capture, such as camera shake or object movement. It creates a directional blur using a line-shaped kernel with controllable angle, direction, and position.</p> <p>Parameters:</p> Name Type Description <code>blur_limit</code> <code>int | tuple[int, int]</code> <p>Maximum kernel size for blurring. Should be in range [3, inf). - If int: kernel size will be randomly chosen from [3, blur_limit] - If tuple: kernel size will be randomly chosen from [min, max] Larger values create stronger blur effects. Default: (3, 7)</p> <code>angle_range</code> <code>tuple[float, float]</code> <p>Range of possible angles in degrees. Controls the rotation of the motion blur line: - 0\u00b0: Horizontal motion blur \u2192 - 45\u00b0: Diagonal motion blur \u2197 - 90\u00b0: Vertical motion blur \u2191 - 135\u00b0: Diagonal motion blur \u2196 Default: (0, 360)</p> <code>direction_range</code> <code>tuple[float, float]</code> <p>Range for motion bias. Controls how the blur extends from the center: - -1.0: Blur extends only backward (\u2190) -  0.0: Blur extends equally in both directions (\u2190\u2192) -  1.0: Blur extends only forward (\u2192) For example, with angle=0: - direction=-1.0: \u2190\u2022 - direction=0.0:  \u2190\u2022\u2192 - direction=1.0:   \u2022\u2192 Default: (-1.0, 1.0)</p> <code>allow_shifted</code> <code>bool</code> <p>Allow random kernel position shifts. - If True: Kernel can be randomly offset from center - If False: Kernel will always be centered Default: True</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Examples of angle vs direction:     1. Horizontal motion (angle=0\u00b0):        - direction=0.0:   \u2190\u2022\u2192   (symmetric blur)        - direction=1.0:    \u2022\u2192   (forward blur)        - direction=-1.0:  \u2190\u2022    (backward blur)</p> <pre><code>2. Vertical motion (angle=90\u00b0):\n   - direction=0.0:   \u2191\u2022\u2193   (symmetric blur)\n   - direction=1.0:    \u2022\u2191   (upward blur)\n   - direction=-1.0:  \u2193\u2022    (downward blur)\n\n3. Diagonal motion (angle=45\u00b0):\n   - direction=0.0:   \u2199\u2022\u2197   (symmetric blur)\n   - direction=1.0:    \u2022\u2197   (forward diagonal blur)\n   - direction=-1.0:  \u2199\u2022    (backward diagonal blur)\n</code></pre> <p>Note</p> <ul> <li>angle controls the orientation of the motion line</li> <li>direction controls the distribution of the blur along that line</li> <li>Together they can simulate various motion effects:</li> <li>Camera shake: Small angle range + direction near 0</li> <li>Object motion: Specific angle + direction=1.0</li> <li>Complex motion: Random angle + random direction</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Horizontal camera shake (symmetric)\n&gt;&gt;&gt; transform = A.MotionBlur(\n...     angle_range=(-5, 5),      # Near-horizontal motion\n...     direction_range=(0, 0),    # Symmetric blur\n...     p=1.0\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Object moving right\n&gt;&gt;&gt; transform = A.MotionBlur(\n...     angle_range=(0, 0),        # Horizontal motion\n...     direction_range=(0.8, 1.0), # Strong forward bias\n...     p=1.0\n... )\n</code></pre> <p>References</p> <ul> <li> <p>Motion blur fundamentals:   https://en.wikipedia.org/wiki/Motion_blur</p> </li> <li> <p>Directional blur kernels:   https://www.sciencedirect.com/topics/computer-science/directional-blur</p> </li> <li> <p>OpenCV filter2D (used for convolution):   https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga27c049795ce870216ddfb366086b5a04</p> </li> <li> <p>Research on motion blur simulation:   \"Understanding and Evaluating Blind Deconvolution Algorithms\" (CVPR 2009)   https://doi.org/10.1109/CVPR.2009.5206815</p> </li> <li> <p>Motion blur in photography:   \"The Manual of Photography\", Chapter 7: Motion in Photography   ISBN: 978-0240520377</p> </li> <li> <p>Kornia's implementation (similar approach):   https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomMotionBlur</p> </li> </ul> <p>See Also:     - GaussianBlur: For uniform blur effects     - MedianBlur: For noise reduction while preserving edges     - RandomRain: Another motion-based effect     - Perspective: For geometric motion-like distortions</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class MotionBlur(Blur):\n    \"\"\"Apply motion blur to the input image using a directional kernel.\n\n    This transform simulates motion blur effects that occur during image capture,\n    such as camera shake or object movement. It creates a directional blur using\n    a line-shaped kernel with controllable angle, direction, and position.\n\n    Args:\n        blur_limit (int | tuple[int, int]): Maximum kernel size for blurring.\n            Should be in range [3, inf).\n            - If int: kernel size will be randomly chosen from [3, blur_limit]\n            - If tuple: kernel size will be randomly chosen from [min, max]\n            Larger values create stronger blur effects.\n            Default: (3, 7)\n\n        angle_range (tuple[float, float]): Range of possible angles in degrees.\n            Controls the rotation of the motion blur line:\n            - 0\u00b0: Horizontal motion blur \u2192\n            - 45\u00b0: Diagonal motion blur \u2197\n            - 90\u00b0: Vertical motion blur \u2191\n            - 135\u00b0: Diagonal motion blur \u2196\n            Default: (0, 360)\n\n        direction_range (tuple[float, float]): Range for motion bias.\n            Controls how the blur extends from the center:\n            - -1.0: Blur extends only backward (\u2190)\n            -  0.0: Blur extends equally in both directions (\u2190\u2192)\n            -  1.0: Blur extends only forward (\u2192)\n            For example, with angle=0:\n            - direction=-1.0: \u2190\u2022\n            - direction=0.0:  \u2190\u2022\u2192\n            - direction=1.0:   \u2022\u2192\n            Default: (-1.0, 1.0)\n\n        allow_shifted (bool): Allow random kernel position shifts.\n            - If True: Kernel can be randomly offset from center\n            - If False: Kernel will always be centered\n            Default: True\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Examples of angle vs direction:\n        1. Horizontal motion (angle=0\u00b0):\n           - direction=0.0:   \u2190\u2022\u2192   (symmetric blur)\n           - direction=1.0:    \u2022\u2192   (forward blur)\n           - direction=-1.0:  \u2190\u2022    (backward blur)\n\n        2. Vertical motion (angle=90\u00b0):\n           - direction=0.0:   \u2191\u2022\u2193   (symmetric blur)\n           - direction=1.0:    \u2022\u2191   (upward blur)\n           - direction=-1.0:  \u2193\u2022    (downward blur)\n\n        3. Diagonal motion (angle=45\u00b0):\n           - direction=0.0:   \u2199\u2022\u2197   (symmetric blur)\n           - direction=1.0:    \u2022\u2197   (forward diagonal blur)\n           - direction=-1.0:  \u2199\u2022    (backward diagonal blur)\n\n    Note:\n        - angle controls the orientation of the motion line\n        - direction controls the distribution of the blur along that line\n        - Together they can simulate various motion effects:\n          * Camera shake: Small angle range + direction near 0\n          * Object motion: Specific angle + direction=1.0\n          * Complex motion: Random angle + random direction\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Horizontal camera shake (symmetric)\n        &gt;&gt;&gt; transform = A.MotionBlur(\n        ...     angle_range=(-5, 5),      # Near-horizontal motion\n        ...     direction_range=(0, 0),    # Symmetric blur\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Object moving right\n        &gt;&gt;&gt; transform = A.MotionBlur(\n        ...     angle_range=(0, 0),        # Horizontal motion\n        ...     direction_range=(0.8, 1.0), # Strong forward bias\n        ...     p=1.0\n        ... )\n\n    References:\n        - Motion blur fundamentals:\n          https://en.wikipedia.org/wiki/Motion_blur\n\n        - Directional blur kernels:\n          https://www.sciencedirect.com/topics/computer-science/directional-blur\n\n        - OpenCV filter2D (used for convolution):\n          https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#ga27c049795ce870216ddfb366086b5a04\n\n        - Research on motion blur simulation:\n          \"Understanding and Evaluating Blind Deconvolution Algorithms\" (CVPR 2009)\n          https://doi.org/10.1109/CVPR.2009.5206815\n\n        - Motion blur in photography:\n          \"The Manual of Photography\", Chapter 7: Motion in Photography\n          ISBN: 978-0240520377\n\n        - Kornia's implementation (similar approach):\n          https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomMotionBlur\n\n    See Also:\n        - GaussianBlur: For uniform blur effects\n        - MedianBlur: For noise reduction while preserving edges\n        - RandomRain: Another motion-based effect\n        - Perspective: For geometric motion-like distortions\n\n    \"\"\"\n\n    class InitSchema(BlurInitSchema):\n        allow_shifted: bool\n        angle_range: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 360)),\n        ]\n        direction_range: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(min_val=-1.0, max_val=1.0)),\n        ]\n\n    def __init__(\n        self,\n        blur_limit: ScaleIntType = 7,\n        allow_shifted: bool = True,\n        angle_range: tuple[float, float] = (0, 360),\n        direction_range: tuple[float, float] = (-1.0, 1.0),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(blur_limit=blur_limit, p=p)\n        self.allow_shifted = allow_shifted\n        self.blur_limit = cast(tuple[int, int], blur_limit)\n        self.angle_range = angle_range\n        self.direction_range = direction_range\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"allow_shifted\",\n            \"angle_range\",\n            \"direction_range\",\n        )\n\n    def apply(self, img: np.ndarray, kernel: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fmain.convolve(img, kernel=kernel)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        ksize = fblur.sample_odd_from_range(\n            self.py_random,\n            self.blur_limit[0],\n            self.blur_limit[1],\n        )\n\n        angle = self.py_random.uniform(*self.angle_range)\n        direction = self.py_random.uniform(*self.direction_range)\n\n        # Create motion blur kernel\n        kernel = fblur.create_motion_kernel(\n            ksize,\n            angle,\n            direction,\n            allow_shifted=self.allow_shifted,\n            random_state=self.py_random,\n        )\n\n        return {\"kernel\": kernel.astype(np.float32) / np.sum(kernel)}\n</code></pre>"},{"location":"api_reference/augmentations/blur/transforms/#albumentations.augmentations.blur.transforms.ZoomBlur","title":"<code>class  ZoomBlur</code> <code>       (max_factor=(1, 1.31), step_factor=(0.01, 0.03), always_apply=None, p=0.5)                       </code>  [view source on GitHub]","text":"<p>Apply zoom blur transform.</p> <p>Parameters:</p> Name Type Description <code>max_factor</code> <code>float, float) or float</code> <p>range for max factor for blurring. If max_factor is a single float, the range will be (1, limit). Default: (1, 1.31). All max_factor values should be larger than 1.</p> <code>step_factor</code> <code>float, float) or float</code> <p>If single float will be used as step parameter for np.arange. If tuple of float step_factor will be in range <code>[step_factor[0], step_factor[1])</code>. Default: (0.01, 0.03). All step_factor values should be positive.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     unit8, float32</p> <p>Reference</p> <p>https://arxiv.org/abs/1903.12261</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/blur/transforms.py</code> Python<pre><code>class ZoomBlur(ImageOnlyTransform):\n    \"\"\"Apply zoom blur transform.\n\n    Args:\n        max_factor ((float, float) or float): range for max factor for blurring.\n            If max_factor is a single float, the range will be (1, limit). Default: (1, 1.31).\n            All max_factor values should be larger than 1.\n        step_factor ((float, float) or float): If single float will be used as step parameter for np.arange.\n            If tuple of float step_factor will be in range `[step_factor[0], step_factor[1])`. Default: (0.01, 0.03).\n            All step_factor values should be positive.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        unit8, float32\n\n    Reference:\n        https://arxiv.org/abs/1903.12261\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        max_factor: OnePlusFloatRangeType\n        step_factor: NonNegativeFloatRangeType\n\n    def __init__(\n        self,\n        max_factor: ScaleFloatType = (1, 1.31),\n        step_factor: ScaleFloatType = (0.01, 0.03),\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_factor = cast(tuple[float, float], max_factor)\n        self.step_factor = cast(tuple[float, float], step_factor)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        zoom_factors: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fblur.zoom_blur(img, zoom_factors)\n\n    def get_params(self) -&gt; dict[str, Any]:\n        step_factor = self.py_random.uniform(*self.step_factor)\n        max_factor = max(1 + step_factor, self.py_random.uniform(*self.max_factor))\n        return {\"zoom_factors\": np.arange(1.0, max_factor, step_factor)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str]:\n        return (\"max_factor\", \"step_factor\")\n</code></pre>"},{"location":"api_reference/augmentations/crops/","title":"Index","text":"<ul> <li>Crop functional transforms (albumentations.augmentations.crops.functional)</li> <li>Crop transforms (albumentations.augmentations.crops.transforms)</li> </ul>"},{"location":"api_reference/augmentations/crops/functional/","title":"Crop functional transforms (augmentations.crops.functional)","text":""},{"location":"api_reference/augmentations/crops/functional/#albumentations.augmentations.crops.functional.crop_and_pad_keypoints","title":"<code>def crop_and_pad_keypoints    (keypoints, crop_params=None, pad_params=None, image_shape=(0, 0), result_shape=(0, 0), keep_size=False)    </code> [view source on GitHub]","text":"<p>Crop and pad multiple keypoints simultaneously.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).</p> <code>crop_params</code> <code>Sequence[int]</code> <p>Crop parameters [crop_x1, crop_y1, ...].</p> <code>pad_params</code> <code>Sequence[int]</code> <p>Pad parameters [top, bottom, left, right].</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>Original image shape (rows, cols).</p> <code>result_shape</code> <code>Tuple[int, int]</code> <p>Result image shape (rows, cols).</p> <code>keep_size</code> <code>bool</code> <p>Whether to keep the original size.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of transformed keypoints with the same shape as input.</p> Source code in <code>albumentations/augmentations/crops/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef crop_and_pad_keypoints(\n    keypoints: np.ndarray,\n    crop_params: tuple[int, int, int, int] | None = None,\n    pad_params: tuple[int, int, int, int] | None = None,\n    image_shape: tuple[int, int] = (0, 0),\n    result_shape: tuple[int, int] = (0, 0),\n    keep_size: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Crop and pad multiple keypoints simultaneously.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).\n        crop_params (Sequence[int], optional): Crop parameters [crop_x1, crop_y1, ...].\n        pad_params (Sequence[int], optional): Pad parameters [top, bottom, left, right].\n        image_shape (Tuple[int, int]): Original image shape (rows, cols).\n        result_shape (Tuple[int, int]): Result image shape (rows, cols).\n        keep_size (bool): Whether to keep the original size.\n\n    Returns:\n        np.ndarray: Array of transformed keypoints with the same shape as input.\n    \"\"\"\n    transformed_keypoints = keypoints.copy()\n\n    if crop_params is not None:\n        crop_x1, crop_y1 = crop_params[:2]\n        transformed_keypoints[:, 0] -= crop_x1\n        transformed_keypoints[:, 1] -= crop_y1\n\n    if pad_params is not None:\n        top, _, left, _ = pad_params\n        transformed_keypoints[:, 0] += left\n        transformed_keypoints[:, 1] += top\n\n    rows, cols = image_shape[:2]\n    result_rows, result_cols = result_shape[:2]\n\n    if keep_size and (result_cols != cols or result_rows != rows):\n        scale_x = cols / result_cols\n        scale_y = rows / result_rows\n        return fgeometric.keypoints_scale(transformed_keypoints, scale_x, scale_y)\n\n    return transformed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/crops/functional/#albumentations.augmentations.crops.functional.crop_bboxes_by_coords","title":"<code>def crop_bboxes_by_coords    (bboxes, crop_coords, image_shape, normalized_input=True)    </code> [view source on GitHub]","text":"<p>Crop bounding boxes based on given crop coordinates.</p> <p>This function adjusts bounding boxes to fit within a cropped image.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) where each row is                  [x_min, y_min, x_max, y_max, ...]. The bounding box coordinates                  can be either normalized (in [0, 1]) if normalized_input=True or                  absolute pixel values if normalized_input=False.</p> <code>crop_coords</code> <code>tuple[int, int, int, int]</code> <p>Crop coordinates (x_min, y_min, x_max, y_max)                                      in absolute pixel values.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Original image shape (height, width).</p> <code>normalized_input</code> <code>bool</code> <p>Whether input boxes are in normalized coordinates.                    If True, assumes input is normalized [0,1] and returns normalized coordinates.                    If False, assumes input is in absolute pixels and returns absolute coordinates.                    Default: True for backward compatibility.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of cropped bounding boxes. Coordinates will be in the same format as input            (normalized if normalized_input=True, absolute pixels if normalized_input=False).</p> <p>Note</p> <p>Bounding boxes that fall completely outside the crop area will be removed. Bounding boxes that partially overlap with the crop area will be adjusted to fit within it.</p> Source code in <code>albumentations/augmentations/crops/functional.py</code> Python<pre><code>def crop_bboxes_by_coords(\n    bboxes: np.ndarray,\n    crop_coords: tuple[int, int, int, int],\n    image_shape: tuple[int, int],\n    normalized_input: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Crop bounding boxes based on given crop coordinates.\n\n    This function adjusts bounding boxes to fit within a cropped image.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (N, 4+) where each row is\n                             [x_min, y_min, x_max, y_max, ...]. The bounding box coordinates\n                             can be either normalized (in [0, 1]) if normalized_input=True or\n                             absolute pixel values if normalized_input=False.\n        crop_coords (tuple[int, int, int, int]): Crop coordinates (x_min, y_min, x_max, y_max)\n                                                 in absolute pixel values.\n        image_shape (tuple[int, int]): Original image shape (height, width).\n        normalized_input (bool): Whether input boxes are in normalized coordinates.\n                               If True, assumes input is normalized [0,1] and returns normalized coordinates.\n                               If False, assumes input is in absolute pixels and returns absolute coordinates.\n                               Default: True for backward compatibility.\n\n    Returns:\n        np.ndarray: Array of cropped bounding boxes. Coordinates will be in the same format as input\n                   (normalized if normalized_input=True, absolute pixels if normalized_input=False).\n\n    Note:\n        Bounding boxes that fall completely outside the crop area will be removed.\n        Bounding boxes that partially overlap with the crop area will be adjusted to fit within it.\n    \"\"\"\n    if not bboxes.size:\n        return bboxes\n\n    # Convert to absolute coordinates if needed\n    if normalized_input:\n        cropped_bboxes = denormalize_bboxes(bboxes.copy().astype(np.float32), image_shape)\n    else:\n        cropped_bboxes = bboxes.copy().astype(np.float32)\n\n    x_min, y_min = crop_coords[:2]\n\n    # Subtract crop coordinates\n    cropped_bboxes[:, [0, 2]] -= x_min\n    cropped_bboxes[:, [1, 3]] -= y_min\n\n    # Calculate crop shape\n    crop_height = crop_coords[3] - crop_coords[1]\n    crop_width = crop_coords[2] - crop_coords[0]\n    crop_shape = (crop_height, crop_width)\n\n    # Return in same format as input\n    return normalize_bboxes(cropped_bboxes, crop_shape) if normalized_input else cropped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/crops/functional/#albumentations.augmentations.crops.functional.crop_keypoints_by_coords","title":"<code>def crop_keypoints_by_coords    (keypoints, crop_coords)    </code> [view source on GitHub]","text":"<p>Crop keypoints using the provided coordinates of bottom-left and top-right corners in pixels.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>An array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).</p> <code>crop_coords</code> <code>tuple</code> <p>Crop box coords (x1, y1, x2, y2).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of cropped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/crops/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef crop_keypoints_by_coords(\n    keypoints: np.ndarray,\n    crop_coords: tuple[int, int, int, int],\n) -&gt; np.ndarray:\n    \"\"\"Crop keypoints using the provided coordinates of bottom-left and top-right corners in pixels.\n\n    Args:\n        keypoints (np.ndarray): An array of keypoints with shape (N, 4+) where each row is (x, y, angle, scale, ...).\n        crop_coords (tuple): Crop box coords (x1, y1, x2, y2).\n\n    Returns:\n        np.ndarray: An array of cropped keypoints with the same shape as the input.\n    \"\"\"\n    x1, y1 = crop_coords[:2]\n\n    cropped_keypoints = keypoints.copy()\n    cropped_keypoints[:, 0] -= x1  # Adjust x coordinates\n    cropped_keypoints[:, 1] -= y1  # Adjust y coordinates\n\n    return cropped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/","title":"Crop transforms (augmentations.crops.transforms)","text":""},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.AtLeastOneBBoxRandomCrop","title":"<code>class  AtLeastOneBBoxRandomCrop</code> <code>       (height, width, erosion_factor=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crops an image to a fixed resolution, while ensuring that at least one bounding box is always in the crop. The maximal erosion factor define by how much the target bounding box can be thinned out. For example, erosion_factor = 0.2 means that the bounding box dimensions can be thinned by up to 20%.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>Height of the crop.</p> <code>width</code> <code>int</code> <p>Width of the crop.</p> <code>erosion_factor</code> <code>float</code> <p>Maximal erosion factor of the height and width of the target bounding box. Default: 0.0.</p> <code>p</code> <code>float</code> <p>The probability of applying the transform. Default: 1.0.</p> <code>always_apply</code> <code>bool | None</code> <p>Whether to apply the transform systematically.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class AtLeastOneBBoxRandomCrop(BaseCrop):\n    \"\"\"Crops an image to a fixed resolution, while ensuring that at least one bounding box is always in the crop.\n    The maximal erosion factor define by how much the target bounding box can be thinned out.\n    For example, erosion_factor = 0.2 means that the bounding box dimensions can be thinned by up to 20%.\n\n    Args:\n        height: Height of the crop.\n        width: Width of the crop.\n        erosion_factor: Maximal erosion factor of the height and width of the target bounding box. Default: 0.0.\n        p: The probability of applying the transform. Default: 1.0.\n        always_apply: Whether to apply the transform systematically.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseCrop.InitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        erosion_factor: Annotated[float, Field(ge=0.0, le=1.0)]\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        erosion_factor: float = 0.0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.height = height\n        self.width = width\n        self.erosion_factor = erosion_factor\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_height, image_width = params[\"shape\"][:2]\n        bboxes = data.get(\"bboxes\", [])\n\n        if self.height &gt; image_height or self.width &gt; image_width:\n            raise CropSizeError(\n                f\"Crop size (height, width) exceeds image dimensions (height, width):\"\n                f\" {(self.height, self.width)} vs {image_height, image_width}\",\n            )\n\n        if len(bboxes) &gt; 0:\n            # Pick a bbox amongst all possible as our reference bbox.\n            bboxes = denormalize_bboxes(bboxes, shape=(image_height, image_width))\n            bbox = self.py_random.choice(bboxes)\n\n            x1, y1, x2, y2 = bbox[:4]\n\n            w = x2 - x1\n            h = y2 - y1\n\n            # Compute the eroded width and height\n            ew = w * (1.0 - self.erosion_factor)\n            eh = h * (1.0 - self.erosion_factor)\n\n            # Compute the lower and upper bounds for the x-axis and y-axis.\n            ax1 = np.clip(\n                a=x1 + ew - self.width,\n                a_min=0.0,\n                a_max=image_width - self.width,\n            )\n            bx1 = np.clip(\n                a=x2 - ew,\n                a_min=0.0,\n                a_max=image_width - self.width,\n            )\n\n            ay1 = np.clip(\n                a=y1 + eh - self.height,\n                a_min=0.0,\n                a_max=image_height - self.height,\n            )\n            by1 = np.clip(\n                a=y2 - eh,\n                a_min=0.0,\n                a_max=image_height - self.height,\n            )\n        else:\n            # If there are no bboxes, just crop anywhere in the image.\n            ax1 = 0.0\n            bx1 = image_width - self.width\n\n            ay1 = 0.0\n            by1 = image_height - self.height\n\n        # Randomly draw the upper-left corner.\n        x1 = int(self.py_random.uniform(a=ax1, b=bx1))\n        y1 = int(self.py_random.uniform(a=ay1, b=by1))\n\n        x2 = x1 + self.width\n        y2 = y1 + self.height\n\n        return {\"crop_coords\": (x1, y1, x2, y2)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"erosion_factor\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BBoxSafeRandomCrop","title":"<code>class  BBoxSafeRandomCrop</code> <code>       (erosion_rate=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input without loss of bounding boxes.</p> <p>This transform performs a random crop of the input image while ensuring that all bounding boxes remain within the cropped area. It's particularly useful for object detection tasks where preserving all objects in the image is crucial.</p> <p>Parameters:</p> Name Type Description <code>erosion_rate</code> <code>float</code> <p>A value between 0.0 and 1.0 that determines the minimum allowable size of the crop as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be at least 80% of the original image height. Default: 0.0 (no minimum size).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>This transform ensures that all bounding boxes in the original image are fully contained within the cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out), it will default to cropping the entire image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.ones((300, 300, 3), dtype=np.uint8)\n&gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n&gt;&gt;&gt; transform = A.Compose([\n...     A.BBoxSafeRandomCrop(erosion_rate=0.2, p=1.0),\n... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n&gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BBoxSafeRandomCrop(BaseCrop):\n    \"\"\"Crop a random part of the input without loss of bounding boxes.\n\n    This transform performs a random crop of the input image while ensuring that all bounding boxes remain within\n    the cropped area. It's particularly useful for object detection tasks where preserving all objects in the image\n    is crucial.\n\n    Args:\n        erosion_rate (float): A value between 0.0 and 1.0 that determines the minimum allowable size of the crop\n            as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be\n            at least 80% of the original image height. Default: 0.0 (no minimum size).\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        This transform ensures that all bounding boxes in the original image are fully contained within the\n        cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out),\n        it will default to cropping the entire image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.ones((300, 300, 3), dtype=np.uint8)\n        &gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.BBoxSafeRandomCrop(erosion_rate=0.2, p=1.0),\n        ... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n        &gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        erosion_rate: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n\n    def __init__(self, erosion_rate: float = 0.0, p: float = 1.0, always_apply: bool | None = None):\n        super().__init__(p=p)\n        self.erosion_rate = erosion_rate\n\n    def _get_coords_no_bbox(self, image_shape: tuple[int, int]) -&gt; tuple[int, int, int, int]:\n        image_height, image_width = image_shape\n\n        erosive_h = int(image_height * (1.0 - self.erosion_rate))\n        crop_height = image_height if erosive_h &gt;= image_height else self.py_random.randint(erosive_h, image_height)\n\n        crop_width = int(crop_height * image_width / image_height)\n\n        h_start = self.py_random.random()\n        w_start = self.py_random.random()\n\n        crop_shape = (crop_height, crop_width)\n\n        return fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_shape = params[\"shape\"][:2]\n\n        if len(data[\"bboxes\"]) == 0:  # less likely, this class is for use with bboxes.\n            crop_coords = self._get_coords_no_bbox(image_shape)\n            return {\"crop_coords\": crop_coords}\n\n        bbox_union = union_of_bboxes(bboxes=data[\"bboxes\"], erosion_rate=self.erosion_rate)\n\n        if bbox_union is None:\n            crop_coords = self._get_coords_no_bbox(image_shape)\n            return {\"crop_coords\": crop_coords}\n\n        x_min, y_min, x_max, y_max = bbox_union\n\n        x_min = np.clip(x_min, 0, 1)\n        y_min = np.clip(y_min, 0, 1)\n        x_max = np.clip(x_max, x_min, 1)\n        y_max = np.clip(y_max, y_min, 1)\n\n        image_height, image_width = image_shape\n\n        crop_x_min = int(x_min * self.py_random.random() * image_width)\n        crop_y_min = int(y_min * self.py_random.random() * image_height)\n\n        bbox_xmax = x_max + (1 - x_max) * self.py_random.random()\n        bbox_ymax = y_max + (1 - y_max) * self.py_random.random()\n        crop_x_max = int(bbox_xmax * image_width)\n        crop_y_max = int(bbox_ymax * image_height)\n\n        return {\"crop_coords\": (crop_x_min, crop_y_min, crop_x_max, crop_y_max)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"erosion_rate\",)\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BaseCrop","title":"<code>class  BaseCrop</code> <code> </code>  [view source on GitHub]","text":"<p>Base class for transforms that only perform cropping.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BaseCrop(DualTransform):\n    \"\"\"Base class for transforms that only perform cropping.\"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop(img, x_min=crop_coords[0], y_min=crop_coords[1], x_max=crop_coords[2], y_max=crop_coords[3])\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_bboxes_by_coords(bboxes, crop_coords, params[\"shape\"][:2])\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_keypoints_by_coords(keypoints, crop_coords)\n\n    @staticmethod\n    def _clip_bbox(bbox: tuple[int, int, int, int], image_shape: tuple[int, int]) -&gt; tuple[int, int, int, int]:\n        height, width = image_shape[:2]\n        x_min, y_min, x_max, y_max = bbox\n        x_min = np.clip(x_min, 0, width)\n        y_min = np.clip(y_min, 0, height)\n\n        x_max = np.clip(x_max, x_min, width)\n        y_max = np.clip(y_max, y_min, height)\n        return x_min, y_min, x_max, y_max\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BaseCropAndPad","title":"<code>class  BaseCropAndPad</code> <code>       (pad_if_needed, border_mode, fill, fill_mask, pad_position, p, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Base class for transforms that need both cropping and padding.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BaseCropAndPad(BaseCrop):\n    \"\"\"Base class for transforms that need both cropping and padding.\"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        pad_if_needed: bool\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_position: PositionType\n\n    def __init__(\n        self,\n        pad_if_needed: bool,\n        border_mode: int,\n        fill: ColorType,\n        fill_mask: ColorType,\n        pad_position: PositionType,\n        p: float,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p)\n        self.pad_if_needed = pad_if_needed\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.pad_position = pad_position\n\n    def _get_pad_params(self, image_shape: tuple[int, int], target_shape: tuple[int, int]) -&gt; dict[str, Any] | None:\n        \"\"\"Calculate padding parameters if needed.\"\"\"\n        if not self.pad_if_needed:\n            return None\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.get_padding_params(\n            image_shape=image_shape,\n            min_height=target_shape[0],\n            min_width=target_shape[1],\n            pad_height_divisor=None,\n            pad_width_divisor=None,\n        )\n\n        if h_pad_top == h_pad_bottom == w_pad_left == w_pad_right == 0:\n            return None\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.adjust_padding_by_position(\n            h_top=h_pad_top,\n            h_bottom=h_pad_bottom,\n            w_left=w_pad_left,\n            w_right=w_pad_right,\n            position=self.pad_position,\n            py_random=self.py_random,\n        )\n\n        return {\n            \"pad_top\": h_pad_top,\n            \"pad_bottom\": h_pad_bottom,\n            \"pad_left\": w_pad_left,\n            \"pad_right\": w_pad_right,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        if pad_params is not None:\n            img = fgeometric.pad_with_params(\n                img,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                border_mode=self.border_mode,\n                value=self.fill,\n            )\n        return BaseCrop.apply(self, img, crop_coords, **params)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        crop_coords: Any,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        if pad_params is not None:\n            mask = fgeometric.pad_with_params(\n                mask,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                border_mode=self.border_mode,\n                value=self.fill_mask,\n            )\n        # Note' that super().apply would apply the padding twice as it is looped to this.apply\n        return BaseCrop.apply(self, mask, crop_coords=crop_coords, **params)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        image_shape = params[\"shape\"][:2]\n\n        if pad_params is not None:\n            # First denormalize bboxes to absolute coordinates\n            bboxes_np = denormalize_bboxes(bboxes, image_shape)\n\n            # Apply padding to bboxes (already works with absolute coordinates)\n            bboxes_np = fgeometric.pad_bboxes(\n                bboxes_np,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                self.border_mode,\n                image_shape=image_shape,\n            )\n\n            # Update shape to padded dimensions\n            padded_height = image_shape[0] + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            padded_width = image_shape[1] + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n            padded_shape = (padded_height, padded_width)\n\n            bboxes_np = normalize_bboxes(bboxes_np, padded_shape)\n\n            params[\"shape\"] = padded_shape\n\n            return BaseCrop.apply_to_bboxes(self, bboxes_np, crop_coords, **params)\n\n        # If no padding, use original function behavior\n        return BaseCrop.apply_to_bboxes(self, bboxes, crop_coords, **params)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        pad_params = params.get(\"pad_params\")\n        image_shape = params[\"shape\"][:2]\n\n        if pad_params is not None:\n            # Calculate padded dimensions\n            padded_height = image_shape[0] + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            padded_width = image_shape[1] + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n\n            # First apply padding to keypoints using original image shape\n            keypoints = fgeometric.pad_keypoints(\n                keypoints,\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n                self.border_mode,\n                image_shape=image_shape,\n            )\n\n            # Update image shape for subsequent crop operation\n            params = {**params, \"shape\": (padded_height, padded_width)}\n\n        return BaseCrop.apply_to_keypoints(self, keypoints, crop_coords, **params)\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.BaseRandomSizedCropInitSchema","title":"<code>class  BaseRandomSizedCropInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class BaseRandomSizedCropInitSchema(BaseTransformInitSchema):\n    size: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.CenterCrop","title":"<code>class  CenterCrop</code> <code>       (height, width, pad_if_needed=False, pad_mode=None, pad_cval=None, pad_cval_mask=None, pad_position='center', border_mode=0, fill=0.0, fill_mask=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop the central part of the input.</p> <p>This transform crops the center of the input image, mask, bounding boxes, and keypoints to the specified dimensions. It's useful when you want to focus on the central region of the input, discarding peripheral information.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>The height of the crop. Must be greater than 0.</p> <code>width</code> <code>int</code> <p>The width of the crop. Must be greater than 0.</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if crop size exceeds image size. Default: False.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value for images if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for masks if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>pad_position</code> <code>Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']</code> <p>Position of padding. Default: 'center'.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If pad_if_needed is False and crop size exceeds image dimensions, it will raise a CropSizeError.</li> <li>If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded.</li> <li>For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class CenterCrop(BaseCropAndPad):\n    \"\"\"Crop the central part of the input.\n\n    This transform crops the center of the input image, mask, bounding boxes, and keypoints to the specified dimensions.\n    It's useful when you want to focus on the central region of the input, discarding peripheral information.\n\n    Args:\n        height (int): The height of the crop. Must be greater than 0.\n        width (int): The width of the crop. Must be greater than 0.\n        pad_if_needed (bool): Whether to pad if crop size exceeds image size. Default: False.\n        border_mode (OpenCV flag): OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value for images if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fill_mask (ColorType): Padding value for masks if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']):\n            Position of padding. Default: 'center'.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If pad_if_needed is False and crop size exceeds image dimensions, it will raise a CropSizeError.\n        - If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded.\n        - For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.\n    \"\"\"\n\n    class InitSchema(BaseCropAndPad.InitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_dimensions(self) -&gt; Self:\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n            return self\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        pad_if_needed: bool = False,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        pad_position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0.0,\n        fill_mask: ColorType = 0.0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=pad_position,\n            p=p,\n        )\n        self.height = height\n        self.width = width\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"height\",\n            \"width\",\n            \"pad_if_needed\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"pad_position\",\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        if not self.pad_if_needed and (self.height &gt; image_height or self.width &gt; image_width):\n            raise CropSizeError(\n                f\"Crop size (height, width) exceeds image dimensions (height, width):\"\n                f\" {(self.height, self.width)} vs {image_shape[:2]}\",\n            )\n\n        # Get padding params first if needed\n        pad_params = self._get_pad_params(image_shape, (self.height, self.width))\n\n        # If padding is needed, adjust the image shape for crop calculation\n        if pad_params is not None:\n            pad_top = pad_params[\"pad_top\"]\n            pad_bottom = pad_params[\"pad_bottom\"]\n            pad_left = pad_params[\"pad_left\"]\n            pad_right = pad_params[\"pad_right\"]\n\n            padded_height = image_height + pad_top + pad_bottom\n            padded_width = image_width + pad_left + pad_right\n            padded_shape = (padded_height, padded_width)\n\n            # Get crop coordinates based on padded dimensions\n            crop_coords = fcrops.get_center_crop_coords(padded_shape, (self.height, self.width))\n        else:\n            # Get crop coordinates based on original dimensions\n            crop_coords = fcrops.get_center_crop_coords(image_shape, (self.height, self.width))\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.Crop","title":"<code>class  Crop</code> <code>       (x_min=0, y_min=0, x_max=1024, y_max=1024, pad_if_needed=False, pad_mode=None, pad_cval=None, pad_cval_mask=None, pad_position='center', border_mode=0, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a specific region from the input image.</p> <p>This transform crops a rectangular region from the input image, mask, bounding boxes, and keypoints based on specified coordinates. It's useful when you want to extract a specific area of interest from your inputs.</p> <p>Parameters:</p> Name Type Description <code>x_min</code> <code>int</code> <p>Minimum x-coordinate of the crop region (left edge). Must be &gt;= 0. Default: 0.</p> <code>y_min</code> <code>int</code> <p>Minimum y-coordinate of the crop region (top edge). Must be &gt;= 0. Default: 0.</p> <code>x_max</code> <code>int</code> <p>Maximum x-coordinate of the crop region (right edge). Must be &gt; x_min. Default: 1024.</p> <code>y_max</code> <code>int</code> <p>Maximum y-coordinate of the crop region (bottom edge). Must be &gt; y_min. Default: 1024.</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if crop coordinates exceed image dimensions. Default: False.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for masks. Default: 0.</p> <code>pad_position</code> <code>Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']</code> <p>Position of padding. Default: 'center'.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The crop coordinates are applied as follows: x_min &lt;= x &lt; x_max and y_min &lt;= y &lt; y_max.</li> <li>If pad_if_needed is False and crop region extends beyond image boundaries, it will be clipped.</li> <li>If pad_if_needed is True, image will be padded to accommodate the full crop region.</li> <li>For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class Crop(BaseCropAndPad):\n    \"\"\"Crop a specific region from the input image.\n\n    This transform crops a rectangular region from the input image, mask, bounding boxes, and keypoints\n    based on specified coordinates. It's useful when you want to extract a specific area of interest\n    from your inputs.\n\n    Args:\n        x_min (int): Minimum x-coordinate of the crop region (left edge). Must be &gt;= 0. Default: 0.\n        y_min (int): Minimum y-coordinate of the crop region (top edge). Must be &gt;= 0. Default: 0.\n        x_max (int): Maximum x-coordinate of the crop region (right edge). Must be &gt; x_min. Default: 1024.\n        y_max (int): Maximum y-coordinate of the crop region (bottom edge). Must be &gt; y_min. Default: 1024.\n        pad_if_needed (bool): Whether to pad if crop coordinates exceed image dimensions. Default: False.\n        border_mode (OpenCV flag): OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.\n        fill_mask (ColorType): Padding value for masks. Default: 0.\n        pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']):\n            Position of padding. Default: 'center'.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The crop coordinates are applied as follows: x_min &lt;= x &lt; x_max and y_min &lt;= y &lt; y_max.\n        - If pad_if_needed is False and crop region extends beyond image boundaries, it will be clipped.\n        - If pad_if_needed is True, image will be padded to accommodate the full crop region.\n        - For bounding boxes and keypoints, coordinates are adjusted appropriately for both padding and cropping.\n    \"\"\"\n\n    class InitSchema(BaseCropAndPad.InitSchema):\n        x_min: Annotated[int, Field(ge=0)]\n        y_min: Annotated[int, Field(ge=0)]\n        x_max: Annotated[int, Field(gt=0)]\n        y_max: Annotated[int, Field(gt=0)]\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_coordinates(self) -&gt; Self:\n            if not self.x_min &lt; self.x_max:\n                msg = \"x_max must be greater than x_min\"\n                raise ValueError(msg)\n            if not self.y_min &lt; self.y_max:\n                msg = \"y_max must be greater than y_min\"\n                raise ValueError(msg)\n\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n\n            return self\n\n    def __init__(\n        self,\n        x_min: int = 0,\n        y_min: int = 0,\n        x_max: int = 1024,\n        y_max: int = 1024,\n        pad_if_needed: bool = False,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        pad_position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=pad_position,\n            p=p,\n        )\n        self.x_min = x_min\n        self.y_min = y_min\n        self.x_max = x_max\n        self.y_max = y_max\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        crop_height = self.y_max - self.y_min\n        crop_width = self.x_max - self.x_min\n\n        if not self.pad_if_needed:\n            # If no padding, clip coordinates to image boundaries\n            x_min = np.clip(self.x_min, 0, image_width)\n            y_min = np.clip(self.y_min, 0, image_height)\n            x_max = np.clip(self.x_max, x_min, image_width)\n            y_max = np.clip(self.y_max, y_min, image_height)\n            return {\"crop_coords\": (x_min, y_min, x_max, y_max)}\n\n        # Calculate padding if needed\n        pad_params = self._get_pad_params(\n            image_shape=image_shape,\n            target_shape=(max(crop_height, image_height), max(crop_width, image_width)),\n        )\n\n        if pad_params is not None:\n            # Adjust crop coordinates based on padding\n            x_min = self.x_min + pad_params[\"pad_left\"]\n            y_min = self.y_min + pad_params[\"pad_top\"]\n            x_max = self.x_max + pad_params[\"pad_left\"]\n            y_max = self.y_max + pad_params[\"pad_top\"]\n            crop_coords = (x_min, y_min, x_max, y_max)\n        else:\n            crop_coords = (self.x_min, self.y_min, self.x_max, self.y_max)\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"x_min\",\n            \"y_min\",\n            \"x_max\",\n            \"y_max\",\n            \"pad_if_needed\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"pad_position\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.CropAndPad","title":"<code>class  CropAndPad</code> <code>       (px=None, percent=None, pad_mode=None, pad_cval=None, pad_cval_mask=None, keep_size=True, sample_independently=True, interpolation=1, mask_interpolation=0, border_mode=0, fill=0, fill_mask=0, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Crop and pad images by pixel amounts or fractions of image sizes.</p> <p>This transform allows for simultaneous cropping and padding of images. Cropping removes pixels from the sides (i.e., extracts a subimage), while padding adds pixels to the sides (e.g., black pixels). The amount of cropping/padding can be specified either in absolute pixels or as a fraction of the image size.</p> <p>Parameters:</p> Name Type Description <code>px</code> <code>int, tuple of int, tuple of tuples of int, or None</code> <p>The number of pixels to crop (negative values) or pad (positive values) on each side of the image. Either this or the parameter <code>percent</code> may be set, not both at the same time. - If int: crop/pad all sides by this value. - If tuple of 2 ints: crop/pad by (top/bottom, left/right). - If tuple of 4 ints: crop/pad by (top, right, bottom, left). - Each int can also be a tuple of 2 ints for a range, or a list of ints for discrete choices. Default: None.</p> <code>percent</code> <code>float, tuple of float, tuple of tuples of float, or None</code> <p>The fraction of the image size to crop (negative values) or pad (positive values) on each side. Either this or the parameter <code>px</code> may be set, not both at the same time. - If float: crop/pad all sides by this fraction. - If tuple of 2 floats: crop/pad by (top/bottom, left/right) fractions. - If tuple of 4 floats: crop/pad by (top, right, bottom, left) fractions. - Each float can also be a tuple of 2 floats for a range, or a list of floats for discrete choices. Default: None.</p> <code>border_mode</code> <code>int</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>The constant value to use for padding if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Same as fill but used for mask padding. Default: 0.</p> <code>keep_size</code> <code>bool</code> <p>If True, the output image will be resized to the input image size after cropping/padding. Default: True.</p> <code>sample_independently</code> <code>bool</code> <p>If True and ranges are used for px/percent, sample a value for each side independently. If False, sample one value and use it for all sides. Default: True.</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag used for resizing if keep_size is True. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>int</code> <p>OpenCV interpolation flag used for resizing if keep_size is True. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform will never crop images below a height or width of 1.</li> <li>When using pixel values (px), the image will be cropped/padded by exactly that many pixels.</li> <li>When using percentages (percent), the amount of crop/pad will be calculated based on the image size.</li> <li>Bounding boxes that end up fully outside the image after cropping will be removed.</li> <li>Keypoints that end up outside the image after cropping will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.CropAndPad(px=(-10, 20, 30, -40), border_mode=cv2.BORDER_REFLECT, fill=128, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class CropAndPad(DualTransform):\n    \"\"\"Crop and pad images by pixel amounts or fractions of image sizes.\n\n    This transform allows for simultaneous cropping and padding of images. Cropping removes pixels from the sides\n    (i.e., extracts a subimage), while padding adds pixels to the sides (e.g., black pixels). The amount of\n    cropping/padding can be specified either in absolute pixels or as a fraction of the image size.\n\n    Args:\n        px (int, tuple of int, tuple of tuples of int, or None):\n            The number of pixels to crop (negative values) or pad (positive values) on each side of the image.\n            Either this or the parameter `percent` may be set, not both at the same time.\n            - If int: crop/pad all sides by this value.\n            - If tuple of 2 ints: crop/pad by (top/bottom, left/right).\n            - If tuple of 4 ints: crop/pad by (top, right, bottom, left).\n            - Each int can also be a tuple of 2 ints for a range, or a list of ints for discrete choices.\n            Default: None.\n\n        percent (float, tuple of float, tuple of tuples of float, or None):\n            The fraction of the image size to crop (negative values) or pad (positive values) on each side.\n            Either this or the parameter `px` may be set, not both at the same time.\n            - If float: crop/pad all sides by this fraction.\n            - If tuple of 2 floats: crop/pad by (top/bottom, left/right) fractions.\n            - If tuple of 4 floats: crop/pad by (top, right, bottom, left) fractions.\n            - Each float can also be a tuple of 2 floats for a range, or a list of floats for discrete choices.\n            Default: None.\n\n        border_mode (int):\n            OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n\n        fill (ColorType):\n            The constant value to use for padding if border_mode is cv2.BORDER_CONSTANT.\n            Default: 0.\n\n        fill_mask (ColorType):\n            Same as fill but used for mask padding. Default: 0.\n\n        keep_size (bool):\n            If True, the output image will be resized to the input image size after cropping/padding.\n            Default: True.\n\n        sample_independently (bool):\n            If True and ranges are used for px/percent, sample a value for each side independently.\n            If False, sample one value and use it for all sides. Default: True.\n\n        interpolation (int):\n            OpenCV interpolation flag used for resizing if keep_size is True.\n            Default: cv2.INTER_LINEAR.\n\n        mask_interpolation (int):\n            OpenCV interpolation flag used for resizing if keep_size is True.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n\n        p (float):\n            Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform will never crop images below a height or width of 1.\n        - When using pixel values (px), the image will be cropped/padded by exactly that many pixels.\n        - When using percentages (percent), the amount of crop/pad will be calculated based on the image size.\n        - Bounding boxes that end up fully outside the image after cropping will be removed.\n        - Keypoints that end up outside the image after cropping will be removed.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.CropAndPad(px=(-10, 20, 30, -40), border_mode=cv2.BORDER_REFLECT, fill=128, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        px: PxType | None\n        percent: PercentType | None\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n        keep_size: bool\n        sample_independently: bool\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        @model_validator(mode=\"after\")\n        def check_px_percent(self) -&gt; Self:\n            if self.px is None and self.percent is None:\n                msg = \"Both px and percent parameters cannot be None simultaneously.\"\n                raise ValueError(msg)\n            if self.px is not None and self.percent is not None:\n                msg = \"Only px or percent may be set!\"\n                raise ValueError(msg)\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n\n            return self\n\n    def __init__(\n        self,\n        px: int | list[int] | None = None,\n        percent: float | list[float] | None = None,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        keep_size: bool = True,\n        sample_independently: bool = True,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        border_mode: BorderModeType = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.px = px\n        self.percent = percent\n\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n        self.keep_size = keep_size\n        self.sample_independently = sample_independently\n\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_params: Sequence[int],\n        pad_params: Sequence[int],\n        fill: ColorType,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad(\n            img,\n            crop_params,\n            pad_params,\n            fill,\n            params[\"shape\"][:2],\n            self.interpolation,\n            self.border_mode,\n            self.keep_size,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        crop_params: Sequence[int],\n        pad_params: Sequence[int],\n        fill_mask: ColorType,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad(\n            mask,\n            crop_params,\n            pad_params,\n            fill_mask,\n            params[\"shape\"][:2],\n            self.mask_interpolation,\n            self.border_mode,\n            self.keep_size,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        crop_params: tuple[int, int, int, int],\n        pad_params: tuple[int, int, int, int],\n        result_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad_bboxes(bboxes, crop_params, pad_params, params[\"shape\"][:2], result_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_params: tuple[int, int, int, int],\n        pad_params: tuple[int, int, int, int],\n        result_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fcrops.crop_and_pad_keypoints(\n            keypoints,\n            crop_params,\n            pad_params,\n            params[\"shape\"][:2],\n            result_shape,\n            self.keep_size,\n        )\n\n    @staticmethod\n    def __prevent_zero(val1: int, val2: int, max_val: int) -&gt; tuple[int, int]:\n        regain = abs(max_val) + 1\n        regain1 = regain // 2\n        regain2 = regain // 2\n        if regain1 + regain2 &lt; regain:\n            regain1 += 1\n\n        if regain1 &gt; val1:\n            diff = regain1 - val1\n            regain1 = val1\n            regain2 += diff\n        elif regain2 &gt; val2:\n            diff = regain2 - val2\n            regain2 = val2\n            regain1 += diff\n\n        return val1 - regain1, val2 - regain2\n\n    @staticmethod\n    def _prevent_zero(crop_params: list[int], height: int, width: int) -&gt; list[int]:\n        top, right, bottom, left = crop_params\n\n        remaining_height = height - (top + bottom)\n        remaining_width = width - (left + right)\n\n        if remaining_height &lt; 1:\n            top, bottom = CropAndPad.__prevent_zero(top, bottom, height)\n        if remaining_width &lt; 1:\n            left, right = CropAndPad.__prevent_zero(left, right, width)\n\n        return [max(top, 0), max(right, 0), max(bottom, 0), max(left, 0)]\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        if self.px is not None:\n            new_params = self._get_px_params()\n        else:\n            percent_params = self._get_percent_params()\n            new_params = [\n                int(percent_params[0] * height),\n                int(percent_params[1] * width),\n                int(percent_params[2] * height),\n                int(percent_params[3] * width),\n            ]\n\n        pad_params = [max(i, 0) for i in new_params]\n\n        crop_params = self._prevent_zero([-min(i, 0) for i in new_params], height, width)\n\n        top, right, bottom, left = crop_params\n        crop_params = [left, top, width - right, height - bottom]\n        result_rows = crop_params[3] - crop_params[1]\n        result_cols = crop_params[2] - crop_params[0]\n        if result_cols == width and result_rows == height:\n            crop_params = []\n\n        top, right, bottom, left = pad_params\n        pad_params = [top, bottom, left, right]\n        if any(pad_params):\n            result_rows += top + bottom\n            result_cols += left + right\n        else:\n            pad_params = []\n\n        return {\n            \"crop_params\": crop_params or None,\n            \"pad_params\": pad_params or None,\n            \"fill\": None if pad_params is None else self._get_pad_value(cast(ColorType, self.fill)),\n            \"fill_mask\": None if pad_params is None else self._get_pad_value(cast(ColorType, self.fill_mask)),\n            \"result_shape\": (result_rows, result_cols),\n        }\n\n    def _get_px_params(self) -&gt; list[int]:\n        if self.px is None:\n            msg = \"px is not set\"\n            raise ValueError(msg)\n\n        if isinstance(self.px, int):\n            params = [self.px] * 4\n        elif len(self.px) == PAIR:\n            if self.sample_independently:\n                params = [self.py_random.randrange(*self.px) for _ in range(4)]\n            else:\n                px = self.py_random.randrange(*self.px)\n                params = [px] * 4\n        elif isinstance(self.px[0], int):\n            params = self.px\n        elif len(self.px[0]) == PAIR:\n            params = [self.py_random.randrange(*i) for i in self.px]\n        else:\n            params = [self.py_random.choice(i) for i in self.px]\n\n        return params\n\n    def _get_percent_params(self) -&gt; list[float]:\n        if self.percent is None:\n            msg = \"percent is not set\"\n            raise ValueError(msg)\n\n        if isinstance(self.percent, float):\n            params = [self.percent] * 4\n        elif len(self.percent) == PAIR:\n            if self.sample_independently:\n                params = [self.py_random.uniform(*self.percent) for _ in range(4)]\n            else:\n                px = self.py_random.uniform(*self.percent)\n                params = [px] * 4\n        elif isinstance(self.percent[0], (int, float)):\n            params = self.percent\n        elif len(self.percent[0]) == PAIR:\n            params = [self.py_random.uniform(*i) for i in self.percent]\n        else:\n            params = [self.py_random.choice(i) for i in self.percent]\n\n        return params  # params = [top, right, bottom, left]\n\n    def _get_pad_value(\n        self,\n        fill: ColorType,\n    ) -&gt; int | float:\n        if isinstance(fill, (list, tuple)):\n            if len(fill) == PAIR:\n                a, b = fill\n                if isinstance(a, int) and isinstance(b, int):\n                    return self.py_random.randint(a, b)\n                return self.py_random.uniform(a, b)\n            return self.py_random.choice(fill)\n\n        if isinstance(fill, Real):\n            return fill\n\n        msg = \"fill should be a number or list, or tuple of two numbers.\"\n        raise ValueError(msg)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"px\",\n            \"percent\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"keep_size\",\n            \"sample_independently\",\n            \"interpolation\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.CropNonEmptyMaskIfExists","title":"<code>class  CropNonEmptyMaskIfExists</code> <code>       (height, width, ignore_values=None, ignore_channels=None, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop area with mask if mask is non-empty, else make random crop.</p> <p>This transform attempts to crop a region containing a mask (non-zero pixels). If the mask is empty or not provided, it falls back to a random crop. This is particularly useful for segmentation tasks where you want to focus on regions of interest defined by the mask.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>Vertical size of crop in pixels. Must be &gt; 0.</p> <code>width</code> <code>int</code> <p>Horizontal size of crop in pixels. Must be &gt; 0.</p> <code>ignore_values</code> <code>list of int</code> <p>Values to ignore in mask, <code>0</code> values are always ignored. For example, if background value is 5, set <code>ignore_values=[5]</code> to ignore it. Default: None.</p> <code>ignore_channels</code> <code>list of int</code> <p>Channels to ignore in mask. For example, if background is the first channel, set <code>ignore_channels=[0]</code> to ignore it. Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If a mask is provided, the transform will try to crop an area containing non-zero (or non-ignored) pixels.</li> <li>If no suitable area is found in the mask or no mask is provided, it will perform a random crop.</li> <li>The crop size (height, width) must not exceed the original image dimensions.</li> <li>Bounding boxes and keypoints are also cropped along with the image and mask.</li> </ul> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the specified crop size is larger than the input image dimensions.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n&gt;&gt;&gt; mask[25:75, 25:75] = 1  # Create a non-empty region in the mask\n&gt;&gt;&gt; transform = A.Compose([\n...     A.CropNonEmptyMaskIfExists(height=50, width=50, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n# The resulting crop will likely include part of the non-zero region in the mask\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class CropNonEmptyMaskIfExists(BaseCrop):\n    \"\"\"Crop area with mask if mask is non-empty, else make random crop.\n\n    This transform attempts to crop a region containing a mask (non-zero pixels). If the mask is empty or not provided,\n    it falls back to a random crop. This is particularly useful for segmentation tasks where you want to focus on\n    regions of interest defined by the mask.\n\n    Args:\n        height (int): Vertical size of crop in pixels. Must be &gt; 0.\n        width (int): Horizontal size of crop in pixels. Must be &gt; 0.\n        ignore_values (list of int, optional): Values to ignore in mask, `0` values are always ignored.\n            For example, if background value is 5, set `ignore_values=[5]` to ignore it. Default: None.\n        ignore_channels (list of int, optional): Channels to ignore in mask.\n            For example, if background is the first channel, set `ignore_channels=[0]` to ignore it. Default: None.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If a mask is provided, the transform will try to crop an area containing non-zero (or non-ignored) pixels.\n        - If no suitable area is found in the mask or no mask is provided, it will perform a random crop.\n        - The crop size (height, width) must not exceed the original image dimensions.\n        - Bounding boxes and keypoints are also cropped along with the image and mask.\n\n    Raises:\n        ValueError: If the specified crop size is larger than the input image dimensions.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; mask[25:75, 25:75] = 1  # Create a non-empty region in the mask\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.CropNonEmptyMaskIfExists(height=50, width=50, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        # The resulting crop will likely include part of the non-zero region in the mask\n    \"\"\"\n\n    class InitSchema(BaseCrop.InitSchema):\n        ignore_values: list[int] | None\n        ignore_channels: list[int] | None\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        ignore_values: list[int] | None = None,\n        ignore_channels: list[int] | None = None,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p)\n\n        self.height = height\n        self.width = width\n        self.ignore_values = ignore_values\n        self.ignore_channels = ignore_channels\n\n    def _preprocess_mask(self, mask: np.ndarray) -&gt; np.ndarray:\n        mask_height, mask_width = mask.shape[:2]\n\n        if self.ignore_values is not None:\n            ignore_values_np = np.array(self.ignore_values)\n            mask = np.where(np.isin(mask, ignore_values_np), 0, mask)\n\n        if mask.ndim == NUM_MULTI_CHANNEL_DIMENSIONS and self.ignore_channels is not None:\n            target_channels = np.array([ch for ch in range(mask.shape[-1]) if ch not in self.ignore_channels])\n            mask = np.take(mask, target_channels, axis=-1)\n\n        if self.height &gt; mask_height or self.width &gt; mask_width:\n            raise ValueError(\n                f\"Crop size ({self.height},{self.width}) is larger than image ({mask_height},{mask_width})\",\n            )\n\n        return mask\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        \"\"\"Get crop coordinates based on mask content.\"\"\"\n        if \"mask\" in data:\n            mask = self._preprocess_mask(data[\"mask\"])\n        elif \"masks\" in data and len(data[\"masks\"]):\n            masks = data[\"masks\"]\n            mask = self._preprocess_mask(np.copy(masks[0]))\n            for m in masks[1:]:\n                mask |= self._preprocess_mask(m)\n        else:\n            msg = \"Can not find mask for CropNonEmptyMaskIfExists\"\n            raise RuntimeError(msg)\n\n        mask_height, mask_width = mask.shape[:2]\n\n        if mask.any():\n            # Find non-zero regions in mask\n            mask_sum = mask.sum(axis=-1) if mask.ndim == NUM_MULTI_CHANNEL_DIMENSIONS else mask\n            non_zero_yx = np.argwhere(mask_sum)\n            y, x = self.py_random.choice(non_zero_yx)\n\n            # Calculate crop coordinates centered around chosen point\n            x_min = x - self.py_random.randint(0, self.width - 1)\n            y_min = y - self.py_random.randint(0, self.height - 1)\n            x_min = np.clip(x_min, 0, mask_width - self.width)\n            y_min = np.clip(y_min, 0, mask_height - self.height)\n        else:\n            # Random crop if no non-zero regions\n            x_min = self.py_random.randint(0, mask_width - self.width)\n            y_min = self.py_random.randint(0, mask_height - self.height)\n\n        x_max = x_min + self.width\n        y_max = y_min + self.height\n\n        return {\"crop_coords\": (x_min, y_min, x_max, y_max)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"ignore_values\", \"ignore_channels\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomCrop","title":"<code>class  RandomCrop</code> <code>       (height, width, pad_if_needed=False, pad_mode=None, pad_cval=None, pad_cval_mask=None, pad_position='center', border_mode=0, fill=0.0, fill_mask=0.0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>height of the crop.</p> <code>width</code> <code>int</code> <p>width of the crop.</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if crop size exceeds image size. Default: False.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value for images if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for masks if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>pad_position</code> <code>Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']</code> <p>Position of padding. Default: 'center'.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded before applying the random crop.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomCrop(BaseCropAndPad):\n    \"\"\"Crop a random part of the input.\n\n    Args:\n        height: height of the crop.\n        width: width of the crop.\n        pad_if_needed (bool): Whether to pad if crop size exceeds image size. Default: False.\n        border_mode (OpenCV flag): OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value for images if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fill_mask (ColorType): Padding value for masks if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        pad_position (Literal['center', 'top_left', 'top_right', 'bottom_left', 'bottom_right', 'random']):\n            Position of padding. Default: 'center'.\n        p: probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        If pad_if_needed is True and crop size exceeds image dimensions, the image will be padded\n        before applying the random crop.\n    \"\"\"\n\n    class InitSchema(BaseCropAndPad.InitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        border_mode: BorderModeType\n        fill: ColorType\n        fill_mask: ColorType\n        pad_mode: BorderModeType | None\n        pad_cval: ColorType | None\n        pad_cval_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_dimensions(self) -&gt; Self:\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_cval is not None:\n                warn(\"pad_cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_cval\n            if self.pad_cval_mask is not None:\n                warn(\"pad_cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.pad_cval_mask\n            return self\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        pad_if_needed: bool = False,\n        pad_mode: int | None = None,\n        pad_cval: ColorType | None = None,\n        pad_cval_mask: ColorType | None = None,\n        pad_position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0.0,\n        fill_mask: ColorType = 0.0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=pad_position,\n            p=p,\n        )\n        self.height = height\n        self.width = width\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:  # Changed return type to be more flexible\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        if not self.pad_if_needed and (self.height &gt; image_height or self.width &gt; image_width):\n            raise CropSizeError(\n                f\"Crop size (height, width) exceeds image dimensions (height, width):\"\n                f\" {(self.height, self.width)} vs {image_shape[:2]}\",\n            )\n\n        # Get padding params first if needed\n        pad_params = self._get_pad_params(image_shape, (self.height, self.width))\n\n        # If padding is needed, adjust the image shape for crop calculation\n        if pad_params is not None:\n            pad_top = pad_params[\"pad_top\"]\n            pad_bottom = pad_params[\"pad_bottom\"]\n            pad_left = pad_params[\"pad_left\"]\n            pad_right = pad_params[\"pad_right\"]\n\n            padded_height = image_height + pad_top + pad_bottom\n            padded_width = image_width + pad_left + pad_right\n            padded_shape = (padded_height, padded_width)\n\n            # Get random crop coordinates based on padded dimensions\n            h_start = self.py_random.random()\n            w_start = self.py_random.random()\n            crop_coords = fcrops.get_crop_coords(padded_shape, (self.height, self.width), h_start, w_start)\n        else:\n            # Get random crop coordinates based on original dimensions\n            h_start = self.py_random.random()\n            w_start = self.py_random.random()\n            crop_coords = fcrops.get_crop_coords(image_shape, (self.height, self.width), h_start, w_start)\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"height\",\n            \"width\",\n            \"pad_if_needed\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"pad_position\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomCropFromBorders","title":"<code>class  RandomCropFromBorders</code> <code>       (crop_left=0.1, crop_right=0.1, crop_top=0.1, crop_bottom=0.1, always_apply=None, p=1.0)                     </code>  [view source on GitHub]","text":"<p>Randomly crops the input from its borders without resizing.</p> <p>This transform randomly crops parts of the input (image, mask, bounding boxes, or keypoints) from each of its borders. The amount of cropping is specified as a fraction of the input's dimensions for each side independently.</p> <p>Parameters:</p> Name Type Description <code>crop_left</code> <code>float</code> <p>The maximum fraction of width to crop from the left side. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>crop_right</code> <code>float</code> <p>The maximum fraction of width to crop from the right side. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>crop_top</code> <code>float</code> <p>The maximum fraction of height to crop from the top. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>crop_bottom</code> <code>float</code> <p>The maximum fraction of height to crop from the bottom. Must be in the range [0.0, 1.0]. Default: 0.1</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual amount of cropping for each side is randomly chosen between 0 and   the specified maximum for each application of the transform.</li> <li>The sum of crop_left and crop_right must not exceed 1.0, and the sum of   crop_top and crop_bottom must not exceed 1.0. Otherwise, a ValueError will be raised.</li> <li>This transform does not resize the input after cropping, so the output dimensions   will be smaller than the input dimensions.</li> <li>Bounding boxes that end up fully outside the cropped area will be removed.</li> <li>Keypoints that end up outside the cropped area will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomCropFromBorders(\n...     crop_left=0.1, crop_right=0.2, crop_top=0.2, crop_bottom=0.1, p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# The resulting image will have random crops from each border, with the maximum\n# possible crops being 10% from the left, 20% from the right, 20% from the top,\n# and 10% from the bottom. The image size will be reduced accordingly.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomCropFromBorders(BaseCrop):\n    \"\"\"Randomly crops the input from its borders without resizing.\n\n    This transform randomly crops parts of the input (image, mask, bounding boxes, or keypoints)\n    from each of its borders. The amount of cropping is specified as a fraction of the input's\n    dimensions for each side independently.\n\n    Args:\n        crop_left (float): The maximum fraction of width to crop from the left side.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        crop_right (float): The maximum fraction of width to crop from the right side.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        crop_top (float): The maximum fraction of height to crop from the top.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        crop_bottom (float): The maximum fraction of height to crop from the bottom.\n            Must be in the range [0.0, 1.0]. Default: 0.1\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual amount of cropping for each side is randomly chosen between 0 and\n          the specified maximum for each application of the transform.\n        - The sum of crop_left and crop_right must not exceed 1.0, and the sum of\n          crop_top and crop_bottom must not exceed 1.0. Otherwise, a ValueError will be raised.\n        - This transform does not resize the input after cropping, so the output dimensions\n          will be smaller than the input dimensions.\n        - Bounding boxes that end up fully outside the cropped area will be removed.\n        - Keypoints that end up outside the cropped area will be removed.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomCropFromBorders(\n        ...     crop_left=0.1, crop_right=0.2, crop_top=0.2, crop_bottom=0.1, p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # The resulting image will have random crops from each border, with the maximum\n        # possible crops being 10% from the left, 20% from the right, 20% from the top,\n        # and 10% from the bottom. The image size will be reduced accordingly.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        crop_left: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        crop_right: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        crop_top: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        crop_bottom: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n\n        @model_validator(mode=\"after\")\n        def validate_crop_values(self) -&gt; Self:\n            if self.crop_left + self.crop_right &gt; 1.0:\n                msg = \"The sum of crop_left and crop_right must be &lt;= 1.\"\n                raise ValueError(msg)\n            if self.crop_top + self.crop_bottom &gt; 1.0:\n                msg = \"The sum of crop_top and crop_bottom must be &lt;= 1.\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        crop_left: float = 0.1,\n        crop_right: float = 0.1,\n        crop_top: float = 0.1,\n        crop_bottom: float = 0.1,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p)\n        self.crop_left = crop_left\n        self.crop_right = crop_right\n        self.crop_top = crop_top\n        self.crop_bottom = crop_bottom\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        height, width = params[\"shape\"][:2]\n\n        x_min = self.py_random.randint(0, int(self.crop_left * width))\n        x_max = self.py_random.randint(max(x_min + 1, int((1 - self.crop_right) * width)), width)\n\n        y_min = self.py_random.randint(0, int(self.crop_top * height))\n        y_max = self.py_random.randint(max(y_min + 1, int((1 - self.crop_bottom) * height)), height)\n\n        crop_coords = x_min, y_min, x_max, y_max\n\n        return {\"crop_coords\": crop_coords}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"crop_left\", \"crop_right\", \"crop_top\", \"crop_bottom\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomCropNearBBox","title":"<code>class  RandomCropNearBBox</code> <code>       (max_part_shift=(0, 0.3), cropping_bbox_key='cropping_bbox', cropping_box_key=None, always_apply=None, p=1.0)                       </code>  [view source on GitHub]","text":"<p>Crop bbox from image with random shift by x,y coordinates</p> <p>Parameters:</p> Name Type Description <code>max_part_shift</code> <code>float, (float, float</code> <p>Max shift in <code>height</code> and <code>width</code> dimensions relative to <code>cropping_bbox</code> dimension. If max_part_shift is a single float, the range will be (0, max_part_shift). Default (0, 0.3).</p> <code>cropping_bbox_key</code> <code>str</code> <p>Additional target key for cropping box. Default <code>cropping_bbox</code>.</p> <code>cropping_box_key</code> <code>str</code> <p>[Deprecated] Use <code>cropping_bbox_key</code> instead.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_bbox_key='test_bbox')],\n&gt;&gt;&gt;              bbox_params=BboxParams(\"pascal_voc\"))\n&gt;&gt;&gt; result = aug(image=image, bboxes=bboxes, test_bbox=[0, 5, 10, 20])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomCropNearBBox(BaseCrop):\n    \"\"\"Crop bbox from image with random shift by x,y coordinates\n\n    Args:\n        max_part_shift (float, (float, float)): Max shift in `height` and `width` dimensions relative\n            to `cropping_bbox` dimension.\n            If max_part_shift is a single float, the range will be (0, max_part_shift).\n            Default (0, 0.3).\n        cropping_bbox_key (str): Additional target key for cropping box. Default `cropping_bbox`.\n        cropping_box_key (str): [Deprecated] Use `cropping_bbox_key` instead.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Examples:\n        &gt;&gt;&gt; aug = Compose([RandomCropNearBBox(max_part_shift=(0.1, 0.5), cropping_bbox_key='test_bbox')],\n        &gt;&gt;&gt;              bbox_params=BboxParams(\"pascal_voc\"))\n        &gt;&gt;&gt; result = aug(image=image, bboxes=bboxes, test_bbox=[0, 5, 10, 20])\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_part_shift: ZeroOneRangeType\n        cropping_bbox_key: str\n\n    def __init__(\n        self,\n        max_part_shift: ScaleFloatType = (0, 0.3),\n        cropping_bbox_key: str = \"cropping_bbox\",\n        cropping_box_key: str | None = None,  # Deprecated\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(p=p)\n        # Check for deprecated parameter and issue warning\n        if cropping_box_key is not None:\n            warn(\n                \"The parameter 'cropping_box_key' is deprecated and will be removed in future versions. \"\n                \"Use 'cropping_bbox_key' instead.\",\n                DeprecationWarning,\n                stacklevel=2,\n            )\n            # Ensure the new parameter is used even if the old one is passed\n            cropping_bbox_key = cropping_box_key\n\n        self.max_part_shift = cast(tuple[float, float], max_part_shift)\n        self.cropping_bbox_key = cropping_bbox_key\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[float, ...]]:\n        bbox = data[self.cropping_bbox_key]\n\n        image_shape = params[\"shape\"][:2]\n\n        bbox = self._clip_bbox(bbox, image_shape)\n\n        h_max_shift = round((bbox[3] - bbox[1]) * self.max_part_shift[0])\n        w_max_shift = round((bbox[2] - bbox[0]) * self.max_part_shift[1])\n\n        x_min = bbox[0] - self.py_random.randint(-w_max_shift, w_max_shift)\n        x_max = bbox[2] + self.py_random.randint(-w_max_shift, w_max_shift)\n\n        y_min = bbox[1] - self.py_random.randint(-h_max_shift, h_max_shift)\n        y_max = bbox[3] + self.py_random.randint(-h_max_shift, h_max_shift)\n\n        crop_coords = self._clip_bbox((x_min, y_min, x_max, y_max), image_shape)\n\n        if crop_coords[0] == crop_coords[2] or crop_coords[1] == crop_coords[3]:\n            crop_shape = (bbox[3] - bbox[1], bbox[2] - bbox[0])\n            crop_coords = fcrops.get_center_crop_coords(image_shape, crop_shape)\n\n        return {\"crop_coords\": crop_coords}\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [self.cropping_bbox_key]\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_part_shift\", \"cropping_bbox_key\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomResizedCrop","title":"<code>class  RandomResizedCrop</code> <code>       (size=None, width=None, height=None, *, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input and rescale it to a specified size.</p> <p>This transform first crops a random portion of the input image (or mask, bounding boxes, keypoints) and then resizes the crop to a specified size. It's particularly useful for training neural networks on images of varying sizes and aspect ratios.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, int]</code> <p>Target size for the output image, i.e. (height, width) after crop and resize.</p> <code>scale</code> <code>tuple[float, float]</code> <p>Range of the random size of the crop relative to the input size. For example, (0.08, 1.0) means the crop size will be between 8% and 100% of the input size. Default: (0.08, 1.0)</p> <code>ratio</code> <code>tuple[float, float]</code> <p>Range of aspect ratios of the random crop. For example, (0.75, 1.3333) allows crop aspect ratios from 3:4 to 4:3. Default: (0.75, 1.3333333333333333)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform attempts to crop a random area with an aspect ratio and relative size   specified by 'ratio' and 'scale' parameters. If it fails to find a suitable crop after   10 attempts, it will return a crop from the center of the image.</li> <li>The crop's aspect ratio is defined as width / height.</li> <li>Bounding boxes that end up fully outside the cropped area will be removed.</li> <li>Keypoints that end up outside the cropped area will be removed.</li> <li>After cropping, the result is resized to the specified size.</li> </ul> <p>Mathematical Details:     1. A target area A is sampled from the range [scale[0] * input_area, scale[1] * input_area].     2. A target aspect ratio r is sampled from the range [ratio[0], ratio[1]].     3. The crop width and height are computed as:        w = sqrt(A * r)        h = sqrt(A / r)     4. If w and h are within the input image dimensions, the crop is accepted.        Otherwise, steps 1-3 are repeated (up to 10 times).     5. If no valid crop is found after 10 attempts, a centered crop is taken.     6. The crop is then resized to the specified size.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomResizedCrop(size=80, scale=(0.5, 1.0), ratio=(0.75, 1.33), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# transformed_image will be a 80x80 crop from a random location in the original image,\n# with the crop's size between 50% and 100% of the original image size,\n# and the crop's aspect ratio between 3:4 and 4:3.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomResizedCrop(_BaseRandomSizedCrop):\n    \"\"\"Crop a random part of the input and rescale it to a specified size.\n\n    This transform first crops a random portion of the input image (or mask, bounding boxes, keypoints)\n    and then resizes the crop to a specified size. It's particularly useful for training neural networks\n    on images of varying sizes and aspect ratios.\n\n    Args:\n        size (tuple[int, int]): Target size for the output image, i.e. (height, width) after crop and resize.\n        scale (tuple[float, float]): Range of the random size of the crop relative to the input size.\n            For example, (0.08, 1.0) means the crop size will be between 8% and 100% of the input size.\n            Default: (0.08, 1.0)\n        ratio (tuple[float, float]): Range of aspect ratios of the random crop.\n            For example, (0.75, 1.3333) allows crop aspect ratios from 3:4 to 4:3.\n            Default: (0.75, 1.3333333333333333)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform attempts to crop a random area with an aspect ratio and relative size\n          specified by 'ratio' and 'scale' parameters. If it fails to find a suitable crop after\n          10 attempts, it will return a crop from the center of the image.\n        - The crop's aspect ratio is defined as width / height.\n        - Bounding boxes that end up fully outside the cropped area will be removed.\n        - Keypoints that end up outside the cropped area will be removed.\n        - After cropping, the result is resized to the specified size.\n\n    Mathematical Details:\n        1. A target area A is sampled from the range [scale[0] * input_area, scale[1] * input_area].\n        2. A target aspect ratio r is sampled from the range [ratio[0], ratio[1]].\n        3. The crop width and height are computed as:\n           w = sqrt(A * r)\n           h = sqrt(A / r)\n        4. If w and h are within the input image dimensions, the crop is accepted.\n           Otherwise, steps 1-3 are repeated (up to 10 times).\n        5. If no valid crop is found after 10 attempts, a centered crop is taken.\n        6. The crop is then resized to the specified size.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomResizedCrop(size=80, scale=(0.5, 1.0), ratio=(0.75, 1.33), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # transformed_image will be a 80x80 crop from a random location in the original image,\n        # with the crop's size between 50% and 100% of the original image size,\n        # and the crop's aspect ratio between 3:4 and 4:3.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1)), AfterValidator(nondecreasing)]\n        ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n        width: int | None\n        height: int | None\n        size: ScaleIntType | None\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def process(self) -&gt; Self:\n            if isinstance(self.size, int):\n                if isinstance(self.width, int):\n                    warn(\n                        \"Initializing with 'size' as an integer and a separate 'width', `height` are deprecated. \"\n                        \"Please use a tuple (height, width) for the 'size' argument.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                    self.size = (self.size, self.width)\n                else:\n                    msg = \"If size is an integer, width as integer must be specified.\"\n                    raise TypeError(msg)\n\n            if self.size is None:\n                if self.height is None or self.width is None:\n                    message = \"If 'size' is not provided, both 'height' and 'width' must be specified.\"\n                    raise ValueError(message)\n                self.size = (self.height, self.width)\n\n            return self\n\n    def __init__(\n        self,\n        # NOTE @zetyquickly: when (width, height) are deprecated, make 'size' non optional\n        size: ScaleIntType | None = None,\n        width: int | None = None,\n        height: int | None = None,\n        *,\n        scale: tuple[float, float] = (0.08, 1.0),\n        ratio: tuple[float, float] = (0.75, 1.3333333333333333),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            size=cast(tuple[int, int], size),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.scale = scale\n        self.ratio = ratio\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_shape = params[\"shape\"][:2]\n        image_height, image_width = image_shape\n\n        area = image_height * image_width\n\n        for _ in range(10):\n            target_area = self.py_random.uniform(*self.scale) * area\n            log_ratio = (math.log(self.ratio[0]), math.log(self.ratio[1]))\n            aspect_ratio = math.exp(self.py_random.uniform(*log_ratio))\n\n            width = int(round(math.sqrt(target_area * aspect_ratio)))\n            height = int(round(math.sqrt(target_area / aspect_ratio)))\n\n            if 0 &lt; width &lt;= image_width and 0 &lt; height &lt;= image_height:\n                i = self.py_random.randint(0, image_height - height)\n                j = self.py_random.randint(0, image_width - width)\n\n                h_start = i * 1.0 / (image_height - height + 1e-10)\n                w_start = j * 1.0 / (image_width - width + 1e-10)\n\n                crop_shape = (height, width)\n\n                crop_coords = fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n                return {\"crop_coords\": crop_coords}\n\n        # Fallback to central crop\n        in_ratio = image_width / image_height\n        if in_ratio &lt; min(self.ratio):\n            width = image_width\n            height = int(round(image_width / min(self.ratio)))\n        elif in_ratio &gt; max(self.ratio):\n            height = image_height\n            width = int(round(height * max(self.ratio)))\n        else:  # whole image\n            width = image_width\n            height = image_height\n\n        i = (image_height - height) // 2\n        j = (image_width - width) // 2\n\n        h_start = i * 1.0 / (image_height - height + 1e-10)\n        w_start = j * 1.0 / (image_width - width + 1e-10)\n\n        crop_shape = (height, width)\n\n        crop_coords = fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n        return {\"crop_coords\": crop_coords}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"size\", \"scale\", \"ratio\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomSizedBBoxSafeCrop","title":"<code>class  RandomSizedBBoxSafeCrop</code> <code>       (height, width, erosion_rate=0.0, interpolation=1, mask_interpolation=0, always_apply=None, p=1.0)                         </code>  [view source on GitHub]","text":"<p>Crop a random part of the input and rescale it to a specific size without loss of bounding boxes.</p> <p>This transform first attempts to crop a random portion of the input image while ensuring that all bounding boxes remain within the cropped area. It then resizes the crop to the specified size. This is particularly useful for object detection tasks where preserving all objects in the image is crucial while also standardizing the image size.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>Height of the output image after resizing.</p> <code>width</code> <code>int</code> <p>Width of the output image after resizing.</p> <code>erosion_rate</code> <code>float</code> <p>A value between 0.0 and 1.0 that determines the minimum allowable size of the crop as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be at least 80% of the original image height and width. Default: 0.0 (no minimum size).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform ensures that all bounding boxes in the original image are fully contained within the   cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out),   it will default to cropping the entire image.</li> <li>After cropping, the result is resized to the specified (height, width) size.</li> <li>Bounding box coordinates are adjusted to match the new image size.</li> <li>Keypoints are moved along with the crop and scaled to the new image size.</li> <li>If there are no bounding boxes in the image, it will fall back to a random crop.</li> </ul> <p>Mathematical Details:     1. A crop region is selected that includes all bounding boxes.     2. The crop size is determined by the erosion_rate:        min_crop_size = (1 - erosion_rate) * original_size     3. If the selected crop is smaller than min_crop_size, it's expanded to meet this requirement.     4. The crop is then resized to the specified (height, width) size.     5. Bounding box coordinates are transformed to match the new image size:        new_coord = (old_coord - crop_start) * (new_size / crop_size)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (300, 300, 3), dtype=np.uint8)\n&gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomSizedBBoxSafeCrop(height=224, width=224, erosion_rate=0.2, p=1.0),\n... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n&gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n# transformed_image will be a 224x224 image containing all original bounding boxes,\n# with their coordinates adjusted to the new image size.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomSizedBBoxSafeCrop(BBoxSafeRandomCrop):\n    \"\"\"Crop a random part of the input and rescale it to a specific size without loss of bounding boxes.\n\n    This transform first attempts to crop a random portion of the input image while ensuring that all bounding boxes\n    remain within the cropped area. It then resizes the crop to the specified size. This is particularly useful for\n    object detection tasks where preserving all objects in the image is crucial while also standardizing the image size.\n\n    Args:\n        height (int): Height of the output image after resizing.\n        width (int): Width of the output image after resizing.\n        erosion_rate (float): A value between 0.0 and 1.0 that determines the minimum allowable size of the crop\n            as a fraction of the original image size. For example, an erosion_rate of 0.2 means the crop will be\n            at least 80% of the original image height and width. Default: 0.0 (no minimum size).\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform ensures that all bounding boxes in the original image are fully contained within the\n          cropped area. If it's not possible to find such a crop (e.g., when bounding boxes are too spread out),\n          it will default to cropping the entire image.\n        - After cropping, the result is resized to the specified (height, width) size.\n        - Bounding box coordinates are adjusted to match the new image size.\n        - Keypoints are moved along with the crop and scaled to the new image size.\n        - If there are no bounding boxes in the image, it will fall back to a random crop.\n\n    Mathematical Details:\n        1. A crop region is selected that includes all bounding boxes.\n        2. The crop size is determined by the erosion_rate:\n           min_crop_size = (1 - erosion_rate) * original_size\n        3. If the selected crop is smaller than min_crop_size, it's expanded to meet this requirement.\n        4. The crop is then resized to the specified (height, width) size.\n        5. Bounding box coordinates are transformed to match the new image size:\n           new_coord = (old_coord - crop_start) * (new_size / crop_size)\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (300, 300, 3), dtype=np.uint8)\n        &gt;&gt;&gt; bboxes = [(10, 10, 50, 50), (100, 100, 150, 150)]\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomSizedBBoxSafeCrop(height=224, width=224, erosion_rate=0.2, p=1.0),\n        ... ], bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels']))\n        &gt;&gt;&gt; transformed = transform(image=image, bboxes=bboxes, labels=['cat', 'dog'])\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        # transformed_image will be a 224x224 image containing all original bounding boxes,\n        # with their coordinates adjusted to the new image size.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        height: Annotated[int, Field(ge=1)]\n        width: Annotated[int, Field(ge=1)]\n        erosion_rate: float = Field(\n            ge=0.0,\n            le=1.0,\n        )\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        erosion_rate: float = 0.0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        always_apply: bool | None = None,\n        p: float = 1.0,\n    ):\n        super().__init__(erosion_rate=erosion_rate, p=p)\n        self.height = height\n        self.width = width\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        crop = fcrops.crop(img, *crop_coords)\n        return fgeometric.resize(crop, (self.height, self.width), self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        crop = fcrops.crop(mask, *crop_coords)\n        return fgeometric.resize(crop, (self.height, self.width), self.mask_interpolation)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        keypoints = fcrops.crop_keypoints_by_coords(keypoints, crop_coords)\n\n        crop_height = crop_coords[3] - crop_coords[1]\n        crop_width = crop_coords[2] - crop_coords[0]\n\n        scale_y = self.height / crop_height\n        scale_x = self.width / crop_width\n        return fgeometric.keypoints_scale(keypoints, scale_x=scale_x, scale_y=scale_y)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (*super().get_transform_init_args_names(), \"height\", \"width\", \"interpolation\", \"mask_interpolation\")\n</code></pre>"},{"location":"api_reference/augmentations/crops/transforms/#albumentations.augmentations.crops.transforms.RandomSizedCrop","title":"<code>class  RandomSizedCrop</code> <code>       (min_max_height, size=None, width=None, height=None, *, w2h_ratio=1.0, interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop a random part of the input and rescale it to a specific size.</p> <p>This transform first crops a random portion of the input and then resizes it to a specified size. The size of the random crop is controlled by the 'min_max_height' parameter.</p> <p>Parameters:</p> Name Type Description <code>min_max_height</code> <code>tuple[int, int]</code> <p>Minimum and maximum height of the crop in pixels.</p> <code>size</code> <code>tuple[int, int]</code> <p>Target size for the output image, i.e. (height, width) after crop and resize.</p> <code>w2h_ratio</code> <code>float</code> <p>Aspect ratio (width/height) of crop. Default: 1.0</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The crop size is randomly selected for each execution within the range specified by 'min_max_height'.</li> <li>The aspect ratio of the crop is determined by the 'w2h_ratio' parameter.</li> <li>After cropping, the result is resized to the specified 'size'.</li> <li>Bounding boxes that end up fully outside the cropped area will be removed.</li> <li>Keypoints that end up outside the cropped area will be removed.</li> <li>This transform differs from RandomResizedCrop in that it allows more control over the crop size   through the 'min_max_height' parameter, rather than using a scale parameter.</li> </ul> <p>Mathematical Details:     1. A random crop height h is sampled from the range [min_max_height[0], min_max_height[1]].     2. The crop width w is calculated as: w = h * w2h_ratio     3. A random location for the crop is selected within the input image.     4. The image is cropped to the size (h, w).     5. The crop is then resized to the specified 'size'.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomSizedCrop(\n...     min_max_height=(50, 80),\n...     size=(64, 64),\n...     w2h_ratio=1.0,\n...     interpolation=cv2.INTER_LINEAR,\n...     p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# transformed_image will be a 64x64 image, resulting from a crop with height\n# between 50 and 80 pixels, and the same aspect ratio as specified by w2h_ratio,\n# taken from a random location in the original image and then resized.\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/crops/transforms.py</code> Python<pre><code>class RandomSizedCrop(_BaseRandomSizedCrop):\n    \"\"\"Crop a random part of the input and rescale it to a specific size.\n\n    This transform first crops a random portion of the input and then resizes it to a specified size.\n    The size of the random crop is controlled by the 'min_max_height' parameter.\n\n    Args:\n        min_max_height (tuple[int, int]): Minimum and maximum height of the crop in pixels.\n        size (tuple[int, int]): Target size for the output image, i.e. (height, width) after crop and resize.\n        w2h_ratio (float): Aspect ratio (width/height) of crop. Default: 1.0\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The crop size is randomly selected for each execution within the range specified by 'min_max_height'.\n        - The aspect ratio of the crop is determined by the 'w2h_ratio' parameter.\n        - After cropping, the result is resized to the specified 'size'.\n        - Bounding boxes that end up fully outside the cropped area will be removed.\n        - Keypoints that end up outside the cropped area will be removed.\n        - This transform differs from RandomResizedCrop in that it allows more control over the crop size\n          through the 'min_max_height' parameter, rather than using a scale parameter.\n\n    Mathematical Details:\n        1. A random crop height h is sampled from the range [min_max_height[0], min_max_height[1]].\n        2. The crop width w is calculated as: w = h * w2h_ratio\n        3. A random location for the crop is selected within the input image.\n        4. The image is cropped to the size (h, w).\n        5. The crop is then resized to the specified 'size'.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomSizedCrop(\n        ...     min_max_height=(50, 80),\n        ...     size=(64, 64),\n        ...     w2h_ratio=1.0,\n        ...     interpolation=cv2.INTER_LINEAR,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # transformed_image will be a 64x64 image, resulting from a crop with height\n        # between 50 and 80 pixels, and the same aspect ratio as specified by w2h_ratio,\n        # taken from a random location in the original image and then resized.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        min_max_height: OnePlusIntRangeType\n        w2h_ratio: Annotated[float, Field(gt=0)]\n        width: int | None\n        height: int | None\n        size: ScaleIntType | None\n\n        @model_validator(mode=\"after\")\n        def process(self) -&gt; Self:\n            if isinstance(self.size, int):\n                if isinstance(self.width, int):\n                    warn(\n                        \"Initializing with 'size' as an integer and a separate 'width', `height` are deprecated. \"\n                        \"Please use a tuple (height, width) for the 'size' argument.\",\n                        DeprecationWarning,\n                        stacklevel=2,\n                    )\n                    self.size = (self.size, self.width)\n                else:\n                    msg = \"If size is an integer, width as integer must be specified.\"\n                    raise TypeError(msg)\n\n            if self.size is None:\n                if self.height is None or self.width is None:\n                    message = \"If 'size' is not provided, both 'height' and 'width' must be specified.\"\n                    raise ValueError(message)\n                self.size = (self.height, self.width)\n            return self\n\n    def __init__(\n        self,\n        min_max_height: tuple[int, int],\n        # NOTE @zetyquickly: when (width, height) are deprecated, make 'size' non optional\n        size: ScaleIntType | None = None,\n        width: int | None = None,\n        height: int | None = None,\n        *,\n        w2h_ratio: float = 1.0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            size=cast(tuple[int, int], size),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.min_max_height = min_max_height\n        self.w2h_ratio = w2h_ratio\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, tuple[int, int, int, int]]:\n        image_shape = params[\"shape\"][:2]\n\n        crop_height = self.py_random.randint(*self.min_max_height)\n        crop_width = int(crop_height * self.w2h_ratio)\n\n        crop_shape = (crop_height, crop_width)\n\n        h_start = self.py_random.random()\n        w_start = self.py_random.random()\n\n        crop_coords = fcrops.get_crop_coords(image_shape, crop_shape, h_start, w_start)\n\n        return {\"crop_coords\": crop_coords}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (*super().get_transform_init_args_names(), \"min_max_height\", \"w2h_ratio\")\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/","title":"Index","text":"<ul> <li>Domain Adaptation functional transforms (albumentations.augmentations.domain_adaptation.functional)</li> <li>Domain Adaptation transforms (albumentations.augmentations.domain_adaptation.transforms)</li> </ul>"},{"location":"api_reference/augmentations/domain_adaptation/functional/","title":"Domain Adaptation functional transforms (augmentations.domain_adaptation.functional)","text":""},{"location":"api_reference/augmentations/domain_adaptation/functional/#albumentations.augmentations.domain_adaptation.functional.apply_histogram","title":"<code>def apply_histogram    (img, reference_image, blend_ratio)    </code> [view source on GitHub]","text":"<p>Apply histogram matching to an input image using a reference image and blend the result.</p> <p>This function performs histogram matching between the input image and a reference image, then blends the result with the original input image based on the specified blend ratio.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The input image to be transformed. Can be either grayscale or RGB. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>reference_image</code> <code>np.ndarray</code> <p>The reference image used for histogram matching. Should have the same number of channels as the input image. Supported dtypes: uint8, float32 (values should be in [0, 1] range).</p> <code>blend_ratio</code> <code>float</code> <p>The ratio for blending the matched image with the original image. Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The transformed image after histogram matching and blending.     The output will have the same shape and dtype as the input image.</p> <p>Supported image types:     - Grayscale images: 2D arrays     - RGB images: 3D arrays with 3 channels     - Multispectral images: 3D arrays with more than 3 channels</p> <p>Note</p> <ul> <li>If the input and reference images have different sizes, the reference image   will be resized to match the input image's dimensions.</li> <li>The function uses a custom implementation of histogram matching based on OpenCV and NumPy.</li> <li>The @clipped and @preserve_channel_dim decorators ensure the output is within   the valid range and maintains the original number of dimensions.</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef apply_histogram(img: np.ndarray, reference_image: np.ndarray, blend_ratio: float) -&gt; np.ndarray:\n    \"\"\"Apply histogram matching to an input image using a reference image and blend the result.\n\n    This function performs histogram matching between the input image and a reference image,\n    then blends the result with the original input image based on the specified blend ratio.\n\n    Args:\n        img (np.ndarray): The input image to be transformed. Can be either grayscale or RGB.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        reference_image (np.ndarray): The reference image used for histogram matching.\n            Should have the same number of channels as the input image.\n            Supported dtypes: uint8, float32 (values should be in [0, 1] range).\n        blend_ratio (float): The ratio for blending the matched image with the original image.\n            Should be in the range [0, 1], where 0 means no change and 1 means full histogram matching.\n\n    Returns:\n        np.ndarray: The transformed image after histogram matching and blending.\n            The output will have the same shape and dtype as the input image.\n\n    Supported image types:\n        - Grayscale images: 2D arrays\n        - RGB images: 3D arrays with 3 channels\n        - Multispectral images: 3D arrays with more than 3 channels\n\n    Note:\n        - If the input and reference images have different sizes, the reference image\n          will be resized to match the input image's dimensions.\n        - The function uses a custom implementation of histogram matching based on OpenCV and NumPy.\n        - The @clipped and @preserve_channel_dim decorators ensure the output is within\n          the valid range and maintains the original number of dimensions.\n    \"\"\"\n    # Resize reference image only if necessary\n    if img.shape[:2] != reference_image.shape[:2]:\n        reference_image = cv2.resize(reference_image, dsize=(img.shape[1], img.shape[0]))\n\n    img = np.squeeze(img)\n    reference_image = np.squeeze(reference_image)\n\n    # Match histograms between the images\n    matched = match_histograms(img, reference_image)\n\n    # Blend the original image and the matched image\n    return add_weighted(matched, blend_ratio, img, 1 - blend_ratio)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/functional/#albumentations.augmentations.domain_adaptation.functional.fourier_domain_adaptation","title":"<code>def fourier_domain_adaptation    (img, target_img, beta)    </code> [view source on GitHub]","text":"<p>Apply Fourier Domain Adaptation to the input image using a target image.</p> <p>This function performs domain adaptation in the frequency domain by modifying the amplitude spectrum of the source image based on the target image's amplitude spectrum. It preserves the phase information of the source image, which helps maintain its content while adapting its style to match the target image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The source image to be adapted. Can be grayscale or RGB.</p> <code>target_img</code> <code>np.ndarray</code> <p>The target image used as a reference for adaptation. Should have the same dimensions as the source image.</p> <code>beta</code> <code>float</code> <p>The adaptation strength, typically in the range [0, 1]. Higher values result in stronger adaptation towards the target image's style.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The adapted image with the same shape and type as the input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the source and target images have different shapes.</p> <p>Note</p> <ul> <li>Both input images are converted to float32 for processing.</li> <li>The function handles both grayscale (2D) and color (3D) images.</li> <li>For grayscale images, an extra dimension is added to facilitate uniform processing.</li> <li>The adaptation is performed channel-wise for color images.</li> <li>The output is clipped to the valid range and preserves the original number of channels.</li> </ul> <p>The adaptation process involves the following steps for each channel: 1. Compute the 2D Fourier Transform of both source and target images. 2. Shift the zero frequency component to the center of the spectrum. 3. Extract amplitude and phase information from the source image's spectrum. 4. Mutate the source amplitude using the target amplitude and the beta parameter. 5. Combine the mutated amplitude with the original phase. 6. Perform the inverse Fourier Transform to obtain the adapted channel.</p> <p>The <code>low_freq_mutate</code> function (not shown here) is responsible for the actual amplitude mutation, focusing on low-frequency components which carry style information.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n&gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n&gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n</code></pre> <p>References</p> <ul> <li>\"FDA: Fourier Domain Adaptation for Semantic Segmentation\"   (Yang and Soatto, 2020, CVPR)   https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@clipped\n@preserve_channel_dim\ndef fourier_domain_adaptation(img: np.ndarray, target_img: np.ndarray, beta: float) -&gt; np.ndarray:\n    \"\"\"Apply Fourier Domain Adaptation to the input image using a target image.\n\n    This function performs domain adaptation in the frequency domain by modifying the amplitude\n    spectrum of the source image based on the target image's amplitude spectrum. It preserves\n    the phase information of the source image, which helps maintain its content while adapting\n    its style to match the target image.\n\n    Args:\n        img (np.ndarray): The source image to be adapted. Can be grayscale or RGB.\n        target_img (np.ndarray): The target image used as a reference for adaptation.\n            Should have the same dimensions as the source image.\n        beta (float): The adaptation strength, typically in the range [0, 1].\n            Higher values result in stronger adaptation towards the target image's style.\n\n    Returns:\n        np.ndarray: The adapted image with the same shape and type as the input image.\n\n    Raises:\n        ValueError: If the source and target images have different shapes.\n\n    Note:\n        - Both input images are converted to float32 for processing.\n        - The function handles both grayscale (2D) and color (3D) images.\n        - For grayscale images, an extra dimension is added to facilitate uniform processing.\n        - The adaptation is performed channel-wise for color images.\n        - The output is clipped to the valid range and preserves the original number of channels.\n\n    The adaptation process involves the following steps for each channel:\n    1. Compute the 2D Fourier Transform of both source and target images.\n    2. Shift the zero frequency component to the center of the spectrum.\n    3. Extract amplitude and phase information from the source image's spectrum.\n    4. Mutate the source amplitude using the target amplitude and the beta parameter.\n    5. Combine the mutated amplitude with the original phase.\n    6. Perform the inverse Fourier Transform to obtain the adapted channel.\n\n    The `low_freq_mutate` function (not shown here) is responsible for the actual\n    amplitude mutation, focusing on low-frequency components which carry style information.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; source_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; target_img = np.random.rand(100, 100, 3).astype(np.float32)\n        &gt;&gt;&gt; adapted_img = A.fourier_domain_adaptation(source_img, target_img, beta=0.5)\n        &gt;&gt;&gt; assert adapted_img.shape == source_img.shape\n\n    References:\n        - \"FDA: Fourier Domain Adaptation for Semantic Segmentation\"\n          (Yang and Soatto, 2020, CVPR)\n          https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n    \"\"\"\n    src_img = img.astype(np.float32)\n    trg_img = target_img.astype(np.float32)\n\n    if src_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        src_img = np.expand_dims(src_img, axis=-1)\n    if trg_img.ndim == MONO_CHANNEL_DIMENSIONS:\n        trg_img = np.expand_dims(trg_img, axis=-1)\n\n    num_channels = src_img.shape[-1]\n\n    # Prepare container for the output image\n    src_in_trg = np.zeros_like(src_img)\n\n    for channel_id in range(num_channels):\n        # Perform FFT on each channel\n        fft_src = np.fft.fft2(src_img[:, :, channel_id])\n        fft_trg = np.fft.fft2(trg_img[:, :, channel_id])\n\n        # Shift the zero frequency component to the center\n        fft_src_shifted = np.fft.fftshift(fft_src)\n        fft_trg_shifted = np.fft.fftshift(fft_trg)\n\n        # Extract amplitude and phase\n        amp_src, pha_src = np.abs(fft_src_shifted), np.angle(fft_src_shifted)\n        amp_trg = np.abs(fft_trg_shifted)\n\n        # Mutate the amplitude part of the source with the target\n        mutated_amp = low_freq_mutate(amp_src.copy(), amp_trg, beta)\n\n        # Combine the mutated amplitude with the original phase\n        fft_src_mutated = np.fft.ifftshift(mutated_amp * np.exp(1j * pha_src))\n\n        # Perform inverse FFT\n        src_in_trg_channel = np.fft.ifft2(fft_src_mutated)\n\n        # Store the result in the corresponding channel of the output image\n        src_in_trg[:, :, channel_id] = np.real(src_in_trg_channel)\n\n    return src_in_trg\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/functional/#albumentations.augmentations.domain_adaptation.functional.match_histograms","title":"<code>def match_histograms    (image, reference)    </code> [view source on GitHub]","text":"<p>Adjust an image so that its cumulative histogram matches that of another.</p> <p>The adjustment is applied separately for each channel.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image. Can be gray-scale or in color.</p> <code>reference</code> <code>np.ndarray</code> <p>Image to match histogram of. Must have the same number of channels as image.</p> <code>channel_axis</code> <p>If None, the image is assumed to be a grayscale (single channel) image. Otherwise, this parameter indicates which axis of the array corresponds to channels.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed input image.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>Thrown when the number of channels in the input image and the reference differ.</p> Source code in <code>albumentations/augmentations/domain_adaptation/functional.py</code> Python<pre><code>@uint8_io\n@preserve_channel_dim\ndef match_histograms(image: np.ndarray, reference: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Adjust an image so that its cumulative histogram matches that of another.\n\n    The adjustment is applied separately for each channel.\n\n    Args:\n        image: Input image. Can be gray-scale or in color.\n        reference: Image to match histogram of. Must have the same number of channels as image.\n        channel_axis: If None, the image is assumed to be a grayscale (single channel) image.\n            Otherwise, this parameter indicates which axis of the array corresponds to channels.\n\n    Returns:\n        np.ndarray: Transformed input image.\n\n    Raises:\n        ValueError: Thrown when the number of channels in the input image and the reference differ.\n    \"\"\"\n    if reference.dtype != np.uint8:\n        reference = from_float(reference, np.uint8)\n\n    if image.ndim != reference.ndim:\n        raise ValueError(\"Image and reference must have the same number of dimensions.\")\n\n    # Expand dimensions for grayscale images\n    if image.ndim == 2:\n        image = np.expand_dims(image, axis=-1)\n    if reference.ndim == 2:\n        reference = np.expand_dims(reference, axis=-1)\n\n    matched = np.empty(image.shape, dtype=np.uint8)\n\n    num_channels = image.shape[-1]\n\n    for channel in range(num_channels):\n        matched_channel = _match_cumulative_cdf(image[..., channel], reference[..., channel]).astype(np.uint8)\n        matched[..., channel] = matched_channel\n\n    return matched\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/","title":"Domain Adaptation transforms (augmentations.domain_adaptation.transforms)","text":""},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.FDA","title":"<code>class  FDA</code> <code>       (reference_images, beta_limit=(0, 0.1), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source and target datasets, effectively adapting images from one domain to closely resemble those from another without altering their semantic content.</p> <p>This transform is particularly beneficial in scenarios where the training (source) and testing (target) images come from different distributions, such as synthetic versus real images, or day versus night scenes. Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain alignment by swapping low-frequency components of the Fourier transform between the source and target images. This technique has shown to improve the performance of models on the target domain, particularly for tasks like semantic segmentation, without additional training for domain invariance.</p> <p>The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more of the original image's characteristics and higher values leading to more pronounced adaptation effects. It is recommended to use beta values less than 0.3 to avoid introducing artifacts.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>Sequence of objects to be converted into images by <code>read_fn</code>. This typically involves paths to images that serve as target domain examples for adaptation.</p> <code>beta_limit</code> <code>tuple[float, float] | float</code> <p>Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.</p> <code>read_fn</code> <code>Callable</code> <p>User-defined function for reading images. It takes an element from <code>reference_images</code> and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <ul> <li>https://github.com/YanchaoYang/FDA</li> <li>https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n&gt;&gt;&gt; result = aug(image=image)\n</code></pre> <p>Note</p> <p>FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target domain samples are unavailable. It enables significant improvements in model generalization by aligning the low-level statistics of source and target images through a simple yet effective Fourier-based method.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class FDA(ImageOnlyTransform):\n    \"\"\"Fourier Domain Adaptation (FDA) for simple \"style transfer\" in the context of unsupervised domain adaptation\n    (UDA). FDA manipulates the frequency components of images to reduce the domain gap between source\n    and target datasets, effectively adapting images from one domain to closely resemble those from another without\n    altering their semantic content.\n\n    This transform is particularly beneficial in scenarios where the training (source) and testing (target) images\n    come from different distributions, such as synthetic versus real images, or day versus night scenes.\n    Unlike traditional domain adaptation methods that may require complex adversarial training, FDA achieves domain\n    alignment by swapping low-frequency components of the Fourier transform between the source and target images.\n    This technique has shown to improve the performance of models on the target domain, particularly for tasks\n    like semantic segmentation, without additional training for domain invariance.\n\n    The 'beta_limit' parameter controls the extent of frequency component swapping, with lower values preserving more\n    of the original image's characteristics and higher values leading to more pronounced adaptation effects.\n    It is recommended to use beta values less than 0.3 to avoid introducing artifacts.\n\n    Args:\n        reference_images (Sequence[Any]): Sequence of objects to be converted into images by `read_fn`. This typically\n            involves paths to images that serve as target domain examples for adaptation.\n        beta_limit (tuple[float, float] | float): Coefficient beta from the paper, controlling the swapping extent of\n            frequency components. If one value is provided beta will be sampled from uniform\n            distribution [0, beta_limit]. Values should be less than 0.5.\n        read_fn (Callable): User-defined function for reading images. It takes an element from `reference_images` and\n            returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a\n            numpy array.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        - https://github.com/YanchaoYang/FDA\n        - https://openaccess.thecvf.com/content_CVPR_2020/papers/Yang_FDA_Fourier_Domain_Adaptation_for_Semantic_Segmentation_CVPR_2020_paper.pdf\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; target_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; aug = A.Compose([A.FDA([target_image], p=1, read_fn=lambda x: x)])\n        &gt;&gt;&gt; result = aug(image=image)\n\n    Note:\n        FDA is a powerful tool for domain adaptation, particularly in unsupervised settings where annotated target\n        domain samples are unavailable. It enables significant improvements in model generalization by aligning\n        the low-level statistics of source and target images through a simple yet effective Fourier-based method.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        read_fn: Callable[[Any], np.ndarray]\n        beta_limit: ZeroOneRangeType\n\n        @field_validator(\"beta_limit\")\n        @classmethod\n        def check_ranges(cls, value: tuple[float, float]) -&gt; tuple[float, float]:\n            bounds = 0, MAX_BETA_LIMIT\n            if not bounds[0] &lt;= value[0] &lt;= value[1] &lt;= bounds[1]:\n                raise ValueError(f\"Values should be in the range {bounds} got {value} \")\n            return value\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        beta_limit: ScaleFloatType = (0, 0.1),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.beta_limit = cast(tuple[float, float], beta_limit)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        target_image: np.ndarray,\n        beta: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fourier_domain_adaptation(img, target_image, beta)\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, np.ndarray]:\n        height, width = params[\"shape\"][:2]\n        target_img = self.read_fn(self.py_random.choice(self.reference_images))\n        target_img = cv2.resize(target_img, dsize=(width, height))\n\n        return {\"target_image\": target_img, \"beta\": self.py_random.uniform(*self.beta_limit)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str]:\n        return \"reference_images\", \"beta_limit\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"FDA can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.HistogramMatching","title":"<code>class  HistogramMatching</code> <code>       (reference_images, blend_ratio=(0.5, 1.0), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Adjust the pixel values of an input image to match the histogram of a reference image.</p> <p>This transform applies histogram matching, a technique that modifies the distribution of pixel intensities in the input image to closely resemble that of a reference image. This process is performed independently for each channel in multi-channel images, provided both the input and reference images have the same number of channels.</p> <p>Histogram matching is particularly useful for: - Normalizing images from different sources or captured under varying conditions. - Preparing images for feature matching or other computer vision tasks where consistent   tone and contrast are important. - Simulating different lighting or camera conditions in a controlled manner.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the <code>read_fn</code>.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)</p> <code>read_fn</code> <code>Callable[[Any], np.ndarray]</code> <p>A function that takes an element from <code>reference_images</code> and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> <li>The effectiveness of the matching depends on the similarity between the input and reference images.</li> <li>For best results, choose reference images that represent the desired tone and contrast.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.HistogramMatching(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     read_fn=lambda x: x,\n...     p=1\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; matched_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>Histogram Matching in scikit-image:   https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class HistogramMatching(ImageOnlyTransform):\n    \"\"\"Adjust the pixel values of an input image to match the histogram of a reference image.\n\n    This transform applies histogram matching, a technique that modifies the distribution of pixel\n    intensities in the input image to closely resemble that of a reference image. This process is\n    performed independently for each channel in multi-channel images, provided both the input and\n    reference images have the same number of channels.\n\n    Histogram matching is particularly useful for:\n    - Normalizing images from different sources or captured under varying conditions.\n    - Preparing images for feature matching or other computer vision tasks where consistent\n      tone and contrast are important.\n    - Simulating different lighting or camera conditions in a controlled manner.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of reference image sources. These can be\n            file paths, URLs, or any objects that can be converted to images by the `read_fn`.\n        blend_ratio (tuple[float, float]): Range for the blending factor between the original\n            and the matched image. Must be two floats between 0 and 1, where:\n            - 0 means no blending (original image is returned)\n            - 1 means full histogram matching\n            A random value within this range is chosen for each application.\n            Default: (0.5, 1.0)\n        read_fn (Callable[[Any], np.ndarray]): A function that takes an element from\n            `reference_images` and returns a numpy array representing the image.\n            Default: read_rgb_image (reads image file from disk)\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform cannot be directly serialized due to its dependency on external image data.\n        - The effectiveness of the matching depends on the similarity between the input and reference images.\n        - For best results, choose reference images that represent the desired tone and contrast.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.HistogramMatching(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     read_fn=lambda x: x,\n        ...     p=1\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; matched_image = result[\"image\"]\n\n    References:\n        - Histogram Matching in scikit-image:\n          https://scikit-image.org/docs/dev/auto_examples/color_exposure/plot_histogram_matching.html\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.5, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n\n    def apply(\n        self: np.ndarray,\n        img: np.ndarray,\n        reference_image: np.ndarray,\n        blend_ratio: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return apply_histogram(img, reference_image, blend_ratio)\n\n    def get_params(self) -&gt; dict[str, np.ndarray]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"HistogramMatching can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.PixelDistributionAdaptation","title":"<code>class  PixelDistributionAdaptation</code> <code>       (reference_images, blend_ratio=(0.25, 1.0), read_fn=&lt;function read_rgb_image at 0x7fcff8b62f20&gt;, transform_type='pca', p=0.5, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image with that of a reference image. This process involves fitting a simple statistical transformation (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images, transforming the original image with the transformation trained on it, and then applying the inverse transformation using the transform fitted on the reference image. The result is an adapted image that retains the original content while mimicking the pixel value distribution of the reference domain.</p> <p>The process can be visualized as two main steps: 1. Adjusting the original image to a standard distribution space using a selected transform. 2. Moving the adjusted image into the distribution space of the reference image by applying the inverse    of the transform fitted on the reference image.</p> <p>This technique is especially useful in scenarios where images from different domains (e.g., synthetic vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in image processing tasks.</p> <p>Parameters:</p> Name Type Description <code>reference_images</code> <code>Sequence[Any]</code> <p>A sequence of objects (typically image paths) that will be converted into images by <code>read_fn</code>. These images serve as references for the domain adaptation.</p> <code>blend_ratio</code> <code>tuple[float, float]</code> <p>Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)</p> <code>read_fn</code> <code>Callable</code> <p>A user-defined function for reading and converting the objects in <code>reference_images</code> into numpy arrays. By default, it assumes these objects are image paths.</p> <code>transform_type</code> <code>Literal[\"pca\", \"standard\", \"minmax\"]</code> <p>Specifies the type of statistical transformation to apply. - \"pca\": Principal Component Analysis - \"standard\": StandardScaler (zero mean and unit variance) - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: \"pca\"</p> <code>p</code> <code>float</code> <p>The probability of applying the transform to any given image. Default: 0.5</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The effectiveness of the adaptation depends on the similarity between the input and reference domains.</li> <li>PCA transformation may alter color relationships more significantly than other methods.</li> <li>StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.</li> <li>The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.</li> <li>This transform cannot be directly serialized due to its dependency on external image data.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n&gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n...     reference_images=[reference_image],\n...     blend_ratio=(0.5, 1.0),\n...     transform_type=\"standard\",\n...     read_fn=lambda x: x,\n...     p=1.0\n... )\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; adapted_image = result[\"image\"]\n</code></pre> <p>References</p> <ul> <li>https://github.com/arsenyinfo/qudida</li> <li>https://arxiv.org/abs/1911.11483</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class PixelDistributionAdaptation(ImageOnlyTransform):\n    \"\"\"Performs pixel-level domain adaptation by aligning the pixel value distribution of an input image\n    with that of a reference image. This process involves fitting a simple statistical transformation\n    (such as PCA, StandardScaler, or MinMaxScaler) to both the original and the reference images,\n    transforming the original image with the transformation trained on it, and then applying the inverse\n    transformation using the transform fitted on the reference image. The result is an adapted image\n    that retains the original content while mimicking the pixel value distribution of the reference domain.\n\n    The process can be visualized as two main steps:\n    1. Adjusting the original image to a standard distribution space using a selected transform.\n    2. Moving the adjusted image into the distribution space of the reference image by applying the inverse\n       of the transform fitted on the reference image.\n\n    This technique is especially useful in scenarios where images from different domains (e.g., synthetic\n    vs. real images, day vs. night scenes) need to be harmonized for better consistency or performance in\n    image processing tasks.\n\n    Args:\n        reference_images (Sequence[Any]): A sequence of objects (typically image paths) that will be\n            converted into images by `read_fn`. These images serve as references for the domain adaptation.\n        blend_ratio (tuple[float, float]): Specifies the minimum and maximum blend ratio for mixing\n            the adapted image with the original. This enhances the diversity of the output images.\n            Values should be in the range [0, 1]. Default: (0.25, 1.0)\n        read_fn (Callable): A user-defined function for reading and converting the objects in\n            `reference_images` into numpy arrays. By default, it assumes these objects are image paths.\n        transform_type (Literal[\"pca\", \"standard\", \"minmax\"]): Specifies the type of statistical\n            transformation to apply.\n            - \"pca\": Principal Component Analysis\n            - \"standard\": StandardScaler (zero mean and unit variance)\n            - \"minmax\": MinMaxScaler (scales to a fixed range, usually [0, 1])\n            Default: \"pca\"\n        p (float): The probability of applying the transform to any given image. Default: 0.5\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The effectiveness of the adaptation depends on the similarity between the input and reference domains.\n        - PCA transformation may alter color relationships more significantly than other methods.\n        - StandardScaler and MinMaxScaler preserve color relationships better but may provide less dramatic adaptations.\n        - The blend_ratio parameter allows for a smooth transition between the original and fully adapted image.\n        - This transform cannot be directly serialized due to its dependency on external image data.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; reference_image = np.random.randint(0, 256, [100, 100, 3], dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.PixelDistributionAdaptation(\n        ...     reference_images=[reference_image],\n        ...     blend_ratio=(0.5, 1.0),\n        ...     transform_type=\"standard\",\n        ...     read_fn=lambda x: x,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; adapted_image = result[\"image\"]\n\n    References:\n        - https://github.com/arsenyinfo/qudida\n        - https://arxiv.org/abs/1911.11483\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        reference_images: Sequence[Any]\n        blend_ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, 1)),\n        ]\n        read_fn: Callable[[Any], np.ndarray]\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"]\n\n    def __init__(\n        self,\n        reference_images: Sequence[Any],\n        blend_ratio: tuple[float, float] = (0.25, 1.0),\n        read_fn: Callable[[Any], np.ndarray] = read_rgb_image,\n        transform_type: Literal[\"pca\", \"standard\", \"minmax\"] = \"pca\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.reference_images = reference_images\n        self.read_fn = read_fn\n        self.blend_ratio = blend_ratio\n        self.transform_type = transform_type\n\n    def apply(self, img: np.ndarray, reference_image: np.ndarray, blend_ratio: float, **params: Any) -&gt; np.ndarray:\n        return adapt_pixel_distribution(\n            img,\n            ref=reference_image,\n            weight=blend_ratio,\n            transform_type=self.transform_type,\n        )\n\n    def get_params(self) -&gt; dict[str, Any]:\n        return {\n            \"reference_image\": self.read_fn(self.py_random.choice(self.reference_images)),\n            \"blend_ratio\": self.py_random.uniform(*self.blend_ratio),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, str, str, str]:\n        return \"reference_images\", \"blend_ratio\", \"read_fn\", \"transform_type\"\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        msg = \"PixelDistributionAdaptation can not be serialized.\"\n        raise NotImplementedError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms.TemplateTransform","title":"<code>class  TemplateTransform</code> <code>       (templates, img_weight=(0.5, 0.5), template_weight=None, template_transform=None, name=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Apply blending of input image with specified templates.</p> <p>This transform overlays one or more template images onto the input image using alpha blending. It allows for creating complex composite images or simulating various visual effects.</p> <p>Parameters:</p> Name Type Description <code>templates</code> <code>numpy array | list[np.ndarray]</code> <p>Images to use as templates for the transform. If a single numpy array is provided, it will be used as the only template. If a list of numpy arrays is provided, one will be randomly chosen for each application.</p> <code>img_weight</code> <code>tuple[float, float]  | float</code> <p>Weight of the original image in the blend. If a single float, that value will always be used. If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application. To use a fixed weight, use (weight, weight). Default: (0.5, 0.5).</p> <code>template_transform</code> <code>A.Compose | None</code> <p>A composition of Albumentations transforms to apply to the template before blending. This should be an instance of A.Compose containing one or more Albumentations transforms. Default: None.</p> <code>name</code> <code>str | None</code> <p>Name of the transform instance. Used for serialization purposes. Default: None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image</p> <p>Image types:     uint8, float32</p> <p>Number of channels:     Any</p> <p>Note</p> <ul> <li>The template(s) must have the same number of channels as the input image or be single-channel.</li> <li>If a single-channel template is used with a multi-channel image, the template will be replicated across   all channels.</li> <li>The template(s) will be resized to match the input image size if they differ.</li> <li>To make this transform serializable, provide a name when initializing it.</li> </ul> <p>Mathematical Formulation:     Given:     - I: Input image     - T: Template image     - w_i: Weight of input image (sampled from img_weight)</p> <pre><code>The blended image B is computed as:\n\nB = w_i * I + (1 - w_i) * T\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-a-single-template","title":"Apply template transform with a single template","text":"Python<pre><code>&gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-multiple-templates-and-custom-weights","title":"Apply template transform with multiple templates and custom weights","text":"Python<pre><code>&gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=templates,\n...     img_weight=(0.3, 0.7),\n...     name=\"multi_template_transform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre>"},{"location":"api_reference/augmentations/domain_adaptation/transforms/#albumentations.augmentations.domain_adaptation.transforms--apply-template-transform-with-additional-transforms-on-the-template","title":"Apply template transform with additional transforms on the template","text":"Python<pre><code>&gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n&gt;&gt;&gt; transform = A.TemplateTransform(\n...     templates=template,\n...     img_weight=0.6,\n...     template_transform=template_transform,\n...     name=\"transformed_template\",\n...     p=1.0\n... )\n&gt;&gt;&gt; blended_image = transform(image=image)['image']\n</code></pre> <p>References</p> <ul> <li>Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing</li> <li>Image blending: https://en.wikipedia.org/wiki/Image_blending</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/domain_adaptation/transforms.py</code> Python<pre><code>class TemplateTransform(ImageOnlyTransform):\n    \"\"\"Apply blending of input image with specified templates.\n\n    This transform overlays one or more template images onto the input image using alpha blending.\n    It allows for creating complex composite images or simulating various visual effects.\n\n    Args:\n        templates (numpy array | list[np.ndarray]): Images to use as templates for the transform.\n            If a single numpy array is provided, it will be used as the only template.\n            If a list of numpy arrays is provided, one will be randomly chosen for each application.\n\n        img_weight (tuple[float, float]  | float): Weight of the original image in the blend.\n            If a single float, that value will always be used.\n            If a tuple (min, max), the weight will be randomly sampled from the range [min, max) for each application.\n            To use a fixed weight, use (weight, weight).\n            Default: (0.5, 0.5).\n\n        template_transform (A.Compose | None): A composition of Albumentations transforms to apply to the template\n            before blending.\n            This should be an instance of A.Compose containing one or more Albumentations transforms.\n            Default: None.\n\n        name (str | None): Name of the transform instance. Used for serialization purposes.\n            Default: None.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image\n\n    Image types:\n        uint8, float32\n\n    Number of channels:\n        Any\n\n    Note:\n        - The template(s) must have the same number of channels as the input image or be single-channel.\n        - If a single-channel template is used with a multi-channel image, the template will be replicated across\n          all channels.\n        - The template(s) will be resized to match the input image size if they differ.\n        - To make this transform serializable, provide a name when initializing it.\n\n    Mathematical Formulation:\n        Given:\n        - I: Input image\n        - T: Template image\n        - w_i: Weight of input image (sampled from img_weight)\n\n        The blended image B is computed as:\n\n        B = w_i * I + (1 - w_i) * T\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; template = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n\n        # Apply template transform with a single template\n        &gt;&gt;&gt; transform = A.TemplateTransform(templates=template, name=\"my_template_transform\", p=1.0)\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with multiple templates and custom weights\n        &gt;&gt;&gt; templates = [np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8) for _ in range(3)]\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=templates,\n        ...     img_weight=(0.3, 0.7),\n        ...     name=\"multi_template_transform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n        # Apply template transform with additional transforms on the template\n        &gt;&gt;&gt; template_transform = A.Compose([A.RandomBrightnessContrast(p=1)])\n        &gt;&gt;&gt; transform = A.TemplateTransform(\n        ...     templates=template,\n        ...     img_weight=0.6,\n        ...     template_transform=template_transform,\n        ...     name=\"transformed_template\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; blended_image = transform(image=image)['image']\n\n    References:\n        - Alpha compositing: https://en.wikipedia.org/wiki/Alpha_compositing\n        - Image blending: https://en.wikipedia.org/wiki/Image_blending\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        templates: np.ndarray | Sequence[np.ndarray]\n        img_weight: ZeroOneRangeType\n        template_weight: ZeroOneRangeType | None = Field(\n            deprecated=\"Template_weight is deprecated. Computed automatically as (1 - img_weight)\",\n        )\n        template_transform: Compose | BasicTransform | None = None\n        name: str | None\n\n        @field_validator(\"templates\")\n        @classmethod\n        def validate_templates(cls, v: np.ndarray | list[np.ndarray]) -&gt; list[np.ndarray]:\n            if isinstance(v, np.ndarray):\n                return [v]\n            if isinstance(v, list):\n                if not all(isinstance(item, np.ndarray) for item in v):\n                    msg = \"All templates must be numpy arrays.\"\n                    raise ValueError(msg)\n                return v\n            msg = \"Templates must be a numpy array or a list of numpy arrays.\"\n            raise TypeError(msg)\n\n    def __init__(\n        self,\n        templates: np.ndarray | list[np.ndarray],\n        img_weight: ScaleFloatType = (0.5, 0.5),\n        template_weight: None = None,\n        template_transform: Compose | BasicTransform | None = None,\n        name: str | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.templates = templates\n        self.img_weight = cast(tuple[float, float], img_weight)\n        self.template_transform = template_transform\n        self.name = name\n\n    def apply(\n        self,\n        img: np.ndarray,\n        template: np.ndarray,\n        img_weight: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if img_weight == 0:\n            return template\n        if img_weight == 1:\n            return img\n\n        return add_weighted(img, img_weight, template, 1 - img_weight)\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\n            \"img_weight\": self.py_random.uniform(*self.img_weight),\n        }\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        template = self.py_random.choice(self.templates)\n\n        if self.template_transform is not None:\n            template = self.template_transform(image=template)[\"image\"]\n\n        if get_num_channels(template) not in [1, get_num_channels(image)]:\n            msg = (\n                \"Template must be a single channel or \"\n                \"has the same number of channels as input \"\n                f\"image ({get_num_channels(image)}), got {get_num_channels(template)}\"\n            )\n            raise ValueError(msg)\n\n        if template.dtype != image.dtype:\n            msg = \"Image and template must be the same image type\"\n            raise ValueError(msg)\n\n        if image.shape[:2] != template.shape[:2]:\n            template = fgeometric.resize(template, image.shape[:2], interpolation=cv2.INTER_AREA)\n\n        if get_num_channels(template) == 1 and get_num_channels(image) &gt; 1:\n            # Replicate single channel template across all channels to match input image\n            template = cv2.merge([template] * get_num_channels(image))\n        # in order to support grayscale image with dummy dim\n        template = template.reshape(image.shape)\n\n        return {\"template\": template}\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        if self.name is None:\n            msg = (\n                \"To make a TemplateTransform serializable you should provide the `name` argument, \"\n                \"e.g. `TemplateTransform(name='my_transform', ...)`.\"\n            )\n            raise ValueError(msg)\n        return {\"__class_fullname__\": self.get_class_fullname(), \"__name__\": self.name}\n</code></pre>"},{"location":"api_reference/augmentations/dropout/","title":"Index","text":"<ul> <li>ChannelDropout augmentation (albumentations.augmentations.dropout.channel_dropout)</li> <li>CoarseDropout augmentation (albumentations.augmentations.dropout.coarse_dropout)</li> <li>GridDropout augmentation (albumentations.augmentations.dropout.grid_dropout)</li> <li>MaskDropout augmentation (albumentations.augmentations.dropout.mask_dropout)</li> </ul>"},{"location":"api_reference/augmentations/dropout/channel_dropout/","title":"ChannelDropout augmentation (augmentations.dropout.channel_dropout)","text":""},{"location":"api_reference/augmentations/dropout/channel_dropout/#albumentations.augmentations.dropout.channel_dropout.ChannelDropout","title":"<code>class  ChannelDropout</code> <code>       (channel_drop_range=(1, 1), fill_value=None, fill=0, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Randomly drop channels in the input image.</p> <p>This transform randomly selects a number of channels to drop from the input image and replaces them with a specified fill value. This can improve model robustness to missing or corrupted channels.</p> <p>The technique is conceptually similar to: - Dropout layers in neural networks, which randomly set input units to 0 during training. - CoarseDropout augmentation, which drops out regions in the spatial dimensions of the image.</p> <p>However, ChannelDropout operates on the channel dimension, effectively \"dropping out\" entire color channels or feature maps.</p> <p>Parameters:</p> Name Type Description <code>channel_drop_range</code> <code>tuple[int, int]</code> <p>Range from which to choose the number of channels to drop. The actual number will be randomly selected from the inclusive range [min, max]. Default: (1, 1).</p> <code>fill</code> <code>float</code> <p>Pixel value used to fill the dropped channels. Default: 0.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Must be in the range [0, 1]. Default: 0.5.</p> <p>Exceptions:</p> Type Description <code>NotImplementedError</code> <p>If the input image has only one channel.</p> <code>ValueError</code> <p>If the upper bound of channel_drop_range is greater than or equal to the number of channels in the input image.</p> <p>Targets</p> <p>image, volume</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.ChannelDropout(channel_drop_range=(1, 2), fill=128, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; dropped_image = result['image']\n&gt;&gt;&gt; assert dropped_image.shape == image.shape\n&gt;&gt;&gt; assert np.any(dropped_image != image)  # Some channels should be different\n</code></pre> <p>Note</p> <ul> <li>The number of channels to drop is randomly chosen within the specified range.</li> <li>Channels are randomly selected for dropping.</li> <li>This transform is not applicable to single-channel (grayscale) images.</li> <li>The transform will raise an error if it's not possible to drop the specified   number of channels (e.g., trying to drop 3 channels from an RGB image).</li> <li>This augmentation can be particularly useful for training models to be robust   against missing or corrupted channel data in multi-spectral or hyperspectral imagery.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/channel_dropout.py</code> Python<pre><code>class ChannelDropout(ImageOnlyTransform):\n    \"\"\"Randomly drop channels in the input image.\n\n    This transform randomly selects a number of channels to drop from the input image\n    and replaces them with a specified fill value. This can improve model robustness\n    to missing or corrupted channels.\n\n    The technique is conceptually similar to:\n    - Dropout layers in neural networks, which randomly set input units to 0 during training.\n    - CoarseDropout augmentation, which drops out regions in the spatial dimensions of the image.\n\n    However, ChannelDropout operates on the channel dimension, effectively \"dropping out\"\n    entire color channels or feature maps.\n\n    Args:\n        channel_drop_range (tuple[int, int]): Range from which to choose the number\n            of channels to drop. The actual number will be randomly selected from\n            the inclusive range [min, max]. Default: (1, 1).\n        fill (float): Pixel value used to fill the dropped channels.\n            Default: 0.\n        p (float): Probability of applying the transform. Must be in the range\n            [0, 1]. Default: 0.5.\n\n    Raises:\n        NotImplementedError: If the input image has only one channel.\n        ValueError: If the upper bound of channel_drop_range is greater than or\n            equal to the number of channels in the input image.\n\n    Targets:\n        image, volume\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.ChannelDropout(channel_drop_range=(1, 2), fill=128, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; dropped_image = result['image']\n        &gt;&gt;&gt; assert dropped_image.shape == image.shape\n        &gt;&gt;&gt; assert np.any(dropped_image != image)  # Some channels should be different\n\n    Note:\n        - The number of channels to drop is randomly chosen within the specified range.\n        - Channels are randomly selected for dropping.\n        - This transform is not applicable to single-channel (grayscale) images.\n        - The transform will raise an error if it's not possible to drop the specified\n          number of channels (e.g., trying to drop 3 channels from an RGB image).\n        - This augmentation can be particularly useful for training models to be robust\n          against missing or corrupted channel data in multi-spectral or hyperspectral imagery.\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        channel_drop_range: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n        fill_value: float | None\n        fill: float\n\n        @model_validator(mode=\"after\")\n        def validate_fill(self) -&gt; Self:\n            if self.fill_value is not None:\n                self.fill = self.fill_value\n                warn(\n                    \"`fill_value` deprecated. Use `fill` instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n            return self\n\n    def __init__(\n        self,\n        channel_drop_range: tuple[int, int] = (1, 1),\n        fill_value: float | None = None,\n        fill: float = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.channel_drop_range = channel_drop_range\n        self.fill = fill\n\n    def apply(self, img: np.ndarray, channels_to_drop: tuple[int, ...], **params: Any) -&gt; np.ndarray:\n        return channel_dropout(img, channels_to_drop, self.fill)\n\n    def get_params_dependent_on_data(self, params: Mapping[str, Any], data: Mapping[str, Any]) -&gt; dict[str, Any]:\n        image = data[\"image\"] if \"image\" in data else data[\"images\"][0]\n\n        num_channels = get_num_channels(image)\n\n        if num_channels == 1:\n            msg = \"Images has one channel. ChannelDropout is not defined.\"\n            raise NotImplementedError(msg)\n\n        if self.channel_drop_range[1] &gt;= num_channels:\n            msg = \"Can not drop all channels in ChannelDropout.\"\n            raise ValueError(msg)\n\n        num_drop_channels = self.py_random.randint(*self.channel_drop_range)\n\n        channels_to_drop = self.py_random.sample(range(num_channels), k=num_drop_channels)\n\n        return {\"channels_to_drop\": channels_to_drop}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"channel_drop_range\", \"fill\"\n</code></pre>"},{"location":"api_reference/augmentations/dropout/coarse_dropout/","title":"CoarseDropout augmentation (augmentations.dropout.coarse_dropout)","text":""},{"location":"api_reference/augmentations/dropout/coarse_dropout/#albumentations.augmentations.dropout.coarse_dropout.CoarseDropout","title":"<code>class  CoarseDropout</code> <code>       (max_holes=None, max_height=None, max_width=None, min_holes=None, min_height=None, min_width=None, fill_value=None, mask_fill_value=None, num_holes_range=(1, 1), hole_height_range=(8, 8), hole_width_range=(8, 8), fill=0, fill_mask=None, p=0.5, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>CoarseDropout randomly drops out rectangular regions from the image and optionally, the corresponding regions in an associated mask, to simulate occlusion and varied object sizes found in real-world settings.</p> <p>This transformation is an evolution of CutOut and RandomErasing, offering more flexibility in the size, number of dropout regions, and fill values.</p> <p>Parameters:</p> Name Type Description <code>num_holes_range</code> <code>tuple[int, int]</code> <p>Range (min, max) for the number of rectangular regions to drop out. Default: (1, 1)</p> <code>hole_height_range</code> <code>tuple[Real, Real]</code> <p>Range (min, max) for the height of dropout regions. If int, specifies absolute pixel values. If float, interpreted as a fraction of the image height. Default: (8, 8)</p> <code>hole_width_range</code> <code>tuple[Real, Real]</code> <p>Range (min, max) for the width of dropout regions. If int, specifies absolute pixel values. If float, interpreted as a fraction of the image width. Default: (8, 8)</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Fill value for dropout regions in the mask. If None, mask regions corresponding to image dropouts are unchanged. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual number and size of dropout regions are randomly chosen within the specified ranges for each     application.</li> <li>When using float values for hole_height_range and hole_width_range, ensure they are between 0 and 1.</li> <li>This implementation includes deprecation warnings for older parameter names (min_holes, max_holes, etc.).</li> <li>Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.</li> <li>For 'random_uniform' fill, each hole gets a single random color, unlike 'random' where each pixel     gets its own random value.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; # Example with random uniform fill\n&gt;&gt;&gt; aug_random = A.CoarseDropout(\n...     num_holes_range=(3, 6),\n...     hole_height_range=(10, 20),\n...     hole_width_range=(10, 20),\n...     fill=\"random_uniform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; # Example with inpainting\n&gt;&gt;&gt; aug_inpaint = A.CoarseDropout(\n...     num_holes_range=(3, 6),\n...     hole_height_range=(10, 20),\n...     hole_width_range=(10, 20),\n...     fill=\"inpaint_ns\",\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n</code></pre> <p>References</p> <ul> <li>CutOut: https://arxiv.org/abs/1708.04552</li> <li>Random Erasing: https://arxiv.org/abs/1708.04896</li> <li>OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/coarse_dropout.py</code> Python<pre><code>class CoarseDropout(BaseDropout):\n    \"\"\"CoarseDropout randomly drops out rectangular regions from the image and optionally,\n    the corresponding regions in an associated mask, to simulate occlusion and\n    varied object sizes found in real-world settings.\n\n    This transformation is an evolution of CutOut and RandomErasing, offering more\n    flexibility in the size, number of dropout regions, and fill values.\n\n    Args:\n        num_holes_range (tuple[int, int]): Range (min, max) for the number of rectangular\n            regions to drop out. Default: (1, 1)\n        hole_height_range (tuple[Real, Real]): Range (min, max) for the height\n            of dropout regions. If int, specifies absolute pixel values. If float,\n            interpreted as a fraction of the image height. Default: (8, 8)\n        hole_width_range (tuple[Real, Real]): Range (min, max) for the width\n            of dropout regions. If int, specifies absolute pixel values. If float,\n            interpreted as a fraction of the image width. Default: (8, 8)\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value for the dropped pixels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            - 'random': each pixel is filled with random values\n            - 'random_uniform': each hole is filled with a single random color\n            - 'inpaint_telea': uses OpenCV Telea inpainting method\n            - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        fill_mask (ColorType | None): Fill value for dropout regions in the mask.\n            If None, mask regions corresponding to image dropouts are unchanged. Default: None\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual number and size of dropout regions are randomly chosen within the specified ranges for each\n            application.\n        - When using float values for hole_height_range and hole_width_range, ensure they are between 0 and 1.\n        - This implementation includes deprecation warnings for older parameter names (min_holes, max_holes, etc.).\n        - Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.\n        - For 'random_uniform' fill, each hole gets a single random color, unlike 'random' where each pixel\n            gets its own random value.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; # Example with random uniform fill\n        &gt;&gt;&gt; aug_random = A.CoarseDropout(\n        ...     num_holes_range=(3, 6),\n        ...     hole_height_range=(10, 20),\n        ...     hole_width_range=(10, 20),\n        ...     fill=\"random_uniform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; # Example with inpainting\n        &gt;&gt;&gt; aug_inpaint = A.CoarseDropout(\n        ...     num_holes_range=(3, 6),\n        ...     hole_height_range=(10, 20),\n        ...     hole_width_range=(10, 20),\n        ...     fill=\"inpaint_ns\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n\n    References:\n        - CutOut: https://arxiv.org/abs/1708.04552\n        - Random Erasing: https://arxiv.org/abs/1708.04896\n        - OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html\n    \"\"\"\n\n    class InitSchema(BaseDropout.InitSchema):\n        min_holes: int | None = Field(ge=0)\n        max_holes: int | None = Field(ge=0)\n        num_holes_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(1, None)),\n            AfterValidator(nondecreasing),\n        ]\n\n        min_height: ScalarType | None = Field(ge=0)\n        max_height: ScalarType | None = Field(ge=0)\n        hole_height_range: Annotated[\n            tuple[ScalarType, ScalarType],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(1, None)),\n        ]\n\n        min_width: ScalarType | None = Field(ge=0)\n        max_width: ScalarType | None = Field(ge=0)\n        hole_width_range: Annotated[\n            tuple[ScalarType, ScalarType],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(1, None)),\n        ]\n\n        @staticmethod\n        def update_range(\n            min_value: Number | None,\n            max_value: Number | None,\n            default_range: tuple[Number, Number],\n        ) -&gt; tuple[Number, Number]:\n            return (min_value or max_value, max_value) if max_value is not None else default_range\n\n        @staticmethod\n        def validate_range(range_value: tuple[float, float], range_name: str, minimum: float = 0) -&gt; None:\n            if not minimum &lt;= range_value[0] &lt;= range_value[1]:\n                raise ValueError(\n                    f\"First value in {range_name} should be less or equal than the second value \"\n                    f\"and at least {minimum}. Got: {range_value}\",\n                )\n            if isinstance(range_value[0], float) and not all(0 &lt;= x &lt;= 1 for x in range_value):\n                raise ValueError(f\"All values in {range_name} should be in [0, 1] range. Got: {range_value}\")\n\n        @model_validator(mode=\"after\")\n        def check_num_holes_and_dimensions(self) -&gt; Self:\n            if self.min_holes is not None:\n                warn(\"`min_holes` is deprecated. Use num_holes_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.max_holes is not None:\n                warn(\"`max_holes` is deprecated. Use num_holes_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.min_height is not None:\n                warn(\"`min_height` is deprecated. Use hole_height_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.max_height is not None:\n                warn(\"`max_height` is deprecated. Use hole_height_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.min_width is not None:\n                warn(\"`min_width` is deprecated. Use hole_width_range instead.\", DeprecationWarning, stacklevel=2)\n            if self.max_width is not None:\n                warn(\"`max_width` is deprecated. Use hole_width_range instead.\", DeprecationWarning, stacklevel=2)\n\n            if self.max_holes is not None:\n                self.num_holes_range = self.update_range(self.min_holes, self.max_holes, self.num_holes_range)\n\n            self.validate_range(self.num_holes_range, \"num_holes_range\", minimum=1)\n\n            if self.max_height is not None:\n                self.hole_height_range = self.update_range(self.min_height, self.max_height, self.hole_height_range)\n            self.validate_range(self.hole_height_range, \"hole_height_range\")\n\n            if self.max_width is not None:\n                self.hole_width_range = self.update_range(self.min_width, self.max_width, self.hole_width_range)\n            self.validate_range(self.hole_width_range, \"hole_width_range\")\n\n            return self\n\n    def __init__(\n        self,\n        max_holes: int | None = None,\n        max_height: ScalarType | None = None,\n        max_width: ScalarType | None = None,\n        min_holes: int | None = None,\n        min_height: ScalarType | None = None,\n        min_width: ScalarType | None = None,\n        fill_value: DropoutFillValue | None = None,\n        mask_fill_value: ColorType | None = None,\n        num_holes_range: tuple[int, int] = (1, 1),\n        hole_height_range: tuple[ScalarType, ScalarType] = (8, 8),\n        hole_width_range: tuple[ScalarType, ScalarType] = (8, 8),\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.num_holes_range = num_holes_range\n        self.hole_height_range = hole_height_range\n        self.hole_width_range = hole_width_range\n\n    def calculate_hole_dimensions(\n        self,\n        image_shape: tuple[int, int],\n        height_range: tuple[float, float],\n        width_range: tuple[float, float],\n        size: int,\n    ) -&gt; tuple[np.ndarray, np.ndarray]:\n        \"\"\"Calculate random hole dimensions based on the provided ranges.\"\"\"\n        height, width = image_shape[:2]\n\n        if isinstance(height_range[0], int):\n            min_height = height_range[0]\n            max_height = min(height_range[1], height)\n\n            min_width = width_range[0]\n            max_width = min(width_range[1], width)\n\n            hole_heights = self.random_generator.integers(int(min_height), int(max_height + 1), size=size)\n            hole_widths = self.random_generator.integers(int(min_width), int(max_width + 1), size=size)\n\n        else:  # Assume float\n            hole_heights = (height * self.random_generator.uniform(*height_range, size=size)).astype(int)\n            hole_widths = (width * self.random_generator.uniform(*width_range, size=size)).astype(int)\n\n        return hole_heights, hole_widths\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        num_holes = self.py_random.randint(*self.num_holes_range)\n\n        hole_heights, hole_widths = self.calculate_hole_dimensions(\n            image_shape,\n            self.hole_height_range,\n            self.hole_width_range,\n            size=num_holes,\n        )\n\n        height, width = image_shape[:2]\n\n        y_min = self.random_generator.integers(0, height - hole_heights + 1, size=num_holes)\n        x_min = self.random_generator.integers(0, width - hole_widths + 1, size=num_holes)\n        y_max = y_min + hole_heights\n        x_max = x_min + hole_widths\n\n        holes = np.stack([x_min, y_min, x_max, y_max], axis=-1)\n\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (*super().get_transform_init_args_names(), \"num_holes_range\", \"hole_height_range\", \"hole_width_range\")\n</code></pre>"},{"location":"api_reference/augmentations/dropout/coarse_dropout/#albumentations.augmentations.dropout.coarse_dropout.Erasing","title":"<code>class  Erasing</code> <code>       (scale=(0.02, 0.33), ratio=(0.3, 3.3), fill=0, fill_mask=None, always_apply=None, p=0.5)                     </code>  [view source on GitHub]","text":"<p>Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.</p> <p>This augmentation helps improve model robustness by randomly masking out rectangular regions in the image, simulating occlusions and encouraging the model to learn from partial information. It's particularly effective for image classification and person re-identification tasks.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>tuple[float, float]</code> <p>Range for the proportion of image area to erase. The actual area will be randomly sampled from (scale[0] * image_area, scale[1] * image_area). Default: (0.02, 0.33)</p> <code>ratio</code> <code>tuple[float, float]</code> <p>Range for the aspect ratio (width/height) of the erased region. The actual ratio will be randomly sampled from (ratio[0], ratio[1]). Default: (0.3, 3.3)</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value used to fill the erased regions. Can be: - int or float: fills all channels with this value - tuple: fills each channel with corresponding value - \"random\": fills each pixel with random values - \"random_uniform\": fills entire erased region with a single random color - \"inpaint_telea\": uses OpenCV Telea inpainting method - \"inpaint_ns\": uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>mask_fill</code> <code>ColorType | None</code> <p>Value used to fill erased regions in the mask. If None, mask regions are not modified. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform attempts to find valid erasing parameters up to 10 times.   If unsuccessful, no erasing is performed.</li> <li>The actual erased area and aspect ratio are randomly sampled within   the specified ranges for each application.</li> <li>When using inpainting methods, only grayscale or RGB images are supported.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; # Basic usage with default parameters\n&gt;&gt;&gt; transform = A.Erasing()\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; # Custom configuration\n&gt;&gt;&gt; transform = A.Erasing(\n...     scale=(0.1, 0.4),\n...     ratio=(0.5, 2.0),\n...     fill_value=\"random_uniform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = transform(image=image)\n</code></pre> <p>References</p> <ul> <li>Paper: https://arxiv.org/abs/1708.04896</li> <li>Implementation inspired by torchvision:   https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.RandomErasing</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/coarse_dropout.py</code> Python<pre><code>class Erasing(BaseDropout):\n    \"\"\"Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.\n\n    This augmentation helps improve model robustness by randomly masking out rectangular regions in the image,\n    simulating occlusions and encouraging the model to learn from partial information. It's particularly\n    effective for image classification and person re-identification tasks.\n\n    Args:\n        scale (tuple[float, float]): Range for the proportion of image area to erase.\n            The actual area will be randomly sampled from (scale[0] * image_area, scale[1] * image_area).\n            Default: (0.02, 0.33)\n        ratio (tuple[float, float]): Range for the aspect ratio (width/height) of the erased region.\n            The actual ratio will be randomly sampled from (ratio[0], ratio[1]).\n            Default: (0.3, 3.3)\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value used to fill the erased regions. Can be:\n            - int or float: fills all channels with this value\n            - tuple: fills each channel with corresponding value\n            - \"random\": fills each pixel with random values\n            - \"random_uniform\": fills entire erased region with a single random color\n            - \"inpaint_telea\": uses OpenCV Telea inpainting method\n            - \"inpaint_ns\": uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        mask_fill (ColorType | None): Value used to fill erased regions in the mask.\n            If None, mask regions are not modified. Default: None\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform attempts to find valid erasing parameters up to 10 times.\n          If unsuccessful, no erasing is performed.\n        - The actual erased area and aspect ratio are randomly sampled within\n          the specified ranges for each application.\n        - When using inpainting methods, only grayscale or RGB images are supported.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; # Basic usage with default parameters\n        &gt;&gt;&gt; transform = A.Erasing()\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; # Custom configuration\n        &gt;&gt;&gt; transform = A.Erasing(\n        ...     scale=(0.1, 0.4),\n        ...     ratio=(0.5, 2.0),\n        ...     fill_value=\"random_uniform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = transform(image=image)\n\n    References:\n        - Paper: https://arxiv.org/abs/1708.04896\n        - Implementation inspired by torchvision:\n          https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.RandomErasing\n    \"\"\"\n\n    class InitSchema(BaseDropout.InitSchema):\n        scale: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, None)),\n        ]\n        ratio: Annotated[\n            tuple[float, float],\n            AfterValidator(nondecreasing),\n            AfterValidator(check_range_bounds(0, None)),\n        ]\n\n    def __init__(\n        self,\n        scale: tuple[float, float] = (0.02, 0.33),\n        ratio: tuple[float, float] = (0.3, 3.3),\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        always_apply: bool | None = None,\n        p: float = 0.5,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n\n        self.scale = scale\n        self.ratio = ratio\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Calculate erasing parameters using direct mathematical derivation.\n\n        Given:\n        - Image dimensions (H, W)\n        - Target area (A)\n        - Aspect ratio (r = w/h)\n\n        We know:\n        - h * w = A (area equation)\n        - w = r * h (aspect ratio equation)\n\n        Therefore:\n        - h * (r * h) = A\n        - h\u00b2 = A/r\n        - h = sqrt(A/r)\n        - w = r * sqrt(A/r) = sqrt(A*r)\n        \"\"\"\n        height, width = params[\"shape\"][:2]\n        total_area = height * width\n\n        # Calculate maximum valid area based on dimensions and aspect ratio\n        max_area = total_area * self.scale[1]\n        min_area = total_area * self.scale[0]\n\n        # For each aspect ratio r, the maximum area is constrained by:\n        # h = sqrt(A/r) \u2264 H and w = sqrt(A*r) \u2264 W\n        # Therefore: A \u2264 min(r*H\u00b2, W\u00b2/r)\n        r_min, r_max = self.ratio\n\n        def area_constraint_h(r: float) -&gt; float:\n            return r * height * height\n\n        def area_constraint_w(r: float) -&gt; float:\n            return width * width / r\n\n        # Find maximum valid area considering aspect ratio constraints\n        max_area_h = min(area_constraint_h(r_min), area_constraint_h(r_max))\n        max_area_w = min(area_constraint_w(r_min), area_constraint_w(r_max))\n        max_valid_area = min(max_area, max_area_h, max_area_w)\n\n        if max_valid_area &lt; min_area:\n            return {\"holes\": np.array([], dtype=np.int32).reshape((0, 4))}\n\n        # Sample valid area and aspect ratio\n        erase_area = self.py_random.uniform(min_area, max_valid_area)\n\n        # Calculate valid aspect ratio range for this area\n        max_r = min(r_max, width * width / erase_area)\n        min_r = max(r_min, erase_area / (height * height))\n\n        if min_r &gt; max_r:\n            return {\"holes\": np.array([], dtype=np.int32).reshape((0, 4))}\n\n        aspect_ratio = self.py_random.uniform(min_r, max_r)\n\n        # Calculate dimensions\n        h = int(round(np.sqrt(erase_area / aspect_ratio)))\n        w = int(round(np.sqrt(erase_area * aspect_ratio)))\n\n        # Sample position\n        top = self.py_random.randint(0, height - h)\n        left = self.py_random.randint(0, width - w)\n\n        holes = np.array([[left, top, left + w, top + h]], dtype=np.int32)\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"scale\", \"ratio\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/","title":"Geometric functional transforms (augmentations.dropout.functional)","text":""},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.apply_inpainting","title":"<code>def apply_inpainting    (img, holes, method)    </code> [view source on GitHub]","text":"<p>Apply OpenCV inpainting to fill the holes in the image.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image (grayscale or BGR)</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>method</code> <code>InpaintMethod</code> <p>Inpainting method to use (\"inpaint_telea\" or \"inpaint_ns\")</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Inpainted image</p> <p>Exceptions:</p> Type Description <code>NotImplementedError</code> <p>If image has more than 3 channels</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>@uint8_io\ndef apply_inpainting(img: np.ndarray, holes: np.ndarray, method: InpaintMethod) -&gt; np.ndarray:\n    \"\"\"Apply OpenCV inpainting to fill the holes in the image.\n\n    Args:\n        img: Input image (grayscale or BGR)\n        holes: Array of [x1, y1, x2, y2] coordinates\n        method: Inpainting method to use (\"inpaint_telea\" or \"inpaint_ns\")\n\n    Returns:\n        np.ndarray: Inpainted image\n\n    Raises:\n        NotImplementedError: If image has more than 3 channels\n    \"\"\"\n    num_channels = get_num_channels(img)\n    # Create inpainting mask\n    mask = np.zeros(img.shape[:2], dtype=np.uint8)\n    for x_min, y_min, x_max, y_max in holes:\n        mask[y_min:y_max, x_min:x_max] = 255\n\n    inpaint_method = cv2.INPAINT_TELEA if method == \"inpaint_telea\" else cv2.INPAINT_NS\n\n    # Handle grayscale images by converting to 3 channels and back\n    if num_channels == 1:\n        if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n            img = img.squeeze()\n        img_3ch = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)\n        result = cv2.inpaint(img_3ch, mask, 3, inpaint_method)\n        return (\n            cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)[..., None]\n            if num_channels == NUM_MULTI_CHANNEL_DIMENSIONS\n            else cv2.cvtColor(result, cv2.COLOR_BGR2GRAY)\n        )\n\n    return cv2.inpaint(img, mask, 3, inpaint_method)\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.calculate_grid_dimensions","title":"<code>def calculate_grid_dimensions    (image_shape, unit_size_range, holes_number_xy, random_generator)    </code> [view source on GitHub]","text":"<p>Calculate the dimensions of grid units for GridDropout.</p> <p>This function determines the size of grid units based on the input parameters. It supports three modes of operation: 1. Using a range of unit sizes 2. Using a specified number of holes in x and y directions 3. Falling back to a default calculation</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>unit_size_range</code> <code>tuple[int, int] | None</code> <p>A range of possible unit sizes. If provided, a random size within this range will be chosen for both height and width.</p> <code>holes_number_xy</code> <code>tuple[int, int] | None</code> <p>The number of holes in the x and y directions. If provided, the grid dimensions will be calculated to fit this number of holes.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for generating random values.</p> <p>Returns:</p> Type Description <code>tuple[int, int]</code> <p>The calculated grid unit dimensions as (unit_height, unit_width).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the upper limit of unit_size_range is greater than the shortest image edge.</p> <p>Notes</p> <ul> <li>If both unit_size_range and holes_number_xy are None, the function falls back to a default calculation,   where the grid unit size is set to max(2, image_dimension // 10) for both height and width.</li> <li>The function prioritizes unit_size_range over holes_number_xy if both are provided.</li> <li>When using holes_number_xy, the actual number of holes may be slightly different due to integer division.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 200)\n&gt;&gt;&gt; calculate_grid_dimensions(image_shape, unit_size_range=(10, 20))\n(15, 15)  # Random value between 10 and 20\n</code></pre> Python<pre><code>&gt;&gt;&gt; calculate_grid_dimensions(image_shape, holes_number_xy=(5, 10))\n(20, 20)  # 100 // 5 and 200 // 10\n</code></pre> Python<pre><code>&gt;&gt;&gt; calculate_grid_dimensions(image_shape)\n(10, 20)  # Default calculation: max(2, dimension // 10)\n</code></pre> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def calculate_grid_dimensions(\n    image_shape: tuple[int, int],\n    unit_size_range: tuple[int, int] | None,\n    holes_number_xy: tuple[int, int] | None,\n    random_generator: np.random.Generator,\n) -&gt; tuple[int, int]:\n    \"\"\"Calculate the dimensions of grid units for GridDropout.\n\n    This function determines the size of grid units based on the input parameters.\n    It supports three modes of operation:\n    1. Using a range of unit sizes\n    2. Using a specified number of holes in x and y directions\n    3. Falling back to a default calculation\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        unit_size_range (tuple[int, int] | None, optional): A range of possible unit sizes.\n            If provided, a random size within this range will be chosen for both height and width.\n        holes_number_xy (tuple[int, int] | None, optional): The number of holes in the x and y directions.\n            If provided, the grid dimensions will be calculated to fit this number of holes.\n        random_generator (np.random.Generator): The random generator to use for generating random values.\n\n    Returns:\n        tuple[int, int]: The calculated grid unit dimensions as (unit_height, unit_width).\n\n    Raises:\n        ValueError: If the upper limit of unit_size_range is greater than the shortest image edge.\n\n    Notes:\n        - If both unit_size_range and holes_number_xy are None, the function falls back to a default calculation,\n          where the grid unit size is set to max(2, image_dimension // 10) for both height and width.\n        - The function prioritizes unit_size_range over holes_number_xy if both are provided.\n        - When using holes_number_xy, the actual number of holes may be slightly different due to integer division.\n\n    Examples:\n        &gt;&gt;&gt; image_shape = (100, 200)\n        &gt;&gt;&gt; calculate_grid_dimensions(image_shape, unit_size_range=(10, 20))\n        (15, 15)  # Random value between 10 and 20\n\n        &gt;&gt;&gt; calculate_grid_dimensions(image_shape, holes_number_xy=(5, 10))\n        (20, 20)  # 100 // 5 and 200 // 10\n\n        &gt;&gt;&gt; calculate_grid_dimensions(image_shape)\n        (10, 20)  # Default calculation: max(2, dimension // 10)\n    \"\"\"\n    height, width = image_shape[:2]\n\n    if unit_size_range is not None:\n        if unit_size_range[1] &gt; min(image_shape[:2]):\n            raise ValueError(\"Grid size limits must be within the shortest image edge.\")\n        unit_size = random_generator.integers(*unit_size_range)\n        return unit_size, unit_size\n\n    if holes_number_xy:\n        holes_number_x, holes_number_y = holes_number_xy\n        unit_width = width // holes_number_x\n        unit_height = height // holes_number_y\n        return unit_height, unit_width\n\n    # Default fallback\n    unit_width = max(2, width // 10)\n    unit_height = max(2, height // 10)\n    return unit_height, unit_width\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.cutout","title":"<code>def cutout    (img, holes, fill_value, random_generator)    </code> [view source on GitHub]","text":"<p>Apply cutout augmentation to the image by cutting out holes and filling them.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>The image to augment</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>fill_value</code> <code>DropoutFillValue</code> <p>Value to fill holes with. Can be: - number (int/float): Will be broadcast to all channels - sequence (tuple/list/ndarray): Must match number of channels - \"random\": Different random values for each pixel - \"random_uniform\": Same random value for entire hole - \"inpaint_telea\"/\"inpaint_ns\": OpenCV inpainting methods</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random number generator for random fills</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If fill_value length doesn't match number of channels</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def cutout(\n    img: np.ndarray,\n    holes: np.ndarray,\n    fill_value: DropoutFillValue,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Apply cutout augmentation to the image by cutting out holes and filling them.\n\n    Args:\n        img: The image to augment\n        holes: Array of [x1, y1, x2, y2] coordinates\n        fill_value: Value to fill holes with. Can be:\n            - number (int/float): Will be broadcast to all channels\n            - sequence (tuple/list/ndarray): Must match number of channels\n            - \"random\": Different random values for each pixel\n            - \"random_uniform\": Same random value for entire hole\n            - \"inpaint_telea\"/\"inpaint_ns\": OpenCV inpainting methods\n        random_generator: Random number generator for random fills\n\n    Raises:\n        ValueError: If fill_value length doesn't match number of channels\n    \"\"\"\n    img = img.copy()\n\n    # Handle inpainting methods\n    if isinstance(fill_value, str):\n        if fill_value in {\"inpaint_telea\", \"inpaint_ns\"}:\n            return apply_inpainting(img, holes, cast(InpaintMethod, fill_value))\n        if fill_value == \"random\":\n            return fill_holes_with_random(img, holes, random_generator, uniform=False)\n        if fill_value == \"random_uniform\":\n            return fill_holes_with_random(img, holes, random_generator, uniform=True)\n        raise ValueError(f\"Unsupported string fill_value: {fill_value}\")\n\n    # Convert numeric fill values to numpy array\n    if isinstance(fill_value, (int, float)):\n        fill_array = np.array(fill_value, dtype=img.dtype)\n        return fill_holes_with_value(img, holes, fill_array)\n\n    # Handle sequence fill values\n    fill_array = np.array(fill_value, dtype=img.dtype)\n\n    # For multi-channel images, verify fill_value matches number of channels\n    if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n        fill_array = fill_array.ravel()\n        if fill_array.size != img.shape[2]:\n            raise ValueError(\n                f\"Fill value must have same number of channels as image. \"\n                f\"Got {fill_array.size}, expected {img.shape[2]}\",\n            )\n\n    return fill_holes_with_value(img, holes, fill_array)\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.fill_holes_with_random","title":"<code>def fill_holes_with_random    (img, holes, random_generator, uniform)    </code> [view source on GitHub]","text":"<p>Fill holes with random values.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>random_generator</code> <code>np.random.Generator</code> <p>Random number generator</p> <code>uniform</code> <code>bool</code> <p>If True, use same random value for entire hole</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def fill_holes_with_random(\n    img: np.ndarray,\n    holes: np.ndarray,\n    random_generator: np.random.Generator,\n    uniform: bool,\n) -&gt; np.ndarray:\n    \"\"\"Fill holes with random values.\n\n    Args:\n        img: Input image\n        holes: Array of [x1, y1, x2, y2] coordinates\n        random_generator: Random number generator\n        uniform: If True, use same random value for entire hole\n    \"\"\"\n    for x_min, y_min, x_max, y_max in holes:\n        shape = (1,) if uniform else (y_max - y_min, x_max - x_min)\n        if img.ndim != MONO_CHANNEL_DIMENSIONS:\n            shape = (1, img.shape[2]) if uniform else (*shape, img.shape[2])\n\n        random_fill = generate_random_fill(img.dtype, shape, random_generator)\n        img[y_min:y_max, x_min:x_max] = random_fill\n    return img\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.fill_holes_with_value","title":"<code>def fill_holes_with_value    (img, holes, fill_value)    </code> [view source on GitHub]","text":"<p>Fill holes with a constant value.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input image</p> <code>holes</code> <code>np.ndarray</code> <p>Array of [x1, y1, x2, y2] coordinates</p> <code>fill_value</code> <code>np.ndarray</code> <p>Value to fill the holes with</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def fill_holes_with_value(img: np.ndarray, holes: np.ndarray, fill_value: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Fill holes with a constant value.\n\n    Args:\n        img: Input image\n        holes: Array of [x1, y1, x2, y2] coordinates\n        fill_value: Value to fill the holes with\n    \"\"\"\n    for x_min, y_min, x_max, y_max in holes:\n        img[y_min:y_max, x_min:x_max] = fill_value\n    return img\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.filter_bboxes_by_holes","title":"<code>def filter_bboxes_by_holes    (bboxes, holes, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Filter bounding boxes based on their remaining visible area and visibility ratio after intersection with holes.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes, each represented as [x_min, y_min, x_max, y_max].</p> <code>holes</code> <code>np.ndarray</code> <p>Array of holes, each represented as [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width).</p> <code>min_area</code> <code>int</code> <p>Minimum remaining visible area to keep the bounding box.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio to keep the bounding box. Calculated as 1 - (intersection_area / bbox_area).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Filtered array of bounding boxes.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def filter_bboxes_by_holes(\n    bboxes: np.ndarray,\n    holes: np.ndarray,\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Filter bounding boxes based on their remaining visible area and visibility ratio after intersection with holes.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes, each represented as [x_min, y_min, x_max, y_max].\n        holes (np.ndarray): Array of holes, each represented as [x_min, y_min, x_max, y_max].\n        image_shape (tuple[int, int]): Shape of the image (height, width).\n        min_area (int): Minimum remaining visible area to keep the bounding box.\n        min_visibility (float): Minimum visibility ratio to keep the bounding box.\n            Calculated as 1 - (intersection_area / bbox_area).\n\n    Returns:\n        np.ndarray: Filtered array of bounding boxes.\n    \"\"\"\n    if len(bboxes) == 0 or len(holes) == 0:\n        return bboxes\n\n    # Create a blank mask for holes\n    hole_mask = np.zeros(image_shape, dtype=np.uint8)\n\n    # Fill in the holes on the mask\n    for hole in holes:\n        x_min, y_min, x_max, y_max = hole.astype(int)\n        hole_mask[y_min:y_max, x_min:x_max] = 1\n\n    # Vectorized calculation\n    bboxes_int = bboxes.astype(int)\n    x_min, y_min, x_max, y_max = bboxes_int[:, 0], bboxes_int[:, 1], bboxes_int[:, 2], bboxes_int[:, 3]\n\n    # Calculate box areas\n    box_areas = (x_max - x_min) * (y_max - y_min)\n\n    # Create a mask of the same shape as bboxes\n    mask = np.zeros(len(bboxes), dtype=bool)\n\n    for i in range(len(bboxes)):\n        intersection_area = np.sum(hole_mask[y_min[i] : y_max[i], x_min[i] : x_max[i]])\n        remaining_area = box_areas[i] - intersection_area\n        visibility_ratio = 1 - (intersection_area / box_areas[i])\n        mask[i] = (remaining_area &gt;= min_area) and (visibility_ratio &gt;= min_visibility)\n\n    return bboxes[mask]\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.filter_keypoints_in_holes","title":"<code>def filter_keypoints_in_holes    (keypoints, holes)    </code> [view source on GitHub]","text":"<p>Filter out keypoints that are inside any of the holes.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (num_keypoints, 2+).                     The first two columns are x and y coordinates.</p> <code>holes</code> <code>np.ndarray</code> <p>Array of holes with shape (num_holes, 4).                 Each hole is represented as [x1, y1, x2, y2].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints that are not inside any hole.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef filter_keypoints_in_holes(keypoints: np.ndarray, holes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Filter out keypoints that are inside any of the holes.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (num_keypoints, 2+).\n                                The first two columns are x and y coordinates.\n        holes (np.ndarray): Array of holes with shape (num_holes, 4).\n                            Each hole is represented as [x1, y1, x2, y2].\n\n    Returns:\n        np.ndarray: Array of keypoints that are not inside any hole.\n    \"\"\"\n    # Broadcast keypoints and holes for vectorized comparison\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    hole_x1 = holes[:, 0]  # Shape: (num_holes,)\n    hole_y1 = holes[:, 1]  # Shape: (num_holes,)\n    hole_x2 = holes[:, 2]  # Shape: (num_holes,)\n    hole_y2 = holes[:, 3]  # Shape: (num_holes,)\n\n    # Check if each keypoint is inside each hole\n    inside_hole = (kp_x &gt;= hole_x1) &amp; (kp_x &lt; hole_x2) &amp; (kp_y &gt;= hole_y1) &amp; (kp_y &lt; hole_y2)\n\n    # A keypoint is valid if it's not inside any hole\n    valid_keypoints = ~np.any(inside_hole, axis=1)\n\n    return keypoints[valid_keypoints]\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.generate_grid_holes","title":"<code>def generate_grid_holes    (image_shape, grid, ratio, random_offset, shift_xy, random_generator)    </code> [view source on GitHub]","text":"<p>Generate a list of holes for GridDropout using a uniform grid.</p> <p>This function creates a grid of holes for use in the GridDropout augmentation technique. It allows for customization of the grid size, hole size ratio, and positioning of holes.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>grid</code> <code>tuple[int, int]</code> <p>The grid size as (rows, columns). This determines the number of cells in the grid, where each cell may contain a hole.</p> <code>ratio</code> <code>float</code> <p>The ratio of the hole size to the grid cell size. Should be between 0 and 1. A ratio of 1 means the hole will fill the entire grid cell.</p> <code>random_offset</code> <code>bool</code> <p>If True, applies random offsets to each hole within its grid cell. If False, uses the global shift specified by shift_xy.</p> <code>shift_xy</code> <code>tuple[int, int]</code> <p>The global shift to apply to all holes as (shift_x, shift_y). Only used when random_offset is False.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator for generating random offsets and shuffling. If None, a new Generator will be created.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of hole coordinates, where each hole is represented as     [x1, y1, x2, y2]. The shape of the array is (n_holes, 4), where n_holes     is determined by the grid size.</p> <p>Notes</p> <ul> <li>The function first creates a uniform grid based on the image shape and specified grid size.</li> <li>Hole sizes are calculated based on the provided ratio and grid cell sizes.</li> <li>If random_offset is True, each hole is randomly positioned within its grid cell.</li> <li>If random_offset is False, all holes are shifted by the global shift_xy value.</li> <li>The function ensures that all holes remain within the image boundaries.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; grid = (5, 5)\n&gt;&gt;&gt; ratio = 0.5\n&gt;&gt;&gt; random_offset = True\n&gt;&gt;&gt; random_state = np.random.RandomState(42)\n&gt;&gt;&gt; shift_xy = (0, 0)\n&gt;&gt;&gt; holes = generate_grid_holes(image_shape, grid, ratio, random_offset, random_state, shift_xy)\n&gt;&gt;&gt; print(holes.shape)\n(25, 4)\n&gt;&gt;&gt; print(holes[0])  # Example output: [x1, y1, x2, y2] of the first hole\n[ 1 21 11 31]\n</code></pre> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def generate_grid_holes(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    ratio: float,\n    random_offset: bool,\n    shift_xy: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate a list of holes for GridDropout using a uniform grid.\n\n    This function creates a grid of holes for use in the GridDropout augmentation technique.\n    It allows for customization of the grid size, hole size ratio, and positioning of holes.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        grid (tuple[int, int]): The grid size as (rows, columns). This determines the number of cells\n            in the grid, where each cell may contain a hole.\n        ratio (float): The ratio of the hole size to the grid cell size. Should be between 0 and 1.\n            A ratio of 1 means the hole will fill the entire grid cell.\n        random_offset (bool): If True, applies random offsets to each hole within its grid cell.\n            If False, uses the global shift specified by shift_xy.\n        shift_xy (tuple[int, int]): The global shift to apply to all holes as (shift_x, shift_y).\n            Only used when random_offset is False.\n        random_generator (np.random.Generator): The random generator for generating random offsets\n            and shuffling. If None, a new Generator will be created.\n\n    Returns:\n        np.ndarray: An array of hole coordinates, where each hole is represented as\n            [x1, y1, x2, y2]. The shape of the array is (n_holes, 4), where n_holes\n            is determined by the grid size.\n\n    Notes:\n        - The function first creates a uniform grid based on the image shape and specified grid size.\n        - Hole sizes are calculated based on the provided ratio and grid cell sizes.\n        - If random_offset is True, each hole is randomly positioned within its grid cell.\n        - If random_offset is False, all holes are shifted by the global shift_xy value.\n        - The function ensures that all holes remain within the image boundaries.\n\n    Examples:\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; grid = (5, 5)\n        &gt;&gt;&gt; ratio = 0.5\n        &gt;&gt;&gt; random_offset = True\n        &gt;&gt;&gt; random_state = np.random.RandomState(42)\n        &gt;&gt;&gt; shift_xy = (0, 0)\n        &gt;&gt;&gt; holes = generate_grid_holes(image_shape, grid, ratio, random_offset, random_state, shift_xy)\n        &gt;&gt;&gt; print(holes.shape)\n        (25, 4)\n        &gt;&gt;&gt; print(holes[0])  # Example output: [x1, y1, x2, y2] of the first hole\n        [ 1 21 11 31]\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Generate the uniform grid\n    cells = split_uniform_grid(image_shape, grid, random_generator)\n\n    # Calculate hole sizes based on the ratio\n    cell_heights = cells[:, 2] - cells[:, 0]\n    cell_widths = cells[:, 3] - cells[:, 1]\n    hole_heights = np.clip(cell_heights * ratio, 1, cell_heights - 1).astype(int)\n    hole_widths = np.clip(cell_widths * ratio, 1, cell_widths - 1).astype(int)\n\n    # Calculate maximum possible offsets\n    max_offset_y = cell_heights - hole_heights\n    max_offset_x = cell_widths - hole_widths\n\n    if random_offset:\n        # Generate random offsets for each hole\n        offset_y = random_generator.integers(0, max_offset_y + 1)\n        offset_x = random_generator.integers(0, max_offset_x + 1)\n    else:\n        # Use global shift\n        offset_y = np.full_like(max_offset_y, shift_xy[1])\n        offset_x = np.full_like(max_offset_x, shift_xy[0])\n\n    # Calculate hole coordinates\n    x_min = np.clip(cells[:, 1] + offset_x, 0, width - hole_widths)\n    y_min = np.clip(cells[:, 0] + offset_y, 0, height - hole_heights)\n    x_max = np.minimum(x_min + hole_widths, width)\n    y_max = np.minimum(y_min + hole_heights, height)\n\n    return np.column_stack((x_min, y_min, x_max, y_max))\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.generate_random_fill","title":"<code>def generate_random_fill    (dtype, shape, random_generator)    </code> [view source on GitHub]","text":"<p>Generate a random fill array based on the given dtype and target shape.</p> <p>This function creates a numpy array filled with random values. The range and type of these values depend on the input dtype. For integer dtypes, it generates random integers. For floating-point dtypes, it generates random floats.</p> <p>Parameters:</p> Name Type Description <code>dtype</code> <code>np.dtype</code> <p>The data type of the array to be generated.</p> <code>shape</code> <code>tuple[int, ...]</code> <p>The shape of the array to be generated.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for generating values. If None, the default numpy random generator is used.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of the specified shape and dtype, filled with random values.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input dtype is neither integer nor floating-point.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; random_state = np.random.RandomState(42)\n&gt;&gt;&gt; result = generate_random_fill(np.dtype('uint8'), (2, 2), random_state)\n&gt;&gt;&gt; print(result)\n[[172 251]\n [ 80 141]]\n</code></pre> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def generate_random_fill(\n    dtype: np.dtype,\n    shape: tuple[int, ...],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate a random fill array based on the given dtype and target shape.\n\n    This function creates a numpy array filled with random values. The range and type of these values\n    depend on the input dtype. For integer dtypes, it generates random integers. For floating-point\n    dtypes, it generates random floats.\n\n    Args:\n        dtype (np.dtype): The data type of the array to be generated.\n        shape (tuple[int, ...]): The shape of the array to be generated.\n        random_generator (np.random.Generator): The random generator to use for generating values.\n            If None, the default numpy random generator is used.\n\n    Returns:\n        np.ndarray: A numpy array of the specified shape and dtype, filled with random values.\n\n    Raises:\n        ValueError: If the input dtype is neither integer nor floating-point.\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; random_state = np.random.RandomState(42)\n        &gt;&gt;&gt; result = generate_random_fill(np.dtype('uint8'), (2, 2), random_state)\n        &gt;&gt;&gt; print(result)\n        [[172 251]\n         [ 80 141]]\n    \"\"\"\n    max_value = MAX_VALUES_BY_DTYPE[dtype]\n    if np.issubdtype(dtype, np.integer):\n        return random_generator.integers(0, max_value + 1, size=shape, dtype=dtype)\n    if np.issubdtype(dtype, np.floating):\n        return random_generator.uniform(0, max_value, size=shape).astype(dtype)\n    raise ValueError(f\"Unsupported dtype: {dtype}\")\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.label","title":"<code>def label    (mask, return_num=False, connectivity=2)    </code> [view source on GitHub]","text":"<p>Label connected regions of an integer array.</p> <p>This function uses OpenCV's connectedComponents under the hood but mimics the behavior of scikit-image's label function.</p> <p>Parameters:</p> Name Type Description <code>mask</code> <code>np.ndarray</code> <p>The array to label. Must be of integer type.</p> <code>return_num</code> <code>bool</code> <p>If True, return the number of labels (default: False).</p> <code>connectivity</code> <code>int</code> <p>Maximum number of orthogonal hops to consider a pixel/voxel                 as a neighbor. Accepted values are 1 or 2. Default is 2.</p> <p>Returns:</p> Type Description <code>np.ndarray | tuple[np.ndarray, int]</code> <p>Labeled array, where all connected regions are assigned the same integer value. If return_num is True, it also returns the number of labels.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>def label(mask: np.ndarray, return_num: bool = False, connectivity: int = 2) -&gt; np.ndarray | tuple[np.ndarray, int]:\n    \"\"\"Label connected regions of an integer array.\n\n    This function uses OpenCV's connectedComponents under the hood but mimics\n    the behavior of scikit-image's label function.\n\n    Args:\n        mask (np.ndarray): The array to label. Must be of integer type.\n        return_num (bool): If True, return the number of labels (default: False).\n        connectivity (int): Maximum number of orthogonal hops to consider a pixel/voxel\n                            as a neighbor. Accepted values are 1 or 2. Default is 2.\n\n    Returns:\n        np.ndarray | tuple[np.ndarray, int]: Labeled array, where all connected regions are\n        assigned the same integer value. If return_num is True, it also returns the number of labels.\n    \"\"\"\n    # Create a copy of the original mask\n    labeled = np.zeros_like(mask, dtype=np.int32)\n\n    # Get unique non-zero values from the original mask\n    unique_values = np.unique(mask[mask != 0])\n\n    # Label each unique value separately\n    next_label = 1\n    for value in unique_values:\n        binary_mask = (mask == value).astype(np.uint8)\n\n        # Set connectivity for OpenCV (4 or 8)\n        cv2_connectivity = 4 if connectivity == 1 else 8\n\n        # Use OpenCV's connectedComponents\n        num_labels, labels = cv2.connectedComponents(binary_mask, connectivity=cv2_connectivity)\n\n        # Assign new labels\n        for i in range(1, num_labels):\n            labeled[labels == i] = next_label\n            next_label += 1\n\n    num_labels = next_label - 1\n\n    return (labeled, num_labels) if return_num else labeled\n</code></pre>"},{"location":"api_reference/augmentations/dropout/functional/#albumentations.augmentations.dropout.functional.mask_dropout_bboxes","title":"<code>def mask_dropout_bboxes    (bboxes, dropout_mask, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Filter out bounding boxes based on their intersection with the dropout mask.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) in format [x_min, y_min, x_max, y_max, ...].</p> <code>dropout_mask</code> <code>np.ndarray</code> <p>Boolean mask of shape (height, width) where True values indicate dropped out regions.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the original image as (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area of the bounding box to be kept.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio of the bounding box to be kept.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Filtered array of bounding boxes.</p> Source code in <code>albumentations/augmentations/dropout/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef mask_dropout_bboxes(\n    bboxes: np.ndarray,\n    dropout_mask: np.ndarray,\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Filter out bounding boxes based on their intersection with the dropout mask.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (N, 4+) in format [x_min, y_min, x_max, y_max, ...].\n        dropout_mask (np.ndarray): Boolean mask of shape (height, width) where True values indicate dropped out regions.\n        image_shape (Tuple[int, int]): The shape of the original image as (height, width).\n        min_area (float): Minimum area of the bounding box to be kept.\n        min_visibility (float): Minimum visibility ratio of the bounding box to be kept.\n\n    Returns:\n        np.ndarray: Filtered array of bounding boxes.\n    \"\"\"\n    height, width = image_shape\n\n    # Create binary masks for each bounding box\n    y, x = np.ogrid[:height, :width]\n    box_masks = (\n        (x[None, :] &gt;= bboxes[:, 0, None, None])\n        &amp; (x[None, :] &lt;= bboxes[:, 2, None, None])\n        &amp; (y[None, :] &gt;= bboxes[:, 1, None, None])\n        &amp; (y[None, :] &lt;= bboxes[:, 3, None, None])\n    )\n\n    # Calculate the area of each bounding box\n    box_areas = (bboxes[:, 2] - bboxes[:, 0]) * (bboxes[:, 3] - bboxes[:, 1])\n\n    # Calculate the visible area of each box (non-intersecting area with dropout mask)\n    visible_areas = np.sum(box_masks &amp; ~dropout_mask.squeeze(), axis=(1, 2))\n\n    # Calculate visibility ratio (visible area / total box area)\n    visibility_ratio = visible_areas / box_areas\n\n    # Create a boolean mask for boxes to keep\n    keep_mask = (visible_areas &gt;= min_area) &amp; (visibility_ratio &gt;= min_visibility)\n\n    return bboxes[keep_mask]\n</code></pre>"},{"location":"api_reference/augmentations/dropout/grid_dropout/","title":"GridDropout augmentation (augmentations.dropout.grid_dropout)","text":""},{"location":"api_reference/augmentations/dropout/grid_dropout/#albumentations.augmentations.dropout.grid_dropout.GridDropout","title":"<code>class  GridDropout</code> <code>       (ratio=0.5, unit_size_min=None, unit_size_max=None, holes_number_x=None, holes_number_y=None, shift_x=None, shift_y=None, random_offset=True, fill_value=None, mask_fill_value=None, unit_size_range=None, holes_number_xy=None, shift_xy=(0, 0), fill=0, fill_mask=None, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply GridDropout augmentation to images, masks, bounding boxes, and keypoints.</p> <p>GridDropout drops out rectangular regions of an image and the corresponding mask in a grid fashion. This technique can help improve model robustness by forcing the network to rely on a broader context rather than specific local features.</p> <p>Parameters:</p> Name Type Description <code>ratio</code> <code>float</code> <p>The ratio of the mask holes to the unit size (same for horizontal and vertical directions). Must be between 0 and 1. Default: 0.5.</p> <code>unit_size_range</code> <code>tuple[int, int] | None</code> <p>Range from which to sample grid size. Default: None. Must be between 2 and the image's shorter edge. If None, grid size is calculated based on image size.</p> <code>holes_number_xy</code> <code>tuple[int, int] | None</code> <p>The number of grid units in x and y directions. First value should be between 1 and image width//2, Second value should be between 1 and image height//2. Default: None. If provided, overrides unit_size_range.</p> <code>random_offset</code> <code>bool</code> <p>Whether to offset the grid randomly between 0 and (grid unit size - hole size). If True, entered shift_xy is ignored and set randomly. Default: True.</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Value for the dropped pixels in mask. If None, the mask is not modified. Default: None.</p> <code>shift_xy</code> <code>tuple[int, int]</code> <p>Offsets of the grid start in x and y directions from (0,0) coordinate. Only used when random_offset is False. Default: (0, 0).</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If both unit_size_range and holes_number_xy are None, the grid size is calculated based on the image size.</li> <li>The actual number of dropped regions may differ slightly from holes_number_xy due to rounding.</li> <li>Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.</li> <li>For 'random_uniform' fill, each grid cell gets a single random color, unlike 'random' where each pixel     gets its own random value.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n&gt;&gt;&gt; # Example with standard fill value\n&gt;&gt;&gt; aug_basic = A.GridDropout(\n...     ratio=0.3,\n...     unit_size_range=(10, 20),\n...     random_offset=True,\n...     p=1.0\n... )\n&gt;&gt;&gt; # Example with random uniform fill\n&gt;&gt;&gt; aug_random = A.GridDropout(\n...     ratio=0.3,\n...     unit_size_range=(10, 20),\n...     fill=\"random_uniform\",\n...     p=1.0\n... )\n&gt;&gt;&gt; # Example with inpainting\n&gt;&gt;&gt; aug_inpaint = A.GridDropout(\n...     ratio=0.3,\n...     unit_size_range=(10, 20),\n...     fill=\"inpaint_ns\",\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n</code></pre> <p>Reference</p> <ul> <li>Paper: https://arxiv.org/abs/2001.04086</li> <li>OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/grid_dropout.py</code> Python<pre><code>class GridDropout(BaseDropout):\n    \"\"\"Apply GridDropout augmentation to images, masks, bounding boxes, and keypoints.\n\n    GridDropout drops out rectangular regions of an image and the corresponding mask in a grid fashion.\n    This technique can help improve model robustness by forcing the network to rely on a broader context\n    rather than specific local features.\n\n    Args:\n        ratio (float): The ratio of the mask holes to the unit size (same for horizontal and vertical directions).\n            Must be between 0 and 1. Default: 0.5.\n        unit_size_range (tuple[int, int] | None): Range from which to sample grid size. Default: None.\n            Must be between 2 and the image's shorter edge. If None, grid size is calculated based on image size.\n        holes_number_xy (tuple[int, int] | None): The number of grid units in x and y directions.\n            First value should be between 1 and image width//2,\n            Second value should be between 1 and image height//2.\n            Default: None. If provided, overrides unit_size_range.\n        random_offset (bool): Whether to offset the grid randomly between 0 and (grid unit size - hole size).\n            If True, entered shift_xy is ignored and set randomly. Default: True.\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value for the dropped pixels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            - 'random': each pixel is filled with random values\n            - 'random_uniform': each hole is filled with a single random color\n            - 'inpaint_telea': uses OpenCV Telea inpainting method\n            - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        fill_mask (ColorType | None): Value for the dropped pixels in mask.\n            If None, the mask is not modified. Default: None.\n        shift_xy (tuple[int, int]): Offsets of the grid start in x and y directions from (0,0) coordinate.\n            Only used when random_offset is False. Default: (0, 0).\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If both unit_size_range and holes_number_xy are None, the grid size is calculated based on the image size.\n        - The actual number of dropped regions may differ slightly from holes_number_xy due to rounding.\n        - Inpainting methods ('inpaint_telea', 'inpaint_ns') work only with grayscale or RGB images.\n        - For 'random_uniform' fill, each grid cell gets a single random color, unlike 'random' where each pixel\n            gets its own random value.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; # Example with standard fill value\n        &gt;&gt;&gt; aug_basic = A.GridDropout(\n        ...     ratio=0.3,\n        ...     unit_size_range=(10, 20),\n        ...     random_offset=True,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; # Example with random uniform fill\n        &gt;&gt;&gt; aug_random = A.GridDropout(\n        ...     ratio=0.3,\n        ...     unit_size_range=(10, 20),\n        ...     fill=\"random_uniform\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; # Example with inpainting\n        &gt;&gt;&gt; aug_inpaint = A.GridDropout(\n        ...     ratio=0.3,\n        ...     unit_size_range=(10, 20),\n        ...     fill=\"inpaint_ns\",\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = aug_random(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = transformed[\"image\"], transformed[\"mask\"]\n\n    Reference:\n        - Paper: https://arxiv.org/abs/2001.04086\n        - OpenCV Inpainting methods: https://docs.opencv.org/master/df/d3d/tutorial_py_inpainting.html\n    \"\"\"\n\n    class InitSchema(BaseDropout.InitSchema):\n        ratio: float = Field(gt=0, le=1)\n\n        unit_size_min: int | None = Field(ge=2)\n        unit_size_max: int | None = Field(ge=2)\n\n        holes_number_x: int | None = Field(ge=1)\n        holes_number_y: int | None = Field(ge=1)\n\n        shift_x: int | None = Field(ge=0)\n        shift_y: int | None = Field(ge=0)\n\n        random_offset: bool\n        fill_value: DropoutFillValue | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_fill_value: ColorType | None = Field(deprecated=\"Deprecated use fill_mask instead\")\n\n        unit_size_range: (\n            Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None)), AfterValidator(nondecreasing)]\n            | None\n        )\n        shift_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(0, None))]\n\n        holes_number_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))] | None\n\n        @model_validator(mode=\"after\")\n        def validate_normalization(self) -&gt; Self:\n            if self.unit_size_min is not None and self.unit_size_max is not None:\n                self.unit_size_range = self.unit_size_min, self.unit_size_max\n                warn(\n                    \"unit_size_min and unit_size_max are deprecated. Use unit_size_range instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.shift_x is not None and self.shift_y is not None:\n                self.shift_xy = self.shift_x, self.shift_y\n                warn(\"shift_x and shift_y are deprecated. Use shift_xy instead.\", DeprecationWarning, stacklevel=2)\n\n            if self.holes_number_x is not None and self.holes_number_y is not None:\n                self.holes_number_xy = self.holes_number_x, self.holes_number_y\n                warn(\n                    \"holes_number_x and holes_number_y are deprecated. Use holes_number_xy instead.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n\n            if self.unit_size_range and not MIN_UNIT_SIZE &lt;= self.unit_size_range[0] &lt;= self.unit_size_range[1]:\n                raise ValueError(\"Max unit size should be &gt;= min size, both at least 2 pixels.\")\n\n            return self\n\n    def __init__(\n        self,\n        ratio: float = 0.5,\n        unit_size_min: int | None = None,\n        unit_size_max: int | None = None,\n        holes_number_x: int | None = None,\n        holes_number_y: int | None = None,\n        shift_x: int | None = None,\n        shift_y: int | None = None,\n        random_offset: bool = True,\n        fill_value: DropoutFillValue | None = None,\n        mask_fill_value: ColorType | None = None,\n        unit_size_range: tuple[int, int] | None = None,\n        holes_number_xy: tuple[int, int] | None = None,\n        shift_xy: tuple[int, int] = (0, 0),\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.ratio = ratio\n        self.unit_size_range = unit_size_range\n        self.holes_number_xy = holes_number_xy\n        self.random_offset = random_offset\n        self.shift_xy = shift_xy\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"]\n        if self.holes_number_xy:\n            grid = self.holes_number_xy\n        else:\n            # Calculate grid based on unit_size_range or default\n            unit_height, unit_width = fdropout.calculate_grid_dimensions(\n                image_shape,\n                self.unit_size_range,\n                self.holes_number_xy,\n                self.random_generator,\n            )\n            grid = (image_shape[0] // unit_height, image_shape[1] // unit_width)\n\n        holes = fdropout.generate_grid_holes(\n            image_shape,\n            grid,\n            self.ratio,\n            self.random_offset,\n            self.shift_xy,\n            self.random_generator,\n        )\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"ratio\",\n            \"unit_size_range\",\n            \"holes_number_xy\",\n            \"shift_xy\",\n            \"random_offset\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/dropout/mask_dropout/","title":"MaskDropout augmentation (augmentations.dropout.mask_dropout)","text":""},{"location":"api_reference/augmentations/dropout/mask_dropout/#albumentations.augmentations.dropout.mask_dropout.MaskDropout","title":"<code>class  MaskDropout</code> <code>       (max_objects=(1, 1), image_fill_value=None, mask_fill_value=None, fill=0, fill_mask=0, p=0.5, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Apply dropout to random objects in a mask, zeroing out the corresponding regions in both the image and mask.</p> <p>This transform identifies objects in the mask (where each unique non-zero value represents a distinct object), randomly selects a number of these objects, and sets their corresponding regions to zero in both the image and mask. It can also handle bounding boxes and keypoints, removing or adjusting them based on the dropout regions.</p> <p>Parameters:</p> Name Type Description <code>max_objects</code> <code>int | tuple[int, int]</code> <p>Maximum number of objects to dropout. If a single int is provided, it's treated as the upper bound. If a tuple of two ints is provided, it's treated as a range [min, max].</p> <code>fill</code> <code>float | str | Literal[\"inpaint\"]</code> <p>Value to fill the dropped out regions in the image. If set to 'inpaint', it applies inpainting to the dropped out regions (works only for 3-channel images).</p> <code>fill_mask</code> <code>float | int</code> <p>Value to fill the dropped out regions in the mask.</p> <code>min_area</code> <code>float</code> <p>Minimum area (in pixels) of a bounding box that must remain visible after dropout to be kept. Only applicable if bounding box augmentation is enabled. Default: 0.0</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio (visible area / total area) of a bounding box after dropout to be kept. Only applicable if bounding box augmentation is enabled. Default: 0.0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The mask should be a single-channel image where 0 represents the background and non-zero values represent   different object instances.</li> <li>For bounding box and keypoint augmentation, make sure to set up the corresponding processors in the pipeline.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Define a sample image, mask, and bounding boxes\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n&gt;&gt;&gt; mask[20:40, 20:40] = 1  # Object 1\n&gt;&gt;&gt; mask[60:80, 60:80] = 2  # Object 2\n&gt;&gt;&gt; bboxes = np.array([[20, 20, 40, 40], [60, 60, 80, 80]])\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Define the transform\n&gt;&gt;&gt; transform = A.Compose([\n...     A.MaskDropout(max_objects=1, mask_fill_value=0, min_area=100, min_visibility=0.5, p=1.0),\n... ], bbox_params=A.BboxParams(format='pascal_voc', min_area=1, min_visibility=0.1))\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Apply the transform\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # The result will have one of the objects dropped out in both image and mask,\n&gt;&gt;&gt; # and the corresponding bounding box removed if it doesn't meet the area and visibility criteria\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/mask_dropout.py</code> Python<pre><code>class MaskDropout(DualTransform):\n    \"\"\"Apply dropout to random objects in a mask, zeroing out the corresponding regions in both the image and mask.\n\n    This transform identifies objects in the mask (where each unique non-zero value represents a distinct object),\n    randomly selects a number of these objects, and sets their corresponding regions to zero in both the image and mask.\n    It can also handle bounding boxes and keypoints, removing or adjusting them based on the dropout regions.\n\n    Args:\n        max_objects (int | tuple[int, int]): Maximum number of objects to dropout. If a single int is provided,\n            it's treated as the upper bound. If a tuple of two ints is provided, it's treated as a range [min, max].\n        fill (float | str | Literal[\"inpaint\"]): Value to fill the dropped out regions in the image.\n            If set to 'inpaint', it applies inpainting to the dropped out regions (works only for 3-channel images).\n        fill_mask (float | int): Value to fill the dropped out regions in the mask.\n        min_area (float): Minimum area (in pixels) of a bounding box that must remain visible after dropout to be kept.\n            Only applicable if bounding box augmentation is enabled. Default: 0.0\n        min_visibility (float): Minimum visibility ratio (visible area / total area) of a bounding box after dropout\n            to be kept. Only applicable if bounding box augmentation is enabled. Default: 0.0\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The mask should be a single-channel image where 0 represents the background and non-zero values represent\n          different object instances.\n        - For bounding box and keypoint augmentation, make sure to set up the corresponding processors in the pipeline.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Define a sample image, mask, and bounding boxes\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mask = np.zeros((100, 100), dtype=np.uint8)\n        &gt;&gt;&gt; mask[20:40, 20:40] = 1  # Object 1\n        &gt;&gt;&gt; mask[60:80, 60:80] = 2  # Object 2\n        &gt;&gt;&gt; bboxes = np.array([[20, 20, 40, 40], [60, 60, 80, 80]])\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Define the transform\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.MaskDropout(max_objects=1, mask_fill_value=0, min_area=100, min_visibility=0.5, p=1.0),\n        ... ], bbox_params=A.BboxParams(format='pascal_voc', min_area=1, min_visibility=0.1))\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Apply the transform\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # The result will have one of the objects dropped out in both image and mask,\n        &gt;&gt;&gt; # and the corresponding bounding box removed if it doesn't meet the area and visibility criteria\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_objects: OnePlusIntRangeType\n\n        image_fill_value: float | Literal[\"inpaint\"] | None = Field(deprecated=\"Deprecated use fill instead\")\n        mask_fill_value: float | None = Field(deprecated=\"Deprecated use fill_mask instead\")\n\n        fill: float | Literal[\"inpaint\"]\n        fill_mask: float\n\n    def __init__(\n        self,\n        max_objects: ScaleIntType = (1, 1),\n        image_fill_value: float | Literal[\"inpaint\"] | None = None,\n        mask_fill_value: float | None = None,\n        fill: float | Literal[\"inpaint\"] = 0,\n        fill_mask: float = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_objects = cast(tuple[int, int], max_objects)\n        self.fill = fill  # type: ignore[assignment]\n        self.fill_mask = fill_mask\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [\"mask\"]\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        mask = data[\"mask\"]\n\n        label_image, num_labels = fdropout.label(mask, return_num=True)\n\n        if num_labels == 0:\n            dropout_mask = None\n        else:\n            objects_to_drop = self.py_random.randint(*self.max_objects)\n            objects_to_drop = min(num_labels, objects_to_drop)\n\n            if objects_to_drop == num_labels:\n                dropout_mask = mask &gt; 0\n            else:\n                labels_index = self.py_random.sample(range(1, num_labels + 1), objects_to_drop)\n                dropout_mask = np.zeros(mask.shape[:2], dtype=bool)\n                for label_index in labels_index:\n                    dropout_mask |= label_image == label_index\n\n        return {\"dropout_mask\": dropout_mask}\n\n    def apply(self, img: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return img\n\n        if self.fill == \"inpaint\":\n            dropout_mask = dropout_mask.astype(np.uint8)\n            _, _, width, height = cv2.boundingRect(dropout_mask)\n            radius = min(3, max(width, height) // 2)\n            return cv2.inpaint(img, dropout_mask, radius, cv2.INPAINT_NS)\n\n        img = img.copy()\n        img[dropout_mask] = self.fill\n\n        return img\n\n    def apply_to_mask(self, mask: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return mask\n\n        mask = mask.copy()\n        mask[dropout_mask] = self.fill_mask\n        return mask\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return bboxes\n\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n\n        image_shape = params[\"shape\"][:2]\n\n        denormalized_bboxes = denormalize_bboxes(bboxes, image_shape)\n\n        result = fdropout.mask_dropout_bboxes(\n            denormalized_bboxes,\n            dropout_mask,\n            image_shape,\n            processor.params.min_area,\n            processor.params.min_visibility,\n        )\n\n        return normalize_bboxes(result, image_shape)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, dropout_mask: np.ndarray | None, **params: Any) -&gt; np.ndarray:\n        if dropout_mask is None:\n            return keypoints\n\n        processor = cast(KeypointsProcessor, self.get_processor(\"keypoints\"))\n\n        if processor is None or not processor.params.remove_invisible:\n            return keypoints\n\n        return fdropout.mask_dropout_keypoints(keypoints, dropout_mask)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_objects\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/dropout/xy_masking/","title":"XYMasking augmentation (augmentations.dropout.xy_masking)","text":""},{"location":"api_reference/augmentations/dropout/xy_masking/#albumentations.augmentations.dropout.xy_masking.XYMasking","title":"<code>class  XYMasking</code> <code>       (num_masks_x=0, num_masks_y=0, mask_x_length=0, mask_y_length=0, fill_value=None, mask_fill_value=None, fill=0, fill_mask=None, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Applies masking strips to an image, either horizontally (X axis) or vertically (Y axis), simulating occlusions. This transform is useful for training models to recognize images with varied visibility conditions. It's particularly effective for spectrogram images, allowing spectral and frequency masking to improve model robustness.</p> <p>At least one of <code>max_x_length</code> or <code>max_y_length</code> must be specified, dictating the mask's maximum size along each axis.</p> <p>Parameters:</p> Name Type Description <code>num_masks_x</code> <code>int | tuple[int, int]</code> <p>Number or range of horizontal regions to mask. Defaults to 0.</p> <code>num_masks_y</code> <code>int | tuple[int, int]</code> <p>Number or range of vertical regions to mask. Defaults to 0.</p> <code>mask_x_length</code> <code>int | tuple[int, int]</code> <p>Specifies the length of the masks along the X (horizontal) axis. If an integer is provided, it sets a fixed mask length. If a tuple of two integers (min, max) is provided, the mask length is randomly chosen within this range for each mask. This allows for variable-length masks in the horizontal direction.</p> <code>mask_y_length</code> <code>int | tuple[int, int]</code> <p>Specifies the height of the masks along the Y (vertical) axis. Similar to <code>mask_x_length</code>, an integer sets a fixed mask height, while a tuple (min, max) allows for variable-height masks, chosen randomly within the specified range for each mask. This flexibility facilitates creating masks of various sizes in the vertical direction.</p> <code>fill</code> <code>ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]</code> <p>Value for the dropped pixels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel - 'random': each pixel is filled with random values - 'random_uniform': each hole is filled with a single random color - 'inpaint_telea': uses OpenCV Telea inpainting method - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method Default: 0</p> <code>mask_fill_value</code> <code>ColorType | None</code> <p>Fill value for dropout regions in the mask. If None, mask regions corresponding to image dropouts are unchanged. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Defaults to 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note: Either <code>max_x_length</code> or <code>max_y_length</code> or both must be defined.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/dropout/xy_masking.py</code> Python<pre><code>class XYMasking(BaseDropout):\n    \"\"\"Applies masking strips to an image, either horizontally (X axis) or vertically (Y axis),\n    simulating occlusions. This transform is useful for training models to recognize images\n    with varied visibility conditions. It's particularly effective for spectrogram images,\n    allowing spectral and frequency masking to improve model robustness.\n\n    At least one of `max_x_length` or `max_y_length` must be specified, dictating the mask's\n    maximum size along each axis.\n\n    Args:\n        num_masks_x (int | tuple[int, int]): Number or range of horizontal regions to mask. Defaults to 0.\n        num_masks_y (int | tuple[int, int]): Number or range of vertical regions to mask. Defaults to 0.\n        mask_x_length (int | tuple[int, int]): Specifies the length of the masks along\n            the X (horizontal) axis. If an integer is provided, it sets a fixed mask length.\n            If a tuple of two integers (min, max) is provided,\n            the mask length is randomly chosen within this range for each mask.\n            This allows for variable-length masks in the horizontal direction.\n        mask_y_length (int | tuple[int, int]): Specifies the height of the masks along\n            the Y (vertical) axis. Similar to `mask_x_length`, an integer sets a fixed mask height,\n            while a tuple (min, max) allows for variable-height masks, chosen randomly\n            within the specified range for each mask. This flexibility facilitates creating masks of various\n            sizes in the vertical direction.\n        fill (ColorType | Literal[\"random\", \"random_uniform\", \"inpaint_telea\", \"inpaint_ns\"]):\n            Value for the dropped pixels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            - 'random': each pixel is filled with random values\n            - 'random_uniform': each hole is filled with a single random color\n            - 'inpaint_telea': uses OpenCV Telea inpainting method\n            - 'inpaint_ns': uses OpenCV Navier-Stokes inpainting method\n            Default: 0\n        mask_fill_value (ColorType | None): Fill value for dropout regions in the mask.\n            If None, mask regions corresponding to image dropouts are unchanged. Default: None\n        p (float): Probability of applying the transform. Defaults to 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note: Either `max_x_length` or `max_y_length` or both must be defined.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        num_masks_x: NonNegativeIntRangeType\n        num_masks_y: NonNegativeIntRangeType\n        mask_x_length: NonNegativeIntRangeType\n        mask_y_length: NonNegativeIntRangeType\n\n        fill_value: DropoutFillValue | None\n        mask_fill_value: ColorType | None\n\n        fill: DropoutFillValue\n        fill_mask: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def check_mask_length(self) -&gt; Self:\n            if (\n                isinstance(self.mask_x_length, int)\n                and self.mask_x_length &lt;= 0\n                and isinstance(self.mask_y_length, int)\n                and self.mask_y_length &lt;= 0\n            ):\n                msg = \"At least one of `mask_x_length` or `mask_y_length` Should be a positive number.\"\n                raise ValueError(msg)\n\n            if self.fill_value is not None:\n                warn(\"fill_value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.fill_value\n\n            if self.mask_fill_value is not None:\n                warn(\"mask_fill_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_fill_value\n\n            return self\n\n    def __init__(\n        self,\n        num_masks_x: ScaleIntType = 0,\n        num_masks_y: ScaleIntType = 0,\n        mask_x_length: ScaleIntType = 0,\n        mask_y_length: ScaleIntType = 0,\n        fill_value: DropoutFillValue | None = None,\n        mask_fill_value: ColorType | None = None,\n        fill: DropoutFillValue = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, fill=fill, fill_mask=fill_mask)\n        self.num_masks_x = cast(tuple[int, int], num_masks_x)\n        self.num_masks_y = cast(tuple[int, int], num_masks_y)\n\n        self.mask_x_length = cast(tuple[int, int], mask_x_length)\n        self.mask_y_length = cast(tuple[int, int], mask_y_length)\n\n    def validate_mask_length(\n        self,\n        mask_length: tuple[int, int] | None,\n        dimension_size: int,\n        dimension_name: str,\n    ) -&gt; None:\n        \"\"\"Validate the mask length against the corresponding image dimension size.\"\"\"\n        if mask_length is not None:\n            if isinstance(mask_length, (tuple, list)):\n                if mask_length[0] &lt; 0 or mask_length[1] &gt; dimension_size:\n                    raise ValueError(\n                        f\"{dimension_name} range {mask_length} is out of valid range [0, {dimension_size}]\",\n                    )\n            elif mask_length &lt; 0 or mask_length &gt; dimension_size:\n                raise ValueError(f\"{dimension_name} {mask_length} exceeds image {dimension_name} {dimension_size}\")\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        image_shape = params[\"shape\"][:2]\n\n        height, width = image_shape\n\n        self.validate_mask_length(self.mask_x_length, width, \"mask_x_length\")\n        self.validate_mask_length(self.mask_y_length, height, \"mask_y_length\")\n\n        masks_x = self.generate_masks(self.num_masks_x, image_shape, self.mask_x_length, axis=\"x\")\n        masks_y = self.generate_masks(self.num_masks_y, image_shape, self.mask_y_length, axis=\"y\")\n\n        holes = np.array(masks_x + masks_y)\n\n        return {\"holes\": holes, \"seed\": self.random_generator.integers(0, 2**32 - 1)}\n\n    def generate_mask_size(self, mask_length: tuple[int, int]) -&gt; int:\n        return self.py_random.randint(*mask_length)\n\n    def generate_masks(\n        self,\n        num_masks: tuple[int, int],\n        image_shape: tuple[int, int],\n        max_length: tuple[int, int] | None,\n        axis: str,\n    ) -&gt; list[tuple[int, int, int, int]]:\n        if max_length is None or max_length == 0 or (isinstance(num_masks, (int, float)) and num_masks == 0):\n            return []\n\n        masks = []\n        num_masks_integer = (\n            num_masks if isinstance(num_masks, int) else self.py_random.randint(num_masks[0], num_masks[1])\n        )\n\n        height, width = image_shape\n\n        for _ in range(num_masks_integer):\n            length = self.generate_mask_size(max_length)\n\n            if axis == \"x\":\n                x_min = self.py_random.randint(0, width - length)\n                y_min = 0\n                x_max, y_max = x_min + length, height\n            else:  # axis == 'y'\n                y_min = self.py_random.randint(0, height - length)\n                x_min = 0\n                x_max, y_max = width, y_min + length\n\n            masks.append((x_min, y_min, x_max, y_max))\n        return masks\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"num_masks_x\",\n            \"num_masks_y\",\n            \"mask_x_length\",\n            \"mask_y_length\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/","title":"Index","text":"<ul> <li>Geometric functional transforms (albumentations.augmentations.geometric.functional)</li> <li>Resizing transforms (augmentations.geometric.resize)</li> <li>Rotation transforms (augmentations.geometric.functional)</li> <li>Geometric transforms (augmentations.geometric.transforms)</li> </ul>"},{"location":"api_reference/augmentations/geometric/functional/","title":"Geometric functional transforms (augmentations.geometric.functional)","text":""},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.adjust_padding_by_position","title":"<code>def adjust_padding_by_position    (h_top, h_bottom, w_left, w_right, position, py_random)    </code> [view source on GitHub]","text":"<p>Adjust padding values based on desired position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def adjust_padding_by_position(\n    h_top: int,\n    h_bottom: int,\n    w_left: int,\n    w_right: int,\n    position: PositionType,\n    py_random: np.random.RandomState,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Adjust padding values based on desired position.\"\"\"\n    if position == \"center\":\n        return h_top, h_bottom, w_left, w_right\n\n    if position == \"top_left\":\n        return 0, h_top + h_bottom, 0, w_left + w_right\n\n    if position == \"top_right\":\n        return 0, h_top + h_bottom, w_left + w_right, 0\n\n    if position == \"bottom_left\":\n        return h_top + h_bottom, 0, 0, w_left + w_right\n\n    if position == \"bottom_right\":\n        return h_top + h_bottom, 0, w_left + w_right, 0\n\n    if position == \"random\":\n        h_pad = h_top + h_bottom\n        w_pad = w_left + w_right\n        h_top = py_random.randint(0, h_pad)\n        h_bottom = h_pad - h_top\n        w_left = py_random.randint(0, w_pad)\n        w_right = w_pad - w_left\n        return h_top, h_bottom, w_left, w_right\n\n    raise ValueError(f\"Unknown position: {position}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.almost_equal_intervals","title":"<code>def almost_equal_intervals    (n, parts)    </code> [view source on GitHub]","text":"<p>Generates an array of nearly equal integer intervals that sum up to <code>n</code>.</p> <p>This function divides the number <code>n</code> into <code>parts</code> nearly equal parts. It ensures that the sum of all parts equals <code>n</code>, and the difference between any two parts is at most one. This is useful for distributing a total amount into nearly equal discrete parts.</p> <p>Parameters:</p> Name Type Description <code>n</code> <code>int</code> <p>The total value to be split.</p> <code>parts</code> <code>int</code> <p>The number of parts to split into.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of integers where each integer represents the size of a part.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; almost_equal_intervals(20, 3)\narray([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n&gt;&gt;&gt; almost_equal_intervals(16, 4)\narray([4, 4, 4, 4])  # Splits 16 into four equal parts\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def almost_equal_intervals(n: int, parts: int) -&gt; np.ndarray:\n    \"\"\"Generates an array of nearly equal integer intervals that sum up to `n`.\n\n    This function divides the number `n` into `parts` nearly equal parts. It ensures that\n    the sum of all parts equals `n`, and the difference between any two parts is at most one.\n    This is useful for distributing a total amount into nearly equal discrete parts.\n\n    Args:\n        n (int): The total value to be split.\n        parts (int): The number of parts to split into.\n\n    Returns:\n        np.ndarray: An array of integers where each integer represents the size of a part.\n\n    Example:\n        &gt;&gt;&gt; almost_equal_intervals(20, 3)\n        array([7, 7, 6])  # Splits 20 into three parts: 7, 7, and 6\n        &gt;&gt;&gt; almost_equal_intervals(16, 4)\n        array([4, 4, 4, 4])  # Splits 16 into four equal parts\n    \"\"\"\n    part_size, remainder = divmod(n, parts)\n    # Create an array with the base part size and adjust the first `remainder` parts by adding 1\n    return np.array(\n        [part_size + 1 if i &lt; remainder else part_size for i in range(parts)],\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.apply_affine_to_points","title":"<code>def apply_affine_to_points    (points, matrix)    </code> [view source on GitHub]","text":"<p>Apply affine transformation to a set of points.</p> <p>This function handles potential division by zero by replacing zero values in the homogeneous coordinate with a small epsilon value.</p> <p>Parameters:</p> Name Type Description <code>points</code> <code>np.ndarray</code> <p>Array of points with shape (N, 2).</p> <code>matrix</code> <code>np.ndarray</code> <p>3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (N, 2).</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"points\")\ndef apply_affine_to_points(points: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply affine transformation to a set of points.\n\n    This function handles potential division by zero by replacing zero values\n    in the homogeneous coordinate with a small epsilon value.\n\n    Args:\n        points (np.ndarray): Array of points with shape (N, 2).\n        matrix (np.ndarray): 3x3 affine transformation matrix.\n\n    Returns:\n        np.ndarray: Transformed points with shape (N, 2).\n    \"\"\"\n    homogeneous_points = np.column_stack([points, np.ones(points.shape[0])])\n    transformed_points = homogeneous_points @ matrix.T\n\n    # Handle potential division by zero\n    epsilon = np.finfo(transformed_points.dtype).eps\n    transformed_points[:, 2] = np.where(\n        np.abs(transformed_points[:, 2]) &lt; epsilon,\n        np.sign(transformed_points[:, 2]) * epsilon,\n        transformed_points[:, 2],\n    )\n\n    return transformed_points[:, :2] / transformed_points[:, 2:]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_affine","title":"<code>def bboxes_affine    (bboxes, matrix, rotate_method, image_shape, border_mode, output_shape)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes.</p> <p>For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function: 1. Calculates necessary padding to avoid information loss 2. Applies padding to the bounding boxes 3. Adjusts the transformation matrix to account for padding 4. Applies the affine transformation 5. Validates the transformed bounding boxes</p> <p>For other border modes, it directly applies the affine transformation without padding.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Input bounding boxes</p> <code>matrix</code> <code>np.ndarray</code> <p>Affine transformation matrix</p> <code>rotate_method</code> <code>str</code> <p>Method for rotating bounding boxes ('largest_box' or 'ellipse')</p> <code>image_shape</code> <code>Sequence[int]</code> <p>Shape of the input image</p> <code>border_mode</code> <code>int</code> <p>OpenCV border mode</p> <code>output_shape</code> <code>Sequence[int]</code> <p>Shape of the output image</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed and normalized bounding boxes</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine(\n    bboxes: np.ndarray,\n    matrix: np.ndarray,\n    rotate_method: Literal[\"largest_box\", \"ellipse\"],\n    image_shape: tuple[int, int],\n    border_mode: int,\n    output_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes.\n\n    For reflection border modes (cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT), this function:\n    1. Calculates necessary padding to avoid information loss\n    2. Applies padding to the bounding boxes\n    3. Adjusts the transformation matrix to account for padding\n    4. Applies the affine transformation\n    5. Validates the transformed bounding boxes\n\n    For other border modes, it directly applies the affine transformation without padding.\n\n    Args:\n        bboxes (np.ndarray): Input bounding boxes\n        matrix (np.ndarray): Affine transformation matrix\n        rotate_method (str): Method for rotating bounding boxes ('largest_box' or 'ellipse')\n        image_shape (Sequence[int]): Shape of the input image\n        border_mode (int): OpenCV border mode\n        output_shape (Sequence[int]): Shape of the output image\n\n    Returns:\n        np.ndarray: Transformed and normalized bounding boxes\n    \"\"\"\n    if is_identity_matrix(matrix):\n        return bboxes\n\n    bboxes = denormalize_bboxes(bboxes, image_shape)\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        bboxes = generate_reflected_bboxes(\n            bboxes,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Apply affine transform\n    if rotate_method == \"largest_box\":\n        transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n    elif rotate_method == \"ellipse\":\n        transformed_bboxes = bboxes_affine_ellipse(bboxes, matrix)\n    else:\n        raise ValueError(f\"Method {rotate_method} is not a valid rotation method.\")\n\n    # Validate and normalize bboxes\n    validated_bboxes = validate_bboxes(transformed_bboxes, output_shape)\n\n    return normalize_bboxes(validated_bboxes, output_shape)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_affine_ellipse","title":"<code>def bboxes_affine_ellipse    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes using an ellipse approximation method.</p> <p>This function transforms bounding boxes by approximating each box with an ellipse, transforming points along the ellipse's circumference, and then computing the new bounding box that encloses the transformed ellipse.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The ellipse approximation method can provide a tighter bounding box compared to the   largest box method, especially for rotations.</li> <li>360 points are used to approximate each ellipse, which provides a good balance between   accuracy and computational efficiency.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method may be more suitable for objects that are roughly elliptical in shape.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_ellipse(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes using an ellipse approximation method.\n\n    This function transforms bounding boxes by approximating each box with an ellipse,\n    transforming points along the ellipse's circumference, and then computing the\n    new bounding box that encloses the transformed ellipse.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The ellipse approximation method can provide a tighter bounding box compared to the\n          largest box method, especially for rotations.\n        - 360 points are used to approximate each ellipse, which provides a good balance between\n          accuracy and computational efficiency.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method may be more suitable for objects that are roughly elliptical in shape.\n    \"\"\"\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n    bbox_width = (x_max - x_min) / 2\n    bbox_height = (y_max - y_min) / 2\n    center_x = x_min + bbox_width\n    center_y = y_min + bbox_height\n\n    angles = np.arange(0, 360, dtype=np.float32)\n    cos_angles = np.cos(np.radians(angles))\n    sin_angles = np.sin(np.radians(angles))\n\n    # Generate points for all ellipses at once\n    x = bbox_width[:, np.newaxis] * sin_angles + center_x[:, np.newaxis]\n    y = bbox_height[:, np.newaxis] * cos_angles + center_y[:, np.newaxis]\n    points = np.stack([x, y], axis=-1).reshape(-1, 2)\n\n    # Transform all points at once using the helper function\n    transformed_points = apply_affine_to_points(points, matrix)\n\n    transformed_points = transformed_points.reshape(len(bboxes), -1, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_points[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_points[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_points[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_points[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_affine_largest_box","title":"<code>def bboxes_affine_largest_box    (bboxes, matrix)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to bounding boxes and return the largest enclosing boxes.</p> <p>This function transforms each corner of every bounding box using the given affine transformation matrix, then computes the new bounding boxes that fully enclose the transformed corners.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4+) where N is the number of                  bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]                  followed by any additional attributes (e.g., class labels).</p> <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix to apply.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as the input.             Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by             any additional attributes from the input bounding boxes.</p> <p>Note</p> <ul> <li>This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].</li> <li>The resulting bounding boxes are the smallest axis-aligned boxes that completely   enclose the transformed original boxes. They may be larger than the minimal possible   bounding box if the original box becomes rotated.</li> <li>Any additional attributes beyond the first 4 coordinates are preserved unchanged.</li> <li>This method is called \"largest box\" because it returns the largest axis-aligned box   that encloses all corners of the transformed bounding box.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n&gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n&gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n&gt;&gt;&gt; print(transformed_bboxes)\n[[ 25.  25.  45.  45.   1.]\n [ 65.  65.  85.  85.   2.]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_affine_largest_box(bboxes: np.ndarray, matrix: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to bounding boxes and return the largest enclosing boxes.\n\n    This function transforms each corner of every bounding box using the given affine transformation\n    matrix, then computes the new bounding boxes that fully enclose the transformed corners.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (N, 4+) where N is the number of\n                             bounding boxes. Each row should contain [x_min, y_min, x_max, y_max]\n                             followed by any additional attributes (e.g., class labels).\n        matrix (np.ndarray): The 3x3 affine transformation matrix to apply.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as the input.\n                    Each row contains [new_x_min, new_y_min, new_x_max, new_y_max] followed by\n                    any additional attributes from the input bounding boxes.\n\n    Note:\n        - This function assumes that the input bounding boxes are in the format [x_min, y_min, x_max, y_max].\n        - The resulting bounding boxes are the smallest axis-aligned boxes that completely\n          enclose the transformed original boxes. They may be larger than the minimal possible\n          bounding box if the original box becomes rotated.\n        - Any additional attributes beyond the first 4 coordinates are preserved unchanged.\n        - This method is called \"largest box\" because it returns the largest axis-aligned box\n          that encloses all corners of the transformed bounding box.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 20, 20, 1], [30, 30, 40, 40, 2]])  # Two boxes with class labels\n        &gt;&gt;&gt; matrix = np.array([[2, 0, 5], [0, 2, 5], [0, 0, 1]])  # Scale by 2 and translate by (5, 5)\n        &gt;&gt;&gt; transformed_bboxes = bboxes_affine_largest_box(bboxes, matrix)\n        &gt;&gt;&gt; print(transformed_bboxes)\n        [[ 25.  25.  45.  45.   1.]\n         [ 65.  65.  85.  85.   2.]]\n    \"\"\"\n    # Extract corners of all bboxes\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    corners = (\n        np.array([[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]]).transpose(2, 0, 1).reshape(-1, 2)\n    )\n\n    # Transform all corners at once\n    transformed_corners = apply_affine_to_points(corners, matrix).reshape(-1, 4, 2)\n\n    # Compute new bounding boxes\n    new_x_min = np.min(transformed_corners[:, :, 0], axis=1)\n    new_x_max = np.max(transformed_corners[:, :, 0], axis=1)\n    new_y_min = np.min(transformed_corners[:, :, 1], axis=1)\n    new_y_max = np.max(transformed_corners[:, :, 1], axis=1)\n\n    return np.column_stack([new_x_min, new_y_min, new_x_max, new_y_max, bboxes[:, 4:]])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_d4","title":"<code>def bboxes_d4    (bboxes, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a bounding box.</p> <p>The function transforms a bounding box according to the specified group member from the <code>D_4</code> group. These transformations include rotations and reflections, specified to work on an image's bounding box given its dimensions.</p> <ul> <li>bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</li> <li>group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.</li> </ul> <ul> <li>BoxInternalType: The transformed bounding box.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Applying a 90-degree rotation:   <code>bbox_d4((10, 20, 110, 120), 'r90')</code>   This would rotate the bounding box 90 degrees within a 100x100 image.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_d4(\n    bboxes: np.ndarray,\n    group_member: D4Type,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a bounding box.\n\n    The function transforms a bounding box according to the specified group member from the `D_4` group.\n    These transformations include rotations and reflections, specified to work on an image's bounding box given\n    its dimensions.\n\n    Parameters:\n    -  bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n    - group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hvt', 'h', 't'.\n\n    Returns:\n    - BoxInternalType: The transformed bounding box.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Applying a 90-degree rotation:\n      `bbox_d4((10, 20, 110, 120), 'r90')`\n      This would rotate the bounding box 90 degrees within a 100x100 image.\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: bboxes_rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: bboxes_rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: bboxes_rot90(x, 3),  # Rotate 270 degrees\n        \"v\": lambda x: bboxes_vflip(x),  # Vertical flip\n        \"hvt\": lambda x: bboxes_transpose(\n            bboxes_rot90(x, 2),\n        ),  # Reflect over anti-diagonal\n        \"h\": lambda x: bboxes_hflip(x),  # Horizontal flip\n        \"t\": lambda x: bboxes_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](bboxes)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_grid_shuffle","title":"<code>def bboxes_grid_shuffle    (bboxes, tiles, mapping, image_shape, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Apply grid shuffle transformation to bounding boxes.</p> <p>This function transforms bounding boxes according to a grid shuffle operation. It handles cases where bounding boxes may be split into multiple components after shuffling and applies filtering based on minimum area and visibility requirements.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4+) where N is the number of boxes.    Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents    optional additional fields (e.g., class_id, score).</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tile coordinates with shape (M, 4) where M is the number of tiles.    Each tile is in format [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int]</code> <p>List of indices defining how tiles should be rearranged. Each index i in the list     contains the index of the tile that should be moved to position i.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <code>min_area</code> <code>float</code> <p>Minimum area threshold in pixels. If a component's area after shuffling is      smaller than this value, it will be filtered out. If None, no area filtering      is applied.</p> <code>min_visibility</code> <code>float</code> <p>Minimum visibility ratio threshold in range [0, 1]. Calculated as            (component_area / original_area). If a component's visibility is lower            than this value, it will be filtered out. If None, no visibility            filtering is applied.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of transformed bounding boxes with shape (K, 4+) where K is the            number of valid components after shuffling and filtering. The format of            each box matches the input format, preserving any additional fields.            If no valid components remain after filtering, returns an empty array            with shape (0, C) where C matches the input column count.</p> <p>Note</p> <ul> <li>The function converts bboxes to masks before applying the transformation to handle   cases where boxes may be split into multiple components.</li> <li>After shuffling, each component is validated against min_area and min_visibility   requirements independently.</li> <li>Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and   copied to all components derived from the same original bbox.</li> <li>Empty input arrays are handled gracefully and return empty arrays of the   appropriate shape.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n&gt;&gt;&gt; tiles = np.array([\n...     [0, 0, 50, 50],    # top-left tile\n...     [0, 50, 50, 100],  # top-right tile\n...     [50, 0, 100, 50],  # bottom-left tile\n...     [50, 50, 100, 100] # bottom-right tile\n... ])\n&gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n&gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n&gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_grid_shuffle(\n    bboxes: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int],\n    image_shape: tuple[int, int],\n    min_area: float,\n    min_visibility: float,\n) -&gt; np.ndarray:\n    \"\"\"Apply grid shuffle transformation to bounding boxes.\n\n    This function transforms bounding boxes according to a grid shuffle operation. It handles cases\n    where bounding boxes may be split into multiple components after shuffling and applies\n    filtering based on minimum area and visibility requirements.\n\n    Args:\n        bboxes: Array of bounding boxes with shape (N, 4+) where N is the number of boxes.\n               Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents\n               optional additional fields (e.g., class_id, score).\n        tiles: Array of tile coordinates with shape (M, 4) where M is the number of tiles.\n               Each tile is in format [start_y, start_x, end_y, end_x].\n        mapping: List of indices defining how tiles should be rearranged. Each index i in the list\n                contains the index of the tile that should be moved to position i.\n        image_shape: Shape of the image as (height, width).\n        min_area: Minimum area threshold in pixels. If a component's area after shuffling is\n                 smaller than this value, it will be filtered out. If None, no area filtering\n                 is applied.\n        min_visibility: Minimum visibility ratio threshold in range [0, 1]. Calculated as\n                       (component_area / original_area). If a component's visibility is lower\n                       than this value, it will be filtered out. If None, no visibility\n                       filtering is applied.\n\n    Returns:\n        np.ndarray: Array of transformed bounding boxes with shape (K, 4+) where K is the\n                   number of valid components after shuffling and filtering. The format of\n                   each box matches the input format, preserving any additional fields.\n                   If no valid components remain after filtering, returns an empty array\n                   with shape (0, C) where C matches the input column count.\n\n    Note:\n        - The function converts bboxes to masks before applying the transformation to handle\n          cases where boxes may be split into multiple components.\n        - After shuffling, each component is validated against min_area and min_visibility\n          requirements independently.\n        - Additional bbox fields (beyond x_min, y_min, x_max, y_max) are preserved and\n          copied to all components derived from the same original bbox.\n        - Empty input arrays are handled gracefully and return empty arrays of the\n          appropriate shape.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 10, 90, 90]])  # Single box crossing multiple tiles\n        &gt;&gt;&gt; tiles = np.array([\n        ...     [0, 0, 50, 50],    # top-left tile\n        ...     [0, 50, 50, 100],  # top-right tile\n        ...     [50, 0, 100, 50],  # bottom-left tile\n        ...     [50, 50, 100, 100] # bottom-right tile\n        ... ])\n        &gt;&gt;&gt; mapping = [3, 2, 1, 0]  # Rotate tiles counter-clockwise\n        &gt;&gt;&gt; result = bboxes_grid_shuffle(bboxes, tiles, mapping, (100, 100), 100, 0.2)\n        &gt;&gt;&gt; # Result may contain multiple boxes if the original box was split\n    \"\"\"\n    # Convert bboxes to masks\n    masks = masks_from_bboxes(bboxes, image_shape)\n\n    # Apply grid shuffle to each mask and handle split components\n    all_component_masks = []\n    extra_bbox_data = []  # Store additional bbox data for each component\n\n    for idx, mask in enumerate(masks):\n        original_area = np.sum(mask)  # Get original mask area\n\n        # Shuffle the mask\n        shuffled_mask = swap_tiles_on_image(mask, tiles, mapping)\n\n        # Find connected components\n        num_components, components = cv2.connectedComponents(\n            shuffled_mask.astype(np.uint8),\n        )\n\n        # For each component, create a separate binary mask\n        for comp_idx in range(1, num_components):  # Skip background (0)\n            component_mask = (components == comp_idx).astype(np.uint8)\n\n            # Calculate area and visibility ratio\n            component_area = np.sum(component_mask)\n            # Check if component meets minimum requirements\n            if is_valid_component(\n                component_area,\n                original_area,\n                min_area,\n                min_visibility,\n            ):\n                all_component_masks.append(component_mask)\n                # Append additional bbox data for this component\n                if bboxes.shape[1] &gt; NUM_BBOXES_COLUMNS_IN_ALBUMENTATIONS:\n                    extra_bbox_data.append(bboxes[idx, 4:])\n\n    # Convert all component masks to bboxes\n    if all_component_masks:\n        all_component_masks = np.array(all_component_masks)\n        shuffled_bboxes = bboxes_from_masks(all_component_masks)\n\n        # Add back additional bbox data if present\n        if extra_bbox_data:\n            extra_bbox_data = np.array(extra_bbox_data)\n            return np.column_stack([shuffled_bboxes, extra_bbox_data])\n    else:\n        # Handle case where no valid components were found\n        return np.zeros((0, bboxes.shape[1]), dtype=bboxes.dtype)\n\n    return shuffled_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_hflip","title":"<code>def bboxes_hflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of horizontally flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_hflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally around the y-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of horizontally flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 0] = 1 - bboxes[:, 2]  # new x_min = 1 - x_max\n    flipped_bboxes[:, 2] = 1 - bboxes[:, 0]  # new x_max = 1 - x_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_rot90","title":"<code>def bboxes_rot90    (bboxes, factor)    </code> [view source on GitHub]","text":"<p>Rotates bounding boxes by 90 degrees CCW (see np.rot90)</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <code>factor</code> <code>Literal[0, 1, 2, 3]</code> <p>Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of rotated bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_rot90(bboxes: np.ndarray, factor: Literal[0, 1, 2, 3]) -&gt; np.ndarray:\n    \"\"\"Rotates bounding boxes by 90 degrees CCW (see np.rot90)\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n        factor: Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.\n\n    Returns:\n        np.ndarray: A numpy array of rotated bounding boxes with the same shape as input.\n    \"\"\"\n    if factor == 0:\n        return bboxes\n\n    rotated_bboxes = bboxes.copy()\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    if factor == 1:\n        rotated_bboxes[:, 0] = y_min\n        rotated_bboxes[:, 1] = 1 - x_max\n        rotated_bboxes[:, 2] = y_max\n        rotated_bboxes[:, 3] = 1 - x_min\n    elif factor == ROT90_180_FACTOR:\n        rotated_bboxes[:, 0] = 1 - x_max\n        rotated_bboxes[:, 1] = 1 - y_max\n        rotated_bboxes[:, 2] = 1 - x_min\n        rotated_bboxes[:, 3] = 1 - y_min\n    elif factor == ROT90_270_FACTOR:\n        rotated_bboxes[:, 0] = 1 - y_max\n        rotated_bboxes[:, 1] = x_min\n        rotated_bboxes[:, 2] = 1 - y_min\n        rotated_bboxes[:, 3] = x_max\n\n    return rotated_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_transpose","title":"<code>def bboxes_transpose    (bboxes)    </code> [view source on GitHub]","text":"<p>Transpose bounding boxes by swapping x and y coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of transposed bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_transpose(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transpose bounding boxes by swapping x and y coordinates.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of transposed bounding boxes with the same shape as input.\n    \"\"\"\n    transposed_bboxes = bboxes.copy()\n    transposed_bboxes[:, [0, 1, 2, 3]] = bboxes[:, [1, 0, 3, 2]]\n\n    return transposed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.bboxes_vflip","title":"<code>def bboxes_vflip    (bboxes)    </code> [view source on GitHub]","text":"<p>Flip bounding boxes vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).     Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of vertically flipped bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef bboxes_vflip(bboxes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes vertically around the x-axis.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n                Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n\n    Returns:\n        np.ndarray: A numpy array of vertically flipped bounding boxes with the same shape as input.\n    \"\"\"\n    flipped_bboxes = bboxes.copy()\n    flipped_bboxes[:, 1] = 1 - bboxes[:, 3]  # new y_min = 1 - y_max\n    flipped_bboxes[:, 3] = 1 - bboxes[:, 1]  # new y_max = 1 - y_min\n\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.calculate_affine_transform_padding","title":"<code>def calculate_affine_transform_padding    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the necessary padding for an affine transformation to avoid empty spaces.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def calculate_affine_transform_padding(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate the necessary padding for an affine transformation to avoid empty spaces.\"\"\"\n    height, width = image_shape[:2]\n\n    # Check for identity transform\n    if is_identity_matrix(matrix):\n        return (0, 0, 0, 0)\n\n    # Original corners\n    corners = np.array([[0, 0], [width, 0], [width, height], [0, height]])\n\n    # Transform corners\n    transformed_corners = apply_affine_to_points(corners, matrix)\n\n    # Ensure transformed_corners is 2D\n    transformed_corners = transformed_corners.reshape(-1, 2)\n\n    # Find box that includes both original and transformed corners\n    all_corners = np.vstack((corners, transformed_corners))\n    min_x, min_y = all_corners.min(axis=0)\n    max_x, max_y = all_corners.max(axis=0)\n\n    # Compute the inverse transform\n    inverse_matrix = np.linalg.inv(matrix)\n\n    # Apply inverse transform to all corners of the bounding box\n    bbox_corners = np.array(\n        [[min_x, min_y], [max_x, min_y], [max_x, max_y], [min_x, max_y]],\n    )\n    inverse_corners = apply_affine_to_points(bbox_corners, inverse_matrix).reshape(\n        -1,\n        2,\n    )\n\n    min_x, min_y = inverse_corners.min(axis=0)\n    max_x, max_y = inverse_corners.max(axis=0)\n\n    pad_left = max(0, math.ceil(0 - min_x))\n    pad_right = max(0, math.ceil(max_x - width))\n    pad_top = max(0, math.ceil(0 - min_y))\n    pad_bottom = max(0, math.ceil(max_y - height))\n\n    return pad_left, pad_right, pad_top, pad_bottom\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.center","title":"<code>def center    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates if image. Used by images, masks and keypoints.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates if image. Used by images, masks and keypoints.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2 - 0.5, height / 2 - 0.5\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.center_bbox","title":"<code>def center_bbox    (image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the center coordinates for of image for bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image.</p> <p>Returns:</p> Type Description <code>tuple[float, float]</code> <p>center_x, center_y</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def center_bbox(image_shape: tuple[int, int]) -&gt; tuple[float, float]:\n    \"\"\"Calculate the center coordinates for of image for bounding boxes.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image.\n\n    Returns:\n        tuple[float, float]: center_x, center_y\n    \"\"\"\n    height, width = image_shape[:2]\n    return width / 2, height / 2\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.compute_tps_weights","title":"<code>def compute_tps_weights    (src_points, dst_points)    </code> [view source on GitHub]","text":"<p>Compute Thin Plate Spline weights.</p> <p>Parameters:</p> Name Type Description <code>src_points</code> <code>np.ndarray</code> <p>Source control points with shape (num_points, 2)</p> <code>dst_points</code> <code>np.ndarray</code> <p>Destination control points with shape (num_points, 2)</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)</li> <li>affine_weights: Weights for affine transformation (3, 2)     [constant term, x scale/shear, y scale/shear]</li> </ul> <p>Note</p> <p>The TPS interpolation is decomposed into: 1. Nonlinear part (controlled by kernel weights) 2. Affine part (global scaling, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_tps_weights(\n    src_points: np.ndarray,\n    dst_points: np.ndarray,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute Thin Plate Spline weights.\n\n    Args:\n        src_points: Source control points with shape (num_points, 2)\n        dst_points: Destination control points with shape (num_points, 2)\n\n    Returns:\n        tuple of:\n        - nonlinear_weights: TPS kernel weights for nonlinear deformation (num_points, 2)\n        - affine_weights: Weights for affine transformation (3, 2)\n            [constant term, x scale/shear, y scale/shear]\n\n    Note:\n        The TPS interpolation is decomposed into:\n        1. Nonlinear part (controlled by kernel weights)\n        2. Affine part (global scaling, rotation, translation)\n    \"\"\"\n    num_points = src_points.shape[0]\n\n    # Compute pairwise distances\n    distances = np.linalg.norm(src_points[:, None] - src_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    # Add small epsilon to avoid log(0)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Construct affine terms matrix [1, x, y]\n    affine_terms = np.ones((num_points, 3))\n    affine_terms[:, 1:] = src_points\n\n    # Build system matrix\n    system_matrix = np.zeros((num_points + 3, num_points + 3))\n    system_matrix[:num_points, :num_points] = kernel_matrix\n    system_matrix[:num_points, num_points:] = affine_terms\n    system_matrix[num_points:, :num_points] = affine_terms.T\n\n    # Right-hand side of the system\n    target_coords = np.zeros((num_points + 3, 2))\n    target_coords[:num_points] = dst_points\n\n    # Solve the system for both x and y coordinates\n    all_weights = np.linalg.solve(system_matrix, target_coords)\n\n    # Split weights into nonlinear and affine components\n    nonlinear_weights = all_weights[:num_points]\n    affine_weights = all_weights[num_points:]\n\n    return nonlinear_weights, affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.compute_transformed_image_bounds","title":"<code>def compute_transformed_image_bounds    (matrix, image_shape)    </code> [view source on GitHub]","text":"<p>Compute the bounds of an image after applying an affine transformation.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>The 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing:     - min_coords: An array with the minimum x and y coordinates.     - max_coords: An array with the maximum x and y coordinates.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def compute_transformed_image_bounds(\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Compute the bounds of an image after applying an affine transformation.\n\n    Args:\n        matrix (np.ndarray): The 3x3 affine transformation matrix.\n        image_shape (Tuple[int, int]): The shape of the image as (height, width).\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing:\n            - min_coords: An array with the minimum x and y coordinates.\n            - max_coords: An array with the maximum x and y coordinates.\n    \"\"\"\n    height, width = image_shape[:2]\n\n    # Define the corners of the image\n    corners = np.array([[0, 0, 1], [width, 0, 1], [width, height, 1], [0, height, 1]])\n\n    # Transform the corners\n    transformed_corners = corners @ matrix.T\n    transformed_corners = transformed_corners[:, :2] / transformed_corners[:, 2:]\n\n    # Calculate the bounding box of the transformed corners\n    min_coords = np.floor(transformed_corners.min(axis=0)).astype(int)\n    max_coords = np.ceil(transformed_corners.max(axis=0)).astype(int)\n\n    return min_coords, max_coords\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.create_affine_transformation_matrix","title":"<code>def create_affine_transformation_matrix    (translate, shear, scale, rotate, shift)    </code> [view source on GitHub]","text":"<p>Create an affine transformation matrix combining translation, shear, scale, and rotation.</p> <p>Parameters:</p> Name Type Description <code>translate</code> <code>dict[str, float]</code> <p>Translation in x and y directions.</p> <code>shear</code> <code>dict[str, float]</code> <p>Shear in x and y directions (in degrees).</p> <code>scale</code> <code>dict[str, float]</code> <p>Scale factors for x and y directions.</p> <code>rotate</code> <code>float</code> <p>Rotation angle in degrees.</p> <code>shift</code> <code>tuple[float, float]</code> <p>Shift to apply before and after transformations.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The resulting 3x3 affine transformation matrix.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_affine_transformation_matrix(\n    translate: XYInt,\n    shear: XYFloat,\n    scale: XYFloat,\n    rotate: float,\n    shift: tuple[float, float],\n) -&gt; np.ndarray:\n    \"\"\"Create an affine transformation matrix combining translation, shear, scale, and rotation.\n\n    Args:\n        translate (dict[str, float]): Translation in x and y directions.\n        shear (dict[str, float]): Shear in x and y directions (in degrees).\n        scale (dict[str, float]): Scale factors for x and y directions.\n        rotate (float): Rotation angle in degrees.\n        shift (tuple[float, float]): Shift to apply before and after transformations.\n\n    Returns:\n        np.ndarray: The resulting 3x3 affine transformation matrix.\n    \"\"\"\n    # Convert angles to radians\n    rotate_rad = np.deg2rad(rotate % 360)\n\n    shear_x_rad = np.deg2rad(shear[\"x\"])\n    shear_y_rad = np.deg2rad(shear[\"y\"])\n\n    # Create individual transformation matrices\n    # 1. Shift to top-left\n    m_shift_topleft = np.array([[1, 0, -shift[0]], [0, 1, -shift[1]], [0, 0, 1]])\n\n    # 2. Scale\n    m_scale = np.array([[scale[\"x\"], 0, 0], [0, scale[\"y\"], 0], [0, 0, 1]])\n\n    # 3. Rotation\n    m_rotate = np.array(\n        [\n            [np.cos(rotate_rad), np.sin(rotate_rad), 0],\n            [-np.sin(rotate_rad), np.cos(rotate_rad), 0],\n            [0, 0, 1],\n        ],\n    )\n\n    # 4. Shear\n    m_shear = np.array(\n        [[1, np.tan(shear_x_rad), 0], [np.tan(shear_y_rad), 1, 0], [0, 0, 1]],\n    )\n\n    # 5. Translation\n    m_translate = np.array([[1, 0, translate[\"x\"]], [0, 1, translate[\"y\"]], [0, 0, 1]])\n\n    # 6. Shift back to center\n    m_shift_center = np.array([[1, 0, shift[0]], [0, 1, shift[1]], [0, 0, 1]])\n\n    # Combine all transformations\n    # The order is important: transformations are applied from right to left\n    m = m_shift_center @ m_translate @ m_shear @ m_rotate @ m_scale @ m_shift_topleft\n\n    # Ensure the last row is exactly [0, 0, 1]\n    m[2] = [0, 0, 1]\n\n    return m\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.create_piecewise_affine_maps","title":"<code>def create_piecewise_affine_maps    (image_shape, grid, scale, absolute_scale, random_generator)    </code> [view source on GitHub]","text":"<p>Create maps for piecewise affine transformation using OpenCV's remap function.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_piecewise_affine_maps(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    scale: float,\n    absolute_scale: bool,\n    random_generator: np.random.Generator,\n) -&gt; tuple[np.ndarray | None, np.ndarray | None]:\n    \"\"\"Create maps for piecewise affine transformation using OpenCV's remap function.\"\"\"\n    height, width = image_shape[:2]\n    nb_rows, nb_cols = grid\n\n    # Input validation\n    if height &lt;= 0 or width &lt;= 0 or nb_rows &lt;= 0 or nb_cols &lt;= 0:\n        raise ValueError(\"Dimensions must be positive\")\n    if scale &lt;= 0:\n        return None, None\n\n    # Create source points grid\n    y = np.linspace(0, height - 1, nb_rows, dtype=np.float32)\n    x = np.linspace(0, width - 1, nb_cols, dtype=np.float32)\n    xx_src, yy_src = np.meshgrid(x, y)\n\n    # Initialize destination maps at full resolution\n    map_x = np.zeros((height, width), dtype=np.float32)\n    map_y = np.zeros((height, width), dtype=np.float32)\n\n    # Generate jitter for control points\n    jitter_scale = scale / 3 if absolute_scale else scale * min(width, height) / 3\n\n    jitter = random_generator.normal(0, jitter_scale, (nb_rows, nb_cols, 2)).astype(\n        np.float32,\n    )\n\n    # Create control points with jitter\n    control_points = np.zeros((nb_rows * nb_cols, 4), dtype=np.float32)\n    for i in range(nb_rows):\n        for j in range(nb_cols):\n            idx = i * nb_cols + j\n            # Source points\n            control_points[idx, 0] = xx_src[i, j]\n            control_points[idx, 1] = yy_src[i, j]\n            # Destination points with jitter\n            control_points[idx, 2] = np.clip(\n                xx_src[i, j] + jitter[i, j, 1],\n                0,\n                width - 1,\n            )\n            control_points[idx, 3] = np.clip(\n                yy_src[i, j] + jitter[i, j, 0],\n                0,\n                height - 1,\n            )\n\n    # Create full resolution maps\n    for i in range(height):\n        for j in range(width):\n            # Find nearest control points and interpolate\n            dx = j - control_points[:, 0]\n            dy = i - control_points[:, 1]\n            dist = dx * dx + dy * dy\n            weights = 1 / (dist + 1e-8)\n            weights = weights / np.sum(weights)\n\n            map_x[i, j] = np.sum(weights * control_points[:, 2])\n            map_y[i, j] = np.sum(weights * control_points[:, 3])\n\n    # Ensure output is within bounds\n    map_x = np.clip(map_x, 0, width - 1, out=map_x)\n    map_y = np.clip(map_y, 0, height - 1, out=map_y)\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.create_shape_groups","title":"<code>def create_shape_groups    (tiles)    </code> [view source on GitHub]","text":"<p>Groups tiles by their shape and stores the indices for each shape.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def create_shape_groups(tiles: np.ndarray) -&gt; dict[tuple[int, int], list[int]]:\n    \"\"\"Groups tiles by their shape and stores the indices for each shape.\"\"\"\n    shape_groups = defaultdict(list)\n    for index, (start_y, start_x, end_y, end_x) in enumerate(tiles):\n        shape = (end_y - start_y, end_x - start_x)\n        shape_groups[shape].append(index)\n    return shape_groups\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.d4","title":"<code>def d4    (img, group_member)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to an image array.</p> <p>This function manipulates an image using transformations such as rotations and flips, corresponding to the <code>D_4</code> dihedral group symmetry operations. Each transformation is identified by a unique group member code.</p> <ul> <li>img (np.ndarray): The input image array to transform.</li> <li>group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:</li> <li>'e': Identity (no transformation).</li> <li>'r90': Rotate 90 degrees counterclockwise.</li> <li>'r180': Rotate 180 degrees.</li> <li>'r270': Rotate 270 degrees counterclockwise.</li> <li>'v': Vertical flip.</li> <li>'hvt': Transpose over second diagonal</li> <li>'h': Horizontal flip.</li> <li>'t': Transpose (reflect over the main diagonal).</li> </ul> <ul> <li>np.ndarray: The transformed image array.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified.</li> </ul> <p>Examples:</p> <ul> <li>Rotating an image by 90 degrees:   <code>transformed_image = d4(original_image, 'r90')</code></li> <li>Applying a horizontal flip to an image:   <code>transformed_image = d4(original_image, 'h')</code></li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def d4(img: np.ndarray, group_member: D4Type) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to an image array.\n\n    This function manipulates an image using transformations such as rotations and flips,\n    corresponding to the `D_4` dihedral group symmetry operations.\n    Each transformation is identified by a unique group member code.\n\n    Parameters:\n    - img (np.ndarray): The input image array to transform.\n    - group_member (D4Type): A string identifier indicating the specific transformation to apply. Valid codes include:\n      - 'e': Identity (no transformation).\n      - 'r90': Rotate 90 degrees counterclockwise.\n      - 'r180': Rotate 180 degrees.\n      - 'r270': Rotate 270 degrees counterclockwise.\n      - 'v': Vertical flip.\n      - 'hvt': Transpose over second diagonal\n      - 'h': Horizontal flip.\n      - 't': Transpose (reflect over the main diagonal).\n\n    Returns:\n    - np.ndarray: The transformed image array.\n\n    Raises:\n    - ValueError: If an invalid group member is specified.\n\n    Examples:\n    - Rotating an image by 90 degrees:\n      `transformed_image = d4(original_image, 'r90')`\n    - Applying a horizontal flip to an image:\n      `transformed_image = d4(original_image, 'h')`\n    \"\"\"\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: rot90(x, 1),  # Rotate 90 degrees\n        \"r180\": lambda x: rot90(x, 2),  # Rotate 180 degrees\n        \"r270\": lambda x: rot90(x, 3),  # Rotate 270 degrees\n        \"v\": vflip,  # Vertical flip\n        \"hvt\": lambda x: transpose(rot90(x, 2)),  # Reflect over anti-diagonal\n        \"h\": hflip,  # Horizontal flip\n        \"t\": transpose,  # Transpose (reflect over main diagonal)\n    }\n\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](img)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.distort_image","title":"<code>def distort_image    (image, generated_mesh, interpolation)    </code> [view source on GitHub]","text":"<p>Apply perspective distortion to an image based on a generated mesh.</p> <p>This function applies a perspective transformation to each cell of the image defined by the generated mesh. The distortion is applied using OpenCV's perspective transformation and blending techniques.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>The input image to be distorted. Can be a 2D grayscale image or a                 3D color image.</p> <code>generated_mesh</code> <code>np.ndarray</code> <p>A 2D array where each row represents a quadrilateral cell                         as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].                         The first four values define the source rectangle, and the last eight values                         define the destination quadrilateral.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used in the perspective transformation.                  Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The distorted image with the same shape and dtype as the input image.</p> <p>Note</p> <ul> <li>The function preserves the channel dimension of the input image.</li> <li>Each cell of the generated mesh is transformed independently and then blended into the output image.</li> <li>The distortion is applied using perspective transformation, which allows for more complex   distortions compared to affine transformations.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n&gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n&gt;&gt;&gt; distorted.shape\n(100, 100, 3)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@preserve_channel_dim\ndef distort_image(\n    image: np.ndarray,\n    generated_mesh: np.ndarray,\n    interpolation: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply perspective distortion to an image based on a generated mesh.\n\n    This function applies a perspective transformation to each cell of the image defined by the\n    generated mesh. The distortion is applied using OpenCV's perspective transformation and\n    blending techniques.\n\n    Args:\n        image (np.ndarray): The input image to be distorted. Can be a 2D grayscale image or a\n                            3D color image.\n        generated_mesh (np.ndarray): A 2D array where each row represents a quadrilateral cell\n                                    as [x1, y1, x2, y2, dst_x1, dst_y1, dst_x2, dst_y2, dst_x3, dst_y3, dst_x4, dst_y4].\n                                    The first four values define the source rectangle, and the last eight values\n                                    define the destination quadrilateral.\n        interpolation (int): Interpolation method to be used in the perspective transformation.\n                             Should be one of the OpenCV interpolation flags (e.g., cv2.INTER_LINEAR).\n\n    Returns:\n        np.ndarray: The distorted image with the same shape and dtype as the input image.\n\n    Note:\n        - The function preserves the channel dimension of the input image.\n        - Each cell of the generated mesh is transformed independently and then blended into the output image.\n        - The distortion is applied using perspective transformation, which allows for more complex\n          distortions compared to affine transformations.\n\n    Example:\n        &gt;&gt;&gt; image = np.random.randint(0, 255, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; mesh = np.array([[0, 0, 50, 50, 5, 5, 45, 5, 45, 45, 5, 45]])\n        &gt;&gt;&gt; distorted = distort_image(image, mesh, cv2.INTER_LINEAR)\n        &gt;&gt;&gt; distorted.shape\n        (100, 100, 3)\n    \"\"\"\n    distorted_image = np.zeros_like(image)\n\n    for mesh in generated_mesh:\n        # Extract source rectangle and destination quadrilateral\n        x1, y1, x2, y2 = mesh[:4]  # Source rectangle\n        dst_quad = mesh[4:].reshape(4, 2)  # Destination quadrilateral\n\n        # Convert source rectangle to quadrilateral\n        src_quad = np.array(\n            [\n                [x1, y1],  # Top-left\n                [x2, y1],  # Top-right\n                [x2, y2],  # Bottom-right\n                [x1, y2],  # Bottom-left\n            ],\n            dtype=np.float32,\n        )\n\n        # Calculate Perspective transformation matrix\n        perspective_mat = cv2.getPerspectiveTransform(src_quad, dst_quad)\n\n        # Apply Perspective transformation\n        warped = cv2.warpPerspective(\n            image,\n            perspective_mat,\n            (image.shape[1], image.shape[0]),\n            flags=interpolation,\n        )\n\n        # Create mask for the transformed region\n        mask = np.zeros(image.shape[:2], dtype=np.uint8)\n        cv2.fillConvexPoly(mask, np.int32(dst_quad), 255)\n\n        # Copy only the warped quadrilateral area to the output image\n        distorted_image = cv2.copyTo(warped, mask, distorted_image)\n\n    return distorted_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.find_keypoint","title":"<code>def find_keypoint    (position, distance_map, threshold, inverted)    </code> [view source on GitHub]","text":"<p>Determine if a valid keypoint can be found at the given position.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def find_keypoint(\n    position: tuple[int, int],\n    distance_map: np.ndarray,\n    threshold: float | None,\n    inverted: bool,\n) -&gt; tuple[float, float] | None:\n    \"\"\"Determine if a valid keypoint can be found at the given position.\"\"\"\n    y, x = position\n    value = distance_map[y, x]\n    if not inverted and threshold is not None and value &gt;= threshold:\n        return None\n    if inverted and threshold is not None and value &lt;= threshold:\n        return None\n    return float(x), float(y)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.flip_bboxes","title":"<code>def flip_bboxes    (bboxes, flip_horizontal=False, flip_vertical=False, image_shape=(0, 0))    </code> [view source on GitHub]","text":"<p>Flip bounding boxes horizontally and/or vertically.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where each row is [x_min, y_min, x_max, y_max, ...].</p> <code>flip_horizontal</code> <code>bool</code> <p>Whether to flip horizontally.</p> <code>flip_vertical</code> <code>bool</code> <p>Whether to flip vertically.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Flipped bounding boxes.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef flip_bboxes(\n    bboxes: np.ndarray,\n    flip_horizontal: bool = False,\n    flip_vertical: bool = False,\n    image_shape: tuple[int, int] = (0, 0),\n) -&gt; np.ndarray:\n    \"\"\"Flip bounding boxes horizontally and/or vertically.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where each row is\n            [x_min, y_min, x_max, y_max, ...].\n        flip_horizontal (bool): Whether to flip horizontally.\n        flip_vertical (bool): Whether to flip vertically.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Flipped bounding boxes.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    flipped_bboxes = bboxes.copy()\n    if flip_horizontal:\n        flipped_bboxes[:, [0, 2]] = cols - flipped_bboxes[:, [2, 0]]\n    if flip_vertical:\n        flipped_bboxes[:, [1, 3]] = rows - flipped_bboxes[:, [3, 1]]\n    return flipped_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.from_distance_maps","title":"<code>def from_distance_maps    (distance_maps, inverted, if_not_found_coords=None, threshold=None)    </code> [view source on GitHub]","text":"<p>Convert distance maps back to keypoints coordinates.</p> <p>This function is the inverse of <code>to_distance_maps</code>. It takes distance maps generated for a set of keypoints and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps, and can handle cases where keypoints are not found or fall outside a specified threshold.</p> <p>Parameters:</p> Name Type Description <code>distance_maps</code> <code>np.ndarray</code> <p>A 3D numpy array of shape (height, width, nb_keypoints) containing distance maps for each keypoint. Each channel represents the distance map for one keypoint.</p> <code>inverted</code> <code>bool</code> <p>If True, treats the distance maps as inverted (where higher values indicate closer proximity to keypoints). If False, treats them as regular distance maps (where lower values indicate closer proximity).</p> <code>if_not_found_coords</code> <code>Sequence[int] | dict[str, Any] | None</code> <p>Coordinates to use for keypoints that are not found or fall outside the threshold. Can be: - None: Drop keypoints that are not found. - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints. - Dict with 'x' and 'y' keys: Use these values for not found keypoints. Defaults to None.</p> <code>threshold</code> <code>float | None</code> <p>A threshold value to determine valid keypoints. For inverted maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are considered valid. If None, all keypoints are considered valid. Defaults to None.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates of the reconstructed keypoints. If <code>drop_if_not_found</code> is True (derived from if_not_found_coords), the output may have fewer rows than input keypoints.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the input <code>distance_maps</code> is not a 3D array.</p> <p>Notes</p> <ul> <li>The function uses vectorized operations for improved performance, especially with large numbers of keypoints.</li> <li>When <code>threshold</code> is None, all keypoints are considered valid, and <code>if_not_found_coords</code> is not used.</li> <li>The function assumes that the input distance maps are properly normalized and scaled according to the   original image dimensions.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n&gt;&gt;&gt; inverted = True\n&gt;&gt;&gt; if_not_found_coords = [0, 0]\n&gt;&gt;&gt; threshold = 0.5\n&gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n&gt;&gt;&gt; print(keypoints.shape)\n(3, 2)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def from_distance_maps(\n    distance_maps: np.ndarray,\n    inverted: bool,\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None = None,\n    threshold: float | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Convert distance maps back to keypoints coordinates.\n\n    This function is the inverse of `to_distance_maps`. It takes distance maps generated for a set of keypoints\n    and reconstructs the original keypoint coordinates. The function supports both regular and inverted distance maps,\n    and can handle cases where keypoints are not found or fall outside a specified threshold.\n\n    Args:\n        distance_maps (np.ndarray): A 3D numpy array of shape (height, width, nb_keypoints) containing\n            distance maps for each keypoint. Each channel represents the distance map for one keypoint.\n        inverted (bool): If True, treats the distance maps as inverted (where higher values indicate\n            closer proximity to keypoints). If False, treats them as regular distance maps (where lower\n            values indicate closer proximity).\n        if_not_found_coords (Sequence[int] | dict[str, Any] | None, optional): Coordinates to use for\n            keypoints that are not found or fall outside the threshold. Can be:\n            - None: Drop keypoints that are not found.\n            - Sequence of two integers: Use these as (x, y) coordinates for not found keypoints.\n            - Dict with 'x' and 'y' keys: Use these values for not found keypoints.\n            Defaults to None.\n        threshold (float | None, optional): A threshold value to determine valid keypoints. For inverted\n            maps, values &gt;= threshold are considered valid. For regular maps, values &lt;= threshold are\n            considered valid. If None, all keypoints are considered valid. Defaults to None.\n\n    Returns:\n        np.ndarray: A 2D numpy array of shape (nb_keypoints, 2) containing the (x, y) coordinates\n        of the reconstructed keypoints. If `drop_if_not_found` is True (derived from if_not_found_coords),\n        the output may have fewer rows than input keypoints.\n\n    Raises:\n        ValueError: If the input `distance_maps` is not a 3D array.\n\n    Notes:\n        - The function uses vectorized operations for improved performance, especially with large numbers of keypoints.\n        - When `threshold` is None, all keypoints are considered valid, and `if_not_found_coords` is not used.\n        - The function assumes that the input distance maps are properly normalized and scaled according to the\n          original image dimensions.\n\n    Example:\n        &gt;&gt;&gt; distance_maps = np.random.rand(100, 100, 3)  # 3 keypoints\n        &gt;&gt;&gt; inverted = True\n        &gt;&gt;&gt; if_not_found_coords = [0, 0]\n        &gt;&gt;&gt; threshold = 0.5\n        &gt;&gt;&gt; keypoints = from_distance_maps(distance_maps, inverted, if_not_found_coords, threshold)\n        &gt;&gt;&gt; print(keypoints.shape)\n        (3, 2)\n    \"\"\"\n    if distance_maps.ndim != NUM_MULTI_CHANNEL_DIMENSIONS:\n        msg = f\"Expected three-dimensional input, got {distance_maps.ndim} dimensions and shape {distance_maps.shape}.\"\n        raise ValueError(msg)\n    height, width, nb_keypoints = distance_maps.shape\n\n    drop_if_not_found, if_not_found_x, if_not_found_y = validate_if_not_found_coords(\n        if_not_found_coords,\n    )\n\n    # Find the indices of max/min values for all keypoints at once\n    if inverted:\n        hitidx_flat = np.argmax(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n    else:\n        hitidx_flat = np.argmin(\n            distance_maps.reshape(height * width, nb_keypoints),\n            axis=0,\n        )\n\n    # Convert flat indices to 2D coordinates\n    hitidx_y, hitidx_x = np.unravel_index(hitidx_flat, (height, width))\n\n    # Create keypoints array\n    keypoints = np.column_stack((hitidx_x, hitidx_y)).astype(float)\n\n    if threshold is not None:\n        # Check threshold condition\n        if inverted:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &gt;= threshold\n        else:\n            valid_mask = distance_maps[hitidx_y, hitidx_x, np.arange(nb_keypoints)] &lt;= threshold\n\n        if not drop_if_not_found:\n            # Replace invalid keypoints with if_not_found_coords\n            keypoints[~valid_mask] = [if_not_found_x, if_not_found_y]\n        else:\n            # Keep only valid keypoints\n            return keypoints[valid_mask]\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_displacement_fields","title":"<code>def generate_displacement_fields    (image_shape, alpha, sigma, same_dxdy, kernel_size, random_generator, noise_distribution)    </code> [view source on GitHub]","text":"<p>Generate displacement fields for elastic transform.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width)</p> <code>alpha</code> <code>float</code> <p>Scaling factor for displacement</p> <code>sigma</code> <code>float</code> <p>Standard deviation for Gaussian blur</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use same displacement field for both directions</p> <code>kernel_size</code> <code>tuple[int, int]</code> <p>Size of Gaussian blur kernel</p> <code>random_generator</code> <code>np.random.Generator</code> <p>NumPy random number generator</p> <code>noise_distribution</code> <code>Literal['gaussian', 'uniform']</code> <p>Type of noise distribution to use (\"gaussian\" or \"uniform\")</p> <p>Returns:</p> Type Description <code>tuple</code> <p>(dx, dy) displacement fields</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_displacement_fields(\n    image_shape: tuple[int, int],\n    alpha: float,\n    sigma: float,\n    same_dxdy: bool,\n    kernel_size: tuple[int, int],\n    random_generator: np.random.Generator,\n    noise_distribution: Literal[\"gaussian\", \"uniform\"],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate displacement fields for elastic transform.\n\n    Args:\n        image_shape: Shape of the image (height, width)\n        alpha: Scaling factor for displacement\n        sigma: Standard deviation for Gaussian blur\n        same_dxdy: Whether to use same displacement field for both directions\n        kernel_size: Size of Gaussian blur kernel\n        random_generator: NumPy random number generator\n        noise_distribution: Type of noise distribution to use (\"gaussian\" or \"uniform\")\n\n    Returns:\n        tuple: (dx, dy) displacement fields\n    \"\"\"\n\n    def generate_noise_field() -&gt; np.ndarray:\n        # Generate noise based on distribution type\n        if noise_distribution == \"gaussian\":\n            field = random_generator.standard_normal(size=image_shape[:2])\n        else:  # uniform\n            field = random_generator.uniform(low=-1, high=1, size=image_shape[:2])\n\n        # Common operations for both distributions\n        field = field.astype(np.float32)\n        cv2.GaussianBlur(field, kernel_size, sigma, dst=field)\n        return field * alpha\n\n    # Generate first displacement field\n    dx = generate_noise_field()\n\n    # Generate or copy second displacement field\n    dy = dx if same_dxdy else generate_noise_field()\n\n    return dx, dy\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_distorted_grid_polygons","title":"<code>def generate_distorted_grid_polygons    (dimensions, magnitude, random_generator)    </code> [view source on GitHub]","text":"<p>Generate distorted grid polygons based on input dimensions and magnitude.</p> <p>This function creates a grid of polygons and applies random distortions to the internal vertices, while keeping the boundary vertices fixed. The distortion is applied consistently across shared vertices to avoid gaps or overlaps in the resulting grid.</p> <p>Parameters:</p> Name Type Description <code>dimensions</code> <code>np.ndarray</code> <p>A 3D array of shape (grid_height, grid_width, 4) where each element                      is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. The actual displacement              will be randomly chosen in the range [-magnitude, magnitude].</p> <code>random_generator</code> <code>np.random.Generator</code> <p>A random number generator.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D array of shape (total_cells, 8) where each row represents a distorted polygon             as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.</p> <p>Note</p> <ul> <li>Only internal grid points are distorted; boundary points remain fixed.</li> <li>The function ensures consistent distortion across shared vertices of adjacent cells.</li> <li>The distortion is applied to the following points of each internal cell:<ul> <li>Bottom-right of the cell above and to the left</li> <li>Bottom-left of the cell above</li> <li>Top-right of the cell to the left</li> <li>Top-left of the current cell</li> </ul> </li> <li>Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.     +--+--+--+--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--X--X--X--+     |  |  |  |  |     +--+--+--+--+</li> <li>For each X, the coordinates of the left, right, top, and bottom edges   in the four adjacent cells are displaced.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n&gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n&gt;&gt;&gt; distorted.shape\n(4, 8)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_distorted_grid_polygons(\n    dimensions: np.ndarray,\n    magnitude: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate distorted grid polygons based on input dimensions and magnitude.\n\n    This function creates a grid of polygons and applies random distortions to the internal vertices,\n    while keeping the boundary vertices fixed. The distortion is applied consistently across shared\n    vertices to avoid gaps or overlaps in the resulting grid.\n\n    Args:\n        dimensions (np.ndarray): A 3D array of shape (grid_height, grid_width, 4) where each element\n                                 is [x_min, y_min, x_max, y_max] representing the dimensions of a grid cell.\n        magnitude (int): Maximum pixel-wise displacement for distortion. The actual displacement\n                         will be randomly chosen in the range [-magnitude, magnitude].\n        random_generator (np.random.Generator): A random number generator.\n\n    Returns:\n        np.ndarray: A 2D array of shape (total_cells, 8) where each row represents a distorted polygon\n                    as [x1, y1, x2, y1, x2, y2, x1, y2]. The total_cells is equal to grid_height * grid_width.\n\n    Note:\n        - Only internal grid points are distorted; boundary points remain fixed.\n        - The function ensures consistent distortion across shared vertices of adjacent cells.\n        - The distortion is applied to the following points of each internal cell:\n            * Bottom-right of the cell above and to the left\n            * Bottom-left of the cell above\n            * Top-right of the cell to the left\n            * Top-left of the current cell\n        - Each square represents a cell, and the X marks indicate the coordinates where displacement occurs.\n            +--+--+--+--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--X--X--X--+\n            |  |  |  |  |\n            +--+--+--+--+\n        - For each X, the coordinates of the left, right, top, and bottom edges\n          in the four adjacent cells are displaced.\n\n    Example:\n        &gt;&gt;&gt; dimensions = np.array([[[0, 0, 50, 50], [50, 0, 100, 50]],\n        ...                        [[0, 50, 50, 100], [50, 50, 100, 100]]])\n        &gt;&gt;&gt; distorted = generate_distorted_grid_polygons(dimensions, magnitude=10)\n        &gt;&gt;&gt; distorted.shape\n        (4, 8)\n    \"\"\"\n    grid_height, grid_width = dimensions.shape[:2]\n    total_cells = grid_height * grid_width\n\n    # Initialize polygons\n    polygons = np.zeros((total_cells, 8), dtype=np.float32)\n    polygons[:, 0:2] = dimensions.reshape(-1, 4)[:, [0, 1]]  # x1, y1\n    polygons[:, 2:4] = dimensions.reshape(-1, 4)[:, [2, 1]]  # x2, y1\n    polygons[:, 4:6] = dimensions.reshape(-1, 4)[:, [2, 3]]  # x2, y2\n    polygons[:, 6:8] = dimensions.reshape(-1, 4)[:, [0, 3]]  # x1, y2\n\n    # Generate displacements for internal grid points only\n    internal_points_height, internal_points_width = grid_height - 1, grid_width - 1\n    displacements = random_generator.integers(\n        -magnitude,\n        magnitude + 1,\n        size=(internal_points_height, internal_points_width, 2),\n    ).astype(np.float32)\n\n    # Apply displacements to internal polygon vertices\n    for i in range(1, grid_height):\n        for j in range(1, grid_width):\n            dx, dy = displacements[i - 1, j - 1]\n\n            # Bottom-right of cell (i-1, j-1)\n            polygons[(i - 1) * grid_width + (j - 1), 4:6] += [dx, dy]\n\n            # Bottom-left of cell (i-1, j)\n            polygons[(i - 1) * grid_width + j, 6:8] += [dx, dy]\n\n            # Top-right of cell (i, j-1)\n            polygons[i * grid_width + (j - 1), 2:4] += [dx, dy]\n\n            # Top-left of cell (i, j)\n            polygons[i * grid_width + j, 0:2] += [dx, dy]\n\n    return polygons\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_grid","title":"<code>def generate_grid    (image_shape, steps_x, steps_y, num_steps)    </code> [view source on GitHub]","text":"<p>Generate a distorted grid for image transformation based on given step sizes.</p> <p>This function creates two 2D arrays (map_x and map_y) that represent a distorted version of the original image grid. These arrays can be used with OpenCV's remap function to apply grid distortion to an image.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>steps_x</code> <code>list[float]</code> <p>List of step sizes for the x-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the x direction.</p> <code>steps_y</code> <code>list[float]</code> <p>List of step sizes for the y-axis distortion. The length should be num_steps + 1. Each value represents the relative step size for a segment of the grid in the y direction.</p> <code>num_steps</code> <code>int</code> <p>The number of steps to divide each axis into. This determines the granularity of the distortion grid.</p> <p>Returns:</p> Type Description <code>tuple[np.ndarray, np.ndarray]</code> <p>A tuple containing two 2D numpy arrays:     - map_x: A 2D array of float32 values representing the x-coordinates       of the distorted grid.     - map_y: A 2D array of float32 values representing the y-coordinates       of the distorted grid.</p> <p>Note</p> <ul> <li>The function generates a grid where each cell can be distorted independently.</li> <li>The distortion is controlled by the steps_x and steps_y parameters, which   determine how much each grid line is shifted.</li> <li>The resulting map_x and map_y can be used directly with cv2.remap() to   apply the distortion to an image.</li> <li>The distortion is applied smoothly across each grid cell using linear   interpolation.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n&gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n&gt;&gt;&gt; num_steps = 5\n&gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n&gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_grid(\n    image_shape: tuple[int, int],\n    steps_x: list[float],\n    steps_y: list[float],\n    num_steps: int,\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate a distorted grid for image transformation based on given step sizes.\n\n    This function creates two 2D arrays (map_x and map_y) that represent a distorted version\n    of the original image grid. These arrays can be used with OpenCV's remap function to\n    apply grid distortion to an image.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        steps_x (list[float]): List of step sizes for the x-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the x direction.\n        steps_y (list[float]): List of step sizes for the y-axis distortion. The length\n            should be num_steps + 1. Each value represents the relative step size for\n            a segment of the grid in the y direction.\n        num_steps (int): The number of steps to divide each axis into. This determines\n            the granularity of the distortion grid.\n\n    Returns:\n        tuple[np.ndarray, np.ndarray]: A tuple containing two 2D numpy arrays:\n            - map_x: A 2D array of float32 values representing the x-coordinates\n              of the distorted grid.\n            - map_y: A 2D array of float32 values representing the y-coordinates\n              of the distorted grid.\n\n    Note:\n        - The function generates a grid where each cell can be distorted independently.\n        - The distortion is controlled by the steps_x and steps_y parameters, which\n          determine how much each grid line is shifted.\n        - The resulting map_x and map_y can be used directly with cv2.remap() to\n          apply the distortion to an image.\n        - The distortion is applied smoothly across each grid cell using linear\n          interpolation.\n\n    Example:\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; steps_x = [1.1, 0.9, 1.0, 1.2, 0.95, 1.05]\n        &gt;&gt;&gt; steps_y = [0.9, 1.1, 1.0, 1.1, 0.9, 1.0]\n        &gt;&gt;&gt; num_steps = 5\n        &gt;&gt;&gt; map_x, map_y = generate_grid(image_shape, steps_x, steps_y, num_steps)\n        &gt;&gt;&gt; distorted_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR)\n    \"\"\"\n    height, width = image_shape[:2]\n    x_step = width // num_steps\n    xx = np.zeros(width, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_x):\n        x = idx * x_step\n        start = int(x)\n        end = min(int(x) + x_step, width)\n        cur = prev + x_step * step\n        xx[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    y_step = height // num_steps\n    yy = np.zeros(height, np.float32)\n    prev = 0.0\n    for idx, step in enumerate(steps_y):\n        y = idx * y_step\n        start = int(y)\n        end = min(int(y) + y_step, height)\n        cur = prev + y_step * step\n        yy[start:end] = np.linspace(prev, cur, end - start)\n        prev = cur\n\n    return np.meshgrid(xx, yy)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_reflected_bboxes","title":"<code>def generate_reflected_bboxes    (bboxes, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected bounding boxes for the entire reflection grid.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Original bounding boxes.</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>Grid dimensions and original position.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted bounding boxes for the entire grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_bboxes(\n    bboxes: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected bounding boxes for the entire reflection grid.\n\n    Args:\n        bboxes (np.ndarray): Original bounding boxes.\n        grid_dims (dict[str, tuple[int, int]]): Grid dimensions and original position.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted bounding boxes for the entire grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of bboxes\n    bboxes_hflipped = flip_bboxes(bboxes, flip_horizontal=True, image_shape=image_shape)\n    bboxes_vflipped = flip_bboxes(bboxes, flip_vertical=True, image_shape=image_shape)\n    bboxes_hvflipped = flip_bboxes(\n        bboxes,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [\n            original_col * cols,\n            original_row * rows,\n            original_col * cols,\n            original_row * rows,\n        ],\n    )\n    bboxes = shift_bboxes(bboxes, shift_vector)\n    bboxes_hflipped = shift_bboxes(bboxes_hflipped, shift_vector)\n    bboxes_vflipped = shift_bboxes(bboxes_vflipped, shift_vector)\n    bboxes_hvflipped = shift_bboxes(bboxes_hvflipped, shift_vector)\n\n    new_bboxes = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of bboxes to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes\n            elif (grid_row - original_row) % 2 == 0:\n                current_bboxes = bboxes_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_bboxes = bboxes_vflipped\n            else:\n                current_bboxes = bboxes_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                ],\n            )\n            shifted_bboxes = shift_bboxes(current_bboxes, cell_shift)\n\n            new_bboxes.append(shifted_bboxes)\n\n    result = np.vstack(new_bboxes)\n\n    return shift_bboxes(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_reflected_keypoints","title":"<code>def generate_reflected_keypoints    (keypoints, grid_dims, image_shape, center_in_origin=False)    </code> [view source on GitHub]","text":"<p>Generate reflected keypoints for the entire reflection grid.</p> <p>This function creates a grid of keypoints by reflecting and shifting the original keypoints. It handles both centered and non-centered grids based on the <code>center_in_origin</code> parameter.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Original keypoints array of shape (N, 4+), where N is the number of keypoints,                     and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).</p> <code>grid_dims</code> <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing grid dimensions and original position. It should have the following keys: - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols) - \"original_position\": tuple[int, int] representing (original_row, original_col)</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <code>center_in_origin</code> <code>bool</code> <p>If True, center the grid at the origin. Default is False.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of reflected and shifted keypoints for the entire grid. The shape is             (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.</p> <p>Note</p> <ul> <li>The function handles keypoint flipping and shifting to create a grid of reflected keypoints.</li> <li>It preserves the angle and scale information of the keypoints during transformations.</li> <li>The resulting grid can be either centered at the origin or positioned based on the original grid.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_reflected_keypoints(\n    keypoints: np.ndarray,\n    grid_dims: dict[str, tuple[int, int]],\n    image_shape: tuple[int, int],\n    center_in_origin: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate reflected keypoints for the entire reflection grid.\n\n    This function creates a grid of keypoints by reflecting and shifting the original keypoints.\n    It handles both centered and non-centered grids based on the `center_in_origin` parameter.\n\n    Args:\n        keypoints (np.ndarray): Original keypoints array of shape (N, 4+), where N is the number of keypoints,\n                                and each keypoint is represented by at least 4 values (x, y, angle, scale, ...).\n        grid_dims (dict[str, tuple[int, int]]): A dictionary containing grid dimensions and original position.\n            It should have the following keys:\n            - \"grid_shape\": tuple[int, int] representing (grid_rows, grid_cols)\n            - \"original_position\": tuple[int, int] representing (original_row, original_col)\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n        center_in_origin (bool, optional): If True, center the grid at the origin. Default is False.\n\n    Returns:\n        np.ndarray: Array of reflected and shifted keypoints for the entire grid. The shape is\n                    (N * grid_rows * grid_cols, 4+), where N is the number of original keypoints.\n\n    Note:\n        - The function handles keypoint flipping and shifting to create a grid of reflected keypoints.\n        - It preserves the angle and scale information of the keypoints during transformations.\n        - The resulting grid can be either centered at the origin or positioned based on the original grid.\n    \"\"\"\n    grid_rows, grid_cols = grid_dims[\"grid_shape\"]\n    original_row, original_col = grid_dims[\"original_position\"]\n\n    # Prepare flipped versions of keypoints\n    keypoints_hflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        image_shape=image_shape,\n    )\n    keypoints_vflipped = flip_keypoints(\n        keypoints,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n    keypoints_hvflipped = flip_keypoints(\n        keypoints,\n        flip_horizontal=True,\n        flip_vertical=True,\n        image_shape=image_shape,\n    )\n\n    rows, cols = image_shape[:2]\n\n    # Shift all versions to the original position\n    shift_vector = np.array(\n        [original_col * cols, original_row * rows, 0, 0, 0],\n    )  # Only shift x and y\n    keypoints = shift_keypoints(keypoints, shift_vector)\n    keypoints_hflipped = shift_keypoints(keypoints_hflipped, shift_vector)\n    keypoints_vflipped = shift_keypoints(keypoints_vflipped, shift_vector)\n    keypoints_hvflipped = shift_keypoints(keypoints_hvflipped, shift_vector)\n\n    new_keypoints = []\n\n    for grid_row in range(grid_rows):\n        for grid_col in range(grid_cols):\n            # Determine which version of keypoints to use based on grid position\n            if (grid_row - original_row) % 2 == 0 and (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints\n            elif (grid_row - original_row) % 2 == 0:\n                current_keypoints = keypoints_hflipped\n            elif (grid_col - original_col) % 2 == 0:\n                current_keypoints = keypoints_vflipped\n            else:\n                current_keypoints = keypoints_hvflipped\n\n            # Shift to the current grid cell\n            cell_shift = np.array(\n                [\n                    (grid_col - original_col) * cols,\n                    (grid_row - original_row) * rows,\n                    0,\n                    0,\n                    0,\n                ],\n            )\n            shifted_keypoints = shift_keypoints(current_keypoints, cell_shift)\n\n            new_keypoints.append(shifted_keypoints)\n\n    result = np.vstack(new_keypoints)\n\n    return shift_keypoints(result, -shift_vector) if center_in_origin else result\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.generate_shuffled_splits","title":"<code>def generate_shuffled_splits    (size, divisions, random_generator)    </code> [view source on GitHub]","text":"<p>Generate shuffled splits for a given dimension size and number of divisions.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>int</code> <p>Total size of the dimension (height or width).</p> <code>divisions</code> <code>int</code> <p>Number of divisions (rows or columns).</p> <code>random_generator</code> <code>np.random.Generator | None</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Cumulative edges of the shuffled intervals.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def generate_shuffled_splits(\n    size: int,\n    divisions: int,\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Generate shuffled splits for a given dimension size and number of divisions.\n\n    Args:\n        size (int): Total size of the dimension (height or width).\n        divisions (int): Number of divisions (rows or columns).\n        random_generator (np.random.Generator | None): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: Cumulative edges of the shuffled intervals.\n    \"\"\"\n    intervals = almost_equal_intervals(size, divisions)\n    random_generator.shuffle(intervals)\n    return np.insert(np.cumsum(intervals), 0, 0)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_camera_matrix_distortion_maps","title":"<code>def get_camera_matrix_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using camera matrix model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_camera_matrix_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using camera matrix model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n    camera_matrix = np.array(\n        [[width, 0, center_xy[0]], [0, height, center_xy[1]], [0, 0, 1]],\n        dtype=np.float32,\n    )\n    distortion = np.array([k, k, 0, 0, 0], dtype=np.float32)\n    return cv2.initUndistortRectifyMap(\n        camera_matrix,\n        distortion,\n        None,\n        None,\n        (width, height),\n        cv2.CV_32FC1,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_dimension_padding","title":"<code>def get_dimension_padding    (current_size, min_size, divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding for a single dimension.</p> <p>Parameters:</p> Name Type Description <code>current_size</code> <code>int</code> <p>Current size of the dimension</p> <code>min_size</code> <code>int | None</code> <p>Minimum size requirement, if any</p> <code>divisor</code> <code>int | None</code> <p>Divisor for padding to make size divisible, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int]</code> <p>(pad_before, pad_after)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_dimension_padding(\n    current_size: int,\n    min_size: int | None,\n    divisor: int | None,\n) -&gt; tuple[int, int]:\n    \"\"\"Calculate padding for a single dimension.\n\n    Args:\n        current_size: Current size of the dimension\n        min_size: Minimum size requirement, if any\n        divisor: Divisor for padding to make size divisible, if any\n\n    Returns:\n        tuple[int, int]: (pad_before, pad_after)\n    \"\"\"\n    if min_size is not None:\n        if current_size &lt; min_size:\n            pad_before = int((min_size - current_size) / 2.0)\n            pad_after = min_size - current_size - pad_before\n            return pad_before, pad_after\n    elif divisor is not None:\n        remainder = current_size % divisor\n        if remainder &gt; 0:\n            total_pad = divisor - remainder\n            pad_before = total_pad // 2\n            pad_after = total_pad - pad_before\n            return pad_before, pad_after\n\n    return 0, 0\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_fisheye_distortion_maps","title":"<code>def get_fisheye_distortion_maps    (image_shape, k, center_xy)    </code> [view source on GitHub]","text":"<p>Generate distortion maps using fisheye model.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>Image shape</p> <code>k</code> <code>float</code> <p>Distortion coefficient</p> <code>center_xy</code> <code>tuple[float, float]</code> <p>Center of distortion</p> <p>Returns:</p> Type Description <code>tuple of</code> <ul> <li>map_x: Horizontal displacement map</li> <li>map_y: Vertical displacement map</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_fisheye_distortion_maps(\n    image_shape: tuple[int, int],\n    k: float,\n    center_xy: tuple[float, float],\n) -&gt; tuple[np.ndarray, np.ndarray]:\n    \"\"\"Generate distortion maps using fisheye model.\n\n    Args:\n        image_shape: Image shape\n        k: Distortion coefficient\n        center_xy: Center of distortion\n    Returns:\n        tuple of:\n        - map_x: Horizontal displacement map\n        - map_y: Vertical displacement map\n    \"\"\"\n    height, width = image_shape[:2]\n\n    center_x, center_y = center_xy\n\n    # Create coordinate grid\n    y, x = np.mgrid[:height, :width].astype(np.float32)\n\n    x = x - center_x\n    y = y - center_y\n\n    # Calculate polar coordinates\n    r = np.sqrt(x * x + y * y)\n    theta = np.arctan2(y, x)\n\n    # Normalize radius by the maximum possible radius to keep distortion in check\n    max_radius = math.sqrt(max(center_x, width - center_x) ** 2 + max(center_y, height - center_y) ** 2)\n    r_norm = r / max_radius\n\n    # Apply fisheye distortion to normalized radius\n    r_dist = r * (1 + k * r_norm * r_norm)\n\n    # Convert back to cartesian coordinates\n    map_x = r_dist * np.cos(theta) + center_x\n    map_y = r_dist * np.sin(theta) + center_y\n\n    return map_x, map_y\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_pad_grid_dimensions","title":"<code>def get_pad_grid_dimensions    (pad_top, pad_bottom, pad_left, pad_right, image_shape)    </code> [view source on GitHub]","text":"<p>Calculate the dimensions of the grid needed for reflection padding and the position of the original image.</p> <p>Parameters:</p> Name Type Description <code>pad_top</code> <code>int</code> <p>Number of pixels to pad above the image.</p> <code>pad_bottom</code> <code>int</code> <p>Number of pixels to pad below the image.</p> <code>pad_left</code> <code>int</code> <p>Number of pixels to pad to the left of the image.</p> <code>pad_right</code> <code>int</code> <p>Number of pixels to pad to the right of the image.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the original image as (height, width).</p> <p>Returns:</p> Type Description <code>dict[str, tuple[int, int]]</code> <p>A dictionary containing:     - 'grid_shape': A tuple (grid_rows, grid_cols) where:         - grid_rows (int): Number of times the image needs to be repeated vertically.         - grid_cols (int): Number of times the image needs to be repeated horizontally.     - 'original_position': A tuple (original_row, original_col) where:         - original_row (int): Row index of the original image in the grid.         - original_col (int): Column index of the original image in the grid.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_pad_grid_dimensions(\n    pad_top: int,\n    pad_bottom: int,\n    pad_left: int,\n    pad_right: int,\n    image_shape: tuple[int, int],\n) -&gt; dict[str, tuple[int, int]]:\n    \"\"\"Calculate the dimensions of the grid needed for reflection padding and the position of the original image.\n\n    Args:\n        pad_top (int): Number of pixels to pad above the image.\n        pad_bottom (int): Number of pixels to pad below the image.\n        pad_left (int): Number of pixels to pad to the left of the image.\n        pad_right (int): Number of pixels to pad to the right of the image.\n        image_shape (tuple[int, int]): Shape of the original image as (height, width).\n\n    Returns:\n        dict[str, tuple[int, int]]: A dictionary containing:\n            - 'grid_shape': A tuple (grid_rows, grid_cols) where:\n                - grid_rows (int): Number of times the image needs to be repeated vertically.\n                - grid_cols (int): Number of times the image needs to be repeated horizontally.\n            - 'original_position': A tuple (original_row, original_col) where:\n                - original_row (int): Row index of the original image in the grid.\n                - original_col (int): Column index of the original image in the grid.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    grid_rows = 1 + math.ceil(pad_top / rows) + math.ceil(pad_bottom / rows)\n    grid_cols = 1 + math.ceil(pad_left / cols) + math.ceil(pad_right / cols)\n    original_row = math.ceil(pad_top / rows)\n    original_col = math.ceil(pad_left / cols)\n\n    return {\n        \"grid_shape\": (grid_rows, grid_cols),\n        \"original_position\": (original_row, original_col),\n    }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.get_padding_params","title":"<code>def get_padding_params    (image_shape, min_height, min_width, pad_height_divisor, pad_width_divisor)    </code> [view source on GitHub]","text":"<p>Calculate padding parameters based on target dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>(height, width) of the image</p> <code>min_height</code> <code>int | None</code> <p>Minimum height requirement, if any</p> <code>min_width</code> <code>int | None</code> <p>Minimum width requirement, if any</p> <code>pad_height_divisor</code> <code>int | None</code> <p>Divisor for height padding, if any</p> <code>pad_width_divisor</code> <code>int | None</code> <p>Divisor for width padding, if any</p> <p>Returns:</p> Type Description <code>tuple[int, int, int, int]</code> <p>(pad_top, pad_bottom, pad_left, pad_right)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def get_padding_params(\n    image_shape: tuple[int, int],\n    min_height: int | None,\n    min_width: int | None,\n    pad_height_divisor: int | None,\n    pad_width_divisor: int | None,\n) -&gt; tuple[int, int, int, int]:\n    \"\"\"Calculate padding parameters based on target dimensions.\n\n    Args:\n        image_shape: (height, width) of the image\n        min_height: Minimum height requirement, if any\n        min_width: Minimum width requirement, if any\n        pad_height_divisor: Divisor for height padding, if any\n        pad_width_divisor: Divisor for width padding, if any\n\n    Returns:\n        tuple[int, int, int, int]: (pad_top, pad_bottom, pad_left, pad_right)\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    h_pad_top, h_pad_bottom = get_dimension_padding(\n        rows,\n        min_height,\n        pad_height_divisor,\n    )\n    w_pad_left, w_pad_right = get_dimension_padding(cols, min_width, pad_width_divisor)\n\n    return h_pad_top, h_pad_bottom, w_pad_left, w_pad_right\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.is_identity_matrix","title":"<code>def is_identity_matrix    (matrix)    </code> [view source on GitHub]","text":"<p>Check if the given matrix is an identity matrix.</p> <p>Parameters:</p> Name Type Description <code>matrix</code> <code>np.ndarray</code> <p>A 3x3 affine transformation matrix.</p> <p>Returns:</p> Type Description <code>bool</code> <p>True if the matrix is an identity matrix, False otherwise.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_identity_matrix(matrix: np.ndarray) -&gt; bool:\n    \"\"\"Check if the given matrix is an identity matrix.\n\n    Args:\n        matrix (np.ndarray): A 3x3 affine transformation matrix.\n\n    Returns:\n        bool: True if the matrix is an identity matrix, False otherwise.\n    \"\"\"\n    return np.allclose(matrix, np.eye(3, dtype=matrix.dtype))\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.is_valid_component","title":"<code>def is_valid_component    (component_area, original_area, min_area, min_visibility)    </code> [view source on GitHub]","text":"<p>Validate if a component meets the minimum requirements.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def is_valid_component(\n    component_area: float,\n    original_area: float,\n    min_area: float | None,\n    min_visibility: float | None,\n) -&gt; bool:\n    \"\"\"Validate if a component meets the minimum requirements.\"\"\"\n    visibility = component_area / original_area\n    return (min_area is None or component_area &gt;= min_area) and (min_visibility is None or visibility &gt;= min_visibility)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_affine","title":"<code>def keypoints_affine    (keypoints, matrix, image_shape, scale, border_mode)    </code> [view source on GitHub]","text":"<p>Apply an affine transformation to keypoints.</p> <p>This function transforms keypoints using the given affine transformation matrix. It handles reflection padding if necessary, updates coordinates, angles, and scales.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 4+) where N is the number of keypoints.                     Each keypoint is represented as [x, y, angle, scale, ...].</p> <code>matrix</code> <code>np.ndarray</code> <p>The 2x3 or 3x3 affine transformation matrix.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image (height, width).</p> <code>scale</code> <code>dict[str, float]</code> <p>Dictionary containing scale factors for x and y directions.                       Expected keys are 'x' and 'y'.</p> <code>border_mode</code> <code>int</code> <p>Border mode for handling keypoints near image edges.                 Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed keypoints array with the same shape as input.</p> <p>Notes</p> <ul> <li>The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.</li> <li>Coordinates (x, y) are transformed using the affine matrix.</li> <li>Angles are adjusted based on the rotation component of the affine transformation.</li> <li>Scales are multiplied by the maximum of x and y scale factors.</li> <li>The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n&gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n&gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n&gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_affine(\n    keypoints: np.ndarray,\n    matrix: np.ndarray,\n    image_shape: tuple[int, int],\n    scale: XYFloat,\n    border_mode: int,\n) -&gt; np.ndarray:\n    \"\"\"Apply an affine transformation to keypoints.\n\n    This function transforms keypoints using the given affine transformation matrix.\n    It handles reflection padding if necessary, updates coordinates, angles, and scales.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 4+) where N is the number of keypoints.\n                                Each keypoint is represented as [x, y, angle, scale, ...].\n        matrix (np.ndarray): The 2x3 or 3x3 affine transformation matrix.\n        image_shape (tuple[int, int]): Shape of the image (height, width).\n        scale (dict[str, float]): Dictionary containing scale factors for x and y directions.\n                                  Expected keys are 'x' and 'y'.\n        border_mode (int): Border mode for handling keypoints near image edges.\n                            Use cv2.BORDER_REFLECT_101, cv2.BORDER_REFLECT, etc.\n\n    Returns:\n        np.ndarray: Transformed keypoints array with the same shape as input.\n\n    Notes:\n        - The function applies reflection padding if the mode is in REFLECT_BORDER_MODES.\n        - Coordinates (x, y) are transformed using the affine matrix.\n        - Angles are adjusted based on the rotation component of the affine transformation.\n        - Scales are multiplied by the maximum of x and y scale factors.\n        - The @angle_2pi_range decorator ensures angles remain in the [0, 2\u03c0] range.\n\n    Example:\n        &gt;&gt;&gt; keypoints = np.array([[100, 100, 0, 1]])\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0, 10], [0, 1.2, 20]])\n        &gt;&gt;&gt; scale = {'x': 1.5, 'y': 1.2}\n        &gt;&gt;&gt; transformed_keypoints = keypoints_affine(keypoints, matrix, (480, 640), scale, cv2.BORDER_REFLECT_101)\n    \"\"\"\n    keypoints = keypoints.copy().astype(np.float32)\n\n    if is_identity_matrix(matrix):\n        return keypoints\n\n    if border_mode in REFLECT_BORDER_MODES:\n        # Step 1: Compute affine transform padding\n        pad_left, pad_right, pad_top, pad_bottom = calculate_affine_transform_padding(\n            matrix,\n            image_shape,\n        )\n        grid_dimensions = get_pad_grid_dimensions(\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            image_shape,\n        )\n        keypoints = generate_reflected_keypoints(\n            keypoints,\n            grid_dimensions,\n            image_shape,\n            center_in_origin=True,\n        )\n\n    # Extract x, y coordinates (z is preserved)\n    xy = keypoints[:, :2]\n\n    # Ensure matrix is 2x3\n    if matrix.shape == (3, 3):\n        matrix = matrix[:2]\n\n    # Transform x, y coordinates\n    xy_transformed = cv2.transform(xy.reshape(-1, 1, 2), matrix).squeeze()\n\n    # Calculate angle adjustment\n    angle_adjustment = rotation2d_matrix_to_euler_angles(matrix[:2, :2], y_up=False)\n\n    # Update angles (now at index 3)\n    keypoints[:, 3] = keypoints[:, 3] + angle_adjustment\n\n    # Update scales (now at index 4)\n    max_scale = max(scale[\"x\"], scale[\"y\"])\n    keypoints[:, 4] *= max_scale\n\n    # Update x, y coordinates and preserve z\n    keypoints[:, :2] = xy_transformed\n\n    return keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_d4","title":"<code>def keypoints_d4    (keypoints, group_member, image_shape, ** params)    </code> [view source on GitHub]","text":"<p>Applies a <code>D_4</code> symmetry group transformation to a keypoint.</p> <p>This function adjusts a keypoint's coordinates according to the specified <code>D_4</code> group transformation, which includes rotations and reflections suitable for image processing tasks. These transformations account for the dimensions of the image to ensure the keypoint remains within its boundaries.</p> <ul> <li>keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...). -group_member (D4Type): A string identifier for the <code>D_4</code> group transformation to apply.     Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.</li> <li>image_shape (tuple[int, int]): The shape of the image.</li> <li>params (Any): Not used</li> </ul> <ul> <li>KeypointInternalType: The transformed keypoint.</li> </ul> <ul> <li>ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.</li> </ul> <p>Examples:</p> <ul> <li>Rotating a keypoint by 90 degrees in a 100x100 image:   <code>keypoint_d4((50, 30), 'r90', 100, 100)</code>   This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_d4(\n    keypoints: np.ndarray,\n    group_member: D4Type,\n    image_shape: tuple[int, int],\n    **params: Any,\n) -&gt; np.ndarray:\n    \"\"\"Applies a `D_4` symmetry group transformation to a keypoint.\n\n    This function adjusts a keypoint's coordinates according to the specified `D_4` group transformation,\n    which includes rotations and reflections suitable for image processing tasks. These transformations account\n    for the dimensions of the image to ensure the keypoint remains within its boundaries.\n\n    Parameters:\n    - keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n    -group_member (D4Type): A string identifier for the `D_4` group transformation to apply.\n        Valid values are 'e', 'r90', 'r180', 'r270', 'v', 'hv', 'h', 't'.\n    - image_shape (tuple[int, int]): The shape of the image.\n    - params (Any): Not used\n\n    Returns:\n    - KeypointInternalType: The transformed keypoint.\n\n    Raises:\n    - ValueError: If an invalid group member is specified, indicating that the specified transformation does not exist.\n\n    Examples:\n    - Rotating a keypoint by 90 degrees in a 100x100 image:\n      `keypoint_d4((50, 30), 'r90', 100, 100)`\n      This would move the keypoint from (50, 30) to (70, 50) assuming standard coordinate transformations.\n    \"\"\"\n    rows, cols = image_shape[:2]\n    transformations = {\n        \"e\": lambda x: x,  # Identity transformation\n        \"r90\": lambda x: keypoints_rot90(x, 1, image_shape),  # Rotate 90 degrees\n        \"r180\": lambda x: keypoints_rot90(x, 2, image_shape),  # Rotate 180 degrees\n        \"r270\": lambda x: keypoints_rot90(x, 3, image_shape),  # Rotate 270 degrees\n        \"v\": lambda x: keypoints_vflip(x, rows),  # Vertical flip\n        \"hvt\": lambda x: keypoints_transpose(\n            keypoints_rot90(x, 2, image_shape),\n        ),  # Reflect over anti diagonal\n        \"h\": lambda x: keypoints_hflip(x, cols),  # Horizontal flip\n        \"t\": lambda x: keypoints_transpose(x),  # Transpose (reflect over main diagonal)\n    }\n    # Execute the appropriate transformation\n    if group_member in transformations:\n        return transformations[group_member](keypoints)\n\n    raise ValueError(f\"Invalid group member: {group_member}\")\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_hflip","title":"<code>def keypoints_hflip    (keypoints, cols)    </code> [view source on GitHub]","text":"<p>Flip keypoints horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>cols</code> <code>int</code> <p>Image width.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_hflip(keypoints: np.ndarray, cols: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints horizontally around the y-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        cols: Image width.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip x-coordinates\n    flipped_keypoints[:, 0] = (cols - 1) - keypoints[:, 0]\n\n    # Adjust angles\n    flipped_keypoints[:, 3] = np.pi - keypoints[:, 3]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_rot90","title":"<code>def keypoints_rot90    (keypoints, factor, image_shape)    </code> [view source on GitHub]","text":"<p>Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).</p> <code>factor</code> <code>int</code> <p>The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The rotated keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_rot90(\n    keypoints: np.ndarray,\n    factor: Literal[0, 1, 2, 3],\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Rotate keypoints by 90 degrees counter-clockwise (CCW) a specified number of times.\n\n    Args:\n        keypoints (np.ndarray): An array of keypoints with shape (N, 4+) in the format (x, y, angle, scale, ...).\n        factor (int): The number of 90 degree CCW rotations to apply. Must be in the range [0, 3].\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n\n    Returns:\n        np.ndarray: The rotated keypoints with the same shape as the input.\n    \"\"\"\n    if factor == 0:\n        return keypoints\n\n    height, width = image_shape[:2]\n    rotated_keypoints = keypoints.copy().astype(np.float32)\n\n    x, y, angle = keypoints[:, 0], keypoints[:, 1], keypoints[:, 3]\n\n    if factor == 1:\n        rotated_keypoints[:, 0] = y\n        rotated_keypoints[:, 1] = width - 1 - x\n        rotated_keypoints[:, 3] = angle - np.pi / 2\n    elif factor == ROT90_180_FACTOR:\n        rotated_keypoints[:, 0] = width - 1 - x\n        rotated_keypoints[:, 1] = height - 1 - y\n        rotated_keypoints[:, 3] = angle - np.pi\n    elif factor == ROT90_270_FACTOR:\n        rotated_keypoints[:, 0] = height - 1 - y\n        rotated_keypoints[:, 1] = x\n        rotated_keypoints[:, 3] = angle + np.pi / 2\n\n    return rotated_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_scale","title":"<code>def keypoints_scale    (keypoints, scale_x, scale_y)    </code> [view source on GitHub]","text":"<p>Scales keypoints by scale_x and scale_y.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of keypoints with shape (N, 5+) in the format       (x, y, z, angle, scale, ...).</p> <code>scale_x</code> <code>float</code> <p>Scale coefficient x-axis.</p> <code>scale_y</code> <code>float</code> <p>Scale coefficient y-axis.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of scaled keypoints with the same shape as input. X and Y coordinates are scaled by their respective scale factors, Z coordinate remains unchanged, and the keypoint scale is multiplied by max(scale_x, scale_y).</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef keypoints_scale(\n    keypoints: np.ndarray,\n    scale_x: float,\n    scale_y: float,\n) -&gt; np.ndarray:\n    \"\"\"Scales keypoints by scale_x and scale_y.\n\n    Args:\n        keypoints: A numpy array of keypoints with shape (N, 5+) in the format\n                  (x, y, z, angle, scale, ...).\n        scale_x: Scale coefficient x-axis.\n        scale_y: Scale coefficient y-axis.\n\n    Returns:\n        A numpy array of scaled keypoints with the same shape as input.\n        X and Y coordinates are scaled by their respective scale factors,\n        Z coordinate remains unchanged, and the keypoint scale is multiplied\n        by max(scale_x, scale_y).\n    \"\"\"\n    # Extract x, y, z, angle, and scale\n    x, y, z, angle, scale = (\n        keypoints[:, 0],\n        keypoints[:, 1],\n        keypoints[:, 2],\n        keypoints[:, 3],\n        keypoints[:, 4],\n    )\n\n    # Scale x and y\n    x_scaled = x * scale_x\n    y_scaled = y * scale_y\n\n    # Scale the keypoint scale by the maximum of scale_x and scale_y\n    scale_scaled = scale * max(scale_x, scale_y)\n\n    # Create the output array\n    scaled_keypoints = np.column_stack([x_scaled, y_scaled, z, angle, scale_scaled])\n\n    # If there are additional columns, preserve them\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack(\n            [scaled_keypoints, keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:]],\n        )\n\n    return scaled_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_transpose","title":"<code>def keypoints_transpose    (keypoints)    </code> [view source on GitHub]","text":"<p>Transposes keypoints along the main diagonal.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transposed keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_transpose(keypoints: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes keypoints along the main diagonal.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n\n    Returns:\n        np.ndarray: An array of transposed keypoints with the same shape as the input.\n    \"\"\"\n    transposed_keypoints = keypoints.copy()\n\n    # Swap x and y coordinates\n    transposed_keypoints[:, [0, 1]] = keypoints[:, [1, 0]]\n\n    # Adjust angles to reflect the coordinate swap\n    angles = keypoints[:, 3]\n    transposed_keypoints[:, 3] = np.where(\n        angles &lt;= np.pi,\n        np.pi / 2 - angles,\n        3 * np.pi / 2 - angles,\n    )\n\n    return transposed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.keypoints_vflip","title":"<code>def keypoints_vflip    (keypoints, rows)    </code> [view source on GitHub]","text":"<p>Flip keypoints vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).</p> <code>rows</code> <code>int</code> <p>Image height.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of flipped keypoints with the same shape as the input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef keypoints_vflip(keypoints: np.ndarray, rows: int) -&gt; np.ndarray:\n    \"\"\"Flip keypoints vertically around the x-axis.\n\n    Args:\n        keypoints: A numpy array of shape (N, 4+) where each row represents a keypoint (x, y, angle, scale, ...).\n        rows: Image height.\n\n    Returns:\n        np.ndarray: An array of flipped keypoints with the same shape as the input.\n    \"\"\"\n    flipped_keypoints = keypoints.copy().astype(np.float32)\n\n    # Flip y-coordinates\n    flipped_keypoints[:, 1] = (rows - 1) - keypoints[:, 1]\n\n    # Negate angles\n    flipped_keypoints[:, 3] = -keypoints[:, 3]\n\n    return flipped_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.perspective_bboxes","title":"<code>def perspective_bboxes    (bboxes, image_shape, matrix, max_width, max_height, keep_size)    </code> [view source on GitHub]","text":"<p>Applies perspective transformation to bounding boxes.</p> <p>This function transforms bounding boxes using the given perspective transformation matrix. It handles bounding boxes with additional attributes beyond the standard coordinates.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>An array of bounding boxes with shape (num_bboxes, 4+).                  Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).                  Additional columns beyond the first 4 are preserved unchanged.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image (height, width).</p> <code>matrix</code> <code>np.ndarray</code> <p>The perspective transformation matrix.</p> <code>max_width</code> <code>int</code> <p>The maximum width of the output image.</p> <code>max_height</code> <code>int</code> <p>The maximum height of the output image.</p> <code>keep_size</code> <code>bool</code> <p>If True, maintains the original image size after transformation.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of transformed bounding boxes with the same shape as input.             The first 4 columns contain the transformed coordinates, and any             additional columns are preserved from the input.</p> <p>Note</p> <ul> <li>This function modifies only the coordinate columns (first 4) of the input bounding boxes.</li> <li>Any additional attributes (columns beyond the first 4) are kept unchanged.</li> <li>The function handles denormalization and renormalization of coordinates internally.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n&gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef perspective_bboxes(\n    bboxes: np.ndarray,\n    image_shape: tuple[int, int],\n    matrix: np.ndarray,\n    max_width: int,\n    max_height: int,\n    keep_size: bool,\n) -&gt; np.ndarray:\n    \"\"\"Applies perspective transformation to bounding boxes.\n\n    This function transforms bounding boxes using the given perspective transformation matrix.\n    It handles bounding boxes with additional attributes beyond the standard coordinates.\n\n    Args:\n        bboxes (np.ndarray): An array of bounding boxes with shape (num_bboxes, 4+).\n                             Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).\n                             Additional columns beyond the first 4 are preserved unchanged.\n        image_shape (tuple[int, int]): The shape of the image (height, width).\n        matrix (np.ndarray): The perspective transformation matrix.\n        max_width (int): The maximum width of the output image.\n        max_height (int): The maximum height of the output image.\n        keep_size (bool): If True, maintains the original image size after transformation.\n\n    Returns:\n        np.ndarray: An array of transformed bounding boxes with the same shape as input.\n                    The first 4 columns contain the transformed coordinates, and any\n                    additional columns are preserved from the input.\n\n    Note:\n        - This function modifies only the coordinate columns (first 4) of the input bounding boxes.\n        - Any additional attributes (columns beyond the first 4) are kept unchanged.\n        - The function handles denormalization and renormalization of coordinates internally.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.3, 0.3, 1], [0.5, 0.5, 0.8, 0.8, 2]])\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; matrix = np.array([[1.5, 0.2, -20], [-0.1, 1.3, -10], [0.002, 0.001, 1]])\n        &gt;&gt;&gt; transformed_bboxes = perspective_bboxes(bboxes, image_shape, matrix, 150, 150, False)\n    \"\"\"\n    height, width = image_shape[:2]\n    transformed_bboxes = bboxes.copy()\n    denormalized_coords = denormalize_bboxes(bboxes[:, :4], image_shape)\n\n    x_min, y_min, x_max, y_max = denormalized_coords.T\n    points = np.array(\n        [[x_min, y_min], [x_max, y_min], [x_max, y_max], [x_min, y_max]],\n    ).transpose(2, 0, 1)\n    points_reshaped = points.reshape(-1, 1, 2)\n\n    transformed_points = cv2.perspectiveTransform(\n        points_reshaped.astype(np.float32),\n        matrix,\n    )\n    transformed_points = transformed_points.reshape(-1, 4, 2)\n\n    new_coords = np.array(\n        [[np.min(box[:, 0]), np.min(box[:, 1]), np.max(box[:, 0]), np.max(box[:, 1])] for box in transformed_points],\n    )\n\n    if keep_size:\n        scale_x, scale_y = width / max_width, height / max_height\n        new_coords[:, [0, 2]] *= scale_x\n        new_coords[:, [1, 3]] *= scale_y\n        output_shape = image_shape\n    else:\n        output_shape = (max_height, max_width)\n\n    normalized_coords = normalize_bboxes(new_coords, output_shape)\n    transformed_bboxes[:, :4] = normalized_coords\n\n    return transformed_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.perspective_keypoints","title":"<code>def perspective_keypoints    (keypoints, image_shape, matrix, max_width, max_height, keep_size)    </code> [view source on GitHub]","text":"<p>Apply perspective transformation to keypoints.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of shape (N, 5+) in format [x, y, z, angle, scale, ...].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Original image shape (height, width).</p> <code>matrix</code> <code>np.ndarray</code> <p>3x3 perspective transformation matrix.</p> <code>max_width</code> <code>int</code> <p>Maximum width after transformation.</p> <code>max_height</code> <code>int</code> <p>Maximum height after transformation.</p> <code>keep_size</code> <code>bool</code> <p>Whether to keep original size.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed keypoints array with same shape as input. Z coordinate remains unchanged through the transformation.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\n@angle_2pi_range\ndef perspective_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    matrix: np.ndarray,\n    max_width: int,\n    max_height: int,\n    keep_size: bool,\n) -&gt; np.ndarray:\n    \"\"\"Apply perspective transformation to keypoints.\n\n    Args:\n        keypoints: Array of shape (N, 5+) in format [x, y, z, angle, scale, ...].\n        image_shape: Original image shape (height, width).\n        matrix: 3x3 perspective transformation matrix.\n        max_width: Maximum width after transformation.\n        max_height: Maximum height after transformation.\n        keep_size: Whether to keep original size.\n\n    Returns:\n        Transformed keypoints array with same shape as input.\n        Z coordinate remains unchanged through the transformation.\n    \"\"\"\n    keypoints = keypoints.copy().astype(np.float32)\n\n    height, width = image_shape[:2]\n\n    x, y, z, angle, scale = (\n        keypoints[:, 0],\n        keypoints[:, 1],\n        keypoints[:, 2],\n        keypoints[:, 3],\n        keypoints[:, 4],\n    )\n\n    # Reshape keypoints for perspective transform\n    keypoint_vector = np.column_stack((x, y)).astype(np.float32).reshape(-1, 1, 2)\n\n    # Apply perspective transform\n    transformed_points = cv2.perspectiveTransform(keypoint_vector, matrix).squeeze()\n\n    # Unsqueeze if we have a single keypoint\n    if transformed_points.ndim == 1:\n        transformed_points = transformed_points[np.newaxis, :]\n\n    x, y = transformed_points[:, 0], transformed_points[:, 1]\n\n    # Update angles\n    angle += rotation2d_matrix_to_euler_angles(matrix[:2, :2], y_up=True)\n\n    # Calculate scale factors\n    scale_x = np.sign(matrix[0, 0]) * np.sqrt(matrix[0, 0] ** 2 + matrix[0, 1] ** 2)\n    scale_y = np.sign(matrix[1, 1]) * np.sqrt(matrix[1, 0] ** 2 + matrix[1, 1] ** 2)\n    scale *= max(scale_x, scale_y)\n\n    if keep_size:\n        scale_x = width / max_width\n        scale_y = height / max_height\n        x *= scale_x\n        y *= scale_y\n        scale *= max(scale_x, scale_y)\n\n    # Create the output array with unchanged z coordinate\n    transformed_keypoints = np.column_stack([x, y, z, angle, scale])\n\n    # If there are additional columns, preserve them\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack(\n            [\n                transformed_keypoints,\n                keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:],\n            ],\n        )\n\n    return transformed_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.rotation2d_matrix_to_euler_angles","title":"<code>def rotation2d_matrix_to_euler_angles    (matrix, y_up)    </code> [view source on GitHub]","text":"<p>matrix (np.ndarray): Rotation matrix y_up (bool): is Y axis looks up or down</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def rotation2d_matrix_to_euler_angles(matrix: np.ndarray, y_up: bool) -&gt; float:\n    \"\"\"Args:\n    matrix (np.ndarray): Rotation matrix\n    y_up (bool): is Y axis looks up or down\n\n    \"\"\"\n    if y_up:\n        return np.arctan2(matrix[1, 0], matrix[0, 0])\n    return np.arctan2(-matrix[1, 0], matrix[0, 0])\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.shift_bboxes","title":"<code>def shift_bboxes    (bboxes, shift_vector)    </code> [view source on GitHub]","text":"<p>Shift bounding boxes by a given vector.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, m) where n is the number of bboxes                  and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>shift_vector</code> <code>np.ndarray</code> <p>Vector to shift the bounding boxes by, with shape (4,) for                        [shift_x, shift_y, shift_x, shift_y].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Shifted bounding boxes with the same shape as input.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shift_bboxes(bboxes: np.ndarray, shift_vector: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Shift bounding boxes by a given vector.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, m) where n is the number of bboxes\n                             and m &gt;= 4. The first 4 columns are [x_min, y_min, x_max, y_max].\n        shift_vector (np.ndarray): Vector to shift the bounding boxes by, with shape (4,) for\n                                   [shift_x, shift_y, shift_x, shift_y].\n\n    Returns:\n        np.ndarray: Shifted bounding boxes with the same shape as input.\n    \"\"\"\n    # Create a copy of the input array to avoid modifying it in-place\n    shifted_bboxes = bboxes.copy()\n\n    # Add the shift vector to the first 4 columns\n    shifted_bboxes[:, :4] += shift_vector\n\n    return shifted_bboxes\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.shuffle_tiles_within_shape_groups","title":"<code>def shuffle_tiles_within_shape_groups    (shape_groups, random_generator)    </code> [view source on GitHub]","text":"<p>Shuffles indices within each group of similar shapes and creates a list where each index points to the index of the tile it should be mapped to.</p> <p>Parameters:</p> Name Type Description <code>shape_groups</code> <code>dict[tuple[int, int], list[int]]</code> <p>Groups of tile indices categorized by shape.</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the indices. If None, a new random generator will be used.</p> <p>Returns:</p> Type Description <code>list[int]</code> <p>A list where each index is mapped to the new index of the tile after shuffling.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def shuffle_tiles_within_shape_groups(\n    shape_groups: dict[tuple[int, int], list[int]],\n    random_generator: np.random.Generator,\n) -&gt; list[int]:\n    \"\"\"Shuffles indices within each group of similar shapes and creates a list where each\n    index points to the index of the tile it should be mapped to.\n\n    Args:\n        shape_groups (dict[tuple[int, int], list[int]]): Groups of tile indices categorized by shape.\n        random_generator (np.random.Generator): The random generator to use for shuffling the indices.\n            If None, a new random generator will be used.\n\n    Returns:\n        list[int]: A list where each index is mapped to the new index of the tile after shuffling.\n    \"\"\"\n    # Initialize the output list with the same size as the total number of tiles, filled with -1\n    num_tiles = sum(len(indices) for indices in shape_groups.values())\n    mapping = [-1] * num_tiles\n\n    # Prepare the random number generator\n\n    for indices in shape_groups.values():\n        shuffled_indices = indices.copy()\n        random_generator.shuffle(shuffled_indices)\n\n        for old, new in zip(indices, shuffled_indices):\n            mapping[old] = new\n\n    return mapping\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.split_uniform_grid","title":"<code>def split_uniform_grid    (image_shape, grid, random_generator)    </code> [view source on GitHub]","text":"<p>Splits an image shape into a uniform grid specified by the grid dimensions.</p> <p>Parameters:</p> Name Type Description <code>image_shape</code> <code>tuple[int, int]</code> <p>The shape of the image as (height, width).</p> <code>grid</code> <code>tuple[int, int]</code> <p>The grid size as (rows, columns).</p> <code>random_generator</code> <code>np.random.Generator</code> <p>The random generator to use for shuffling the splits. If None, the splits are not shuffled.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).</p> <p>Note</p> <p>The function uses <code>generate_shuffled_splits</code> to generate the splits for the height and width of the image. The splits are then used to calculate the coordinates of the tiles.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def split_uniform_grid(\n    image_shape: tuple[int, int],\n    grid: tuple[int, int],\n    random_generator: np.random.Generator,\n) -&gt; np.ndarray:\n    \"\"\"Splits an image shape into a uniform grid specified by the grid dimensions.\n\n    Args:\n        image_shape (tuple[int, int]): The shape of the image as (height, width).\n        grid (tuple[int, int]): The grid size as (rows, columns).\n        random_generator (np.random.Generator): The random generator to use for shuffling the splits.\n            If None, the splits are not shuffled.\n\n    Returns:\n        np.ndarray: An array containing the tiles' coordinates in the format (start_y, start_x, end_y, end_x).\n\n    Note:\n        The function uses `generate_shuffled_splits` to generate the splits for the height and width of the image.\n        The splits are then used to calculate the coordinates of the tiles.\n    \"\"\"\n    n_rows, n_cols = grid\n\n    height_splits = generate_shuffled_splits(\n        image_shape[0],\n        grid[0],\n        random_generator=random_generator,\n    )\n    width_splits = generate_shuffled_splits(\n        image_shape[1],\n        grid[1],\n        random_generator=random_generator,\n    )\n\n    # Calculate tiles coordinates\n    tiles = [\n        (height_splits[i], width_splits[j], height_splits[i + 1], width_splits[j + 1])\n        for i in range(n_rows)\n        for j in range(n_cols)\n    ]\n\n    return np.array(tiles, dtype=np.int16)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.swap_tiles_on_image","title":"<code>def swap_tiles_on_image    (image, tiles, mapping=None)    </code> [view source on GitHub]","text":"<p>Swap tiles on the image according to the new format.</p> <p>Parameters:</p> Name Type Description <code>image</code> <code>np.ndarray</code> <p>Input image.</p> <code>tiles</code> <code>np.ndarray</code> <p>Array of tiles with each tile as [start_y, start_x, end_y, end_x].</p> <code>mapping</code> <code>list[int] | None</code> <p>list of new tile indices.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Output image with tiles swapped according to the random shuffle.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_image(\n    image: np.ndarray,\n    tiles: np.ndarray,\n    mapping: list[int] | None = None,\n) -&gt; np.ndarray:\n    \"\"\"Swap tiles on the image according to the new format.\n\n    Args:\n        image: Input image.\n        tiles: Array of tiles with each tile as [start_y, start_x, end_y, end_x].\n        mapping: list of new tile indices.\n\n    Returns:\n        np.ndarray: Output image with tiles swapped according to the random shuffle.\n    \"\"\"\n    # If no tiles are provided, return a copy of the original image\n    if tiles.size == 0 or mapping is None:\n        return image.copy()\n\n    # Create a copy of the image to retain original for reference\n    new_image = np.empty_like(image)\n    for num, new_index in enumerate(mapping):\n        start_y, start_x, end_y, end_x = tiles[new_index]\n        start_y_orig, start_x_orig, end_y_orig, end_x_orig = tiles[num]\n        # Assign the corresponding tile from the original image to the new image\n        new_image[start_y:end_y, start_x:end_x] = image[\n            start_y_orig:end_y_orig,\n            start_x_orig:end_x_orig,\n        ]\n\n    return new_image\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.swap_tiles_on_keypoints","title":"<code>def swap_tiles_on_keypoints    (keypoints, tiles, mapping)    </code> [view source on GitHub]","text":"<p>Swap the positions of keypoints based on a tile mapping.</p> <p>This function takes a set of keypoints and repositions them according to a mapping of tile swaps. Keypoints are moved from their original tiles to new positions in the swapped tiles.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (N, 2) where N is the number of keypoints.                     Each row represents a keypoint's (x, y) coordinates.</p> <code>tiles</code> <code>np.ndarray</code> <p>A 2D numpy array of shape (M, 4) where M is the number of tiles.                 Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.</p> <code>mapping</code> <code>np.ndarray</code> <p>A 1D numpy array of shape (M,) where M is the number of tiles.                   Each element i contains the index of the tile that tile i should be swapped with.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 2D numpy array of the same shape as the input keypoints, containing the new positions             of the keypoints after the tile swap.</p> <p>Exceptions:</p> Type Description <code>RuntimeWarning</code> <p>If any keypoint is not found within any tile.</p> <p>Notes</p> <ul> <li>Keypoints that do not fall within any tile will remain unchanged.</li> <li>The function assumes that the tiles do not overlap and cover the entire image space.</li> </ul> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def swap_tiles_on_keypoints(\n    keypoints: np.ndarray,\n    tiles: np.ndarray,\n    mapping: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Swap the positions of keypoints based on a tile mapping.\n\n    This function takes a set of keypoints and repositions them according to a mapping of tile swaps.\n    Keypoints are moved from their original tiles to new positions in the swapped tiles.\n\n    Args:\n        keypoints (np.ndarray): A 2D numpy array of shape (N, 2) where N is the number of keypoints.\n                                Each row represents a keypoint's (x, y) coordinates.\n        tiles (np.ndarray): A 2D numpy array of shape (M, 4) where M is the number of tiles.\n                            Each row represents a tile's (start_y, start_x, end_y, end_x) coordinates.\n        mapping (np.ndarray): A 1D numpy array of shape (M,) where M is the number of tiles.\n                              Each element i contains the index of the tile that tile i should be swapped with.\n\n    Returns:\n        np.ndarray: A 2D numpy array of the same shape as the input keypoints, containing the new positions\n                    of the keypoints after the tile swap.\n\n    Raises:\n        RuntimeWarning: If any keypoint is not found within any tile.\n\n    Notes:\n        - Keypoints that do not fall within any tile will remain unchanged.\n        - The function assumes that the tiles do not overlap and cover the entire image space.\n    \"\"\"\n    if not keypoints.size:\n        return keypoints\n\n    # Broadcast keypoints and tiles for vectorized comparison\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    start_y, start_x, end_y, end_x = tiles.T  # Each shape: (num_tiles,)\n\n    # Check if each keypoint is inside each tile\n    in_tile = (kp_y &gt;= start_y) &amp; (kp_y &lt; end_y) &amp; (kp_x &gt;= start_x) &amp; (kp_x &lt; end_x)\n\n    # Find which tile each keypoint belongs to\n    tile_indices = np.argmax(in_tile, axis=1)\n\n    # Check if any keypoint is not in any tile\n    not_in_any_tile = ~np.any(in_tile, axis=1)\n    if np.any(not_in_any_tile):\n        warn(\n            \"Some keypoints are not in any tile. They will be returned unchanged. This is unexpected and should be \"\n            \"investigated.\",\n            RuntimeWarning,\n            stacklevel=2,\n        )\n\n    # Get the new tile indices\n    new_tile_indices = np.array(mapping)[tile_indices]\n\n    # Calculate the offsets\n    old_start_x = tiles[tile_indices, 1]\n    old_start_y = tiles[tile_indices, 0]\n    new_start_x = tiles[new_tile_indices, 1]\n    new_start_y = tiles[new_tile_indices, 0]\n\n    # Apply the transformation\n    new_keypoints = keypoints.copy()\n    new_keypoints[:, 0] = (keypoints[:, 0] - old_start_x) + new_start_x\n    new_keypoints[:, 1] = (keypoints[:, 1] - old_start_y) + new_start_y\n\n    # Keep original coordinates for keypoints not in any tile\n    new_keypoints[not_in_any_tile] = keypoints[not_in_any_tile]\n\n    return new_keypoints\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.to_distance_maps","title":"<code>def to_distance_maps    (keypoints, image_shape, inverted=False)    </code> [view source on GitHub]","text":"<p>Generate a <code>(H,W,N)</code> array of distance maps for <code>N</code> keypoints.</p> <p>The <code>n</code>-th distance map contains at every location <code>(y, x)</code> the euclidean distance to the <code>n</code>-th keypoint.</p> <p>This function can be used as a helper when augmenting keypoints with a method that only supports the augmentation of images.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 2+) where N is the number of keypoints.        Each row represents a keypoint's (x, y) coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>tuple[int, int] shape of the image (height, width)</p> <code>inverted</code> <code>bool</code> <p>If <code>True</code>, inverted distance maps are returned where each distance value d is replaced by <code>d/(d+1)</code>, i.e. the distance maps have values in the range <code>(0.0, 1.0]</code> with <code>1.0</code> denoting exactly the position of the respective keypoint.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A <code>float32</code> array of shape (H, W, N) containing <code>N</code> distance maps for <code>N</code>     keypoints. Each location <code>(y, x, n)</code> in the array denotes the     euclidean distance at <code>(y, x)</code> to the <code>n</code>-th keypoint.     If <code>inverted</code> is <code>True</code>, the distance <code>d</code> is replaced     by <code>d/(d+1)</code>. The height and width of the array match the     height and width in <code>image_shape</code>.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def to_distance_maps(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n    inverted: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Generate a ``(H,W,N)`` array of distance maps for ``N`` keypoints.\n\n    The ``n``-th distance map contains at every location ``(y, x)`` the\n    euclidean distance to the ``n``-th keypoint.\n\n    This function can be used as a helper when augmenting keypoints with a\n    method that only supports the augmentation of images.\n\n    Args:\n        keypoints: A numpy array of shape (N, 2+) where N is the number of keypoints.\n                   Each row represents a keypoint's (x, y) coordinates.\n        image_shape: tuple[int, int] shape of the image (height, width)\n        inverted (bool): If ``True``, inverted distance maps are returned where each\n            distance value d is replaced by ``d/(d+1)``, i.e. the distance\n            maps have values in the range ``(0.0, 1.0]`` with ``1.0`` denoting\n            exactly the position of the respective keypoint.\n\n    Returns:\n        np.ndarray: A ``float32`` array of shape (H, W, N) containing ``N`` distance maps for ``N``\n            keypoints. Each location ``(y, x, n)`` in the array denotes the\n            euclidean distance at ``(y, x)`` to the ``n``-th keypoint.\n            If `inverted` is ``True``, the distance ``d`` is replaced\n            by ``d/(d+1)``. The height and width of the array match the\n            height and width in ``image_shape``.\n    \"\"\"\n    height, width = image_shape[:2]\n    if len(keypoints) == 0:\n        return np.zeros((height, width, 0), dtype=np.float32)\n\n    # Create coordinate grids\n    yy, xx = np.mgrid[:height, :width]\n\n    # Convert keypoints to numpy array\n    keypoints_array = np.array(keypoints)\n\n    # Compute distances for all keypoints at once\n    distances = np.sqrt(\n        (xx[..., np.newaxis] - keypoints_array[:, 0]) ** 2 + (yy[..., np.newaxis] - keypoints_array[:, 1]) ** 2,\n    )\n\n    if inverted:\n        return (1 / (distances + 1)).astype(np.float32)\n    return distances.astype(np.float32)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.tps_transform","title":"<code>def tps_transform    (target_points, control_points, nonlinear_weights, affine_weights)    </code> [view source on GitHub]","text":"<p>Apply Thin Plate Spline transformation to points.</p> <p>Parameters:</p> Name Type Description <code>target_points</code> <code>np.ndarray</code> <p>Points to transform with shape (num_targets, 2)</p> <code>control_points</code> <code>np.ndarray</code> <p>Original control points with shape (num_controls, 2)</p> <code>nonlinear_weights</code> <code>np.ndarray</code> <p>TPS kernel weights with shape (num_controls, 2)</p> <code>affine_weights</code> <code>np.ndarray</code> <p>Affine transformation weights with shape (3, 2)</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transformed points with shape (num_targets, 2)</p> <p>Note</p> <p>The transformation combines: 1. Nonlinear warping based on distances to control points 2. Global affine transformation (scale, rotation, translation)</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def tps_transform(\n    target_points: np.ndarray,\n    control_points: np.ndarray,\n    nonlinear_weights: np.ndarray,\n    affine_weights: np.ndarray,\n) -&gt; np.ndarray:\n    \"\"\"Apply Thin Plate Spline transformation to points.\n\n    Args:\n        target_points: Points to transform with shape (num_targets, 2)\n        control_points: Original control points with shape (num_controls, 2)\n        nonlinear_weights: TPS kernel weights with shape (num_controls, 2)\n        affine_weights: Affine transformation weights with shape (3, 2)\n\n    Returns:\n        Transformed points with shape (num_targets, 2)\n\n    Note:\n        The transformation combines:\n        1. Nonlinear warping based on distances to control points\n        2. Global affine transformation (scale, rotation, translation)\n    \"\"\"\n    # Compute all pairwise distances at once: (num_targets, num_controls)\n    distances = np.linalg.norm(target_points[:, None] - control_points, axis=2)\n\n    # Apply TPS kernel function: U(r) = r\u00b2 log(r)\n    kernel_matrix = np.where(\n        distances &gt; 0,\n        distances * distances * np.log(distances + 1e-6),\n        0,\n    )\n\n    # Prepare affine terms [1, x, y] for each point\n    affine_terms = np.c_[np.ones(len(target_points)), target_points]\n\n    # Combine nonlinear and affine transformations\n    return kernel_matrix @ nonlinear_weights + affine_terms @ affine_weights\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.transpose","title":"<code>def transpose    (img)    </code> [view source on GitHub]","text":"<p>Transposes the first two dimensions of an array of any dimensionality. Retains the order of any additional dimensions.</p> <p>Parameters:</p> Name Type Description <code>img</code> <code>np.ndarray</code> <p>Input array.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Transposed array.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def transpose(img: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Transposes the first two dimensions of an array of any dimensionality.\n    Retains the order of any additional dimensions.\n\n    Args:\n        img (np.ndarray): Input array.\n\n    Returns:\n        np.ndarray: Transposed array.\n    \"\"\"\n    # Generate the new axes order\n    new_axes = list(range(img.ndim))\n    new_axes[0], new_axes[1] = 1, 0  # Swap the first two dimensions\n\n    # Transpose the array using the new axes order\n    return img.transpose(new_axes)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.validate_bboxes","title":"<code>def validate_bboxes    (bboxes, image_shape)    </code> [view source on GitHub]","text":"<p>Validate bounding boxes and remove invalid ones.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid bounding boxes, potentially with fewer boxes than the input.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n&gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n&gt;&gt;&gt; print(valid_bboxes)\n[[10 20 30 40]]\n</code></pre> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_bboxes(bboxes: np.ndarray, image_shape: Sequence[int]) -&gt; np.ndarray:\n    \"\"\"Validate bounding boxes and remove invalid ones.\n\n    Args:\n        bboxes (np.ndarray): Array of bounding boxes with shape (n, 4) where each row is [x_min, y_min, x_max, y_max].\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid bounding boxes, potentially with fewer boxes than the input.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[10, 20, 30, 40], [-10, -10, 5, 5], [100, 100, 120, 120]])\n        &gt;&gt;&gt; valid_bboxes = validate_bboxes(bboxes, (100, 100))\n        &gt;&gt;&gt; print(valid_bboxes)\n        [[10 20 30 40]]\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x_min, y_min, x_max, y_max = bboxes[:, 0], bboxes[:, 1], bboxes[:, 2], bboxes[:, 3]\n\n    valid_indices = (x_max &gt; 0) &amp; (y_max &gt; 0) &amp; (x_min &lt; cols) &amp; (y_min &lt; rows)\n\n    return bboxes[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.validate_if_not_found_coords","title":"<code>def validate_if_not_found_coords    (if_not_found_coords)    </code> [view source on GitHub]","text":"<p>Validate and process <code>if_not_found_coords</code> parameter.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_if_not_found_coords(\n    if_not_found_coords: Sequence[int] | dict[str, Any] | None,\n) -&gt; tuple[bool, float, float]:\n    \"\"\"Validate and process `if_not_found_coords` parameter.\"\"\"\n    if if_not_found_coords is None:\n        return True, -1, -1\n    if isinstance(if_not_found_coords, (tuple, list)):\n        if len(if_not_found_coords) != PAIR:\n            msg = \"Expected tuple/list 'if_not_found_coords' to contain exactly two entries.\"\n            raise ValueError(msg)\n        return False, if_not_found_coords[0], if_not_found_coords[1]\n    if isinstance(if_not_found_coords, dict):\n        return False, if_not_found_coords[\"x\"], if_not_found_coords[\"y\"]\n\n    msg = \"Expected if_not_found_coords to be None, tuple, list, or dict.\"\n    raise ValueError(msg)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/functional/#albumentations.augmentations.geometric.functional.validate_keypoints","title":"<code>def validate_keypoints    (keypoints, image_shape)    </code> [view source on GitHub]","text":"<p>Validate keypoints and remove those that fall outside the image boundaries.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, M) where N is the number of keypoints                     and M &gt;= 2. The first two columns represent x and y coordinates.</p> <code>image_shape</code> <code>tuple[int, int]</code> <p>Shape of the image as (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of valid keypoints that fall within the image boundaries.</p> <p>Note</p> <p>This function only checks the x and y coordinates (first two columns) of the keypoints. Any additional columns (e.g., angle, scale) are preserved for valid keypoints.</p> Source code in <code>albumentations/augmentations/geometric/functional.py</code> Python<pre><code>def validate_keypoints(\n    keypoints: np.ndarray,\n    image_shape: tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Validate keypoints and remove those that fall outside the image boundaries.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, M) where N is the number of keypoints\n                                and M &gt;= 2. The first two columns represent x and y coordinates.\n        image_shape (tuple[int, int]): Shape of the image as (height, width).\n\n    Returns:\n        np.ndarray: Array of valid keypoints that fall within the image boundaries.\n\n    Note:\n        This function only checks the x and y coordinates (first two columns) of the keypoints.\n        Any additional columns (e.g., angle, scale) are preserved for valid keypoints.\n    \"\"\"\n    rows, cols = image_shape[:2]\n\n    x, y = keypoints[:, 0], keypoints[:, 1]\n\n    valid_indices = (x &gt;= 0) &amp; (x &lt; cols) &amp; (y &gt;= 0) &amp; (y &lt; rows)\n\n    return keypoints[valid_indices]\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/","title":"Resizing transforms (augmentations.geometric.resize)","text":""},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.LongestMaxSize","title":"<code>class  LongestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,     keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, Sequence[int]</code> <p>Maximum size of the longest side after the transformation. When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must fit within these bounds - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>interpolation method. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the longest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / max(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = min(H_target/H, W_target/W)\n       This ensures both dimensions fit within the specified bounds.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; import cv2\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n&gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n&gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Common use case with padding\n&gt;&gt;&gt; transform4 = A.Compose([\n...     A.LongestMaxSize(max_size=1024),\n...     A.PadIfNeeded(min_height=1024, min_width=1024),\n... ])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class LongestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that the longest side is equal to max_size or sides meet max_size_hw constraints,\n        keeping the aspect ratio.\n\n    Args:\n        max_size (int, Sequence[int], optional): Maximum size of the longest side after the transformation.\n            When using a list or tuple, the max size will be randomly selected from the values provided. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must fit within these bounds\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): interpolation method. Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the longest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be smaller than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / max(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = min(H_target/H, W_target/W)\n               This ensures both dimensions fit within the specified bounds.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; import cv2\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.LongestMaxSize(max_size=1024)\n        &gt;&gt;&gt; # Input image (1500, 800) -&gt; Output (1024, 546)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.LongestMaxSize(max_size_hw=(800, 1024))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt; # Input (800, 1500) -&gt; Output (546, 1024)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.LongestMaxSize(max_size_hw=(800, None))\n        &gt;&gt;&gt; # Input (1500, 800) -&gt; Output (800, 427)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Common use case with padding\n        &gt;&gt;&gt; transform4 = A.Compose([\n        ...     A.LongestMaxSize(max_size=1024),\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024),\n        ... ])\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / max(img_h, img_w)\n        elif self.max_size_hw is not None:\n            # We know max_size_hw is not None here due to model validator\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on longest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = min(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.MaxSizeTransform","title":"<code>class  MaxSizeTransform</code> <code>       (max_size=1024, max_size_hw=None, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                                     </code>  [view source on GitHub]","text":"<p>Base class for transforms that resize based on maximum size constraints.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class MaxSizeTransform(DualTransform):\n    \"\"\"Base class for transforms that resize based on maximum size constraints.\"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        max_size: int | list[int] | None\n        max_size_hw: tuple[int | None, int | None] | None\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def validate_size_parameters(self) -&gt; Self:\n            if self.max_size is None and self.max_size_hw is None:\n                raise ValueError(\"Either max_size or max_size_hw must be specified\")\n            if self.max_size is not None and self.max_size_hw is not None:\n                raise ValueError(\"Only one of max_size or max_size_hw should be specified\")\n            return self\n\n    def __init__(\n        self,\n        max_size: int | Sequence[int] | None = 1024,\n        max_size_hw: tuple[int | None, int | None] | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.max_size = max_size\n        self.max_size_hw = max_size_hw\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = img.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(img, (new_height, new_width), interpolation=self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        height, width = mask.shape[:2]\n        new_height, new_width = max(1, round(height * scale)), max(1, round(width * scale))\n        return fgeometric.resize(mask, (new_height, new_width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_images(self, images: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(images, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volume, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(volumes, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply_to_mask(masks3d, *args, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"max_size\", \"max_size_hw\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.RandomScale","title":"<code>class  RandomScale</code> <code>       (scale_limit=(-0.1, 0.1), interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Randomly resize the input. Output image size is different from the input image size.</p> <p>Parameters:</p> Name Type Description <code>scale_limit</code> <code>float or tuple[float, float]</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The output image size is different from the input image size.</li> <li>Scale factor is sampled independently per image side (width and height).</li> <li>Bounding box coordinates are scaled accordingly.</li> <li>Keypoint coordinates are scaled accordingly.</li> </ul> <p>Mathematical formulation:     Let (W, H) be the original image dimensions and (W', H') be the output dimensions.     The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].     Then, W' = W * s and H' = H * s.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; scaled_image = result['image']\n# scaled_image will have dimensions in the range [90, 110] x [90, 110]\n# (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class RandomScale(DualTransform):\n    \"\"\"Randomly resize the input. Output image size is different from the input image size.\n\n    Args:\n        scale_limit (float or tuple[float, float]): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The output image size is different from the input image size.\n        - Scale factor is sampled independently per image side (width and height).\n        - Bounding box coordinates are scaled accordingly.\n        - Keypoint coordinates are scaled accordingly.\n\n    Mathematical formulation:\n        Let (W, H) be the original image dimensions and (W', H') be the output dimensions.\n        The scale factor s is sampled from the range [1 + scale_limit[0], 1 + scale_limit[1]].\n        Then, W' = W * s and H' = H * s.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.RandomScale(scale_limit=0.1, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; scaled_image = result['image']\n        # scaled_image will have dimensions in the range [90, 110] x [90, 110]\n        # (assuming the scale_limit of 0.1 results in a scaling factor between 0.9 and 1.1)\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale_limit: ScaleFloatType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(cls, v: ScaleFloatType) -&gt; tuple[float, float]:\n            return to_tuple(v, bias=1.0)\n\n    def __init__(\n        self,\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def get_params(self) -&gt; dict[str, float]:\n        return {\"scale\": self.py_random.uniform(*self.scale_limit)}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(img, scale, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.scale(mask, scale, self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        scale: float,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_scale(keypoints, scale, scale)\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"interpolation\": self.interpolation,\n            \"mask_interpolation\": self.mask_interpolation,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.Resize","title":"<code>class  Resize</code> <code>       (height, width, interpolation=1, mask_interpolation=0, p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Resize the input to the given height and width.</p> <p>Parameters:</p> Name Type Description <code>height</code> <code>int</code> <p>desired height of the output.</p> <code>width</code> <code>int</code> <p>desired width of the output.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class Resize(DualTransform):\n    \"\"\"Resize the input to the given height and width.\n\n    Args:\n        height (int): desired height of the output.\n        width (int): desired width of the output.\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        height: int = Field(ge=1)\n        width: int = Field(ge=1)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        height: int,\n        width: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.height = height\n        self.width = width\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(img, (self.height, self.width), interpolation=self.interpolation)\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.resize(mask, (self.height, self.width), interpolation=self.mask_interpolation)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        # Bounding box coordinates are scale invariant\n        return bboxes\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        height, width = params[\"shape\"][:2]\n        scale_x = self.width / width\n        scale_y = self.height / height\n        return fgeometric.keypoints_scale(keypoints, scale_x, scale_y)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"height\", \"width\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/resize/#albumentations.augmentations.geometric.resize.SmallestMaxSize","title":"<code>class  SmallestMaxSize</code> <code> </code>  [view source on GitHub]","text":"<p>Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints, keeping the aspect ratio.</p> <p>Parameters:</p> Name Type Description <code>max_size</code> <code>int, list of int</code> <p>Maximum size of smallest side of the image after the transformation. When using a list, max size will be randomly selected from the values in the list. Default: 1024.</p> <code>max_size_hw</code> <code>tuple[int | None, int | None]</code> <p>Maximum (height, width) constraints. Supports: - (height, width): Both dimensions must be at least these values - (height, None): Only height is constrained, width scales proportionally - (None, width): Only width is constrained, height scales proportionally If specified, max_size must be None. Default: None.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>If the smallest side of the image is already equal to max_size, the image will not be resized.</li> <li>This transform will not crop the image. The resulting image may be larger than specified in both dimensions.</li> <li>For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.</li> <li>Bounding boxes and keypoints are scaled accordingly.</li> </ul> <p>Mathematical Details:     Let (W, H) be the original width and height of the image.</p> <pre><code>When using max_size:\n    1. The scaling factor s is calculated as:\n       s = max_size / min(W, H)\n    2. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n\nWhen using max_size_hw=(H_target, W_target):\n    1. For both dimensions specified:\n       s = max(H_target/H, W_target/W)\n       This ensures both dimensions are at least as large as specified.\n\n    2. For height only (W_target=None):\n       s = H_target/H\n       Width will scale proportionally.\n\n    3. For width only (H_target=None):\n       s = W_target/W\n       Height will scale proportionally.\n\n    4. The new dimensions (W', H') are:\n       W' = W * s\n       H' = H * s\n</code></pre> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Using max_size\n&gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n&gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with both dimensions\n&gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n&gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Using max_size_hw with only height\n&gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n&gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/resize.py</code> Python<pre><code>class SmallestMaxSize(MaxSizeTransform):\n    \"\"\"Rescale an image so that minimum side is equal to max_size or sides meet max_size_hw constraints,\n    keeping the aspect ratio.\n\n    Args:\n        max_size (int, list of int, optional): Maximum size of smallest side of the image after the transformation.\n            When using a list, max size will be randomly selected from the values in the list. Default: 1024.\n        max_size_hw (tuple[int | None, int | None], optional): Maximum (height, width) constraints. Supports:\n            - (height, width): Both dimensions must be at least these values\n            - (height, None): Only height is constrained, width scales proportionally\n            - (None, width): Only width is constrained, height scales proportionally\n            If specified, max_size must be None. Default: None.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 1.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - If the smallest side of the image is already equal to max_size, the image will not be resized.\n        - This transform will not crop the image. The resulting image may be larger than specified in both dimensions.\n        - For non-square images, both sides will be scaled proportionally to maintain the aspect ratio.\n        - Bounding boxes and keypoints are scaled accordingly.\n\n    Mathematical Details:\n        Let (W, H) be the original width and height of the image.\n\n        When using max_size:\n            1. The scaling factor s is calculated as:\n               s = max_size / min(W, H)\n            2. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n        When using max_size_hw=(H_target, W_target):\n            1. For both dimensions specified:\n               s = max(H_target/H, W_target/W)\n               This ensures both dimensions are at least as large as specified.\n\n            2. For height only (W_target=None):\n               s = H_target/H\n               Width will scale proportionally.\n\n            3. For width only (H_target=None):\n               s = W_target/W\n               Height will scale proportionally.\n\n            4. The new dimensions (W', H') are:\n               W' = W * s\n               H' = H * s\n\n    Examples:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Using max_size\n        &gt;&gt;&gt; transform1 = A.SmallestMaxSize(max_size=120)\n        &gt;&gt;&gt; # Input image (100, 150) -&gt; Output (120, 180)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with both dimensions\n        &gt;&gt;&gt; transform2 = A.SmallestMaxSize(max_size_hw=(100, 200))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n        &gt;&gt;&gt; # Input (160, 80) -&gt; Output (400, 200)\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Using max_size_hw with only height\n        &gt;&gt;&gt; transform3 = A.SmallestMaxSize(max_size_hw=(100, None))\n        &gt;&gt;&gt; # Input (80, 160) -&gt; Output (100, 200)\n    \"\"\"\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        img_h, img_w = params[\"shape\"][:2]\n\n        if self.max_size is not None:\n            if isinstance(self.max_size, (list, tuple)):\n                max_size = self.py_random.choice(self.max_size)\n            else:\n                max_size = self.max_size\n            scale = max_size / min(img_h, img_w)\n        elif self.max_size_hw is not None:\n            max_h, max_w = self.max_size_hw\n            if max_h is not None and max_w is not None:\n                # Scale based on smallest side to maintain aspect ratio\n                h_scale = max_h / img_h\n                w_scale = max_w / img_w\n                scale = max(h_scale, w_scale)\n            elif max_h is not None:\n                # Only height specified\n                scale = max_h / img_h\n            else:\n                # Only width specified\n                scale = max_w / img_w\n\n        return {\"scale\": scale}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/","title":"Rotation transforms (augmentations.geometric.functional)","text":""},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.RandomRotate90","title":"<code>class  RandomRotate90</code> <code> </code>  [view source on GitHub]","text":"<p>Randomly rotate the input by 90 degrees zero or more times.</p> <p>Parameters:</p> Name Type Description <code>p</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RandomRotate90(DualTransform):\n    \"\"\"Randomly rotate the input by 90 degrees zero or more times.\n\n    Args:\n        p: probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, factor: Literal[0, 1, 2, 3], **params: Any) -&gt; np.ndarray:\n        return fgeometric.rot90(img, factor)\n\n    def get_params(self) -&gt; dict[str, int]:\n        # Random int in the range [0, 3]\n        return {\"factor\": self.py_random.randint(0, 3)}\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        factor: Literal[0, 1, 2, 3],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_rot90(bboxes, factor)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        factor: Literal[0, 1, 2, 3],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_rot90(keypoints, factor, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.Rotate","title":"<code>class  Rotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', crop_border=False, mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Rotate the input by an angle selected randomly from the uniform distribution.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>str</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>crop_border</code> <code>bool</code> <p>Whether to crop border after rotation. If True, the output image size might differ from the input. Default: False</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation angle is randomly selected for each execution within the range specified by 'limit'.</li> <li>When 'crop_border' is False, the output image will have the same size as the input, potentially   introducing black triangles in the corners.</li> <li>When 'crop_border' is True, the output image is cropped to remove black triangles, which may result   in a smaller image.</li> <li>Bounding boxes are rotated and may change size or shape.</li> <li>Keypoints are rotated around the center of the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. Each point (x, y) in the image is transformed to (x', y') by:        [x']   cos(\u03b8)  -sin(\u03b8)   [cx]        [y'] = sin(\u03b8)   cos(\u03b8) + [cy]        where (cx, cy) is the center of the image.     5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class Rotate(DualTransform):\n    \"\"\"Rotate the input by an angle selected randomly from the uniform distribution.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        rotate_method (str): Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'.\n            Default: 'largest_box'\n        crop_border (bool): Whether to crop border after rotation. If True, the output image size might differ\n            from the input. Default: False\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation angle is randomly selected for each execution within the range specified by 'limit'.\n        - When 'crop_border' is False, the output image will have the same size as the input, potentially\n          introducing black triangles in the corners.\n        - When 'crop_border' is True, the output image is cropped to remove black triangles, which may result\n          in a smaller image.\n        - Bounding boxes are rotated and may change size or shape.\n        - Keypoints are rotated around the center of the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [cos(\u03b8)  -sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [sin(\u03b8)   cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n        5. If 'crop_border' is True, the image is cropped to the largest rectangle that fits inside the rotated image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Rotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        crop_border: bool\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        @model_validator(mode=\"after\")\n        def validate_value(self) -&gt; Self:\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n            return self\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        crop_border: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.crop_border = crop_border\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            img,\n            matrix,\n            self.interpolation,\n            self.fill,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        img_out = fgeometric.warp_affine(\n            mask,\n            matrix,\n            self.mask_interpolation,\n            self.fill_mask,\n            self.border_mode,\n            params[\"shape\"][:2],\n        )\n        if self.crop_border:\n            return fcrops.crop(img_out, x_min, y_min, x_max, y_max)\n        return img_out\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_out = fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            image_shape,\n            self.border_mode,\n            image_shape,\n        )\n        if self.crop_border:\n            return fcrops.crop_bboxes_by_coords(\n                bboxes_out,\n                (x_min, y_min, x_max, y_max),\n                image_shape,\n            )\n        return bboxes_out\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        x_min: int,\n        x_max: int,\n        y_min: int,\n        y_max: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        keypoints_out = fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"][:2],\n            scale={\"x\": 1, \"y\": 1},\n            border_mode=self.border_mode,\n        )\n        if self.crop_border:\n            return fcrops.crop_keypoints_by_coords(\n                keypoints_out,\n                (x_min, y_min, x_max, y_max),\n            )\n        return keypoints_out\n\n    @staticmethod\n    def _rotated_rect_with_max_area(\n        height: int,\n        width: int,\n        angle: float,\n    ) -&gt; dict[str, int]:\n        \"\"\"Given a rectangle of size wxh that has been rotated by 'angle' (in\n        degrees), computes the width and height of the largest possible\n        axis-aligned rectangle (maximal area) within the rotated rectangle.\n\n        Reference:\n            https://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders\n        \"\"\"\n        angle = math.radians(angle)\n        width_is_longer = width &gt;= height\n        side_long, side_short = (width, height) if width_is_longer else (height, width)\n\n        # since the solutions for angle, -angle and 180-angle are all the same,\n        # it is sufficient to look at the first quadrant and the absolute values of sin,cos:\n        sin_a, cos_a = abs(math.sin(angle)), abs(math.cos(angle))\n        if side_short &lt;= 2.0 * sin_a * cos_a * side_long or abs(sin_a - cos_a) &lt; SMALL_NUMBER:\n            # half constrained case: two crop corners touch the longer side,\n            # the other two corners are on the mid-line parallel to the longer line\n            x = 0.5 * side_short\n            wr, hr = (x / sin_a, x / cos_a) if width_is_longer else (x / cos_a, x / sin_a)\n        else:\n            # fully constrained case: crop touches all 4 sides\n            cos_2a = cos_a * cos_a - sin_a * sin_a\n            wr, hr = (\n                (width * cos_a - height * sin_a) / cos_2a,\n                (height * cos_a - width * sin_a) / cos_2a,\n            )\n\n        return {\n            \"x_min\": max(0, int(width / 2 - wr / 2)),\n            \"x_max\": min(width, int(width / 2 + wr / 2)),\n            \"y_min\": max(0, int(height / 2 - hr / 2)),\n            \"y_max\": min(height, int(height / 2 + hr / 2)),\n        }\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        angle = self.py_random.uniform(*self.limit)\n\n        if self.crop_border:\n            height, width = params[\"shape\"][:2]\n            out_params = self._rotated_rect_with_max_area(height, width, angle)\n        else:\n            out_params = {\"x_min\": -1, \"x_max\": -1, \"y_min\": -1, \"y_max\": -1}\n\n        center = fgeometric.center(params[\"shape\"][:2])\n        bbox_center = fgeometric.center_bbox(params[\"shape\"][:2])\n\n        translate: fgeometric.XYInt = {\"x\": 0, \"y\": 0}\n        shear: fgeometric.XYFloat = {\"x\": 0, \"y\": 0}\n        scale: fgeometric.XYFloat = {\"x\": 1, \"y\": 1}\n        rotate = angle\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            center,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_center,\n        )\n        out_params[\"matrix\"] = matrix\n        out_params[\"bbox_matrix\"] = bbox_matrix\n\n        return out_params\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"crop_border\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.RotateInitSchema","title":"<code>class  RotateInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class RotateInitSchema(BaseTransformInitSchema):\n    limit: SymmetricRangeType\n\n    interpolation: InterpolationType\n    mask_interpolation: InterpolationType\n\n    border_mode: BorderModeType\n\n    fill: ColorType | None\n    fill_mask: ColorType | None\n</code></pre>"},{"location":"api_reference/augmentations/geometric/rotate/#albumentations.augmentations.geometric.rotate.SafeRotate","title":"<code>class  SafeRotate</code> <code>       (limit=(-90, 90), interpolation=1, border_mode=4, value=None, mask_value=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.</p> <p>This transformation ensures that the entire rotated image fits within the original frame by scaling it down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the rotation and scaling process.</p> <p>Parameters:</p> Name Type Description <code>limit</code> <code>float | tuple[float, float]</code> <p>Range from which a random angle is picked. If limit is a single float, an angle is picked from (-limit, limit). Default: (-90, 90)</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>Flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>Method to rotate bounding boxes. Should be 'largest_box' or 'ellipse'. Default: 'largest_box'</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The rotation is performed around the center of the image.</li> <li>After rotation, the image is scaled to fit within the original frame, which may cause some distortion.</li> <li>The output image will always have the same dimensions as the input image.</li> <li>Bounding boxes and keypoints are transformed along with the image.</li> </ul> <p>Mathematical Details:     1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.     2. The image is rotated around its center by \u03b8 degrees.     3. The rotation matrix R is:        R = [cos(\u03b8)  -sin(\u03b8)]            [sin(\u03b8)   cos(\u03b8)]     4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:        s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),                height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))     5. The combined transformation matrix T is:        T = [scos(\u03b8)  -ssin(\u03b8)  tx]            [ssin(\u03b8)   scos(\u03b8)  ty]        where tx and ty are translation factors to keep the image centered.     6. Each point (x, y) in the image is transformed to (x', y') by:        [x']   scos(\u03b8)   ssin(\u03b8)   [cx]        [y'] = -ssin(\u03b8)  scos(\u03b8) + [cy]        where (cx, cy) is the center of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; rotated_image = result['image']\n# rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n# scaled to fit within the original 100x100 frame\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/rotate.py</code> Python<pre><code>class SafeRotate(Affine):\n    \"\"\"Rotate the input inside the input's frame by an angle selected randomly from the uniform distribution.\n\n    This transformation ensures that the entire rotated image fits within the original frame by scaling it\n    down if necessary. The resulting image maintains its original dimensions but may contain artifacts due to the\n    rotation and scaling process.\n\n    Args:\n        limit (float | tuple[float, float]): Range from which a random angle is picked. If limit is a single float,\n            an angle is picked from (-limit, limit). Default: (-90, 90)\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): Flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT applied\n            for masks.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): Method to rotate bounding boxes.\n            Should be 'largest_box' or 'ellipse'. Default: 'largest_box'\n        mask_interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The rotation is performed around the center of the image.\n        - After rotation, the image is scaled to fit within the original frame, which may cause some distortion.\n        - The output image will always have the same dimensions as the input image.\n        - Bounding boxes and keypoints are transformed along with the image.\n\n    Mathematical Details:\n        1. An angle \u03b8 is randomly sampled from the range specified by 'limit'.\n        2. The image is rotated around its center by \u03b8 degrees.\n        3. The rotation matrix R is:\n           R = [cos(\u03b8)  -sin(\u03b8)]\n               [sin(\u03b8)   cos(\u03b8)]\n        4. The scaling factor s is calculated to ensure the rotated image fits within the original frame:\n           s = min(width / (width * |cos(\u03b8)| + height * |sin(\u03b8)|),\n                   height / (width * |sin(\u03b8)| + height * |cos(\u03b8)|))\n        5. The combined transformation matrix T is:\n           T = [s*cos(\u03b8)  -s*sin(\u03b8)  tx]\n               [s*sin(\u03b8)   s*cos(\u03b8)  ty]\n           where tx and ty are translation factors to keep the image centered.\n        6. Each point (x, y) in the image is transformed to (x', y') by:\n           [x']   [s*cos(\u03b8)   s*sin(\u03b8)] [x - cx]   [cx]\n           [y'] = [-s*sin(\u03b8)  s*cos(\u03b8)] [y - cy] + [cy]\n           where (cx, cy) is the center of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.SafeRotate(limit=45, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; rotated_image = result['image']\n        # rotated_image will be the input image rotated by a random angle between -45 and 45 degrees,\n        # scaled to fit within the original 100x100 frame\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(RotateInitSchema):\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n\n    def __init__(\n        self,\n        limit: ScaleFloatType = (-90, 90),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            rotate=limit,\n            interpolation=interpolation,\n            border_mode=border_mode,\n            fill=fill,\n            fill_mask=fill_mask,\n            rotate_method=rotate_method,\n            fit_output=True,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.limit = cast(tuple[float, float], limit)\n        self.interpolation = interpolation\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.rotate_method = rotate_method\n        self.mask_interpolation = mask_interpolation\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"limit\",\n            \"interpolation\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"rotate_method\",\n            \"mask_interpolation\",\n        )\n\n    def _create_safe_rotate_matrix(\n        self,\n        angle: float,\n        center: tuple[float, float],\n        image_shape: tuple[int, int],\n    ) -&gt; tuple[np.ndarray, dict[str, float]]:\n        height, width = image_shape[:2]\n        rotation_mat = cv2.getRotationMatrix2D(center, angle, 1.0)\n\n        # Calculate new image size\n        abs_cos = abs(rotation_mat[0, 0])\n        abs_sin = abs(rotation_mat[0, 1])\n        new_w = int(height * abs_sin + width * abs_cos)\n        new_h = int(height * abs_cos + width * abs_sin)\n\n        # Adjust the rotation matrix to take into account the new size\n        rotation_mat[0, 2] += new_w / 2 - center[0]\n        rotation_mat[1, 2] += new_h / 2 - center[1]\n\n        # Calculate scaling factors\n        scale_x = width / new_w\n        scale_y = height / new_h\n\n        # Create scaling matrix\n        scale_mat = np.array([[scale_x, 0, 0], [0, scale_y, 0], [0, 0, 1]])\n\n        # Combine rotation and scaling\n        matrix = scale_mat @ np.vstack([rotation_mat, [0, 0, 1]])\n\n        return matrix, {\"x\": scale_x, \"y\": scale_y}\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        angle = self.py_random.uniform(*self.limit)\n\n        # Calculate centers for image and bbox\n        image_center = fgeometric.center(image_shape)\n        bbox_center = fgeometric.center_bbox(image_shape)\n\n        # Create matrices for image and bbox\n        matrix, scale = self._create_safe_rotate_matrix(\n            angle,\n            image_center,\n            image_shape,\n        )\n        bbox_matrix, _ = self._create_safe_rotate_matrix(\n            angle,\n            bbox_center,\n            image_shape,\n        )\n\n        return {\n            \"rotate\": angle,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": image_shape,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/","title":"Geometric transforms (augmentations.geometric.transforms)","text":""},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Affine","title":"<code>class  Affine</code> <code>       (scale=1, translate_percent=None, translate_px=None, rotate=0, shear=0, interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, fit_output=False, keep_ratio=False, rotate_method='largest_box', balanced_scale=False, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Augmentation to apply affine transformations to images.</p> <p>Affine transformations involve:</p> <pre><code>- Translation (\"move\" image on the x-/y-axis)\n- Rotation\n- Scaling (\"zoom\" in/out)\n- Shear (move one side of the image, turning a square into a trapezoid)\n</code></pre> <p>All such transformations can create \"new\" pixels in the image without a defined content, e.g. if the image is translated to the left, pixels are created on the right. A method has to be defined to deal with these pixel values. The parameters <code>fill</code> and <code>fill_mask</code> of this class deal with this.</p> <p>Some transformations involve interpolations between several pixels of the input image to generate output pixel values. The parameters <code>interpolation</code> and <code>mask_interpolation</code> deals with the method of interpolation used for this.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>number, tuple of number or dict</code> <p>Scaling factor to use, where <code>1.0</code> denotes \"no change\" and <code>0.5</code> is zoomed out to <code>50</code> percent of the original size.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That the same range will be used for both x- and y-axis. To keep the aspect ratio, set       <code>keep_ratio=True</code>, then the same value will be used for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes. Note that when       the <code>keep_ratio=True</code>, the x- and y-axis ranges should be the same.</p> <code>translate_percent</code> <code>None, number, tuple of number or dict</code> <p>Translation as a fraction of the image height/width (x-translation, y-translation), where <code>0</code> denotes \"no change\" and <code>0.5</code> denotes \"half of the axis size\".     * If <code>None</code> then equivalent to <code>0.0</code> unless <code>translate_px</code> has a value other than <code>None</code>.     * If a single number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>.       That sampled fraction value will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>translate_px</code> <code>None, int, tuple of int or dict</code> <p>Translation in pixels.     * If <code>None</code> then equivalent to <code>0</code> unless <code>translate_percent</code> has a value other than <code>None</code>.     * If a single int, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from       the discrete interval <code>[a..b]</code>. That number will be used identically for both x- and y-axis.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>rotate</code> <code>number or tuple of number</code> <p>Rotation in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>. Rotation happens around the center of the image, not the top left corner as in some other frameworks.     * If a number, then that value will be used for all images.     * If a tuple <code>(a, b)</code>, then a value will be uniformly sampled per image from the interval <code>[a, b]</code>       and used as the rotation value.</p> <code>shear</code> <code>number, tuple of number or dict</code> <p>Shear in degrees (NOT radians), i.e. expected value range is around <code>[-360, 360]</code>, with reasonable values being in the range of <code>[-45, 45]</code>.     * If a number, then that value will be used for all images as       the shear on the x-axis (no shear on the y-axis will be done).     * If a tuple <code>(a, b)</code>, then two value will be uniformly sampled per image       from the interval <code>[a, b]</code> and be used as the x- and y-shear value.     * If a dictionary, then it is expected to have the keys <code>x</code> and/or <code>y</code>.       Each of these keys can have the same values as described above.       Using a dictionary allows to set different values for the two axis and sampling will then happen       independently per axis, resulting in samples that differ between the axes.</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>mask_interpolation</code> <code>int</code> <p>OpenCV interpolation flag.</p> <code>fill</code> <code>ColorType</code> <p>The constant value to use when filling in newly created pixels. (E.g. translating by 1px to the right will create a new 1px-wide column of pixels on the left of the image). The value is only used when <code>mode=constant</code>. The expected value range is <code>[0, 255]</code> for <code>uint8</code> images.</p> <code>fill_mask</code> <code>ColorType</code> <p>Same as fill but only for masks.</p> <code>border_mode</code> <code>int</code> <p>OpenCV border flag.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to tightly capture the whole image after affine transformation (<code>translate_percent</code> and <code>translate_px</code> are ignored). Otherwise (<code>False</code>),  parts of the transformed image may end up outside the image plane. Fitting the output shape can be useful to avoid corners of the image being outside the image plane after applying rotations. Default: False</p> <code>keep_ratio</code> <code>bool</code> <p>When True, the original aspect ratio will be kept when the random scale is applied. Default: False.</p> <code>rotate_method</code> <code>Literal[\"largest_box\", \"ellipse\"]</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"</p> <code>balanced_scale</code> <code>bool</code> <p>When True, scaling factors are chosen to be either entirely below or above 1, ensuring balanced scaling. Default: False.</p> <p>This is important because without it, scaling tends to lean towards upscaling. For example, if we want the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly from [0.5, 2]. With <code>balanced_scale</code>, the  function ensures that half the time, the scaling factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in). This makes the zooming in and out process more balanced.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>[1] https://arxiv.org/abs/2109.13488</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Affine(DualTransform):\n    \"\"\"Augmentation to apply affine transformations to images.\n\n    Affine transformations involve:\n\n        - Translation (\"move\" image on the x-/y-axis)\n        - Rotation\n        - Scaling (\"zoom\" in/out)\n        - Shear (move one side of the image, turning a square into a trapezoid)\n\n    All such transformations can create \"new\" pixels in the image without a defined content, e.g.\n    if the image is translated to the left, pixels are created on the right.\n    A method has to be defined to deal with these pixel values.\n    The parameters `fill` and `fill_mask` of this class deal with this.\n\n    Some transformations involve interpolations between several pixels\n    of the input image to generate output pixel values. The parameters `interpolation` and\n    `mask_interpolation` deals with the method of interpolation used for this.\n\n    Args:\n        scale (number, tuple of number or dict): Scaling factor to use, where ``1.0`` denotes \"no change\" and\n            ``0.5`` is zoomed out to ``50`` percent of the original size.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That the same range will be used for both x- and y-axis. To keep the aspect ratio, set\n                  ``keep_ratio=True``, then the same value will be used for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes. Note that when\n                  the ``keep_ratio=True``, the x- and y-axis ranges should be the same.\n        translate_percent (None, number, tuple of number or dict): Translation as a fraction of the image height/width\n            (x-translation, y-translation), where ``0`` denotes \"no change\"\n            and ``0.5`` denotes \"half of the axis size\".\n                * If ``None`` then equivalent to ``0.0`` unless `translate_px` has a value other than ``None``.\n                * If a single number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``.\n                  That sampled fraction value will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        translate_px (None, int, tuple of int or dict): Translation in pixels.\n                * If ``None`` then equivalent to ``0`` unless `translate_percent` has a value other than ``None``.\n                * If a single int, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from\n                  the discrete interval ``[a..b]``. That number will be used identically for both x- and y-axis.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        rotate (number or tuple of number): Rotation in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``. Rotation happens around the *center* of the image,\n            not the top left corner as in some other frameworks.\n                * If a number, then that value will be used for all images.\n                * If a tuple ``(a, b)``, then a value will be uniformly sampled per image from the interval ``[a, b]``\n                  and used as the rotation value.\n        shear (number, tuple of number or dict): Shear in degrees (**NOT** radians), i.e. expected value range is\n            around ``[-360, 360]``, with reasonable values being in the range of ``[-45, 45]``.\n                * If a number, then that value will be used for all images as\n                  the shear on the x-axis (no shear on the y-axis will be done).\n                * If a tuple ``(a, b)``, then two value will be uniformly sampled per image\n                  from the interval ``[a, b]`` and be used as the x- and y-shear value.\n                * If a dictionary, then it is expected to have the keys ``x`` and/or ``y``.\n                  Each of these keys can have the same values as described above.\n                  Using a dictionary allows to set different values for the two axis and sampling will then happen\n                  *independently* per axis, resulting in samples that differ between the axes.\n        interpolation (int): OpenCV interpolation flag.\n        mask_interpolation (int): OpenCV interpolation flag.\n        fill (ColorType): The constant value to use when filling in newly created pixels.\n            (E.g. translating by 1px to the right will create a new 1px-wide column of pixels\n            on the left of the image).\n            The value is only used when `mode=constant`. The expected value range is ``[0, 255]`` for ``uint8`` images.\n        fill_mask (ColorType): Same as fill but only for masks.\n        border_mode (int): OpenCV border flag.\n        fit_output (bool): If True, the image plane size and position will be adjusted to tightly capture\n            the whole image after affine transformation (`translate_percent` and `translate_px` are ignored).\n            Otherwise (``False``),  parts of the transformed image may end up outside the image plane.\n            Fitting the output shape can be useful to avoid corners of the image being outside the image plane\n            after applying rotations. Default: False\n        keep_ratio (bool): When True, the original aspect ratio will be kept when the random scale is applied.\n            Default: False.\n        rotate_method (Literal[\"largest_box\", \"ellipse\"]): rotation method used for the bounding boxes.\n            Should be one of \"largest_box\" or \"ellipse\"[1]. Default: \"largest_box\"\n        balanced_scale (bool): When True, scaling factors are chosen to be either entirely below or above 1,\n            ensuring balanced scaling. Default: False.\n\n            This is important because without it, scaling tends to lean towards upscaling. For example, if we want\n            the image to zoom in and out by 2x, we may pick an interval [0.5, 2]. Since the interval [0.5, 1] is\n            three times smaller than [1, 2], values above 1 are picked three times more often if sampled directly\n            from [0.5, 2]. With `balanced_scale`, the  function ensures that half the time, the scaling\n            factor is picked from below 1 (zooming out), and the other half from above 1 (zooming in).\n            This makes the zooming in and out process more balanced.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        [1] https://arxiv.org/abs/2109.13488\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: ScaleFloatType | fgeometric.XYFloatScale\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None\n        rotate: ScaleFloatType\n        shear: ScaleFloatType | fgeometric.XYFloatScale\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n        cval: ColorType | None\n        cval_mask: ColorType | None\n        mode: BorderModeType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        fit_output: bool\n        keep_ratio: bool\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        balanced_scale: bool\n\n        @field_validator(\"shear\", \"scale\")\n        @classmethod\n        def process_shear(\n            cls,\n            value: ScaleFloatType | fgeometric.XYFloatScale,\n            info: ValidationInfo,\n        ) -&gt; fgeometric.XYFloatDict:\n            return cast(\n                fgeometric.XYFloatDict,\n                cls._handle_dict_arg(value, info.field_name),\n            )\n\n        @field_validator(\"rotate\")\n        @classmethod\n        def process_rotate(\n            cls,\n            value: ScaleFloatType,\n        ) -&gt; tuple[float, float]:\n            return to_tuple(value, value)\n\n        @model_validator(mode=\"after\")\n        def handle_translate(self) -&gt; Self:\n            if self.translate_percent is None and self.translate_px is None:\n                self.translate_px = 0\n\n            if self.translate_percent is not None and self.translate_px is not None:\n                msg = \"Expected either translate_percent or translate_px to be provided, but both were provided.\"\n                raise ValueError(msg)\n\n            if self.translate_percent is not None:\n                self.translate_percent = self._handle_dict_arg(\n                    self.translate_percent,\n                    \"translate_percent\",\n                    default=0.0,\n                )  # type: ignore[assignment]\n\n            if self.translate_px is not None:\n                self.translate_px = self._handle_dict_arg(\n                    self.translate_px,\n                    \"translate_px\",\n                    default=0,\n                )  # type: ignore[assignment]\n\n            return self\n\n        @staticmethod\n        def _handle_dict_arg(\n            val: ScaleType | fgeometric.XYFloatScale | fgeometric.XYIntScale,\n            name: str | None,\n            default: float = 1.0,\n        ) -&gt; dict[str, Any]:\n            if isinstance(val, dict):\n                if \"x\" not in val and \"y\" not in val:\n                    raise ValueError(\n                        f'Expected {name} dictionary to contain at least key \"x\" or key \"y\". Found neither of them.',\n                    )\n                x = val.get(\"x\", default)\n                y = val.get(\"y\", default)\n                return {\"x\": to_tuple(x, x), \"y\": to_tuple(y, y)}  # type: ignore[arg-type]\n            return {\"x\": to_tuple(val, val), \"y\": to_tuple(val, val)}\n\n        @model_validator(mode=\"after\")\n        def validate_fill_types(self) -&gt; Self:\n            if self.cval is not None:\n                self.fill = self.cval\n                warn(\"cval is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n            if self.cval_mask is not None:\n                self.fill_mask = self.cval_mask\n                warn(\"cval_mask is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n            if self.mode is not None:\n                self.border_mode = self.mode\n                warn(\"mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType | fgeometric.XYFloatScale = 1,\n        translate_percent: ScaleFloatType | fgeometric.XYFloatScale | None = None,\n        translate_px: ScaleIntType | fgeometric.XYIntScale | None = None,\n        rotate: ScaleFloatType = 0,\n        shear: ScaleFloatType | fgeometric.XYFloatScale = 0,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: ColorType | None = None,\n        cval_mask: ColorType | None = None,\n        mode: int | None = None,\n        fit_output: bool = False,\n        keep_ratio: bool = False,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        balanced_scale: bool = False,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n        self.scale = cast(fgeometric.XYFloatDict, scale)\n        self.translate_percent = cast(fgeometric.XYFloatDict, translate_percent)\n        self.translate_px = cast(fgeometric.XYIntDict, translate_px)\n        self.rotate = cast(tuple[float, float], rotate)\n        self.fit_output = fit_output\n        self.shear = cast(fgeometric.XYFloatDict, shear)\n        self.keep_ratio = keep_ratio\n        self.rotate_method = rotate_method\n        self.balanced_scale = balanced_scale\n\n        if self.keep_ratio and self.scale[\"x\"] != self.scale[\"y\"]:\n            raise ValueError(\n                f\"When keep_ratio is True, the x and y scale range should be identical. got {self.scale}\",\n            )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"fill\",\n            \"border_mode\",\n            \"scale\",\n            \"translate_percent\",\n            \"translate_px\",\n            \"rotate\",\n            \"fit_output\",\n            \"shear\",\n            \"fill_mask\",\n            \"keep_ratio\",\n            \"rotate_method\",\n            \"balanced_scale\",\n        )\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            img,\n            matrix,\n            interpolation=self.interpolation,\n            fill=self.fill,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.warp_affine(\n            mask,\n            matrix,\n            interpolation=self.mask_interpolation,\n            fill=self.fill_mask,\n            border_mode=self.border_mode,\n            output_shape=output_shape,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        bbox_matrix: np.ndarray,\n        output_shape: tuple[int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_affine(\n            bboxes,\n            bbox_matrix,\n            self.rotate_method,\n            params[\"shape\"][:2],\n            self.border_mode,\n            output_shape,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        scale: fgeometric.XYFloat,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_affine(\n            keypoints,\n            matrix,\n            params[\"shape\"],\n            scale,\n            self.border_mode,\n        )\n\n    @staticmethod\n    def get_scale(\n        scale: fgeometric.XYFloatDict,\n        keep_ratio: bool,\n        balanced_scale: bool,\n        random_state: random.Random,\n    ) -&gt; fgeometric.XYFloat:\n        result_scale = {}\n        for key, value in scale.items():\n            if isinstance(value, (int, float)):\n                result_scale[key] = float(value)\n            elif isinstance(value, tuple):\n                if balanced_scale:\n                    lower_interval = (value[0], 1.0) if value[0] &lt; 1 else None\n                    upper_interval = (1.0, value[1]) if value[1] &gt; 1 else None\n\n                    if lower_interval is not None and upper_interval is not None:\n                        selected_interval = random_state.choice(\n                            [lower_interval, upper_interval],\n                        )\n                    elif lower_interval is not None:\n                        selected_interval = lower_interval\n                    elif upper_interval is not None:\n                        selected_interval = upper_interval\n                    else:\n                        result_scale[key] = 1.0\n                        continue\n\n                    result_scale[key] = random_state.uniform(*selected_interval)\n                else:\n                    result_scale[key] = random_state.uniform(*value)\n            else:\n                raise TypeError(\n                    f\"Invalid scale value for key {key}: {value}. Expected a float or a tuple of two floats.\",\n                )\n\n        if keep_ratio:\n            result_scale[\"y\"] = result_scale[\"x\"]\n\n        return cast(fgeometric.XYFloat, result_scale)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        translate = self._get_translate_params(image_shape)\n        shear = self._get_shear_params()\n        scale = self.get_scale(\n            self.scale,\n            self.keep_ratio,\n            self.balanced_scale,\n            self.py_random,\n        )\n        rotate = self.py_random.uniform(*self.rotate)\n\n        image_shift = fgeometric.center(image_shape)\n        bbox_shift = fgeometric.center_bbox(image_shape)\n\n        matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            image_shift,\n        )\n        bbox_matrix = fgeometric.create_affine_transformation_matrix(\n            translate,\n            shear,\n            scale,\n            rotate,\n            bbox_shift,\n        )\n\n        if self.fit_output:\n            matrix, output_shape = fgeometric.compute_affine_warp_output_shape(\n                matrix,\n                image_shape,\n            )\n            bbox_matrix, _ = fgeometric.compute_affine_warp_output_shape(\n                bbox_matrix,\n                image_shape,\n            )\n        else:\n            output_shape = image_shape\n\n        return {\n            \"rotate\": rotate,\n            \"scale\": scale,\n            \"matrix\": matrix,\n            \"bbox_matrix\": bbox_matrix,\n            \"output_shape\": output_shape,\n        }\n\n    def _get_translate_params(self, image_shape: tuple[int, int]) -&gt; fgeometric.XYInt:\n        height, width = image_shape[:2]\n        if self.translate_px is not None:\n            return {\n                \"x\": self.py_random.randint(*self.translate_px[\"x\"]),\n                \"y\": self.py_random.randint(*self.translate_px[\"y\"]),\n            }\n        if self.translate_percent is not None:\n            translate = {key: self.py_random.uniform(*value) for key, value in self.translate_percent.items()}\n            return cast(\n                fgeometric.XYInt,\n                {\"x\": int(translate[\"x\"] * width), \"y\": int(translate[\"y\"] * height)},\n            )\n        return cast(fgeometric.XYInt, {\"x\": 0, \"y\": 0})\n\n    def _get_shear_params(self) -&gt; fgeometric.XYFloat:\n        return {\n            \"x\": -self.py_random.uniform(*self.shear[\"x\"]),\n            \"y\": -self.py_random.uniform(*self.shear[\"y\"]),\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.BaseDistortion","title":"<code>class  BaseDistortion</code> <code>       (interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Base class for distortion-based transformations.</p> <p>This class provides a foundation for implementing various types of image distortions, such as optical distortions, grid distortions, and elastic transformations. It handles the common operations of applying distortions to images, masks, bounding boxes, and keypoints.</p> <p>Parameters:</p> Name Type Description <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR, cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This is an abstract base class and should not be used directly.</li> <li>Subclasses should implement the <code>get_params_dependent_on_data</code> method to generate   the distortion maps (map_x and map_y).</li> <li>The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)   to maintain coherence in the augmented data.</li> </ul> <p>Example of a subclass:     class CustomDistortion(BaseDistortion):         def init(self, args, **kwargs):             super().init(args, **kwargs)             # Add custom parameters here</p> <pre><code>    def get_params_dependent_on_data(self, params, data):\n        # Generate and return map_x and map_y based on the distortion logic\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self):\n        return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class BaseDistortion(DualTransform):\n    \"\"\"Base class for distortion-based transformations.\n\n    This class provides a foundation for implementing various types of image distortions,\n    such as optical distortions, grid distortions, and elastic transformations. It handles\n    the common operations of applying distortions to images, masks, bounding boxes, and keypoints.\n\n    Args:\n        interpolation (int): Interpolation method to be used for image transformation.\n            Should be one of the OpenCV interpolation types (e.g., cv2.INTER_LINEAR,\n            cv2.INTER_CUBIC). Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This is an abstract base class and should not be used directly.\n        - Subclasses should implement the `get_params_dependent_on_data` method to generate\n          the distortion maps (map_x and map_y).\n        - The distortion is applied consistently across all targets (image, mask, bboxes, keypoints)\n          to maintain coherence in the augmented data.\n\n    Example of a subclass:\n        class CustomDistortion(BaseDistortion):\n            def __init__(self, *args, **kwargs):\n                super().__init__(*args, **kwargs)\n                # Add custom parameters here\n\n            def get_params_dependent_on_data(self, params, data):\n                # Generate and return map_x and map_y based on the distortion logic\n                return {\"map_x\": map_x, \"map_y\": map_y}\n\n            def get_transform_init_args_names(self):\n                return super().get_transform_init_args_names() + (\"custom_param1\", \"custom_param2\")\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            img,\n            map_x,\n            map_y,\n            self.interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap(\n            mask,\n            map_x,\n            map_y,\n            self.mask_interpolation,\n            cv2.BORDER_CONSTANT,\n            0,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        bboxes_returned = fgeometric.remap_bboxes(\n            bboxes_denorm,\n            map_x,\n            map_y,\n            image_shape,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        map_x: np.ndarray,\n        map_y: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.remap_keypoints(keypoints, map_x, map_y, params[\"shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.D4","title":"<code>class  D4</code> <code>       (p=1, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Applies one of the eight possible D4 dihedral group transformations to a square-shaped input, maintaining the square shape. These transformations correspond to the symmetries of a square, including rotations and reflections.</p> <p>The D4 group transformations include: - 'e' (identity): No transformation is applied. - 'r90' (rotation by 90 degrees counterclockwise) - 'r180' (rotation by 180 degrees) - 'r270' (rotation by 270 degrees counterclockwise) - 'v' (reflection across the vertical midline) - 'hvt' (reflection across the anti-diagonal) - 'h' (reflection across the horizontal midline) - 't' (reflection across the main diagonal)</p> <p>Even if the probability (<code>p</code>) of applying the transform is set to 1, the identity transformation 'e' may still occur, which means the input will remain unchanged in one out of eight cases.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform is particularly useful for augmenting data that does not have a clear orientation,   such as top-view satellite or drone imagery, or certain types of medical images.</li> <li>The input image should be square-shaped for optimal results. Non-square inputs may lead to   unexpected behavior or distortions.</li> <li>When applied to bounding boxes or keypoints, their coordinates will be adjusted according   to the selected transformation.</li> <li>This transform preserves the aspect ratio and size of the input.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.D4(p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed['image']\n# The resulting image will be one of the 8 possible D4 transformations of the input\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class D4(DualTransform):\n    \"\"\"Applies one of the eight possible D4 dihedral group transformations to a square-shaped input,\n    maintaining the square shape. These transformations correspond to the symmetries of a square,\n    including rotations and reflections.\n\n    The D4 group transformations include:\n    - 'e' (identity): No transformation is applied.\n    - 'r90' (rotation by 90 degrees counterclockwise)\n    - 'r180' (rotation by 180 degrees)\n    - 'r270' (rotation by 270 degrees counterclockwise)\n    - 'v' (reflection across the vertical midline)\n    - 'hvt' (reflection across the anti-diagonal)\n    - 'h' (reflection across the horizontal midline)\n    - 't' (reflection across the main diagonal)\n\n    Even if the probability (`p`) of applying the transform is set to 1, the identity transformation\n    'e' may still occur, which means the input will remain unchanged in one out of eight cases.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform is particularly useful for augmenting data that does not have a clear orientation,\n          such as top-view satellite or drone imagery, or certain types of medical images.\n        - The input image should be square-shaped for optimal results. Non-square inputs may lead to\n          unexpected behavior or distortions.\n        - When applied to bounding boxes or keypoints, their coordinates will be adjusted according\n          to the selected transformation.\n        - This transform preserves the aspect ratio and size of the input.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.D4(p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        # The resulting image will be one of the 8 possible D4 transformations of the input\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(\n        self,\n        p: float = 1,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def apply(\n        self,\n        img: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.d4(img, group_element)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.bboxes_d4(bboxes, group_element)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        group_element: D4Type,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.keypoints_d4(keypoints, group_element, params[\"shape\"])\n\n    def get_params(self) -&gt; dict[str, D4Type]:\n        return {\n            \"group_element\": self.random_generator.choice(d4_group_elements),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ElasticTransform","title":"<code>class  ElasticTransform</code> <code>       (alpha=1, sigma=50, interpolation=1, border_mode=4, value=None, mask_value=None, approximate=False, same_dxdy=False, mask_interpolation=0, noise_distribution='gaussian', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply elastic deformation to images, masks, bounding boxes, and keypoints.</p> <p>This transformation introduces random elastic distortions to the input data. It's particularly useful for data augmentation in training deep learning models, especially for tasks like image segmentation or object detection where you want to maintain the relative positions of features while introducing realistic deformations.</p> <p>The transform works by generating random displacement fields and applying them to the input. These fields are smoothed using a Gaussian filter to create more natural-looking distortions.</p> <p>Parameters:</p> Name Type Description <code>alpha</code> <code>float</code> <p>Scaling factor for the random displacement fields. Higher values result in more pronounced distortions. Default: 1.0</p> <code>sigma</code> <code>float</code> <p>Standard deviation of the Gaussian filter used to smooth the displacement fields. Higher values result in smoother, more global distortions. Default: 50.0</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>approximate</code> <code>bool</code> <p>Whether to use an approximate version of the elastic transform. If True, uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially less accurate for large sigma values. Default: False</p> <code>same_dxdy</code> <code>bool</code> <p>Whether to use the same random displacement field for both x and y directions. Can speed up the transform at the cost of less diverse distortions. Default: False</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>noise_distribution</code> <code>Literal[\"gaussian\", \"uniform\"]</code> <p>Distribution used to generate the displacement fields. \"gaussian\" generates fields using normal distribution (more natural deformations). \"uniform\" generates fields using uniform distribution (more mechanical deformations). Default: \"gaussian\".</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)   by using the same displacement fields for all.</li> <li>The 'approximate' parameter determines whether to use a precise or approximate method for   generating displacement fields. The approximate method can be faster but may be less   accurate for large sigma values.</li> <li>Bounding boxes that end up outside the image after transformation will be removed.</li> <li>Keypoints that end up outside the image after transformation will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ElasticTransform(BaseDistortion):\n    \"\"\"Apply elastic deformation to images, masks, bounding boxes, and keypoints.\n\n    This transformation introduces random elastic distortions to the input data. It's particularly\n    useful for data augmentation in training deep learning models, especially for tasks like\n    image segmentation or object detection where you want to maintain the relative positions of\n    features while introducing realistic deformations.\n\n    The transform works by generating random displacement fields and applying them to the input.\n    These fields are smoothed using a Gaussian filter to create more natural-looking distortions.\n\n    Args:\n        alpha (float): Scaling factor for the random displacement fields. Higher values result in\n            more pronounced distortions. Default: 1.0\n        sigma (float): Standard deviation of the Gaussian filter used to smooth the displacement\n            fields. Higher values result in smoother, more global distortions. Default: 50.0\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        approximate (bool): Whether to use an approximate version of the elastic transform. If True,\n            uses a fixed kernel size for Gaussian smoothing, which can be faster but potentially\n            less accurate for large sigma values. Default: False\n        same_dxdy (bool): Whether to use the same random displacement field for both x and y\n            directions. Can speed up the transform at the cost of less diverse distortions. Default: False\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        noise_distribution (Literal[\"gaussian\", \"uniform\"]): Distribution used to generate the displacement fields.\n            \"gaussian\" generates fields using normal distribution (more natural deformations).\n            \"uniform\" generates fields using uniform distribution (more mechanical deformations).\n            Default: \"gaussian\".\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints)\n          by using the same displacement fields for all.\n        - The 'approximate' parameter determines whether to use a precise or approximate method for\n          generating displacement fields. The approximate method can be faster but may be less\n          accurate for large sigma values.\n        - Bounding boxes that end up outside the image after transformation will be removed.\n        - Keypoints that end up outside the image after transformation will be removed.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.ElasticTransform(alpha=1, sigma=50, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        alpha: Annotated[float, Field(ge=0)]\n        sigma: Annotated[float, Field(ge=1)]\n        approximate: bool\n        same_dxdy: bool\n        noise_distribution: Literal[\"gaussian\", \"uniform\"]\n        border_mode: BorderModeType = Field(deprecated=\"Deprecated\")\n        value: ColorType | None = Field(deprecated=\"Deprecated\")\n        mask_value: ColorType | None = Field(deprecated=\"Deprecated\")\n\n    def __init__(\n        self,\n        alpha: float = 1,\n        sigma: float = 50,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        approximate: bool = False,\n        same_dxdy: bool = False,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        noise_distribution: Literal[\"gaussian\", \"uniform\"] = \"gaussian\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.alpha = alpha\n        self.sigma = sigma\n        self.approximate = approximate\n        self.same_dxdy = same_dxdy\n        self.noise_distribution = noise_distribution\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n        kernel_size = (0, 0) if self.approximate else (17, 17)\n\n        # Generate displacement fields\n        dx, dy = fgeometric.generate_displacement_fields(\n            (height, width),\n            self.alpha,\n            self.sigma,\n            same_dxdy=self.same_dxdy,\n            kernel_size=kernel_size,\n            random_generator=self.random_generator,\n            noise_distribution=self.noise_distribution,\n        )\n\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        map_x = np.float32(x + dx)\n        map_y = np.float32(y + dy)\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"alpha\",\n            \"sigma\",\n            \"approximate\",\n            \"same_dxdy\",\n            \"noise_distribution\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.GridDistortion","title":"<code>class  GridDistortion</code> <code>       (num_steps=5, distort_limit=(-0.3, 0.3), interpolation=1, border_mode=4, value=None, mask_value=None, normalized=True, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply grid distortion to images, masks, bounding boxes, and keypoints.</p> <p>This transformation divides the image into a grid and randomly distorts each cell, creating localized warping effects. It's particularly useful for data augmentation in tasks like medical image analysis, OCR, and other domains where local geometric variations are meaningful.</p> <p>Parameters:</p> Name Type Description <code>num_steps</code> <code>int</code> <p>Number of grid cells on each side of the image. Higher values create more granular distortions. Must be at least 1. Default: 5.</p> <code>distort_limit</code> <code>float or tuple[float, float]</code> <p>Range of distortion. If a single float is provided, the range will be (-distort_limit, distort_limit). Higher values create stronger distortions. Should be in the range of -1 to 1. Default: (-0.3, 0.3).</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation method used for image transformation. Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.</p> <code>normalized</code> <code>bool</code> <p>If True, ensures that the distortion does not move pixels outside the image boundaries. This can result in less extreme distortions but guarantees that no information is lost. Default: True.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The same distortion is applied to all targets (image, mask, bboxes, keypoints)   to maintain consistency.</li> <li>When normalized=True, the distortion is adjusted to ensure all pixels remain   within the image boundaries.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridDistortion(BaseDistortion):\n    \"\"\"Apply grid distortion to images, masks, bounding boxes, and keypoints.\n\n    This transformation divides the image into a grid and randomly distorts each cell,\n    creating localized warping effects. It's particularly useful for data augmentation\n    in tasks like medical image analysis, OCR, and other domains where local geometric\n    variations are meaningful.\n\n    Args:\n        num_steps (int): Number of grid cells on each side of the image. Higher values\n            create more granular distortions. Must be at least 1. Default: 5.\n        distort_limit (float or tuple[float, float]): Range of distortion. If a single float\n            is provided, the range will be (-distort_limit, distort_limit). Higher values\n            create stronger distortions. Should be in the range of -1 to 1.\n            Default: (-0.3, 0.3).\n        interpolation (int): OpenCV interpolation method used for image transformation.\n            Options include cv2.INTER_LINEAR, cv2.INTER_CUBIC, etc. Default: cv2.INTER_LINEAR.\n        normalized (bool): If True, ensures that the distortion does not move pixels\n            outside the image boundaries. This can result in less extreme distortions\n            but guarantees that no information is lost. Default: True.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The same distortion is applied to all targets (image, mask, bboxes, keypoints)\n          to maintain consistency.\n        - When normalized=True, the distortion is adjusted to ensure all pixels remain\n          within the image boundaries.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.GridDistortion(num_steps=5, distort_limit=0.3, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        num_steps: Annotated[int, Field(ge=1)]\n        distort_limit: SymmetricRangeType\n        normalized: bool\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int = Field(deprecated=\"Deprecated. Does not have any effect.\")\n\n        @field_validator(\"distort_limit\")\n        @classmethod\n        def check_limits(\n            cls,\n            v: tuple[float, float],\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = -1, 1\n            result = to_tuple(v)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        num_steps: int = 5,\n        distort_limit: ScaleFloatType = (-0.3, 0.3),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        normalized: bool = True,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.num_steps = num_steps\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.normalized = normalized\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        steps_x = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n        steps_y = [1 + self.py_random.uniform(*self.distort_limit) for _ in range(self.num_steps + 1)]\n\n        if self.normalized:\n            normalized_params = fgeometric.normalize_grid_distortion_steps(\n                image_shape,\n                self.num_steps,\n                steps_x,\n                steps_y,\n            )\n            steps_x, steps_y = (\n                normalized_params[\"steps_x\"],\n                normalized_params[\"steps_y\"],\n            )\n\n        map_x, map_y = fgeometric.generate_grid(\n            image_shape,\n            steps_x,\n            steps_y,\n            self.num_steps,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            *super().get_transform_init_args_names(),\n            \"num_steps\",\n            \"distort_limit\",\n            \"normalized\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.GridElasticDeform","title":"<code>class  GridElasticDeform</code> <code>       (num_grid_xy, magnitude, interpolation=1, mask_interpolation=0, p=1.0, always_apply=None)                               </code>  [view source on GitHub]","text":"<p>Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.</p> <p>This transformation overlays a grid on the input and applies random displacements to the grid points, resulting in local elastic distortions. The granularity and intensity of the distortions can be controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.</p> <p>Parameters:</p> Name Type Description <code>num_grid_xy</code> <code>tuple[int, int]</code> <p>Number of grid cells along the width and height. Specified as (grid_width, grid_height). Each value must be greater than 1.</p> <code>magnitude</code> <code>int</code> <p>Maximum pixel-wise displacement for distortion. Must be greater than 0.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for the image transformation. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method to be used for mask transformation. Default: cv2.INTER_NEAREST</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n&gt;&gt;&gt; result = transform(image=image, mask=mask)\n&gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n</code></pre> <p>Note</p> <p>This transformation is particularly useful for data augmentation in medical imaging and other domains where elastic deformations can simulate realistic variations.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class GridElasticDeform(DualTransform):\n    \"\"\"Apply elastic deformations to images, masks, bounding boxes, and keypoints using a grid-based approach.\n\n    This transformation overlays a grid on the input and applies random displacements to the grid points,\n    resulting in local elastic distortions. The granularity and intensity of the distortions can be\n    controlled using the dimensions of the overlaying distortion grid and the magnitude parameter.\n\n\n    Args:\n        num_grid_xy (tuple[int, int]): Number of grid cells along the width and height.\n            Specified as (grid_width, grid_height). Each value must be greater than 1.\n        magnitude (int): Maximum pixel-wise displacement for distortion. Must be greater than 0.\n        interpolation (int): Interpolation method to be used for the image transformation.\n            Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Interpolation method to be used for mask transformation.\n            Default: cv2.INTER_NEAREST\n        p (float): Probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Example:\n        &gt;&gt;&gt; transform = GridElasticDeform(num_grid_xy=(4, 4), magnitude=10, p=1.0)\n        &gt;&gt;&gt; result = transform(image=image, mask=mask)\n        &gt;&gt;&gt; transformed_image, transformed_mask = result['image'], result['mask']\n\n    Note:\n        This transformation is particularly useful for data augmentation in medical imaging\n        and other domains where elastic deformations can simulate realistic variations.\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        num_grid_xy: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n        magnitude: int = Field(gt=0)\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n\n    def __init__(\n        self,\n        num_grid_xy: tuple[int, int],\n        magnitude: int,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_grid_xy = num_grid_xy\n        self.magnitude = magnitude\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    @staticmethod\n    def generate_mesh(polygons: np.ndarray, dimensions: np.ndarray) -&gt; np.ndarray:\n        return np.hstack((dimensions.reshape(-1, 4), polygons))\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        # Replace calculate_grid_dimensions with split_uniform_grid\n        tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.num_grid_xy,\n            self.random_generator,\n        )\n\n        # Convert tiles to the format expected by generate_distorted_grid_polygons\n        dimensions = np.array(\n            [\n                [\n                    tile[1],\n                    tile[0],\n                    tile[3],\n                    tile[2],\n                ]  # Reorder to [x_min, y_min, x_max, y_max]\n                for tile in tiles\n            ],\n        ).reshape(\n            self.num_grid_xy[::-1] + (4,),\n        )  # Reshape to (grid_height, grid_width, 4)\n\n        polygons = fgeometric.generate_distorted_grid_polygons(\n            dimensions,\n            self.magnitude,\n            self.random_generator,\n        )\n\n        generated_mesh = self.generate_mesh(polygons, dimensions)\n\n        return {\"generated_mesh\": generated_mesh}\n\n    def apply(\n        self,\n        img: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(img, generated_mesh, self.interpolation)\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image(mask, generated_mesh, self.mask_interpolation)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        bboxes_denorm = denormalize_bboxes(bboxes, params[\"shape\"][:2])\n        return normalize_bboxes(\n            fgeometric.bbox_distort_image(\n                bboxes_denorm,\n                generated_mesh,\n                params[\"shape\"][:2],\n            ),\n            params[\"shape\"][:2],\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        generated_mesh: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.distort_image_keypoints(\n            keypoints,\n            generated_mesh,\n            params[\"shape\"][:2],\n        )\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"num_grid_xy\", \"magnitude\", \"interpolation\", \"mask_interpolation\"\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.HorizontalFlip","title":"<code>class  HorizontalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input horizontally around the y-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class HorizontalFlip(DualTransform):\n    \"\"\"Flip the input horizontally around the y-axis.\n\n    Args:\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return hflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_hflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_hflip(keypoints, params[\"shape\"][1])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.OpticalDistortion","title":"<code>class  OpticalDistortion</code> <code>       (distort_limit=(-0.05, 0.05), shift_limit=None, interpolation=1, border_mode=None, value=None, mask_value=None, mask_interpolation=0, mode='camera', p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply optical distortion to images, masks, bounding boxes, and keypoints.</p> <p>Supports two distortion models: 1. Camera matrix model (original):    Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients</p> <ol> <li>Fisheye model:    Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)</li> </ol> <p>Parameters:</p> Name Type Description <code>distort_limit</code> <code>float | tuple[float, float]</code> <p>Range of distortion coefficient. For camera model: recommended range (-0.05, 0.05) For fisheye model: recommended range (-0.3, 0.3) Default: (-0.05, 0.05)</p> <code>mode</code> <code>Literal['camera', 'fisheye']</code> <p>Distortion model to use: - 'camera': Original camera matrix model - 'fisheye': Fisheye lens model Default: 'camera'</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Interpolation method used for image transformation. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.</li> <li>The distortion coefficient (k) is randomly sampled from the distort_limit range.</li> <li>The image center is shifted by dx and dy, randomly sampled from the shift_limit range.</li> <li>Bounding boxes and keypoints are transformed along with the image to maintain consistency.</li> <li>Fisheye model directly applies radial distortion</li> <li>Both models use shift_limit to control distortion center</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; transformed_image = transformed['image']\n&gt;&gt;&gt; transformed_mask = transformed['mask']\n&gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n&gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class OpticalDistortion(BaseDistortion):\n    \"\"\"Apply optical distortion to images, masks, bounding boxes, and keypoints.\n\n    Supports two distortion models:\n    1. Camera matrix model (original):\n       Uses OpenCV's camera calibration model with k1=k2=k distortion coefficients\n\n    2. Fisheye model:\n       Direct radial distortion: r_dist = r * (1 + gamma * r\u00b2)\n\n    Args:\n        distort_limit (float | tuple[float, float]): Range of distortion coefficient.\n            For camera model: recommended range (-0.05, 0.05)\n            For fisheye model: recommended range (-0.3, 0.3)\n            Default: (-0.05, 0.05)\n\n        mode (Literal['camera', 'fisheye']): Distortion model to use:\n            - 'camera': Original camera matrix model\n            - 'fisheye': Fisheye lens model\n            Default: 'camera'\n\n        interpolation (OpenCV flag): Interpolation method used for image transformation.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC,\n            cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.\n\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The distortion is applied using OpenCV's initUndistortRectifyMap and remap functions.\n        - The distortion coefficient (k) is randomly sampled from the distort_limit range.\n        - The image center is shifted by dx and dy, randomly sampled from the shift_limit range.\n        - Bounding boxes and keypoints are transformed along with the image to maintain consistency.\n        - Fisheye model directly applies radial distortion\n        - Both models use shift_limit to control distortion center\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.OpticalDistortion(distort_limit=0.1, p=1.0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; transformed_image = transformed['image']\n        &gt;&gt;&gt; transformed_mask = transformed['mask']\n        &gt;&gt;&gt; transformed_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; transformed_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        distort_limit: SymmetricRangeType\n        mode: Literal[\"camera\", \"fisheye\"]\n        shift_limit: SymmetricRangeType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        mask_value: ColorType | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n        border_mode: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n    def __init__(\n        self,\n        distort_limit: ScaleFloatType = (-0.05, 0.05),\n        shift_limit: ScaleFloatType | None = None,\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int | None = None,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        mode: Literal[\"camera\", \"fisheye\"] = \"camera\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.distort_limit = cast(tuple[float, float], distort_limit)\n        self.mode = mode\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n        height, width = image_shape\n\n        # Get distortion coefficient\n        k = self.py_random.uniform(*self.distort_limit)\n\n        # Calculate center shift\n        center_xy = fgeometric.center(image_shape)\n\n        # Get distortion maps based on mode\n        if self.mode == \"camera\":\n            map_x, map_y = fgeometric.get_camera_matrix_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n        else:  # fisheye\n            map_x, map_y = fgeometric.get_fisheye_distortion_maps(\n                image_shape,\n                k,\n                center_xy,\n            )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"distort_limit\",\n            \"mode\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Pad","title":"<code>class  Pad</code> <code>       (padding=0, fill=0, fill_mask=0, border_mode=0, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Pad the sides of an image by specified number of pixels.</p> <p>Parameters:</p> Name Type Description <code>padding</code> <code>int, tuple[int, int] or tuple[int, int, int, int]</code> <p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>References</p> <ul> <li>https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Pad(DualTransform):\n    \"\"\"Pad the sides of an image by specified number of pixels.\n\n    Args:\n        padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:\n            * int - pad all sides by this value\n            * tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y\n            * tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT\n        fill_mask (ColorType): Padding value for mask if border_mode is cv2.BORDER_CONSTANT\n        border_mode (OpenCV flag): OpenCV border mode\n        p (float): probability of applying the transform. Default: 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    References:\n        - https://pytorch.org/vision/main/generated/torchvision.transforms.v2.Pad.html\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        padding: int | tuple[int, int] | tuple[int, int, int, int]\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n    def __init__(\n        self,\n        padding: int | tuple[int, int] | tuple[int, int, int, int] = 0,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        border_mode: BorderModeType = cv2.BORDER_CONSTANT,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.padding = padding\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.border_mode = border_mode\n\n    def apply(\n        self,\n        img: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            img,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_with_params(\n            mask,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            border_mode=self.border_mode,\n            value=self.fill_mask,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_np = denormalize_bboxes(bboxes, params[\"shape\"])\n\n        result = fgeometric.pad_bboxes(\n            bboxes_np,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=image_shape,\n        )\n\n        rows, cols = params[\"shape\"][:2]\n        return normalize_bboxes(\n            result,\n            (rows + pad_top + pad_bottom, cols + pad_left + pad_right),\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        pad_top: int,\n        pad_bottom: int,\n        pad_left: int,\n        pad_right: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.pad_keypoints(\n            keypoints,\n            pad_top,\n            pad_bottom,\n            pad_left,\n            pad_right,\n            self.border_mode,\n            image_shape=params[\"shape\"][:2],\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        if isinstance(self.padding, Real):\n            pad_top = pad_bottom = pad_left = pad_right = self.padding\n        elif isinstance(self.padding, (tuple, list)):\n            if len(self.padding) == NUM_PADS_XY:\n                pad_left = pad_right = self.padding[0]\n                pad_top = pad_bottom = self.padding[1]\n            elif len(self.padding) == NUM_PADS_ALL_SIDES:\n                pad_left, pad_top, pad_right, pad_bottom = self.padding  # type: ignore[misc]\n            else:\n                raise TypeError(\n                    \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n                )\n        else:\n            raise TypeError(\n                \"Padding must be a single number, a pair of numbers, or a quadruple of numbers\",\n            )\n\n        return {\n            \"pad_top\": pad_top,\n            \"pad_bottom\": pad_bottom,\n            \"pad_left\": pad_left,\n            \"pad_right\": pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"padding\",\n            \"fill\",\n            \"fill_mask\",\n            \"border_mode\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PadIfNeeded","title":"<code>class  PadIfNeeded</code> <code>       (min_height=1024, min_width=1024, pad_height_divisor=None, pad_width_divisor=None, position='center', border_mode=4, value=None, mask_value=None, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pads the sides of an image if the image dimensions are less than the specified minimum dimensions. If the <code>pad_height_divisor</code> or <code>pad_width_divisor</code> is specified, the function additionally ensures that the image dimensions are divisible by these values.</p> <p>Parameters:</p> Name Type Description <code>min_height</code> <code>int | None</code> <p>Minimum desired height of the image. Ensures image height is at least this value. If not specified, pad_height_divisor must be provided.</p> <code>min_width</code> <code>int | None</code> <p>Minimum desired width of the image. Ensures image width is at least this value. If not specified, pad_width_divisor must be provided.</p> <code>pad_height_divisor</code> <code>int | None</code> <p>If set, pads the image height to make it divisible by this value. If not specified, min_height must be provided.</p> <code>pad_width_divisor</code> <code>int | None</code> <p>If set, pads the image width to make it divisible by this value. If not specified, min_width must be provided.</p> <code>position</code> <code>Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]</code> <p>Position where the image is to be placed after padding. Default is 'center'.</p> <code>border_mode</code> <code>int</code> <p>Specifies the border mode to use if padding is required. The default is <code>cv2.BORDER_REFLECT_101</code>.</p> <code>fill</code> <code>ColorType | None</code> <p>Value to fill the border pixels if the border mode is <code>cv2.BORDER_CONSTANT</code>. Default is None.</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Similar to <code>fill</code> but used for padding masks. Default is None.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default is 1.0.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>Either <code>min_height</code> or <code>pad_height_divisor</code> must be set, but not both.</li> <li>Either <code>min_width</code> or <code>pad_width_divisor</code> must be set, but not both.</li> <li>If <code>border_mode</code> is set to <code>cv2.BORDER_CONSTANT</code>, <code>value</code> must be provided.</li> <li>The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).</li> <li>For bounding boxes, the coordinates will be adjusted to account for the padding.</li> <li>For keypoints, their positions will be shifted according to the padding.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n&gt;&gt;&gt; padded_image = transformed['image']\n&gt;&gt;&gt; padded_mask = transformed['mask']\n&gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n&gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PadIfNeeded(Pad):\n    \"\"\"Pads the sides of an image if the image dimensions are less than the specified minimum dimensions.\n    If the `pad_height_divisor` or `pad_width_divisor` is specified, the function additionally ensures\n    that the image dimensions are divisible by these values.\n\n    Args:\n        min_height (int | None): Minimum desired height of the image. Ensures image height is at least this value.\n            If not specified, pad_height_divisor must be provided.\n        min_width (int | None): Minimum desired width of the image. Ensures image width is at least this value.\n            If not specified, pad_width_divisor must be provided.\n        pad_height_divisor (int | None): If set, pads the image height to make it divisible by this value.\n            If not specified, min_height must be provided.\n        pad_width_divisor (int | None): If set, pads the image width to make it divisible by this value.\n            If not specified, min_width must be provided.\n        position (Literal[\"center\", \"top_left\", \"top_right\", \"bottom_left\", \"bottom_right\", \"random\"]):\n            Position where the image is to be placed after padding. Default is 'center'.\n        border_mode (int): Specifies the border mode to use if padding is required.\n            The default is `cv2.BORDER_REFLECT_101`.\n        fill (ColorType | None): Value to fill the border pixels if the border mode is `cv2.BORDER_CONSTANT`.\n            Default is None.\n        fill_mask (ColorType | None): Similar to `fill` but used for padding masks. Default is None.\n        p (float): Probability of applying the transform. Default is 1.0.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - Either `min_height` or `pad_height_divisor` must be set, but not both.\n        - Either `min_width` or `pad_width_divisor` must be set, but not both.\n        - If `border_mode` is set to `cv2.BORDER_CONSTANT`, `value` must be provided.\n        - The transform will maintain consistency across all targets (image, mask, bboxes, keypoints, volume).\n        - For bounding boxes, the coordinates will be adjusted to account for the padding.\n        - For keypoints, their positions will be shifted according to the padding.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PadIfNeeded(min_height=1024, min_width=1024, border_mode=cv2.BORDER_CONSTANT, fill=0),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image, mask=mask, bboxes=bboxes, keypoints=keypoints)\n        &gt;&gt;&gt; padded_image = transformed['image']\n        &gt;&gt;&gt; padded_mask = transformed['mask']\n        &gt;&gt;&gt; adjusted_bboxes = transformed['bboxes']\n        &gt;&gt;&gt; adjusted_keypoints = transformed['keypoints']\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        min_height: int | None = Field(ge=1)\n        min_width: int | None = Field(ge=1)\n        pad_height_divisor: int | None = Field(ge=1)\n        pad_width_divisor: int | None = Field(ge=1)\n        position: PositionType\n        border_mode: BorderModeType\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        fill: ColorType\n        fill_mask: ColorType\n\n        @model_validator(mode=\"after\")\n        def validate_divisibility(self) -&gt; Self:\n            if (self.min_height is None) == (self.pad_height_divisor is None):\n                msg = \"Only one of 'min_height' and 'pad_height_divisor' parameters must be set\"\n                raise ValueError(msg)\n            if (self.min_width is None) == (self.pad_width_divisor is None):\n                msg = \"Only one of 'min_width' and 'pad_width_divisor' parameters must be set\"\n                raise ValueError(msg)\n\n            if self.border_mode == cv2.BORDER_CONSTANT and self.fill is None:\n                msg = \"If 'border_mode' is set to 'BORDER_CONSTANT', 'fill' must be provided.\"\n                raise ValueError(msg)\n\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n\n            return self\n\n    def __init__(\n        self,\n        min_height: int | None = 1024,\n        min_width: int | None = 1024,\n        pad_height_divisor: int | None = None,\n        pad_width_divisor: int | None = None,\n        position: PositionType = \"center\",\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        # Initialize with dummy padding that will be calculated later\n        super().__init__(\n            padding=0,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            p=p,\n        )\n        self.min_height = min_height\n        self.min_width = min_width\n        self.pad_height_divisor = pad_height_divisor\n        self.pad_width_divisor = pad_width_divisor\n        self.position = position\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.get_padding_params(\n            image_shape=params[\"shape\"][:2],\n            min_height=self.min_height,\n            min_width=self.min_width,\n            pad_height_divisor=self.pad_height_divisor,\n            pad_width_divisor=self.pad_width_divisor,\n        )\n\n        h_pad_top, h_pad_bottom, w_pad_left, w_pad_right = fgeometric.adjust_padding_by_position(\n            h_top=h_pad_top,\n            h_bottom=h_pad_bottom,\n            w_left=w_pad_left,\n            w_right=w_pad_right,\n            position=self.position,\n            py_random=self.py_random,\n        )\n\n        return {\n            \"pad_top\": h_pad_top,\n            \"pad_bottom\": h_pad_bottom,\n            \"pad_left\": w_pad_left,\n            \"pad_right\": w_pad_right,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"min_height\",\n            \"min_width\",\n            \"pad_height_divisor\",\n            \"pad_width_divisor\",\n            \"position\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Perspective","title":"<code>class  Perspective</code> <code>       (scale=(0.05, 0.1), keep_size=True, pad_mode=None, pad_val=None, mask_pad_val=None, fit_output=False, interpolation=1, mask_interpolation=0, border_mode=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Apply random four point perspective transformation to the input.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>float or tuple of float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Default: (0.05, 0.1).</p> <code>keep_size</code> <code>bool</code> <p>Whether to resize image back to its original size after applying the perspective transform. If set to False, the resulting images may end up having different shapes. Default: True.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>OpenCV border mode used for padding. Default: cv2.BORDER_CONSTANT.</p> <code>fill</code> <code>ColorType</code> <p>Padding value if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if border_mode is cv2.BORDER_CONSTANT. Default: 0.</p> <code>fit_output</code> <code>bool</code> <p>If True, the image plane size and position will be adjusted to still capture the whole image after perspective transformation. This is followed by image resizing if keep_size is set to True. If False, parts of the transformed image may be outside of the image plane. This setting should not be set to True when using large scale values as it could lead to very large images. Default: False.</p> <code>interpolation</code> <code>int</code> <p>Interpolation method to be used for image transformation. Should be one of the OpenCV interpolation types. Default: cv2.INTER_LINEAR</p> <code>mask_interpolation</code> <code>int</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>This transformation creates a perspective effect by randomly moving the four corners of the image. The amount of movement is controlled by the 'scale' parameter.</p> <p>When 'keep_size' is True, the output image will have the same size as the input image, which may cause some parts of the transformed image to be cut off or padded.</p> <p>When 'fit_output' is True, the transformation ensures that the entire transformed image is visible, which may result in a larger output image if keep_size is False.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n... ])\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Perspective(DualTransform):\n    \"\"\"Apply random four point perspective transformation to the input.\n\n    Args:\n        scale (float or tuple of float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Default: (0.05, 0.1).\n        keep_size (bool): Whether to resize image back to its original size after applying the perspective transform.\n            If set to False, the resulting images may end up having different shapes.\n            Default: True.\n        border_mode (OpenCV flag): OpenCV border mode used for padding.\n            Default: cv2.BORDER_CONSTANT.\n        fill (ColorType): Padding value if border_mode is cv2.BORDER_CONSTANT.\n            Default: 0.\n        fill_mask (ColorType): Padding value for mask if border_mode is\n            cv2.BORDER_CONSTANT. Default: 0.\n        fit_output (bool): If True, the image plane size and position will be adjusted to still capture\n            the whole image after perspective transformation. This is followed by image resizing if keep_size is set\n            to True. If False, parts of the transformed image may be outside of the image plane.\n            This setting should not be set to True when using large scale values as it could lead to very large images.\n            Default: False.\n        interpolation (int): Interpolation method to be used for image transformation. Should be one\n            of the OpenCV interpolation types. Default: cv2.INTER_LINEAR\n        mask_interpolation (int): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        This transformation creates a perspective effect by randomly moving the four corners of the image.\n        The amount of movement is controlled by the 'scale' parameter.\n\n        When 'keep_size' is True, the output image will have the same size as the input image,\n        which may cause some parts of the transformed image to be cut off or padded.\n\n        When 'fit_output' is True, the transformation ensures that the entire transformed image is visible,\n        which may result in a larger output image if keep_size is False.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.Perspective(scale=(0.05, 0.1), keep_size=True, always_apply=False, p=0.5),\n        ... ])\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        keep_size: bool\n        pad_mode: BorderModeType | None\n        pad_val: ColorType | None\n        mask_pad_val: ColorType | None\n        fit_output: bool\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        fill: ColorType\n        fill_mask: ColorType\n        border_mode: BorderModeType\n\n        @model_validator(mode=\"after\")\n        def validate_deprecated_fields(self) -&gt; Self:\n            if self.pad_mode is not None:\n                warn(\"pad_mode is deprecated, use border_mode instead\", DeprecationWarning, stacklevel=2)\n                self.border_mode = self.pad_mode\n            if self.pad_val is not None:\n                warn(\"pad_val is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.pad_val\n            if self.mask_pad_val is not None:\n                warn(\"mask_pad_val is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_pad_val\n            return self\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.05, 0.1),\n        keep_size: bool = True,\n        pad_mode: int | None = None,\n        pad_val: ColorType | None = None,\n        mask_pad_val: ColorType | None = None,\n        fit_output: bool = False,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        border_mode: int = cv2.BORDER_CONSTANT,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p, always_apply=always_apply)\n        self.scale = cast(tuple[float, float], scale)\n        self.keep_size = keep_size\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.fit_output = fit_output\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n\n    def apply(\n        self,\n        img: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            img,\n            matrix,\n            max_width,\n            max_height,\n            self.fill,\n            self.border_mode,\n            self.keep_size,\n            self.interpolation,\n        )\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective(\n            mask,\n            matrix,\n            max_width,\n            max_height,\n            self.fill_mask,\n            self.border_mode,\n            self.keep_size,\n            self.mask_interpolation,\n        )\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        matrix_bbox: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_bboxes(\n            bboxes,\n            params[\"shape\"],\n            matrix_bbox,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        matrix: np.ndarray,\n        max_height: int,\n        max_width: int,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.perspective_keypoints(\n            keypoints,\n            params[\"shape\"],\n            matrix,\n            max_width,\n            max_height,\n            self.keep_size,\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        scale = self.py_random.uniform(*self.scale)\n\n        points = fgeometric.generate_perspective_points(\n            image_shape,\n            scale,\n            self.random_generator,\n        )\n        points = fgeometric.order_points(points)\n\n        matrix, max_width, max_height = fgeometric.compute_perspective_params(\n            points,\n            image_shape,\n        )\n\n        if self.fit_output:\n            matrix, max_width, max_height = fgeometric.expand_transform(\n                matrix,\n                image_shape,\n            )\n\n        return {\n            \"matrix\": matrix,\n            \"max_height\": max_height,\n            \"max_width\": max_width,\n            \"matrix_bbox\": matrix,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"keep_size\",\n            \"border_mode\",\n            \"fill\",\n            \"fill_mask\",\n            \"fit_output\",\n            \"interpolation\",\n            \"mask_interpolation\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.PiecewiseAffine","title":"<code>class  PiecewiseAffine</code> <code>       (scale=(0.03, 0.05), nb_rows=(4, 4), nb_cols=(4, 4), interpolation=1, mask_interpolation=0, cval=None, cval_mask=None, mode=None, absolute_scale=False, p=0.5, always_apply=None, keypoints_threshold=0.01)                     </code>  [view source on GitHub]","text":"<p>Apply piecewise affine transformations to the input image.</p> <p>This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points around via affine transformations. This leads to local distortions in the image.</p> <p>Parameters:</p> Name Type Description <code>scale</code> <code>tuple[float, float] | float</code> <p>Standard deviation of the normal distributions. These are used to sample the random distances of the subimage's corners from the full image's corners. If scale is a single float value, the range will be (0, scale). Recommended values are in the range (0.01, 0.05) for small distortions, and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).</p> <code>nb_rows</code> <code>tuple[int, int] | int</code> <p>Number of rows of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of rows. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>nb_cols</code> <code>tuple[int, int] | int</code> <p>Number of columns of points that the regular grid should have. Must be at least 2. For large images, you might want to pick a higher value than 4. If a single int, then that value will always be used as the number of columns. If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image. Default: 4.</p> <code>interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>absolute_scale</code> <code>bool</code> <p>If set to True, the value of the scale parameter will be treated as an absolute pixel value. If set to False, it will be treated as a fraction of the image height and width. Default: False.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This augmentation is very slow. Consider using <code>ElasticTransform</code> instead, which is at least 10x faster.</li> <li>The augmentation may not always produce visible effects, especially with small scale values.</li> <li>For keypoints and bounding boxes, the transformation might move them outside the image boundaries.   In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n&gt;&gt;&gt; transform = A.Compose([\n...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n&gt;&gt;&gt; transformed_image = transformed[\"image\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class PiecewiseAffine(BaseDistortion):\n    \"\"\"Apply piecewise affine transformations to the input image.\n\n    This augmentation places a regular grid of points on an image and randomly moves the neighborhood of these points\n    around via affine transformations. This leads to local distortions in the image.\n\n    Args:\n        scale (tuple[float, float] | float): Standard deviation of the normal distributions. These are used to sample\n            the random distances of the subimage's corners from the full image's corners.\n            If scale is a single float value, the range will be (0, scale).\n            Recommended values are in the range (0.01, 0.05) for small distortions,\n            and (0.05, 0.1) for larger distortions. Default: (0.03, 0.05).\n        nb_rows (tuple[int, int] | int): Number of rows of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of rows.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        nb_cols (tuple[int, int] | int): Number of columns of points that the regular grid should have.\n            Must be at least 2. For large images, you might want to pick a higher value than 4.\n            If a single int, then that value will always be used as the number of columns.\n            If a tuple (a, b), then a value from the discrete interval [a..b] will be uniformly sampled per image.\n            Default: 4.\n        interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        absolute_scale (bool): If set to True, the value of the scale parameter will be treated as an absolute\n            pixel value. If set to False, it will be treated as a fraction of the image height and width.\n            Default: False.\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This augmentation is very slow. Consider using `ElasticTransform` instead, which is at least 10x faster.\n        - The augmentation may not always produce visible effects, especially with small scale values.\n        - For keypoints and bounding boxes, the transformation might move them outside the image boundaries.\n          In such cases, the keypoints will be set to (-1, -1) and the bounding boxes will be removed.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.PiecewiseAffine(scale=(0.03, 0.05), nb_rows=4, nb_cols=4, p=0.5),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n        &gt;&gt;&gt; transformed_image = transformed[\"image\"]\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        scale: NonNegativeFloatRangeType\n        nb_rows: ScaleIntType\n        nb_cols: ScaleIntType\n        interpolation: InterpolationType\n        mask_interpolation: InterpolationType\n        cval: int | None = Field(deprecated=\"Deprecated. Does not have any effect.\")\n        cval_mask: int | None = Field(\n            deprecated=\"Deprecated. Does not have any effect.\",\n        )\n\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = Field(\n            deprecated=\"Deprecated. Does not have any effects.\",\n        )\n\n        absolute_scale: bool\n        keypoints_threshold: float = Field(\n            deprecated=\"This parameter is not used anymore\",\n        )\n\n        @field_validator(\"nb_rows\", \"nb_cols\")\n        @classmethod\n        def process_range(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; tuple[float, float]:\n            bounds = 2, BIG_INTEGER\n            result = to_tuple(value, value)\n            check_range(result, *bounds, info.field_name)\n            return result\n\n    def __init__(\n        self,\n        scale: ScaleFloatType = (0.03, 0.05),\n        nb_rows: ScaleIntType = (4, 4),\n        nb_cols: ScaleIntType = (4, 4),\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        cval: int | None = None,\n        cval_mask: int | None = None,\n        mode: Literal[\"constant\", \"edge\", \"symmetric\", \"reflect\", \"wrap\"] | None = None,\n        absolute_scale: bool = False,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n        keypoints_threshold: float = 0.01,\n    ):\n        super().__init__(\n            p=p,\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n        )\n\n        warn(\n            \"This augmenter is very slow. Try to use ``ElasticTransform`` instead, which is at least 10x faster.\",\n            stacklevel=2,\n        )\n\n        self.scale = cast(tuple[float, float], scale)\n        self.nb_rows = cast(tuple[int, int], nb_rows)\n        self.nb_cols = cast(tuple[int, int], nb_cols)\n        self.interpolation = interpolation\n        self.mask_interpolation = mask_interpolation\n        self.absolute_scale = absolute_scale\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale\",\n            \"nb_rows\",\n            \"nb_cols\",\n            \"interpolation\",\n            \"mask_interpolation\",\n            \"absolute_scale\",\n        )\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        image_shape = params[\"shape\"][:2]\n\n        nb_rows = np.clip(self.py_random.randint(*self.nb_rows), 2, None)\n        nb_cols = np.clip(self.py_random.randint(*self.nb_cols), 2, None)\n        scale = self.py_random.uniform(*self.scale)\n\n        map_x, map_y = fgeometric.create_piecewise_affine_maps(\n            image_shape=image_shape,\n            grid=(nb_rows, nb_cols),\n            scale=scale,\n            absolute_scale=self.absolute_scale,\n            random_generator=self.random_generator,\n        )\n\n        return {\"map_x\": map_x, \"map_y\": map_y}\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.RandomGridShuffle","title":"<code>class  RandomGridShuffle</code> <code>       (grid=(3, 3), p=0.5, always_apply=None)                           </code>  [view source on GitHub]","text":"<p>Randomly shuffles the grid's cells on an image, mask, or keypoints, effectively rearranging patches within the image. This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.</p> <p>Parameters:</p> Name Type Description <code>grid</code> <code>tuple[int, int]</code> <p>Size of the grid for splitting the image into cells. Each cell is shuffled randomly. For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled. Default: (3, 3)</p> <code>p</code> <code>float</code> <p>Probability that the transform will be applied. Should be in the range [0, 1]. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform maintains consistency across all targets. If applied to an image and its corresponding   mask or keypoints, the same shuffling will be applied to all.</li> <li>The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))   for the transform to have any effect.</li> <li>Keypoints are moved along with their corresponding grid cell.</li> <li>This transform could be useful when only micro features are important for the model, and memorizing   the global structure could be harmful. For example:</li> <li>Identifying the type of cell phone used to take a picture based on micro artifacts generated by     phone post-processing algorithms, rather than the semantic features of the photo.     See more at https://ieeexplore.ieee.org/abstract/document/8622031</li> <li>Identifying stress, glucose, hydration levels based on skin images.</li> </ul> <p>Mathematical Formulation:     1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.     2. A random permutation P of integers from 0 to (mn - 1) is generated.     3. Each cell in the grid is assigned a number from 0 to (mn - 1) in row-major order.     4. The cells are then rearranged according to the permutation P.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [1, 1, 1, 2, 2, 2],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4],\n...     [3, 3, 3, 4, 4, 4]\n... ])\n&gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transformed_image = result['image']\n# The resulting image might look like this (one possible outcome):\n# [[4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [4, 4, 4, 2, 2, 2],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1],\n#  [3, 3, 3, 1, 1, 1]]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class RandomGridShuffle(DualTransform):\n    \"\"\"Randomly shuffles the grid's cells on an image, mask, or keypoints,\n    effectively rearranging patches within the image.\n    This transformation divides the image into a grid and then permutes these grid cells based on a random mapping.\n\n    Args:\n        grid (tuple[int, int]): Size of the grid for splitting the image into cells. Each cell is shuffled randomly.\n            For example, (3, 3) will divide the image into a 3x3 grid, resulting in 9 cells to be shuffled.\n            Default: (3, 3)\n        p (float): Probability that the transform will be applied. Should be in the range [0, 1].\n            Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform maintains consistency across all targets. If applied to an image and its corresponding\n          mask or keypoints, the same shuffling will be applied to all.\n        - The number of cells in the grid should be at least 2 (i.e., grid should be at least (1, 2), (2, 1), or (2, 2))\n          for the transform to have any effect.\n        - Keypoints are moved along with their corresponding grid cell.\n        - This transform could be useful when only micro features are important for the model, and memorizing\n          the global structure could be harmful. For example:\n          - Identifying the type of cell phone used to take a picture based on micro artifacts generated by\n            phone post-processing algorithms, rather than the semantic features of the photo.\n            See more at https://ieeexplore.ieee.org/abstract/document/8622031\n          - Identifying stress, glucose, hydration levels based on skin images.\n\n    Mathematical Formulation:\n        1. The image is divided into a grid of size (m, n) as specified by the 'grid' parameter.\n        2. A random permutation P of integers from 0 to (m*n - 1) is generated.\n        3. Each cell in the grid is assigned a number from 0 to (m*n - 1) in row-major order.\n        4. The cells are then rearranged according to the permutation P.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [1, 1, 1, 2, 2, 2],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4],\n        ...     [3, 3, 3, 4, 4, 4]\n        ... ])\n        &gt;&gt;&gt; transform = A.RandomGridShuffle(grid=(2, 2), p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transformed_image = result['image']\n        # The resulting image might look like this (one possible outcome):\n        # [[4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [4, 4, 4, 2, 2, 2],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1],\n        #  [3, 3, 3, 1, 1, 1]]\n\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        grid: Annotated[tuple[int, int], AfterValidator(check_range_bounds(1, None))]\n\n    _targets = ALL_TARGETS\n\n    def __init__(\n        self,\n        grid: tuple[int, int] = (3, 3),\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.grid = grid\n\n    def apply(\n        self,\n        img: np.ndarray,\n        tiles: np.ndarray,\n        mapping: list[int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_image(img, tiles, mapping)\n\n    def apply_to_bboxes(\n        self,\n        bboxes: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        image_shape = params[\"shape\"][:2]\n        bboxes_denorm = denormalize_bboxes(bboxes, image_shape)\n        processor = cast(BboxProcessor, self.get_processor(\"bboxes\"))\n        if processor is None:\n            return bboxes\n        bboxes_returned = fgeometric.bboxes_grid_shuffle(\n            bboxes_denorm,\n            tiles,\n            mapping,\n            image_shape,\n            min_area=processor.params.min_area,\n            min_visibility=processor.params.min_visibility,\n        )\n        return normalize_bboxes(bboxes_returned, image_shape)\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        tiles: np.ndarray,\n        mapping: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        return fgeometric.swap_tiles_on_keypoints(keypoints, tiles, mapping)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, np.ndarray]:\n        image_shape = params[\"shape\"][:2]\n\n        original_tiles = fgeometric.split_uniform_grid(\n            image_shape,\n            self.grid,\n            self.random_generator,\n        )\n        shape_groups = fgeometric.create_shape_groups(original_tiles)\n        mapping = fgeometric.shuffle_tiles_within_shape_groups(\n            shape_groups,\n            self.random_generator,\n        )\n\n        return {\"tiles\": original_tiles, \"mapping\": mapping}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"grid\",)\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ShiftScaleRotate","title":"<code>class  ShiftScaleRotate</code> <code>       (shift_limit=(-0.0625, 0.0625), scale_limit=(-0.1, 0.1), rotate_limit=(-45, 45), interpolation=1, border_mode=4, value=None, mask_value=None, shift_limit_x=None, shift_limit_y=None, rotate_method='largest_box', mask_interpolation=0, fill=0, fill_mask=0, p=0.5, always_apply=None)                   </code>  [view source on GitHub]","text":"<p>Randomly apply affine transforms: translate, scale and rotate the input.</p> <p>Parameters:</p> Name Type Description <code>shift_limit</code> <code>float, float) or float</code> <p>shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).</p> <code>scale_limit</code> <code>float, float) or float</code> <p>scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1. If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high). Default: (-0.1, 0.1).</p> <code>rotate_limit</code> <code>int, int) or int</code> <p>rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: (-45, 45).</p> <code>interpolation</code> <code>OpenCV flag</code> <p>flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.</p> <code>border_mode</code> <code>OpenCV flag</code> <p>flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101</p> <code>fill</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT.</p> <code>fill_mask</code> <code>ColorType</code> <p>padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.</p> <code>shift_limit_x</code> <code>float, float) or float</code> <p>shift factor range for width. If it is set then this value instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value, the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in the range [-1, 1]. Default: None.</p> <code>shift_limit_y</code> <code>float, float) or float</code> <p>shift factor range for height. If it is set then this value instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value, the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie in the range [-, 1]. Default: None.</p> <code>rotate_method</code> <code>str</code> <p>rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\". Default: \"largest_box\"</p> <code>mask_interpolation</code> <code>OpenCV flag</code> <p>Flag that is used to specify the interpolation algorithm for mask. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_NEAREST.</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ShiftScaleRotate(Affine):\n    \"\"\"Randomly apply affine transforms: translate, scale and rotate the input.\n\n    Args:\n        shift_limit ((float, float) or float): shift factor range for both height and width. If shift_limit\n            is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and\n            upper bounds should lie in range [-1, 1]. Default: (-0.0625, 0.0625).\n        scale_limit ((float, float) or float): scaling factor range. If scale_limit is a single float value, the\n            range will be (-scale_limit, scale_limit). Note that the scale_limit will be biased by 1.\n            If scale_limit is a tuple, like (low, high), sampling will be done from the range (1 + low, 1 + high).\n            Default: (-0.1, 0.1).\n        rotate_limit ((int, int) or int): rotation range. If rotate_limit is a single int value, the\n            range will be (-rotate_limit, rotate_limit). Default: (-45, 45).\n        interpolation (OpenCV flag): flag that is used to specify the interpolation algorithm. Should be one of:\n            cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_LINEAR.\n        border_mode (OpenCV flag): flag that is used to specify the pixel extrapolation method. Should be one of:\n            cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101.\n            Default: cv2.BORDER_REFLECT_101\n        fill (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT.\n        fill_mask (ColorType): padding value if border_mode is cv2.BORDER_CONSTANT applied for masks.\n        shift_limit_x ((float, float) or float): shift factor range for width. If it is set then this value\n            instead of shift_limit will be used for shifting width.  If shift_limit_x is a single float value,\n            the range will be (-shift_limit_x, shift_limit_x). Absolute values for lower and upper bounds should lie in\n            the range [-1, 1]. Default: None.\n        shift_limit_y ((float, float) or float): shift factor range for height. If it is set then this value\n            instead of shift_limit will be used for shifting height.  If shift_limit_y is a single float value,\n            the range will be (-shift_limit_y, shift_limit_y). Absolute values for lower and upper bounds should lie\n            in the range [-, 1]. Default: None.\n        rotate_method (str): rotation method used for the bounding boxes. Should be one of \"largest_box\" or \"ellipse\".\n            Default: \"largest_box\"\n        mask_interpolation (OpenCV flag): Flag that is used to specify the interpolation algorithm for mask.\n            Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4.\n            Default: cv2.INTER_NEAREST.\n        p (float): probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    class InitSchema(BaseTransformInitSchema):\n        shift_limit: SymmetricRangeType\n        scale_limit: SymmetricRangeType\n        rotate_limit: SymmetricRangeType\n        interpolation: InterpolationType\n        border_mode: BorderModeType\n\n        value: ColorType | None\n        mask_value: ColorType | None\n\n        fill: ColorType = 0\n        fill_mask: ColorType = 0\n\n        shift_limit_x: ScaleFloatType | None\n        shift_limit_y: ScaleFloatType | None\n        rotate_method: Literal[\"largest_box\", \"ellipse\"]\n        mask_interpolation: InterpolationType\n\n        @model_validator(mode=\"after\")\n        def check_shift_limit(self) -&gt; Self:\n            bounds = -1, 1\n            self.shift_limit_x = to_tuple(\n                self.shift_limit_x if self.shift_limit_x is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_x, *bounds, \"shift_limit_x\")\n            self.shift_limit_y = to_tuple(\n                self.shift_limit_y if self.shift_limit_y is not None else self.shift_limit,\n            )\n            check_range(self.shift_limit_y, *bounds, \"shift_limit_y\")\n\n            if self.value is not None:\n                warn(\"value is deprecated, use fill instead\", DeprecationWarning, stacklevel=2)\n                self.fill = self.value\n            if self.mask_value is not None:\n                warn(\"mask_value is deprecated, use fill_mask instead\", DeprecationWarning, stacklevel=2)\n                self.fill_mask = self.mask_value\n            return self\n\n        @field_validator(\"scale_limit\")\n        @classmethod\n        def check_scale_limit(\n            cls,\n            value: ScaleFloatType,\n            info: ValidationInfo,\n        ) -&gt; ScaleFloatType:\n            bounds = 0, float(\"inf\")\n            result = to_tuple(value, bias=1.0)\n            check_range(result, *bounds, str(info.field_name))\n            return result\n\n    def __init__(\n        self,\n        shift_limit: ScaleFloatType = (-0.0625, 0.0625),\n        scale_limit: ScaleFloatType = (-0.1, 0.1),\n        rotate_limit: ScaleFloatType = (-45, 45),\n        interpolation: int = cv2.INTER_LINEAR,\n        border_mode: int = cv2.BORDER_REFLECT_101,\n        value: ColorType | None = None,\n        mask_value: ColorType | None = None,\n        shift_limit_x: ScaleFloatType | None = None,\n        shift_limit_y: ScaleFloatType | None = None,\n        rotate_method: Literal[\"largest_box\", \"ellipse\"] = \"largest_box\",\n        mask_interpolation: InterpolationType = cv2.INTER_NEAREST,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        shift_limit_x = cast(tuple[float, float], shift_limit_x)\n        shift_limit_y = cast(tuple[float, float], shift_limit_y)\n        super().__init__(\n            scale=scale_limit,\n            translate_percent={\"x\": shift_limit_x, \"y\": shift_limit_y},\n            rotate=rotate_limit,\n            shear=(0, 0),\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            fill=fill,\n            fill_mask=fill_mask,\n            border_mode=border_mode,\n            fit_output=False,\n            keep_ratio=False,\n            rotate_method=rotate_method,\n            always_apply=always_apply,\n            p=p,\n        )\n        warn(\n            \"ShiftScaleRotate is deprecated. Please use Affine transform instead.\",\n            DeprecationWarning,\n            stacklevel=2,\n        )\n        self.shift_limit_x = shift_limit_x\n        self.shift_limit_y = shift_limit_y\n\n        self.scale_limit = cast(tuple[float, float], scale_limit)\n        self.rotate_limit = cast(tuple[int, int], rotate_limit)\n        self.border_mode = border_mode\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        return {\n            \"shift_limit_x\": self.shift_limit_x,\n            \"shift_limit_y\": self.shift_limit_y,\n            \"scale_limit\": to_tuple(self.scale_limit, bias=-1.0),\n            \"rotate_limit\": self.rotate_limit,\n            \"interpolation\": self.interpolation,\n            \"border_mode\": self.border_mode,\n            \"fill\": self.fill,\n            \"fill_mask\": self.fill_mask,\n            \"rotate_method\": self.rotate_method,\n            \"mask_interpolation\": self.mask_interpolation,\n        }\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.ThinPlateSpline","title":"<code>class  ThinPlateSpline</code> <code>       (scale_range=(0.2, 0.4), num_control_points=4, interpolation=1, mask_interpolation=0, p=0.5, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.</p> <p>Imagine the image printed on a thin metal plate that can be bent and warped smoothly: - Control points act like pins pushing or pulling the plate - The plate resists sharp bending, creating smooth deformations - The transformation maintains continuity (no tears or folds) - Areas between control points are interpolated naturally</p> <p>The transform works by: 1. Creating a regular grid of control points (like pins in the plate) 2. Randomly displacing these points (like pushing/pulling the pins) 3. Computing a smooth interpolation (like the plate bending) 4. Applying the resulting deformation to the image</p> <p>Parameters:</p> Name Type Description <code>scale_range</code> <code>tuple[float, float]</code> <p>Range for random displacement of control points. Values should be in [0.0, 1.0]: - 0.0: No displacement (identity transform) - 0.1: Subtle warping - 0.2-0.4: Moderate deformation (recommended range) - 0.5+: Strong warping Default: (0.2, 0.4)</p> <code>num_control_points</code> <code>int</code> <p>Number of control points per side. Creates a grid of num_control_points x num_control_points points. - 2: Minimal deformation (affine-like) - 3-4: Moderate flexibility (recommended) - 5+: More local deformation control Must be &gt;= 2. Default: 4</p> <code>interpolation</code> <code>int</code> <p>OpenCV interpolation flag. Used for image sampling. See also: cv2.INTER_* Default: cv2.INTER_LINEAR</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>image, mask, keypoints, bboxes, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The transformation preserves smoothness and continuity</li> <li>Stronger scale values may create more extreme deformations</li> <li>Higher number of control points allows more local deformations</li> <li>The same deformation is applied consistently to all targets</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; # Basic usage\n&gt;&gt;&gt; transform = A.ThinPlateSpline()\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Subtle deformation\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.1, 0.2),\n...     num_control_points=3\n... )\n&gt;&gt;&gt;\n&gt;&gt;&gt; # Strong warping with fine control\n&gt;&gt;&gt; transform = A.ThinPlateSpline(\n...     scale_range=(0.3, 0.5),\n...     num_control_points=5,\n... )\n</code></pre> <p>References</p> <ul> <li> <p>\"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"   by F.L. Bookstein   https://doi.org/10.1109/34.24792</p> </li> <li> <p>Thin Plate Splines in Computer Vision:   https://en.wikipedia.org/wiki/Thin_plate_spline</p> </li> <li> <p>Similar implementation in Kornia:   https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline</p> </li> </ul> <p>See Also:     - ElasticTransform: For different type of non-rigid deformation     - GridDistortion: For grid-based warping     - OpticalDistortion: For lens-like distortions</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class ThinPlateSpline(BaseDistortion):\n    r\"\"\"Apply Thin Plate Spline (TPS) transformation to create smooth, non-rigid deformations.\n\n    Imagine the image printed on a thin metal plate that can be bent and warped smoothly:\n    - Control points act like pins pushing or pulling the plate\n    - The plate resists sharp bending, creating smooth deformations\n    - The transformation maintains continuity (no tears or folds)\n    - Areas between control points are interpolated naturally\n\n    The transform works by:\n    1. Creating a regular grid of control points (like pins in the plate)\n    2. Randomly displacing these points (like pushing/pulling the pins)\n    3. Computing a smooth interpolation (like the plate bending)\n    4. Applying the resulting deformation to the image\n\n\n    Args:\n        scale_range (tuple[float, float]): Range for random displacement of control points.\n            Values should be in [0.0, 1.0]:\n            - 0.0: No displacement (identity transform)\n            - 0.1: Subtle warping\n            - 0.2-0.4: Moderate deformation (recommended range)\n            - 0.5+: Strong warping\n            Default: (0.2, 0.4)\n\n        num_control_points (int): Number of control points per side.\n            Creates a grid of num_control_points x num_control_points points.\n            - 2: Minimal deformation (affine-like)\n            - 3-4: Moderate flexibility (recommended)\n            - 5+: More local deformation control\n            Must be &gt;= 2. Default: 4\n\n        interpolation (int): OpenCV interpolation flag. Used for image sampling.\n            See also: cv2.INTER_*\n            Default: cv2.INTER_LINEAR\n\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        image, mask, keypoints, bboxes, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The transformation preserves smoothness and continuity\n        - Stronger scale values may create more extreme deformations\n        - Higher number of control points allows more local deformations\n        - The same deformation is applied consistently to all targets\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; # Basic usage\n        &gt;&gt;&gt; transform = A.ThinPlateSpline()\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Subtle deformation\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.1, 0.2),\n        ...     num_control_points=3\n        ... )\n        &gt;&gt;&gt;\n        &gt;&gt;&gt; # Strong warping with fine control\n        &gt;&gt;&gt; transform = A.ThinPlateSpline(\n        ...     scale_range=(0.3, 0.5),\n        ...     num_control_points=5,\n        ... )\n\n    References:\n        - \"Principal Warps: Thin-Plate Splines and the Decomposition of Deformations\"\n          by F.L. Bookstein\n          https://doi.org/10.1109/34.24792\n\n        - Thin Plate Splines in Computer Vision:\n          https://en.wikipedia.org/wiki/Thin_plate_spline\n\n        - Similar implementation in Kornia:\n          https://kornia.readthedocs.io/en/latest/augmentation.html#kornia.augmentation.RandomThinPlateSpline\n\n    See Also:\n        - ElasticTransform: For different type of non-rigid deformation\n        - GridDistortion: For grid-based warping\n        - OpticalDistortion: For lens-like distortions\n    \"\"\"\n\n    class InitSchema(BaseDistortion.InitSchema):\n        scale_range: Annotated[tuple[float, float], AfterValidator(check_range_bounds(0, 1))]\n        num_control_points: int = Field(ge=2)\n\n    def __init__(\n        self,\n        scale_range: tuple[float, float] = (0.2, 0.4),\n        num_control_points: int = 4,\n        interpolation: int = cv2.INTER_LINEAR,\n        mask_interpolation: int = cv2.INTER_NEAREST,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            interpolation=interpolation,\n            mask_interpolation=mask_interpolation,\n            p=p,\n        )\n        self.scale_range = scale_range\n        self.num_control_points = num_control_points\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        height, width = params[\"shape\"][:2]\n\n        # Create regular grid of control points\n        grid_size = self.num_control_points\n        x = np.linspace(0, 1, grid_size)\n        y = np.linspace(0, 1, grid_size)\n        src_points = np.stack(np.meshgrid(x, y), axis=-1).reshape(-1, 2)\n\n        # Add random displacement to destination points\n        scale = self.py_random.uniform(*self.scale_range) / 10\n        dst_points = src_points + self.random_generator.normal(\n            0,\n            scale,\n            src_points.shape,\n        )\n\n        # Compute TPS weights\n        weights, affine = fgeometric.compute_tps_weights(src_points, dst_points)\n\n        # Create grid of points\n        x, y = np.meshgrid(np.arange(width), np.arange(height))\n        points = np.stack([x.flatten(), y.flatten()], axis=1).astype(np.float32)\n\n        # Transform points\n        transformed = fgeometric.tps_transform(\n            points / [width, height],\n            src_points,\n            weights,\n            affine,\n        )\n        transformed *= [width, height]\n\n        return {\n            \"map_x\": transformed[:, 0].reshape(height, width).astype(np.float32),\n            \"map_y\": transformed[:, 1].reshape(height, width).astype(np.float32),\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"scale_range\",\n            \"num_control_points\",\n            *super().get_transform_init_args_names(),\n        )\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.Transpose","title":"<code>class  Transpose</code> <code> </code>  [view source on GitHub]","text":"<p>Transpose the input by swapping its rows and columns.</p> <p>This transform flips the image over its main diagonal, effectively switching its width and height. It's equivalent to a 90-degree rotation followed by a horizontal flip.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The dimensions of the output will be swapped compared to the input. For example,   an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).</li> <li>This transform is its own inverse. Applying it twice will return the original input.</li> <li>For multi-channel images (like RGB), the channels are preserved in their original order.</li> <li>Bounding boxes will have their coordinates adjusted to match the new image dimensions.</li> <li>Keypoints will have their x and y coordinates swapped.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (y_min, x_min, y_max, x_max)     3. For keypoints with coordinates (x, y):        new_keypoint = (y, x)</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.Transpose(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; transposed_image = result['image']\n&gt;&gt;&gt; print(transposed_image)\n[[[ 1  2  3]\n  [ 7  8  9]]\n [[ 4  5  6]\n  [10 11 12]]]\n# The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class Transpose(DualTransform):\n    \"\"\"Transpose the input by swapping its rows and columns.\n\n    This transform flips the image over its main diagonal, effectively switching its width and height.\n    It's equivalent to a 90-degree rotation followed by a horizontal flip.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The dimensions of the output will be swapped compared to the input. For example,\n          an input image of shape (100, 200, 3) will result in an output of shape (200, 100, 3).\n        - This transform is its own inverse. Applying it twice will return the original input.\n        - For multi-channel images (like RGB), the channels are preserved in their original order.\n        - Bounding boxes will have their coordinates adjusted to match the new image dimensions.\n        - Keypoints will have their x and y coordinates swapped.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[j, i, k] for all i in [0, W-1], j in [0, H-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (y_min, x_min, y_max, x_max)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (y, x)\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.Transpose(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; transposed_image = result['image']\n        &gt;&gt;&gt; print(transposed_image)\n        [[[ 1  2  3]\n          [ 7  8  9]]\n         [[ 4  5  6]\n          [10 11 12]]]\n        # The original 2x2x3 image is now 2x2x3, with rows and columns swapped\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.transpose(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_transpose(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_transpose(keypoints)\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/geometric/transforms/#albumentations.augmentations.geometric.transforms.VerticalFlip","title":"<code>class  VerticalFlip</code> <code> </code>  [view source on GitHub]","text":"<p>Flip the input vertically around the x-axis.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5.</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform flips the image upside down. The top of the image becomes the bottom and vice versa.</li> <li>The dimensions of the image remain unchanged.</li> <li>For multi-channel images (like RGB), each channel is flipped independently.</li> <li>Bounding boxes are adjusted to match their new positions in the flipped image.</li> <li>Keypoints are moved to their new positions in the flipped image.</li> </ul> <p>Mathematical Details:     1. For an input image I of shape (H, W, C), the output O is:        O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]     2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):        new_bbox = (x_min, H-y_max, x_max, H-y_min)     3. For keypoints with coordinates (x, y):        new_keypoint = (x, H-y)     where H is the height of the image.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; image = np.array([\n...     [[1, 2, 3], [4, 5, 6]],\n...     [[7, 8, 9], [10, 11, 12]]\n... ])\n&gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n&gt;&gt;&gt; result = transform(image=image)\n&gt;&gt;&gt; flipped_image = result['image']\n&gt;&gt;&gt; print(flipped_image)\n[[[ 7  8  9]\n  [10 11 12]]\n [[ 1  2  3]\n  [ 4  5  6]]]\n# The original image is flipped vertically, with rows reversed\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/geometric/transforms.py</code> Python<pre><code>class VerticalFlip(DualTransform):\n    \"\"\"Flip the input vertically around the x-axis.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 0.5.\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform flips the image upside down. The top of the image becomes the bottom and vice versa.\n        - The dimensions of the image remain unchanged.\n        - For multi-channel images (like RGB), each channel is flipped independently.\n        - Bounding boxes are adjusted to match their new positions in the flipped image.\n        - Keypoints are moved to their new positions in the flipped image.\n\n    Mathematical Details:\n        1. For an input image I of shape (H, W, C), the output O is:\n           O[i, j, k] = I[H-1-i, j, k] for all i in [0, H-1], j in [0, W-1], k in [0, C-1]\n        2. For bounding boxes with coordinates (x_min, y_min, x_max, y_max):\n           new_bbox = (x_min, H-y_max, x_max, H-y_min)\n        3. For keypoints with coordinates (x, y):\n           new_keypoint = (x, H-y)\n        where H is the height of the image.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; image = np.array([\n        ...     [[1, 2, 3], [4, 5, 6]],\n        ...     [[7, 8, 9], [10, 11, 12]]\n        ... ])\n        &gt;&gt;&gt; transform = A.VerticalFlip(p=1.0)\n        &gt;&gt;&gt; result = transform(image=image)\n        &gt;&gt;&gt; flipped_image = result['image']\n        &gt;&gt;&gt; print(flipped_image)\n        [[[ 7  8  9]\n          [10 11 12]]\n         [[ 1  2  3]\n          [ 4  5  6]]]\n        # The original image is flipped vertically, with rows reversed\n\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return vflip(img)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.bboxes_vflip(bboxes)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return fgeometric.keypoints_vflip(keypoints, params[\"shape\"][0])\n\n    def get_transform_init_args_names(self) -&gt; tuple[()]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/mixing/","title":"Index","text":"<ul> <li>Mixing transforms (augmentations.mixing.transforms)</li> <li>Mixing functional transforms (albumentations.augmentations.mixing.functional)</li> </ul>"},{"location":"api_reference/augmentations/mixing/functional/","title":"Mixing transforms (augmentations.mixing.functional)","text":""},{"location":"api_reference/augmentations/mixing/transforms/","title":"Mixing transforms (augmentations.mixing.transforms)","text":""},{"location":"api_reference/augmentations/mixing/transforms/#albumentations.augmentations.mixing.transforms.OverlayElements","title":"<code>class  OverlayElements</code> <code>       (metadata_key='overlay_metadata', p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Apply overlay elements such as images and masks onto an input image. This transformation can be used to add various objects (e.g., stickers, logos) to images with optional masks and bounding boxes for better placement control.</p> <p>Parameters:</p> Name Type Description <code>metadata_key</code> <code>str</code> <p>Additional target key for metadata. Default <code>overlay_metadata</code>.</p> <code>p</code> <code>float</code> <p>Probability of applying the transformation. Default: 0.5.</p> <p>Possible Metadata Fields:     - image (np.ndarray): The overlay image to be applied. This is a required field.     - bbox (list[int]): The bounding box specifying the region where the overlay should be applied. It should                         contain four floats: [y_min, x_min, y_max, x_max]. If <code>label_id</code> is provided, it should                         be appended as the fifth element in the bbox. BBox should be in Albumentations format,                         that is the same as normalized Pascal VOC format                         [x_min / width, y_min / height, x_max / width, y_max / height]     - mask (np.ndarray): An optional mask that defines the non-rectangular region of the overlay image. If not                          provided, the entire overlay image is used.     - mask_id (int): An optional identifier for the mask. If provided, the regions specified by the mask will                      be labeled with this identifier in the output mask.</p> <p>Targets</p> <p>image, mask</p> <p>Image types:     uint8, float32</p> <p>Reference</p> <p>https://github.com/danaaubakirova/doc-augmentation</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/mixing/transforms.py</code> Python<pre><code>class OverlayElements(DualTransform):\n    \"\"\"Apply overlay elements such as images and masks onto an input image. This transformation can be used to add\n    various objects (e.g., stickers, logos) to images with optional masks and bounding boxes for better placement\n    control.\n\n    Args:\n        metadata_key (str): Additional target key for metadata. Default `overlay_metadata`.\n        p (float): Probability of applying the transformation. Default: 0.5.\n\n    Possible Metadata Fields:\n        - image (np.ndarray): The overlay image to be applied. This is a required field.\n        - bbox (list[int]): The bounding box specifying the region where the overlay should be applied. It should\n                            contain four floats: [y_min, x_min, y_max, x_max]. If `label_id` is provided, it should\n                            be appended as the fifth element in the bbox. BBox should be in Albumentations format,\n                            that is the same as normalized Pascal VOC format\n                            [x_min / width, y_min / height, x_max / width, y_max / height]\n        - mask (np.ndarray): An optional mask that defines the non-rectangular region of the overlay image. If not\n                             provided, the entire overlay image is used.\n        - mask_id (int): An optional identifier for the mask. If provided, the regions specified by the mask will\n                         be labeled with this identifier in the output mask.\n\n    Targets:\n        image, mask\n\n    Image types:\n        uint8, float32\n\n    Reference:\n        https://github.com/danaaubakirova/doc-augmentation\n\n    \"\"\"\n\n    _targets = (Targets.IMAGE, Targets.MASK)\n\n    class InitSchema(BaseTransformInitSchema):\n        metadata_key: str\n\n    def __init__(\n        self,\n        metadata_key: str = \"overlay_metadata\",\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.metadata_key = metadata_key\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        return [self.metadata_key]\n\n    @staticmethod\n    def preprocess_metadata(\n        metadata: dict[str, Any],\n        img_shape: tuple[int, int],\n        random_state: random.Random,\n    ) -&gt; dict[str, Any]:\n        overlay_image = metadata[\"image\"]\n        overlay_height, overlay_width = overlay_image.shape[:2]\n        image_height, image_width = img_shape[:2]\n\n        if \"bbox\" in metadata:\n            bbox = metadata[\"bbox\"]\n            bbox_np = np.array([bbox])\n            check_bboxes(bbox_np)\n            denormalized_bbox = denormalize_bboxes(bbox_np, img_shape[:2])[0]\n\n            x_min, y_min, x_max, y_max = (int(x) for x in denormalized_bbox[:4])\n\n            if \"mask\" in metadata:\n                mask = metadata[\"mask\"]\n                mask = cv2.resize(mask, (x_max - x_min, y_max - y_min), interpolation=cv2.INTER_NEAREST)\n            else:\n                mask = np.ones((y_max - y_min, x_max - x_min), dtype=np.uint8)\n\n            overlay_image = cv2.resize(overlay_image, (x_max - x_min, y_max - y_min), interpolation=cv2.INTER_AREA)\n            offset = (y_min, x_min)\n\n            if len(bbox) == LENGTH_RAW_BBOX and \"bbox_id\" in metadata:\n                bbox = [x_min, y_min, x_max, y_max, metadata[\"bbox_id\"]]\n            else:\n                bbox = (x_min, y_min, x_max, y_max, *bbox[4:])\n        else:\n            if image_height &lt; overlay_height or image_width &lt; overlay_width:\n                overlay_image = cv2.resize(overlay_image, (image_width, image_height), interpolation=cv2.INTER_AREA)\n                overlay_height, overlay_width = overlay_image.shape[:2]\n\n            mask = metadata[\"mask\"] if \"mask\" in metadata else np.ones_like(overlay_image, dtype=np.uint8)\n\n            max_x_offset = image_width - overlay_width\n            max_y_offset = image_height - overlay_height\n\n            offset_x = random_state.randint(0, max_x_offset)\n            offset_y = random_state.randint(0, max_y_offset)\n\n            offset = (offset_y, offset_x)\n\n            bbox = [\n                offset_x,\n                offset_y,\n                offset_x + overlay_width,\n                offset_y + overlay_height,\n            ]\n\n            if \"bbox_id\" in metadata:\n                bbox = [*bbox, metadata[\"bbox_id\"]]\n\n        result = {\n            \"overlay_image\": overlay_image,\n            \"overlay_mask\": mask,\n            \"offset\": offset,\n            \"bbox\": bbox,\n        }\n\n        if \"mask_id\" in metadata:\n            result[\"mask_id\"] = metadata[\"mask_id\"]\n\n        return result\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        metadata = data[self.metadata_key]\n        img_shape = params[\"shape\"]\n\n        if isinstance(metadata, list):\n            overlay_data = [self.preprocess_metadata(md, img_shape, self.py_random) for md in metadata]\n        else:\n            overlay_data = [self.preprocess_metadata(metadata, img_shape, self.py_random)]\n\n        return {\n            \"overlay_data\": overlay_data,\n        }\n\n    def apply(\n        self,\n        img: np.ndarray,\n        overlay_data: list[dict[str, Any]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        for data in overlay_data:\n            overlay_image = data[\"overlay_image\"]\n            overlay_mask = data[\"overlay_mask\"]\n            offset = data[\"offset\"]\n            img = fmixing.copy_and_paste_blend(img, overlay_image, overlay_mask, offset=offset)\n        return img\n\n    def apply_to_mask(\n        self,\n        mask: np.ndarray,\n        overlay_data: list[dict[str, Any]],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        for data in overlay_data:\n            if \"mask_id\" in data and data[\"mask_id\"] is not None:\n                overlay_mask = data[\"overlay_mask\"]\n                offset = data[\"offset\"]\n                mask_id = data[\"mask_id\"]\n\n                y_min, x_min = offset\n                y_max = y_min + overlay_mask.shape[0]\n                x_max = x_min + overlay_mask.shape[1]\n\n                mask_section = mask[y_min:y_max, x_min:x_max]\n                mask_section[overlay_mask &gt; 0] = mask_id\n\n        return mask\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"metadata_key\",)\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/","title":"Index","text":"<ul> <li>3D (Volumetric) transforms (augmentations.transforms3d.transforms)</li> <li>3D (Volumetric) functional transforms (albumentations.augmentations.transforms3d.functional)</li> </ul>"},{"location":"api_reference/augmentations/transforms3d/functional/","title":"3D (Volumetric) functional transforms (augmentations.transforms3d.functional)","text":""},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.adjust_padding_by_position3d","title":"<code>def adjust_padding_by_position3d    (paddings, position, py_random)    </code> [view source on GitHub]","text":"<p>Adjust padding values based on desired position for 3D data.</p> <p>Parameters:</p> Name Type Description <code>paddings</code> <code>list[tuple[int, int]]</code> <p>List of tuples containing padding pairs for each dimension [(d_pad), (h_pad), (w_pad)]</p> <code>position</code> <code>Literal['center', 'random']</code> <p>Position of the image after padding. Either 'center' or 'random'</p> <code>py_random</code> <code>Random</code> <p>Random number generator</p> <p>Returns:</p> Type Description <code>tuple[int, int, int, int, int, int]</code> <p>Final padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def adjust_padding_by_position3d(\n    paddings: list[tuple[int, int]],  # [(front, back), (top, bottom), (left, right)]\n    position: Literal[\"center\", \"random\"],\n    py_random: random.Random,\n) -&gt; tuple[int, int, int, int, int, int]:\n    \"\"\"Adjust padding values based on desired position for 3D data.\n\n    Args:\n        paddings: List of tuples containing padding pairs for each dimension [(d_pad), (h_pad), (w_pad)]\n        position: Position of the image after padding. Either 'center' or 'random'\n        py_random: Random number generator\n\n    Returns:\n        tuple[int, int, int, int, int, int]: Final padding values (d_front, d_back, h_top, h_bottom, w_left, w_right)\n    \"\"\"\n    if position == \"center\":\n        return (\n            paddings[0][0],  # d_front\n            paddings[0][1],  # d_back\n            paddings[1][0],  # h_top\n            paddings[1][1],  # h_bottom\n            paddings[2][0],  # w_left\n            paddings[2][1],  # w_right\n        )\n\n    # For random position, redistribute padding for each dimension\n    d_pad = sum(paddings[0])\n    h_pad = sum(paddings[1])\n    w_pad = sum(paddings[2])\n\n    return (\n        py_random.randint(0, d_pad),  # d_front\n        d_pad - py_random.randint(0, d_pad),  # d_back\n        py_random.randint(0, h_pad),  # h_top\n        h_pad - py_random.randint(0, h_pad),  # h_bottom\n        py_random.randint(0, w_pad),  # w_left\n        w_pad - py_random.randint(0, w_pad),  # w_right\n    )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.crop3d","title":"<code>def crop3d    (volume, crop_coords)    </code> [view source on GitHub]","text":"<p>Crop 3D volume using coordinates.</p> <p>Parameters:</p> Name Type Description <code>volume</code> <code>ndarray</code> <p>Input volume with shape (z, y, x) or (z, y, x, channels)</p> <code>crop_coords</code> <code>tuple[int, int, int, int, int, int]</code> <p>Tuple of (z_min, z_max, y_min, y_max, x_min, x_max) coordinates for cropping</p> <p>Returns:</p> Type Description <code>ndarray</code> <p>Cropped volume with same number of dimensions as input</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def crop3d(\n    volume: np.ndarray,\n    crop_coords: tuple[int, int, int, int, int, int],\n) -&gt; np.ndarray:\n    \"\"\"Crop 3D volume using coordinates.\n\n    Args:\n        volume: Input volume with shape (z, y, x) or (z, y, x, channels)\n        crop_coords: Tuple of (z_min, z_max, y_min, y_max, x_min, x_max) coordinates for cropping\n\n    Returns:\n        Cropped volume with same number of dimensions as input\n    \"\"\"\n    z_min, z_max, y_min, y_max, x_min, x_max = crop_coords\n\n    return volume[z_min:z_max, y_min:y_max, x_min:x_max]\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.cutout3d","title":"<code>def cutout3d    (volume, holes, fill_value)    </code> [view source on GitHub]","text":"<p>Cut out holes in 3D volume and fill them with a given value.</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def cutout3d(volume: np.ndarray, holes: np.ndarray, fill_value: ColorType) -&gt; np.ndarray:\n    \"\"\"Cut out holes in 3D volume and fill them with a given value.\"\"\"\n    volume = volume.copy()\n    for z1, y1, x1, z2, y2, x2 in holes:\n        volume[z1:z2, y1:y2, x1:x2] = fill_value\n    return volume\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.filter_keypoints_in_holes3d","title":"<code>def filter_keypoints_in_holes3d    (keypoints, holes)    </code> [view source on GitHub]","text":"<p>Filter out keypoints that are inside any of the 3D holes.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (num_keypoints, 3+).                    The first three columns are x, y, z coordinates.</p> <code>holes</code> <code>np.ndarray</code> <p>Array of holes with shape (num_holes, 6).                Each hole is represented as [z1, y1, x1, z2, y2, x2].</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints that are not inside any hole.</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>@handle_empty_array(\"keypoints\")\ndef filter_keypoints_in_holes3d(keypoints: np.ndarray, holes: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Filter out keypoints that are inside any of the 3D holes.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (num_keypoints, 3+).\n                               The first three columns are x, y, z coordinates.\n        holes (np.ndarray): Array of holes with shape (num_holes, 6).\n                           Each hole is represented as [z1, y1, x1, z2, y2, x2].\n\n    Returns:\n        np.ndarray: Array of keypoints that are not inside any hole.\n    \"\"\"\n    if holes.size == 0:\n        return keypoints\n\n    # Broadcast keypoints and holes for vectorized comparison\n    # Convert keypoints from XYZ to ZYX for comparison with holes\n    kp_z = keypoints[:, 2][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_y = keypoints[:, 1][:, np.newaxis]  # Shape: (num_keypoints, 1)\n    kp_x = keypoints[:, 0][:, np.newaxis]  # Shape: (num_keypoints, 1)\n\n    # Extract hole coordinates (in ZYX order)\n    hole_z1 = holes[:, 0]  # Shape: (num_holes,)\n    hole_y1 = holes[:, 1]\n    hole_x1 = holes[:, 2]\n    hole_z2 = holes[:, 3]\n    hole_y2 = holes[:, 4]\n    hole_x2 = holes[:, 5]\n\n    # Check if each keypoint is inside each hole\n    inside_hole = (\n        (kp_z &gt;= hole_z1)\n        &amp; (kp_z &lt; hole_z2)\n        &amp; (kp_y &gt;= hole_y1)\n        &amp; (kp_y &lt; hole_y2)\n        &amp; (kp_x &gt;= hole_x1)\n        &amp; (kp_x &lt; hole_x2)\n    )\n\n    # A keypoint is valid if it's not inside any hole\n    valid_keypoints = ~np.any(inside_hole, axis=1)\n\n    # Return filtered keypoints with same dtype as input\n    result = keypoints[valid_keypoints]\n    if len(result) == 0:\n        # Ensure empty result has correct shape and dtype\n        return np.array([], dtype=keypoints.dtype).reshape(0, keypoints.shape[1])\n    return result\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.pad_3d_with_params","title":"<code>def pad_3d_with_params    (volume, padding, value)    </code> [view source on GitHub]","text":"<p>Pad 3D volume with given parameters.</p> <p>Parameters:</p> Name Type Description <code>volume</code> <code>ndarray</code> <p>Input volume with shape (depth, height, width) or (depth, height, width, channels)</p> <code>padding</code> <code>tuple[int, int, int, int, int, int]</code> <p>Padding values in format: (depth_front, depth_back, height_top, height_bottom, width_left, width_right) where: - depth_front/back: padding at start/end of depth axis (z) - height_top/bottom: padding at start/end of height axis (y) - width_left/right: padding at start/end of width axis (x)</p> <code>value</code> <code>Union[float, collections.abc.Sequence[float]]</code> <p>Value to fill the padding</p> <p>Returns:</p> Type Description <code>ndarray</code> <p>Padded volume with same number of dimensions as input</p> <p>Note</p> <p>The padding order matches the volume dimensions (depth, height, width). For each dimension, the first value is padding at the start (smaller indices), and the second value is padding at the end (larger indices).</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def pad_3d_with_params(\n    volume: np.ndarray,\n    padding: tuple[int, int, int, int, int, int],\n    value: ColorType,\n) -&gt; np.ndarray:\n    \"\"\"Pad 3D volume with given parameters.\n\n    Args:\n        volume: Input volume with shape (depth, height, width) or (depth, height, width, channels)\n        padding: Padding values in format:\n            (depth_front, depth_back, height_top, height_bottom, width_left, width_right)\n            where:\n            - depth_front/back: padding at start/end of depth axis (z)\n            - height_top/bottom: padding at start/end of height axis (y)\n            - width_left/right: padding at start/end of width axis (x)\n        value: Value to fill the padding\n\n    Returns:\n        Padded volume with same number of dimensions as input\n\n    Note:\n        The padding order matches the volume dimensions (depth, height, width).\n        For each dimension, the first value is padding at the start (smaller indices),\n        and the second value is padding at the end (larger indices).\n    \"\"\"\n    depth_front, depth_back, height_top, height_bottom, width_left, width_right = padding\n\n    # Skip if no padding is needed\n    if all(p == 0 for p in padding):\n        return volume\n\n    # Handle both 3D and 4D arrays\n    pad_width = [\n        (depth_front, depth_back),  # depth (z) padding\n        (height_top, height_bottom),  # height (y) padding\n        (width_left, width_right),  # width (x) padding\n    ]\n\n    # Add channel padding if 4D array\n    if volume.ndim == NUM_VOLUME_DIMENSIONS:\n        pad_width.append((0, 0))  # no padding for channels\n\n    return np.pad(\n        volume,\n        pad_width=pad_width,\n        mode=\"constant\",\n        constant_values=value,\n    )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/functional/#albumentations.augmentations.transforms3d.functional.transform_cube","title":"<code>def transform_cube    (cube, index)    </code> [view source on GitHub]","text":"<p>Transform cube by index (0-47)</p> <p>Parameters:</p> Name Type Description <code>cube</code> <code>ndarray</code> <p>Input array with shape (D, H, W) or (D, H, W, C)</p> <code>index</code> <code>int</code> <p>Integer from 0 to 47 specifying which transformation to apply</p> <p>Returns:</p> Type Description <code>ndarray</code> <p>Transformed cube with same shape as input</p> Source code in <code>albumentations/augmentations/transforms3d/functional.py</code> Python<pre><code>def transform_cube(cube: np.ndarray, index: int) -&gt; np.ndarray:\n    \"\"\"Transform cube by index (0-47)\n\n    Args:\n        cube: Input array with shape (D, H, W) or (D, H, W, C)\n        index: Integer from 0 to 47 specifying which transformation to apply\n    Returns:\n        Transformed cube with same shape as input\n    \"\"\"\n    if not (0 &lt;= index &lt; 48):\n        raise ValueError(\"Index must be between 0 and 47\")\n\n    # First determine if we need reflection (indices 24-47)\n    needs_reflection = index &gt;= 24\n    working_cube = cube[:, :, ::-1].copy() if needs_reflection else cube.copy()\n    rotation_index = index % 24\n\n    # Map rotation_index (0-23) to specific rotations\n    if rotation_index &lt; 4:\n        # First 4: rotate around axis 0\n        return np.rot90(working_cube, rotation_index, axes=(1, 2))\n\n    if rotation_index &lt; 8:\n        # Next 4: flip 180\u00b0 about axis 1, then rotate around axis 0\n        temp = np.rot90(working_cube, 2, axes=(0, 2))\n        return np.rot90(temp, rotation_index - 4, axes=(1, 2))\n\n    if rotation_index &lt; 16:\n        # Next 8: split between 90\u00b0 and 270\u00b0 about axis 1, then rotate around axis 2\n        if rotation_index &lt; 12:\n            temp = np.rot90(working_cube, axes=(0, 2))\n            return np.rot90(temp, rotation_index - 8, axes=(0, 1))\n        temp = np.rot90(working_cube, -1, axes=(0, 2))\n        return np.rot90(temp, rotation_index - 12, axes=(0, 1))\n\n    # Final 8: split between rotations about axis 2, then rotate around axis 1\n    if rotation_index &lt; 20:\n        temp = np.rot90(working_cube, axes=(0, 1))\n        return np.rot90(temp, rotation_index - 16, axes=(0, 2))\n    temp = np.rot90(working_cube, -1, axes=(0, 1))\n    return np.rot90(temp, rotation_index - 20, axes=(0, 2))\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/","title":"3D (Volumetric) transforms (augmentations.transforms3d.transforms)","text":""},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.BaseCropAndPad3D","title":"<code>class  BaseCropAndPad3D</code> <code>       (pad_if_needed, fill, fill_mask, pad_position, p=1.0, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Base class for 3D transforms that need both cropping and padding.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class BaseCropAndPad3D(Transform3D):\n    \"\"\"Base class for 3D transforms that need both cropping and padding.\"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D, Targets.KEYPOINTS)\n\n    class InitSchema(Transform3D.InitSchema):\n        pad_if_needed: bool\n        fill: ColorType\n        fill_mask: ColorType\n        pad_position: Literal[\"center\", \"random\"]\n\n    def __init__(\n        self,\n        pad_if_needed: bool,\n        fill: ColorType,\n        fill_mask: ColorType,\n        pad_position: Literal[\"center\", \"random\"],\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.pad_if_needed = pad_if_needed\n        self.fill = fill\n        self.fill_mask = fill_mask\n        self.pad_position = pad_position\n\n    def _random_pad(self, pad: int) -&gt; tuple[int, int]:\n        \"\"\"Helper function to calculate random padding for one dimension.\"\"\"\n        if pad &gt; 0:\n            pad_start = self.py_random.randint(0, pad)\n            pad_end = pad - pad_start\n        else:\n            pad_start = pad_end = 0\n        return pad_start, pad_end\n\n    def _center_pad(self, pad: int) -&gt; tuple[int, int]:\n        \"\"\"Helper function to calculate center padding for one dimension.\"\"\"\n        pad_start = pad // 2\n        pad_end = pad - pad_start\n        return pad_start, pad_end\n\n    def _get_pad_params(\n        self,\n        image_shape: tuple[int, int, int],\n        target_shape: tuple[int, int, int],\n    ) -&gt; dict[str, Any] | None:\n        \"\"\"Calculate padding parameters if needed for 3D volumes.\"\"\"\n        if not self.pad_if_needed:\n            return None\n\n        z, h, w = image_shape\n        target_z, target_h, target_w = target_shape\n\n        # Calculate total padding needed for each dimension\n        z_pad = max(0, target_z - z)\n        h_pad = max(0, target_h - h)\n        w_pad = max(0, target_w - w)\n\n        if z_pad == 0 and h_pad == 0 and w_pad == 0:\n            return None\n\n        # For center padding, split equally\n        if self.pad_position == \"center\":\n            z_front, z_back = self._center_pad(z_pad)\n            h_top, h_bottom = self._center_pad(h_pad)\n            w_left, w_right = self._center_pad(w_pad)\n        # For random padding, randomly distribute the padding\n        else:  # random\n            z_front, z_back = self._random_pad(z_pad)\n            h_top, h_bottom = self._random_pad(h_pad)\n            w_left, w_right = self._random_pad(w_pad)\n\n        return {\n            \"pad_front\": z_front,\n            \"pad_back\": z_back,\n            \"pad_top\": h_top,\n            \"pad_bottom\": h_bottom,\n            \"pad_left\": w_left,\n            \"pad_right\": w_right,\n        }\n\n    def apply_to_volume(\n        self,\n        volume: np.ndarray,\n        crop_coords: tuple[int, int, int, int, int, int],\n        pad_params: dict[str, int] | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        # First crop\n        cropped = f3d.crop3d(volume, crop_coords)\n\n        # Then pad if needed\n        if pad_params is not None:\n            padding = (\n                pad_params[\"pad_front\"],\n                pad_params[\"pad_back\"],\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n            )\n            return f3d.pad_3d_with_params(\n                cropped,\n                padding=padding,\n                value=cast(ColorType, self.fill),\n            )\n\n        return cropped\n\n    def apply_to_mask3d(\n        self,\n        mask3d: np.ndarray,\n        crop_coords: tuple[int, int, int, int, int, int],\n        pad_params: dict[str, int] | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        # First crop\n        cropped = f3d.crop3d(mask3d, crop_coords)\n\n        # Then pad if needed\n        if pad_params is not None:\n            padding = (\n                pad_params[\"pad_front\"],\n                pad_params[\"pad_back\"],\n                pad_params[\"pad_top\"],\n                pad_params[\"pad_bottom\"],\n                pad_params[\"pad_left\"],\n                pad_params[\"pad_right\"],\n            )\n            return f3d.pad_3d_with_params(\n                cropped,\n                padding=padding,\n                value=cast(ColorType, self.fill_mask),\n            )\n\n        return cropped\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        crop_coords: tuple[int, int, int, int, int, int],\n        pad_params: dict[str, int] | None,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        # Extract crop start coordinates (z1,y1,x1)\n        crop_z1, _, crop_y1, _, crop_x1, _ = crop_coords\n\n        # Initialize shift vector with negative crop coordinates\n        shift = np.array(\n            [\n                -crop_x1,  # X shift\n                -crop_y1,  # Y shift\n                -crop_z1,  # Z shift\n            ],\n        )\n\n        # Add padding shift if needed\n        if pad_params is not None:\n            shift += np.array(\n                [\n                    pad_params[\"pad_left\"],  # X shift\n                    pad_params[\"pad_top\"],  # Y shift\n                    pad_params[\"pad_front\"],  # Z shift\n                ],\n            )\n\n        # Apply combined shift\n        return fgeometric.shift_keypoints(keypoints, shift)\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.BasePad3D","title":"<code>class  BasePad3D</code> <code>       (fill=0, fill_mask=0, p=1.0, always_apply=None)                       </code>  [view source on GitHub]","text":"<p>Base class for 3D padding transforms.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class BasePad3D(Transform3D):\n    \"\"\"Base class for 3D padding transforms.\"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D, Targets.KEYPOINTS)\n\n    class InitSchema(Transform3D.InitSchema):\n        fill: ColorType\n        fill_mask: ColorType\n\n    def __init__(\n        self,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def apply_to_volume(\n        self,\n        volume: np.ndarray,\n        padding: tuple[int, int, int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if padding == (0, 0, 0, 0, 0, 0):\n            return volume\n        return f3d.pad_3d_with_params(\n            volume=volume,\n            padding=padding,\n            value=cast(ColorType, self.fill),\n        )\n\n    def apply_to_mask3d(\n        self,\n        mask3d: np.ndarray,\n        padding: tuple[int, int, int, int, int, int],\n        **params: Any,\n    ) -&gt; np.ndarray:\n        if padding == (0, 0, 0, 0, 0, 0):\n            return mask3d\n        return f3d.pad_3d_with_params(\n            volume=mask3d,\n            padding=padding,\n            value=cast(ColorType, self.fill_mask),\n        )\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        padding = params[\"padding\"]\n        shift_vector = np.array([padding[4], padding[2], padding[0]])\n        return fgeometric.shift_keypoints(keypoints, shift_vector)\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.CenterCrop3D","title":"<code>class  CenterCrop3D</code> <code>       (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop the center of 3D volume.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, int, int]</code> <p>Desired output size of the crop in format (depth, height, width)</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if the volume is smaller than desired crop size. Default: False</p> <code>fill</code> <code>ColorType</code> <p>Padding value for image if pad_if_needed is True. Default: 0</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if pad_if_needed is True. Default: 0</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d, keypoints</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>If you want to perform cropping only in the XY plane while preserving all slices along the Z axis, consider using CenterCrop instead. CenterCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class CenterCrop3D(BaseCropAndPad3D):\n    \"\"\"Crop the center of 3D volume.\n\n    Args:\n        size (tuple[int, int, int]): Desired output size of the crop in format (depth, height, width)\n        pad_if_needed (bool): Whether to pad if the volume is smaller than desired crop size. Default: False\n        fill (ColorType): Padding value for image if pad_if_needed is True. Default: 0\n        fill_mask (ColorType): Padding value for mask if pad_if_needed is True. Default: 0\n        p (float): probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d, keypoints\n\n    Image types:\n        uint8, float32\n\n    Note:\n        If you want to perform cropping only in the XY plane while preserving all slices along\n        the Z axis, consider using CenterCrop instead. CenterCrop will apply the same XY crop\n        to each slice independently, maintaining the full depth of the volume.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        size: Annotated[tuple[int, int, int], AfterValidator(check_range_bounds(1, None))]\n        pad_if_needed: bool\n        fill: ColorType\n        fill_mask: ColorType\n\n    def __init__(\n        self,\n        size: tuple[int, int, int],\n        pad_if_needed: bool = False,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=\"center\",  # Center crop always uses center padding\n            p=p,\n        )\n        self.size = size\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        volume = data[\"volume\"]\n        z, h, w = volume.shape[:3]\n        target_z, target_h, target_w = self.size\n\n        # Get padding params if needed\n        pad_params = self._get_pad_params(\n            image_shape=(z, h, w),\n            target_shape=self.size,\n        )\n\n        # Update dimensions if padding is applied\n        if pad_params is not None:\n            z = z + pad_params[\"pad_front\"] + pad_params[\"pad_back\"]\n            h = h + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            w = w + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n\n        # Validate dimensions after padding\n        if z &lt; target_z or h &lt; target_h or w &lt; target_w:\n            msg = (\n                f\"Crop size {self.size} is larger than padded image size ({z}, {h}, {w}). \"\n                f\"This should not happen - please report this as a bug.\"\n            )\n            raise ValueError(msg)\n\n        # For CenterCrop3D:\n        z_start = (z - target_z) // 2\n        h_start = (h - target_h) // 2\n        w_start = (w - target_w) // 2\n\n        crop_coords = (\n            z_start,\n            z_start + target_z,\n            h_start,\n            h_start + target_h,\n            w_start,\n            w_start + target_w,\n        )\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"size\", \"pad_if_needed\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.CoarseDropout3D","title":"<code>class  CoarseDropout3D</code> <code>       (num_holes_range=(1, 1), hole_depth_range=(0.1, 0.2), hole_height_range=(0.1, 0.2), hole_width_range=(0.1, 0.2), fill=0, fill_mask=None, p=0.5, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>CoarseDropout3D randomly drops out cuboid regions from a 3D volume and optionally, the corresponding regions in an associated 3D mask, to simulate occlusion and varied object sizes found in real-world volumetric data.</p> <p>Parameters:</p> Name Type Description <code>num_holes_range</code> <code>tuple[int, int]</code> <p>Range (min, max) for the number of cuboid regions to drop out. Default: (1, 1)</p> <code>hole_depth_range</code> <code>tuple[float, float]</code> <p>Range (min, max) for the depth of dropout regions as a fraction of the volume depth (between 0 and 1). Default: (0.1, 0.2)</p> <code>hole_height_range</code> <code>tuple[float, float]</code> <p>Range (min, max) for the height of dropout regions as a fraction of the volume height (between 0 and 1). Default: (0.1, 0.2)</p> <code>hole_width_range</code> <code>tuple[float, float]</code> <p>Range (min, max) for the width of dropout regions as a fraction of the volume width (between 0 and 1). Default: (0.1, 0.2)</p> <code>fill</code> <code>ColorType</code> <p>Value for the dropped voxels. Can be: - int or float: all channels are filled with this value - tuple: tuple of values for each channel Default: 0</p> <code>fill_mask</code> <code>ColorType | None</code> <p>Fill value for dropout regions in the 3D mask. If None, mask regions corresponding to volume dropouts are unchanged. Default: None</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 0.5</p> <p>Targets</p> <p>volume, mask3d, keypoints</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>The actual number and size of dropout regions are randomly chosen within the specified ranges.</li> <li>All values in hole_depth_range, hole_height_range and hole_width_range must be between 0 and 1.</li> <li>If you want to apply dropout only in the XY plane while preserving the full depth dimension,   consider using CoarseDropout instead. CoarseDropout will apply the same rectangular dropout   to each slice independently, effectively creating cylindrical dropout regions that extend   through the entire depth of the volume.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n&gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n&gt;&gt;&gt; aug = A.CoarseDropout3D(\n...     num_holes_range=(3, 6),\n...     hole_depth_range=(0.1, 0.2),\n...     hole_height_range=(0.1, 0.2),\n...     hole_width_range=(0.1, 0.2),\n...     fill=0,\n...     p=1.0\n... )\n&gt;&gt;&gt; transformed = aug(volume=volume, mask3d=mask3d)\n&gt;&gt;&gt; transformed_volume, transformed_mask3d = transformed[\"volume\"], transformed[\"mask3d\"]\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class CoarseDropout3D(Transform3D):\n    \"\"\"CoarseDropout3D randomly drops out cuboid regions from a 3D volume and optionally,\n    the corresponding regions in an associated 3D mask, to simulate occlusion and\n    varied object sizes found in real-world volumetric data.\n\n    Args:\n        num_holes_range (tuple[int, int]): Range (min, max) for the number of cuboid\n            regions to drop out. Default: (1, 1)\n        hole_depth_range (tuple[float, float]): Range (min, max) for the depth\n            of dropout regions as a fraction of the volume depth (between 0 and 1). Default: (0.1, 0.2)\n        hole_height_range (tuple[float, float]): Range (min, max) for the height\n            of dropout regions as a fraction of the volume height (between 0 and 1). Default: (0.1, 0.2)\n        hole_width_range (tuple[float, float]): Range (min, max) for the width\n            of dropout regions as a fraction of the volume width (between 0 and 1). Default: (0.1, 0.2)\n        fill (ColorType): Value for the dropped voxels. Can be:\n            - int or float: all channels are filled with this value\n            - tuple: tuple of values for each channel\n            Default: 0\n        fill_mask (ColorType | None): Fill value for dropout regions in the 3D mask.\n            If None, mask regions corresponding to volume dropouts are unchanged. Default: None\n        p (float): Probability of applying the transform. Default: 0.5\n\n    Targets:\n        volume, mask3d, keypoints\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - The actual number and size of dropout regions are randomly chosen within the specified ranges.\n        - All values in hole_depth_range, hole_height_range and hole_width_range must be between 0 and 1.\n        - If you want to apply dropout only in the XY plane while preserving the full depth dimension,\n          consider using CoarseDropout instead. CoarseDropout will apply the same rectangular dropout\n          to each slice independently, effectively creating cylindrical dropout regions that extend\n          through the entire depth of the volume.\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n        &gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n        &gt;&gt;&gt; aug = A.CoarseDropout3D(\n        ...     num_holes_range=(3, 6),\n        ...     hole_depth_range=(0.1, 0.2),\n        ...     hole_height_range=(0.1, 0.2),\n        ...     hole_width_range=(0.1, 0.2),\n        ...     fill=0,\n        ...     p=1.0\n        ... )\n        &gt;&gt;&gt; transformed = aug(volume=volume, mask3d=mask3d)\n        &gt;&gt;&gt; transformed_volume, transformed_mask3d = transformed[\"volume\"], transformed[\"mask3d\"]\n    \"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D, Targets.KEYPOINTS)\n\n    class InitSchema(Transform3D.InitSchema):\n        num_holes_range: Annotated[\n            tuple[int, int],\n            AfterValidator(check_range_bounds(0, None)),\n            AfterValidator(nondecreasing),\n        ]\n        hole_depth_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        hole_height_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        hole_width_range: Annotated[\n            tuple[float, float],\n            AfterValidator(check_range_bounds(0, 1)),\n            AfterValidator(nondecreasing),\n        ]\n        fill: ColorType\n        fill_mask: ColorType | None\n\n        @staticmethod\n        def validate_range(range_value: tuple[float, float], range_name: str) -&gt; None:\n            if not 0 &lt;= range_value[0] &lt;= range_value[1] &lt;= 1:\n                raise ValueError(\n                    f\"All values in {range_name} should be in [0, 1] range and first value \"\n                    f\"should be less or equal than the second value. Got: {range_value}\",\n                )\n\n        @model_validator(mode=\"after\")\n        def check_ranges(self) -&gt; Self:\n            self.validate_range(self.hole_depth_range, \"hole_depth_range\")\n            self.validate_range(self.hole_height_range, \"hole_height_range\")\n            self.validate_range(self.hole_width_range, \"hole_width_range\")\n            return self\n\n    def __init__(\n        self,\n        num_holes_range: tuple[int, int] = (1, 1),\n        hole_depth_range: tuple[float, float] = (0.1, 0.2),\n        hole_height_range: tuple[float, float] = (0.1, 0.2),\n        hole_width_range: tuple[float, float] = (0.1, 0.2),\n        fill: ColorType = 0,\n        fill_mask: ColorType | None = None,\n        p: float = 0.5,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n        self.num_holes_range = num_holes_range\n        self.hole_depth_range = hole_depth_range\n        self.hole_height_range = hole_height_range\n        self.hole_width_range = hole_width_range\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def calculate_hole_dimensions(\n        self,\n        volume_shape: tuple[int, int, int],\n        depth_range: tuple[float, float],\n        height_range: tuple[float, float],\n        width_range: tuple[float, float],\n        size: int,\n    ) -&gt; tuple[np.ndarray, np.ndarray, np.ndarray]:\n        \"\"\"Calculate random hole dimensions based on the provided ranges.\"\"\"\n        depth, height, width = volume_shape[:3]\n\n        hole_depths = np.maximum(1, np.ceil(depth * self.random_generator.uniform(*depth_range, size=size))).astype(int)\n        hole_heights = np.maximum(1, np.ceil(height * self.random_generator.uniform(*height_range, size=size))).astype(\n            int,\n        )\n        hole_widths = np.maximum(1, np.ceil(width * self.random_generator.uniform(*width_range, size=size))).astype(int)\n\n        return hole_depths, hole_heights, hole_widths\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        volume_shape = data[\"volume\"].shape[:3]\n\n        num_holes = self.py_random.randint(*self.num_holes_range)\n\n        hole_depths, hole_heights, hole_widths = self.calculate_hole_dimensions(\n            volume_shape,\n            self.hole_depth_range,\n            self.hole_height_range,\n            self.hole_width_range,\n            size=num_holes,\n        )\n\n        depth, height, width = volume_shape[:3]\n\n        z_min = self.random_generator.integers(0, depth - hole_depths + 1, size=num_holes)\n        y_min = self.random_generator.integers(0, height - hole_heights + 1, size=num_holes)\n        x_min = self.random_generator.integers(0, width - hole_widths + 1, size=num_holes)\n        z_max = z_min + hole_depths\n        y_max = y_min + hole_heights\n        x_max = x_min + hole_widths\n\n        holes = np.stack([z_min, y_min, x_min, z_max, y_max, x_max], axis=-1)\n\n        return {\"holes\": holes}\n\n    def apply_to_volume(self, volume: np.ndarray, holes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if holes.size == 0:\n            return volume\n\n        return f3d.cutout3d(volume, holes, cast(ColorType, self.fill))\n\n    def apply_to_mask(self, mask: np.ndarray, holes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        if self.fill_mask is None or holes.size == 0:\n            return mask\n\n        return f3d.cutout3d(mask, holes, cast(ColorType, self.fill_mask))\n\n    def apply_to_keypoints(\n        self,\n        keypoints: np.ndarray,\n        holes: np.ndarray,\n        **params: Any,\n    ) -&gt; np.ndarray:\n        \"\"\"Remove keypoints that fall within dropout regions.\"\"\"\n        if holes.size == 0:\n            return keypoints\n        processor = cast(KeypointsProcessor, self.get_processor(\"keypoints\"))\n\n        if processor is None or not processor.params.remove_invisible:\n            return keypoints\n        return f3d.filter_keypoints_in_holes3d(keypoints, holes)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"num_holes_range\",\n            \"hole_depth_range\",\n            \"hole_height_range\",\n            \"hole_width_range\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.CubicSymmetry","title":"<code>class  CubicSymmetry</code> <code>       (p=1.0, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Applies a random cubic symmetry transformation to a 3D volume.</p> <p>This transform is a 3D extension of D4. While D4 handles the 8 symmetries of a square (4 rotations x 2 reflections), CubicSymmetry handles all 48 symmetries of a cube. Like D4, this transform does not create any interpolation artifacts as it only remaps voxels from one position to another without any interpolation.</p> <p>The 48 transformations consist of: - 24 rotations (orientation-preserving):     * 4 rotations around each face diagonal (6 face diagonals x 4 rotations = 24) - 24 rotoreflections (orientation-reversing):     * Reflection through a plane followed by any of the 24 rotations</p> <p>For a cube, these transformations preserve: - All face centers (6) - All vertex positions (8) - All edge centers (12)</p> <p>works with 3D volumes and masks of the shape (D, H, W) or (D, H, W, C)</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d, keypoints</p> <p>Image types:     uint8, float32</p> <p>Note</p> <ul> <li>This transform is particularly useful for data augmentation in 3D medical imaging,   crystallography, and voxel-based 3D modeling where the object's orientation   is arbitrary.</li> <li>All transformations preserve the object's chirality (handedness) when using   pure rotations (indices 0-23) and invert it when using rotoreflections   (indices 24-47).</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n&gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n&gt;&gt;&gt; transform = A.CubicSymmetry(p=1.0)\n&gt;&gt;&gt; transformed = transform(volume=volume, mask3d=mask3d)\n&gt;&gt;&gt; transformed_volume = transformed[\"volume\"]\n&gt;&gt;&gt; transformed_mask3d = transformed[\"mask3d\"]\n</code></pre> <p>See Also:     - D4: The 2D version that handles the 8 symmetries of a square</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class CubicSymmetry(Transform3D):\n    \"\"\"Applies a random cubic symmetry transformation to a 3D volume.\n\n    This transform is a 3D extension of D4. While D4 handles the 8 symmetries\n    of a square (4 rotations x 2 reflections), CubicSymmetry handles all 48 symmetries of a cube.\n    Like D4, this transform does not create any interpolation artifacts as it only remaps voxels\n    from one position to another without any interpolation.\n\n    The 48 transformations consist of:\n    - 24 rotations (orientation-preserving):\n        * 4 rotations around each face diagonal (6 face diagonals x 4 rotations = 24)\n    - 24 rotoreflections (orientation-reversing):\n        * Reflection through a plane followed by any of the 24 rotations\n\n    For a cube, these transformations preserve:\n    - All face centers (6)\n    - All vertex positions (8)\n    - All edge centers (12)\n\n    works with 3D volumes and masks of the shape (D, H, W) or (D, H, W, C)\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d, keypoints\n\n    Image types:\n        uint8, float32\n\n    Note:\n        - This transform is particularly useful for data augmentation in 3D medical imaging,\n          crystallography, and voxel-based 3D modeling where the object's orientation\n          is arbitrary.\n        - All transformations preserve the object's chirality (handedness) when using\n          pure rotations (indices 0-23) and invert it when using rotoreflections\n          (indices 24-47).\n\n    Example:\n        &gt;&gt;&gt; import numpy as np\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; volume = np.random.randint(0, 256, (10, 100, 100), dtype=np.uint8)  # (D, H, W)\n        &gt;&gt;&gt; mask3d = np.random.randint(0, 2, (10, 100, 100), dtype=np.uint8)    # (D, H, W)\n        &gt;&gt;&gt; transform = A.CubicSymmetry(p=1.0)\n        &gt;&gt;&gt; transformed = transform(volume=volume, mask3d=mask3d)\n        &gt;&gt;&gt; transformed_volume = transformed[\"volume\"]\n        &gt;&gt;&gt; transformed_mask3d = transformed[\"mask3d\"]\n\n    See Also:\n        - D4: The 2D version that handles the 8 symmetries of a square\n    \"\"\"\n\n    _targets = (Targets.VOLUME, Targets.MASK3D, Targets.KEYPOINTS)\n\n    def __init__(\n        self,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(p=p, always_apply=always_apply)\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        # Randomly select one of 48 possible transformations\n\n        volume_shape = data[\"volume\"].shape\n        return {\"index\": self.py_random.randint(0, 47), \"volume_shape\": volume_shape}\n\n    def apply_to_volume(self, volume: np.ndarray, index: int, **params: Any) -&gt; np.ndarray:\n        return f3d.transform_cube(volume, index)\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, index: int, **params: Any) -&gt; np.ndarray:\n        return f3d.transform_cube_keypoints(keypoints, index, volume_shape=params[\"volume_shape\"])\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.Pad3D","title":"<code>class  Pad3D</code> <code>       (padding, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pad the sides of a 3D volume by specified number of voxels.</p> <p>Parameters:</p> Name Type Description <code>padding</code> <code>int, tuple[int, int, int] or tuple[int, int, int, int, int, int]</code> <p>Padding values. Can be: * int - pad all sides by this value * tuple[int, int, int] - symmetric padding (depth, height, width) where each value   is applied to both sides of the corresponding dimension * tuple[int, int, int, int, int, int] - explicit padding per side in order:   (depth_front, depth_back, height_top, height_bottom, width_left, width_right)</p> <code>fill</code> <code>ColorType</code> <p>Padding value for image</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0.</p> <p>Targets</p> <p>volume, mask3d, keypoints</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class Pad3D(BasePad3D):\n    \"\"\"Pad the sides of a 3D volume by specified number of voxels.\n\n    Args:\n        padding (int, tuple[int, int, int] or tuple[int, int, int, int, int, int]): Padding values. Can be:\n            * int - pad all sides by this value\n            * tuple[int, int, int] - symmetric padding (depth, height, width) where each value\n              is applied to both sides of the corresponding dimension\n            * tuple[int, int, int, int, int, int] - explicit padding per side in order:\n              (depth_front, depth_back, height_top, height_bottom, width_left, width_right)\n\n        fill (ColorType): Padding value for image\n        fill_mask (ColorType): Padding value for mask\n        p (float): probability of applying the transform. Default: 1.0.\n\n    Targets:\n        volume, mask3d, keypoints\n\n    Image types:\n        uint8, float32\n\n    Note:\n        Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width),\n        with optional channel dimension as the last axis.\n    \"\"\"\n\n    class InitSchema(BasePad3D.InitSchema):\n        padding: int | tuple[int, int, int] | tuple[int, int, int, int, int, int]\n\n        @field_validator(\"padding\")\n        @classmethod\n        def validate_padding(\n            cls,\n            v: int | tuple[int, int, int] | tuple[int, int, int, int, int, int],\n        ) -&gt; int | tuple[int, int, int] | tuple[int, int, int, int, int, int]:\n            if isinstance(v, int) and v &lt; 0:\n                raise ValueError(\"Padding value must be non-negative\")\n            if isinstance(v, tuple) and not all(isinstance(i, int) and i &gt;= 0 for i in v):\n                raise ValueError(\"Padding tuple must contain non-negative integers\")\n\n            return v\n\n    def __init__(\n        self,\n        padding: int | tuple[int, int, int] | tuple[int, int, int, int, int, int],\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.padding = padding\n        self.fill = fill\n        self.fill_mask = fill_mask\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        if isinstance(self.padding, int):\n            pad_d = pad_h = pad_w = self.padding\n            padding = (pad_d, pad_d, pad_h, pad_h, pad_w, pad_w)\n        elif len(self.padding) == NUM_DIMENSIONS:\n            pad_d, pad_h, pad_w = self.padding  # type: ignore[misc]\n            padding = (pad_d, pad_d, pad_h, pad_h, pad_w, pad_w)\n        else:\n            padding = self.padding  # type: ignore[assignment]\n\n        return {\"padding\": padding}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"padding\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.PadIfNeeded3D","title":"<code>class  PadIfNeeded3D</code> <code>       (min_zyx=None, pad_divisor_zyx=None, position='center', fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions. If the pad_divisor_zyx is specified, the function additionally ensures that the volume dimensions are divisible by these values.</p> <p>Parameters:</p> Name Type Description <code>min_zyx</code> <code>tuple[int, int, int] | None</code> <p>Minimum desired size as (depth, height, width). Ensures volume dimensions are at least these values. If not specified, pad_divisor_zyx must be provided.</p> <code>pad_divisor_zyx</code> <code>tuple[int, int, int] | None</code> <p>If set, pads each dimension to make it divisible by corresponding value in format (depth_div, height_div, width_div). If not specified, min_zyx must be provided.</p> <code>position</code> <code>Literal[\"center\", \"random\"]</code> <p>Position where the volume is to be placed after padding. Default is 'center'.</p> <code>fill</code> <code>ColorType</code> <p>Value to fill the border voxels for volume. Default: 0</p> <code>fill_mask</code> <code>ColorType</code> <p>Value to fill the border voxels for masks. Default: 0</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d, keypoints</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width), with optional channel dimension as the last axis.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class PadIfNeeded3D(BasePad3D):\n    \"\"\"Pads the sides of a 3D volume if its dimensions are less than specified minimum dimensions.\n    If the pad_divisor_zyx is specified, the function additionally ensures that the volume\n    dimensions are divisible by these values.\n\n    Args:\n        min_zyx (tuple[int, int, int] | None): Minimum desired size as (depth, height, width).\n            Ensures volume dimensions are at least these values.\n            If not specified, pad_divisor_zyx must be provided.\n        pad_divisor_zyx (tuple[int, int, int] | None): If set, pads each dimension to make it\n            divisible by corresponding value in format (depth_div, height_div, width_div).\n            If not specified, min_zyx must be provided.\n        position (Literal[\"center\", \"random\"]): Position where the volume is to be placed after padding.\n            Default is 'center'.\n        fill (ColorType): Value to fill the border voxels for volume. Default: 0\n        fill_mask (ColorType): Value to fill the border voxels for masks. Default: 0\n        p (float): Probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d, keypoints\n\n    Image types:\n        uint8, float32\n\n    Note:\n        Input volume should be a numpy array with dimensions ordered as (z, y, x) or (depth, height, width),\n        with optional channel dimension as the last axis.\n    \"\"\"\n\n    class InitSchema(BasePad3D.InitSchema):\n        min_zyx: Annotated[tuple[int, int, int] | None, AfterValidator(check_range_bounds(0, None))]\n        pad_divisor_zyx: Annotated[tuple[int, int, int] | None, AfterValidator(check_range_bounds(1, None))]\n        position: Literal[\"center\", \"random\"]\n\n        @model_validator(mode=\"after\")\n        def validate_params(self) -&gt; Self:\n            if self.min_zyx is None and self.pad_divisor_zyx is None:\n                msg = \"At least one of min_zyx or pad_divisor_zyx must be set\"\n                raise ValueError(msg)\n            return self\n\n    def __init__(\n        self,\n        min_zyx: tuple[int, int, int] | None = None,\n        pad_divisor_zyx: tuple[int, int, int] | None = None,\n        position: Literal[\"center\", \"random\"] = \"center\",\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(fill=fill, fill_mask=fill_mask, p=p)\n        self.min_zyx = min_zyx\n        self.pad_divisor_zyx = pad_divisor_zyx\n        self.position = position\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        depth, height, width = data[\"volume\"].shape[:3]\n        sizes = (depth, height, width)\n\n        paddings = [\n            fgeometric.get_dimension_padding(\n                current_size=size,\n                min_size=self.min_zyx[i] if self.min_zyx else None,\n                divisor=self.pad_divisor_zyx[i] if self.pad_divisor_zyx else None,\n            )\n            for i, size in enumerate(sizes)\n        ]\n\n        padding = f3d.adjust_padding_by_position3d(\n            paddings=paddings,\n            position=self.position,\n            py_random=self.py_random,\n        )\n\n        return {\"padding\": padding}\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\n            \"min_zyx\",\n            \"pad_divisor_zyx\",\n            \"position\",\n            \"fill\",\n            \"fill_mask\",\n        )\n</code></pre>"},{"location":"api_reference/augmentations/transforms3d/transforms/#albumentations.augmentations.transforms3d.transforms.RandomCrop3D","title":"<code>class  RandomCrop3D</code> <code>       (size, pad_if_needed=False, fill=0, fill_mask=0, p=1.0, always_apply=None)                     </code>  [view source on GitHub]","text":"<p>Crop random part of 3D volume.</p> <p>Parameters:</p> Name Type Description <code>size</code> <code>tuple[int, int, int]</code> <p>Desired output size of the crop in format (depth, height, width)</p> <code>pad_if_needed</code> <code>bool</code> <p>Whether to pad if the volume is smaller than desired crop size. Default: False</p> <code>fill</code> <code>ColorType</code> <p>Padding value for image if pad_if_needed is True. Default: 0</p> <code>fill_mask</code> <code>ColorType</code> <p>Padding value for mask if pad_if_needed is True. Default: 0</p> <code>p</code> <code>float</code> <p>probability of applying the transform. Default: 1.0</p> <p>Targets</p> <p>volume, mask3d, keypoints</p> <p>Image types:     uint8, float32</p> <p>Note</p> <p>If you want to perform random cropping only in the XY plane while preserving all slices along the Z axis, consider using RandomCrop instead. RandomCrop will apply the same XY crop to each slice independently, maintaining the full depth of the volume.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/augmentations/transforms3d/transforms.py</code> Python<pre><code>class RandomCrop3D(BaseCropAndPad3D):\n    \"\"\"Crop random part of 3D volume.\n\n    Args:\n        size (tuple[int, int, int]): Desired output size of the crop in format (depth, height, width)\n        pad_if_needed (bool): Whether to pad if the volume is smaller than desired crop size. Default: False\n        fill (ColorType): Padding value for image if pad_if_needed is True. Default: 0\n        fill_mask (ColorType): Padding value for mask if pad_if_needed is True. Default: 0\n        p (float): probability of applying the transform. Default: 1.0\n\n    Targets:\n        volume, mask3d, keypoints\n\n    Image types:\n        uint8, float32\n\n    Note:\n        If you want to perform random cropping only in the XY plane while preserving all slices along\n        the Z axis, consider using RandomCrop instead. RandomCrop will apply the same XY crop\n        to each slice independently, maintaining the full depth of the volume.\n    \"\"\"\n\n    class InitSchema(BaseTransformInitSchema):\n        size: Annotated[tuple[int, int, int], AfterValidator(check_range_bounds(1, None))]\n        pad_if_needed: bool\n        fill: ColorType\n        fill_mask: ColorType\n\n    def __init__(\n        self,\n        size: tuple[int, int, int],\n        pad_if_needed: bool = False,\n        fill: ColorType = 0,\n        fill_mask: ColorType = 0,\n        p: float = 1.0,\n        always_apply: bool | None = None,\n    ):\n        super().__init__(\n            pad_if_needed=pad_if_needed,\n            fill=fill,\n            fill_mask=fill_mask,\n            pad_position=\"random\",  # Random crop uses random padding position\n            p=p,\n        )\n        self.size = size\n\n    def get_params_dependent_on_data(\n        self,\n        params: dict[str, Any],\n        data: dict[str, Any],\n    ) -&gt; dict[str, Any]:\n        volume = data[\"volume\"]\n        z, h, w = volume.shape[:3]\n        target_z, target_h, target_w = self.size\n\n        # Get padding params if needed\n        pad_params = self._get_pad_params(\n            image_shape=(z, h, w),\n            target_shape=self.size,\n        )\n\n        # Update dimensions if padding is applied\n        if pad_params is not None:\n            z = z + pad_params[\"pad_front\"] + pad_params[\"pad_back\"]\n            h = h + pad_params[\"pad_top\"] + pad_params[\"pad_bottom\"]\n            w = w + pad_params[\"pad_left\"] + pad_params[\"pad_right\"]\n\n        # Calculate random crop coordinates\n        z_start = self.py_random.randint(0, max(0, z - target_z))\n        h_start = self.py_random.randint(0, max(0, h - target_h))\n        w_start = self.py_random.randint(0, max(0, w - target_w))\n\n        crop_coords = (\n            z_start,\n            z_start + target_z,\n            h_start,\n            h_start + target_h,\n            w_start,\n            w_start + target_w,\n        )\n\n        return {\n            \"crop_coords\": crop_coords,\n            \"pad_params\": pad_params,\n        }\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return \"size\", \"pad_if_needed\", \"fill\", \"fill_mask\"\n</code></pre>"},{"location":"api_reference/core/","title":"Index","text":"<ul> <li>Composition API (albumentations.core.composition)</li> <li>Serialization API (albumentations.core.serialization)</li> <li>Transforms Interface (albumentations.core.transforms_interface)</li> <li>Helper functions for working with bounding boxes (albumentations.core.bbox_utils)</li> <li>Helper functions for working with keypoints (albumentations.core.keypoints_utils)</li> </ul>"},{"location":"api_reference/core/bbox_utils/","title":"Helper functions for working with bounding boxes (augmentations.core.bbox_utils)","text":""},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.BboxParams","title":"<code>class  BboxParams</code> <code>       (format, label_fields=None, min_area=0.0, min_visibility=0.0, min_width=0.0, min_height=0.0, check_each_transform=True, clip=False)                         </code>  [view source on GitHub]","text":"<p>Parameters of bounding boxes</p> <p>Parameters:</p> Name Type Description <code>format</code> <code>Literal[\"coco\", \"pascal_voc\", \"albumentations\", \"yolo\"]</code> <p>format of bounding boxes.</p> <p>The <code>coco</code> format     <code>[x_min, y_min, width, height]</code>, e.g. [97, 12, 150, 200]. The <code>pascal_voc</code> format     <code>[x_min, y_min, x_max, y_max]</code>, e.g. [97, 12, 247, 212]. The <code>albumentations</code> format     is like <code>pascal_voc</code>, but normalized,     in other words: <code>[x_min, y_min, x_max, y_max]</code>, e.g. [0.2, 0.3, 0.4, 0.5]. The <code>yolo</code> format     <code>[x, y, width, height]</code>, e.g. [0.1, 0.2, 0.3, 0.4];     <code>x</code>, <code>y</code> - normalized bbox center; <code>width</code>, <code>height</code> - normalized bbox width and height.</p> <code>label_fields</code> <code>list</code> <p>List of fields joined with boxes, e.g., labels.</p> <code>min_area</code> <code>float</code> <p>Minimum area of a bounding box in pixels or normalized units. Bounding boxes with an area less than this value will be removed. Default: 0.0.</p> <code>min_visibility</code> <code>float</code> <p>Minimum fraction of area for a bounding box to remain in the list. Bounding boxes with a visible area less than this fraction will be removed. Default: 0.0.</p> <code>min_width</code> <code>float</code> <p>Minimum width of a bounding box in pixels or normalized units. Bounding boxes with a width less than this value will be removed. Default: 0.0.</p> <code>min_height</code> <code>float</code> <p>Minimum height of a bounding box in pixels or normalized units. Bounding boxes with a height less than this value will be removed. Default: 0.0.</p> <code>check_each_transform</code> <code>bool</code> <p>If True, bounding boxes will be checked after each dual transform. Default: True.</p> <code>clip</code> <code>bool</code> <p>If True, bounding boxes will be clipped to the image borders before applying any transform. Default: False.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>class BboxParams(Params):\n    \"\"\"Parameters of bounding boxes\n\n    Args:\n        format Literal[\"coco\", \"pascal_voc\", \"albumentations\", \"yolo\"]: format of bounding boxes.\n\n            The `coco` format\n                `[x_min, y_min, width, height]`, e.g. [97, 12, 150, 200].\n            The `pascal_voc` format\n                `[x_min, y_min, x_max, y_max]`, e.g. [97, 12, 247, 212].\n            The `albumentations` format\n                is like `pascal_voc`, but normalized,\n                in other words: `[x_min, y_min, x_max, y_max]`, e.g. [0.2, 0.3, 0.4, 0.5].\n            The `yolo` format\n                `[x, y, width, height]`, e.g. [0.1, 0.2, 0.3, 0.4];\n                `x`, `y` - normalized bbox center; `width`, `height` - normalized bbox width and height.\n\n        label_fields (list): List of fields joined with boxes, e.g., labels.\n        min_area (float): Minimum area of a bounding box in pixels or normalized units.\n            Bounding boxes with an area less than this value will be removed. Default: 0.0.\n        min_visibility (float): Minimum fraction of area for a bounding box to remain in the list.\n            Bounding boxes with a visible area less than this fraction will be removed. Default: 0.0.\n        min_width (float): Minimum width of a bounding box in pixels or normalized units.\n            Bounding boxes with a width less than this value will be removed. Default: 0.0.\n        min_height (float): Minimum height of a bounding box in pixels or normalized units.\n            Bounding boxes with a height less than this value will be removed. Default: 0.0.\n        check_each_transform (bool): If True, bounding boxes will be checked after each dual transform. Default: True.\n        clip (bool): If True, bounding boxes will be clipped to the image borders before applying any transform.\n            Default: False.\n\n    \"\"\"\n\n    def __init__(\n        self,\n        format: Literal[\"coco\", \"pascal_voc\", \"albumentations\", \"yolo\"],  # noqa: A002\n        label_fields: Sequence[Any] | None = None,\n        min_area: float = 0.0,\n        min_visibility: float = 0.0,\n        min_width: float = 0.0,\n        min_height: float = 0.0,\n        check_each_transform: bool = True,\n        clip: bool = False,\n    ):\n        super().__init__(format, label_fields)\n        self.min_area = min_area\n        self.min_visibility = min_visibility\n        self.min_width = min_width\n        self.min_height = min_height\n        self.check_each_transform = check_each_transform\n        self.clip = clip\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        data = super().to_dict_private()\n        data.update(\n            {\n                \"min_area\": self.min_area,\n                \"min_visibility\": self.min_visibility,\n                \"min_width\": self.min_width,\n                \"min_height\": self.min_height,\n                \"check_each_transform\": self.check_each_transform,\n                \"clip\": self.clip,\n            },\n        )\n        return data\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return \"BboxParams\"\n\n    def __repr__(self) -&gt; str:\n        return (\n            f\"BboxParams(format={self.format}, label_fields={self.label_fields}, min_area={self.min_area},\"\n            f\" min_visibility={self.min_visibility}, min_width={self.min_width}, min_height={self.min_height},\"\n            f\" check_each_transform={self.check_each_transform}, clip={self.clip})\"\n        )\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.bboxes_from_masks","title":"<code>def bboxes_from_masks    (masks)    </code> [view source on GitHub]","text":"<p>Create bounding boxes from binary masks (fast version)</p> <p>Parameters:</p> Name Type Description <code>masks</code> <code>np.ndarray</code> <p>Binary masks of shape (H, W) or (N, H, W) where N is the number of masks,                and H, W are the height and width of each mask.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of bounding boxes with shape (N, 4), where each row is            (x_min, y_min, x_max, y_max).</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def bboxes_from_masks(masks: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Create bounding boxes from binary masks (fast version)\n\n    Args:\n        masks (np.ndarray): Binary masks of shape (H, W) or (N, H, W) where N is the number of masks,\n                           and H, W are the height and width of each mask.\n\n    Returns:\n        np.ndarray: An array of bounding boxes with shape (N, 4), where each row is\n                   (x_min, y_min, x_max, y_max).\n    \"\"\"\n    # Handle single mask case by adding batch dimension\n    if len(masks.shape) == MONO_CHANNEL_DIMENSIONS:\n        masks = masks[np.newaxis, ...]\n\n    rows = np.any(masks, axis=2)\n    cols = np.any(masks, axis=1)\n\n    bboxes = np.zeros((masks.shape[0], 4), dtype=np.int32)\n\n    for i, (row, col) in enumerate(zip(rows, cols)):\n        if not np.any(row) or not np.any(col):\n            bboxes[i] = [-1, -1, -1, -1]\n        else:\n            y_min, y_max = np.where(row)[0][[0, -1]]\n            x_min, x_max = np.where(col)[0][[0, -1]]\n            bboxes[i] = [x_min, y_min, x_max + 1, y_max + 1]\n\n    return bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.calculate_bbox_areas_in_pixels","title":"<code>def calculate_bbox_areas_in_pixels    (bboxes, shape)    </code> [view source on GitHub]","text":"<p>Calculate areas for multiple bounding boxes.</p> <p>This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 4+) where N is the number of bounding boxes.                  Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates.                  Additional columns beyond the first 4 are ignored.</p> <code>shape</code> <code>ShapeType</code> <p>A tuple containing the height and width of the image (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.             Returns an empty array if the input <code>bboxes</code> is empty.</p> <p>Note</p> <ul> <li>The function assumes that the input bounding boxes are valid (i.e., x_max &gt; x_min and y_max &gt; y_min).   Invalid bounding boxes may result in negative areas.</li> <li>The function preserves the input array and creates a copy for internal calculations.</li> <li>The returned areas are in pixel units, not normalized.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])\n&gt;&gt;&gt; image_shape = (100, 100)\n&gt;&gt;&gt; areas = calculate_bbox_areas(bboxes, image_shape)\n&gt;&gt;&gt; print(areas)\n[1600. 3600.]\n</code></pre> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def calculate_bbox_areas_in_pixels(bboxes: np.ndarray, shape: ShapeType) -&gt; np.ndarray:\n    \"\"\"Calculate areas for multiple bounding boxes.\n\n    This function computes the areas of bounding boxes given their normalized coordinates\n    and the dimensions of the image they belong to. The bounding boxes are expected to be\n    in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).\n\n    Args:\n        bboxes (np.ndarray): A numpy array of shape (N, 4+) where N is the number of bounding boxes.\n                             Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates.\n                             Additional columns beyond the first 4 are ignored.\n        shape (ShapeType): A tuple containing the height and width of the image (height, width).\n\n    Returns:\n        np.ndarray: A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.\n                    Returns an empty array if the input `bboxes` is empty.\n\n    Note:\n        - The function assumes that the input bounding boxes are valid (i.e., x_max &gt; x_min and y_max &gt; y_min).\n          Invalid bounding boxes may result in negative areas.\n        - The function preserves the input array and creates a copy for internal calculations.\n        - The returned areas are in pixel units, not normalized.\n\n    Example:\n        &gt;&gt;&gt; bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])\n        &gt;&gt;&gt; image_shape = (100, 100)\n        &gt;&gt;&gt; areas = calculate_bbox_areas(bboxes, image_shape)\n        &gt;&gt;&gt; print(areas)\n        [1600. 3600.]\n    \"\"\"\n    if len(bboxes) == 0:\n        return np.array([], dtype=np.float32)\n\n    height, width = shape[\"height\"], shape[\"width\"]\n    bboxes_denorm = bboxes.copy()\n    bboxes_denorm[:, [0, 2]] *= width\n    bboxes_denorm[:, [1, 3]] *= height\n    return (bboxes_denorm[:, 2] - bboxes_denorm[:, 0]) * (bboxes_denorm[:, 3] - bboxes_denorm[:, 1])\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.check_bboxes","title":"<code>def check_bboxes    (bboxes)    </code> [view source on GitHub]","text":"<p>Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>numpy array of shape (num_bboxes, 4+) where first 4 coordinates are x_min, y_min, x_max, y_max.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If any bbox is invalid.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef check_bboxes(bboxes: np.ndarray) -&gt; None:\n    \"\"\"Check if bboxes boundaries are in range 0, 1 and minimums are lesser than maximums.\n\n    Args:\n        bboxes: numpy array of shape (num_bboxes, 4+) where first 4 coordinates are x_min, y_min, x_max, y_max.\n\n    Raises:\n        ValueError: If any bbox is invalid.\n    \"\"\"\n    # Check if all values are in range [0, 1]\n    in_range = (bboxes[:, :4] &gt;= 0) &amp; (bboxes[:, :4] &lt;= 1)\n    close_to_zero = np.isclose(bboxes[:, :4], 0)\n    close_to_one = np.isclose(bboxes[:, :4], 1)\n    valid_range = in_range | close_to_zero | close_to_one\n\n    if not np.all(valid_range):\n        invalid_idx = np.where(~np.all(valid_range, axis=1))[0][0]\n        invalid_bbox = bboxes[invalid_idx]\n        invalid_coord = [\"x_min\", \"y_min\", \"x_max\", \"y_max\"][np.where(~valid_range[invalid_idx])[0][0]]\n        invalid_value = invalid_bbox[np.where(~valid_range[invalid_idx])[0][0]]\n        raise ValueError(\n            f\"Expected {invalid_coord} for bbox {invalid_bbox} to be in the range [0.0, 1.0], got {invalid_value}.\",\n        )\n\n    # Check if x_max &gt; x_min and y_max &gt; y_min\n    valid_order = (bboxes[:, 2] &gt; bboxes[:, 0]) &amp; (bboxes[:, 3] &gt; bboxes[:, 1])\n\n    if not np.all(valid_order):\n        invalid_idx = np.where(~valid_order)[0][0]\n        invalid_bbox = bboxes[invalid_idx]\n        if invalid_bbox[2] &lt;= invalid_bbox[0]:\n            raise ValueError(f\"x_max is less than or equal to x_min for bbox {invalid_bbox}.\")\n\n        raise ValueError(f\"y_max is less than or equal to y_min for bbox {invalid_bbox}.\")\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.clip_bboxes","title":"<code>def clip_bboxes    (bboxes, shape)    </code> [view source on GitHub]","text":"<p>Clips the bounding box coordinates to ensure they fit within the boundaries of an image.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (num_boxes, 4+) in normalized format.                  The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>image_shape</code> <code>Tuple[int, int]</code> <p>Image shape (height, width).</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>The clipped bounding boxes, normalized to the image dimensions.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef clip_bboxes(bboxes: np.ndarray, shape: ShapeType) -&gt; np.ndarray:\n    \"\"\"Clips the bounding box coordinates to ensure they fit within the boundaries of an image.\n\n    Parameters:\n        bboxes (np.ndarray): Array of bounding boxes with shape (num_boxes, 4+) in normalized format.\n                             The first 4 columns are [x_min, y_min, x_max, y_max].\n        image_shape (Tuple[int, int]): Image shape (height, width).\n\n    Returns:\n        np.ndarray: The clipped bounding boxes, normalized to the image dimensions.\n\n    \"\"\"\n    height, width = shape[\"height\"], shape[\"width\"]\n\n    # Denormalize bboxes\n    denorm_bboxes = denormalize_bboxes(bboxes, shape)\n\n    ## Note:\n    # It could be tempting to use cols - 1 and rows - 1 as the upper bounds for the clipping\n\n    # But this would cause the bounding box to be clipped to the image dimensions - 1 which is not what we want.\n    # Bounding box lives not in the middle of pixels but between them.\n\n    # Example: for image with height 100, width 100, the pixel values are in the range [0, 99]\n    # but if we want bounding box to be 1 pixel width and height and lie on the boundary of the image\n    # it will be described as [99, 99, 100, 100] =&gt; clip by image_size - 1 will lead to [99, 99, 99, 99]\n    # which is incorrect\n\n    # It could be also tempting to clip `x_min`` to `cols - 1`` and `y_min` to `rows - 1`, but this also leads\n    # to another error. If image fully lies outside of the visible area and min_area is set to 0, then\n    # the bounding box will be clipped to the image size - 1 and will be 1 pixel in size and fully visible,\n    # but it should be completely removed.\n\n    # Clip coordinates\n    denorm_bboxes[:, [0, 2]] = np.clip(denorm_bboxes[:, [0, 2]], 0, width, out=denorm_bboxes[:, [0, 2]])\n    denorm_bboxes[:, [1, 3]] = np.clip(denorm_bboxes[:, [1, 3]], 0, height, out=denorm_bboxes[:, [1, 3]])\n\n    # Normalize clipped bboxes\n    return normalize_bboxes(denorm_bboxes, shape)\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.convert_bboxes_from_albumentations","title":"<code>def convert_bboxes_from_albumentations    (bboxes, target_format, shape, check_validity=False)    </code> [view source on GitHub]","text":"<p>Convert bounding boxes from the format used by albumentations to a specified format.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+).     The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>target_format</code> <code>Literal['coco', 'pascal_voc', 'yolo']</code> <p>Required format of the output bounding boxes. Should be 'coco', 'pascal_voc' or 'yolo'.</p> <code>shape</code> <code>ShapeType</code> <p>Image shape (height, width).</p> <code>check_validity</code> <code>bool</code> <p>Check if all boxes are valid boxes.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of bounding boxes in the target format with shape (num_bboxes, 4+).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>target_format</code> is not 'coco', 'pascal_voc' or 'yolo'.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef convert_bboxes_from_albumentations(\n    bboxes: np.ndarray,\n    target_format: Literal[\"coco\", \"pascal_voc\", \"yolo\"],\n    shape: ShapeType,\n    check_validity: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Convert bounding boxes from the format used by albumentations to a specified format.\n\n    Args:\n        bboxes: A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+).\n                The first 4 columns are [x_min, y_min, x_max, y_max].\n        target_format: Required format of the output bounding boxes. Should be 'coco', 'pascal_voc' or 'yolo'.\n        shape: Image shape (height, width).\n        check_validity: Check if all boxes are valid boxes.\n\n    Returns:\n        np.ndarray: An array of bounding boxes in the target format with shape (num_bboxes, 4+).\n\n    Raises:\n        ValueError: If `target_format` is not 'coco', 'pascal_voc' or 'yolo'.\n    \"\"\"\n    if target_format not in {\"coco\", \"pascal_voc\", \"yolo\"}:\n        raise ValueError(\n            f\"Unknown target_format {target_format}. Supported formats are: 'coco', 'pascal_voc' and 'yolo'\",\n        )\n\n    if check_validity:\n        check_bboxes(bboxes)\n\n    converted_bboxes = np.zeros_like(bboxes)\n    converted_bboxes[:, 4:] = bboxes[:, 4:]  # Preserve additional columns\n\n    denormalized_bboxes = denormalize_bboxes(bboxes[:, :4], shape) if target_format != \"yolo\" else bboxes[:, :4]\n\n    if target_format == \"coco\":\n        converted_bboxes[:, 0] = denormalized_bboxes[:, 0]  # x_min\n        converted_bboxes[:, 1] = denormalized_bboxes[:, 1]  # y_min\n        converted_bboxes[:, 2] = denormalized_bboxes[:, 2] - denormalized_bboxes[:, 0]  # width\n        converted_bboxes[:, 3] = denormalized_bboxes[:, 3] - denormalized_bboxes[:, 1]  # height\n    elif target_format == \"yolo\":\n        converted_bboxes[:, 0] = (denormalized_bboxes[:, 0] + denormalized_bboxes[:, 2]) / 2  # x_center\n        converted_bboxes[:, 1] = (denormalized_bboxes[:, 1] + denormalized_bboxes[:, 3]) / 2  # y_center\n        converted_bboxes[:, 2] = denormalized_bboxes[:, 2] - denormalized_bboxes[:, 0]  # width\n        converted_bboxes[:, 3] = denormalized_bboxes[:, 3] - denormalized_bboxes[:, 1]  # height\n    else:  # pascal_voc\n        converted_bboxes[:, :4] = denormalized_bboxes\n\n    return converted_bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.convert_bboxes_to_albumentations","title":"<code>def convert_bboxes_to_albumentations    (bboxes, source_format, shape, check_validity=False)    </code> [view source on GitHub]","text":"<p>Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of <code>(x_min, y_min, x_max, y_max)</code> e.g. <code>(0.15, 0.27, 0.67, 0.5)</code>.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>A numpy array of bounding boxes with shape (num_bboxes, 4+).</p> <code>source_format</code> <code>Literal['coco', 'pascal_voc', 'yolo']</code> <p>Format of the input bounding boxes. Should be 'coco', 'pascal_voc', or 'yolo'.</p> <code>shape</code> <code>ShapeType</code> <p>Image shape (height, width).</p> <code>check_validity</code> <code>bool</code> <p>Check if all boxes are valid boxes.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>source_format</code> is not 'coco', 'pascal_voc', or 'yolo'.</p> <code>ValueError</code> <p>If in YOLO format, any coordinates are not in the range (0, 1].</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef convert_bboxes_to_albumentations(\n    bboxes: np.ndarray,\n    source_format: Literal[\"coco\", \"pascal_voc\", \"yolo\"],\n    shape: ShapeType,\n    check_validity: bool = False,\n) -&gt; np.ndarray:\n    \"\"\"Convert bounding boxes from a specified format to the format used by albumentations:\n    normalized coordinates of top-left and bottom-right corners of the bounding box in the form of\n    `(x_min, y_min, x_max, y_max)` e.g. `(0.15, 0.27, 0.67, 0.5)`.\n\n    Args:\n        bboxes: A numpy array of bounding boxes with shape (num_bboxes, 4+).\n        source_format: Format of the input bounding boxes. Should be 'coco', 'pascal_voc', or 'yolo'.\n        shape: Image shape (height, width).\n        check_validity: Check if all boxes are valid boxes.\n\n    Returns:\n        np.ndarray: An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).\n\n    Raises:\n        ValueError: If `source_format` is not 'coco', 'pascal_voc', or 'yolo'.\n        ValueError: If in YOLO format, any coordinates are not in the range (0, 1].\n    \"\"\"\n    if source_format not in {\"coco\", \"pascal_voc\", \"yolo\"}:\n        raise ValueError(\n            f\"Unknown source_format {source_format}. Supported formats are: 'coco', 'pascal_voc' and 'yolo'\",\n        )\n\n    bboxes = bboxes.copy().astype(np.float32)\n    converted_bboxes = np.zeros_like(bboxes)\n    converted_bboxes[:, 4:] = bboxes[:, 4:]  # Preserve additional columns\n\n    if source_format == \"coco\":\n        converted_bboxes[:, 0] = bboxes[:, 0]  # x_min\n        converted_bboxes[:, 1] = bboxes[:, 1]  # y_min\n        converted_bboxes[:, 2] = bboxes[:, 0] + bboxes[:, 2]  # x_max\n        converted_bboxes[:, 3] = bboxes[:, 1] + bboxes[:, 3]  # y_max\n    elif source_format == \"yolo\":\n        if check_validity and np.any((bboxes[:, :4] &lt;= 0) | (bboxes[:, :4] &gt; 1)):\n            raise ValueError(f\"In YOLO format all coordinates must be float and in range (0, 1], got {bboxes}\")\n\n        w_half, h_half = bboxes[:, 2] / 2, bboxes[:, 3] / 2\n        converted_bboxes[:, 0] = bboxes[:, 0] - w_half  # x_min\n        converted_bboxes[:, 1] = bboxes[:, 1] - h_half  # y_min\n        converted_bboxes[:, 2] = bboxes[:, 0] + w_half  # x_max\n        converted_bboxes[:, 3] = bboxes[:, 1] + h_half  # y_max\n    else:  # pascal_voc\n        converted_bboxes[:, :4] = bboxes[:, :4]\n\n    if source_format != \"yolo\":\n        converted_bboxes[:, :4] = normalize_bboxes(converted_bboxes[:, :4], shape)\n\n    if check_validity:\n        check_bboxes(converted_bboxes)\n\n    return converted_bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.denormalize_bboxes","title":"<code>def denormalize_bboxes    (bboxes, shape)    </code> [view source on GitHub]","text":"<p>Denormalize  array of bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> <code>shape</code> <code>ShapeType | tuple[int, int]</code> <p>Image shape <code>(height, width)</code>.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef denormalize_bboxes(\n    bboxes: np.ndarray,\n    shape: ShapeType | tuple[int, int],\n) -&gt; np.ndarray:\n    \"\"\"Denormalize  array of bounding boxes.\n\n    Args:\n        bboxes: Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n        shape: Image shape `(height, width)`.\n\n    Returns:\n        Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n\n    \"\"\"\n    if isinstance(shape, tuple):\n        rows, cols = shape[:2]\n    else:\n        rows, cols = shape[\"height\"], shape[\"width\"]\n\n    denormalized = bboxes.copy().astype(float)\n    denormalized[:, [0, 2]] *= cols\n    denormalized[:, [1, 3]] *= rows\n    return denormalized\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.filter_bboxes","title":"<code>def filter_bboxes    (bboxes, shape, min_area=0.0, min_visibility=0.0, min_width=1.0, min_height=1.0)    </code> [view source on GitHub]","text":"<p>Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by <code>min_area</code>. Also crops boxes to final image size.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>numpy array of bounding boxes with shape (num_bboxes, 4+).     The first 4 columns are [x_min, y_min, x_max, y_max].</p> <code>shape</code> <code>dict[str, int]</code> <p>The shape of the image/volume:                    - For 2D: {'height': int, 'width': int}                    - For 3D: {'height': int, 'width': int, 'depth': int}</p> <code>min_area</code> <code>float</code> <p>Minimum area of a bounding box in pixels. Default: 0.0.</p> <code>min_visibility</code> <code>float</code> <p>Minimum fraction of area for a bounding box to remain. Default: 0.0.</p> <code>min_width</code> <code>float</code> <p>Minimum width of a bounding box in pixels. Default: 0.0.</p> <code>min_height</code> <code>float</code> <p>Minimum height of a bounding box in pixels. Default: 0.0.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>numpy array of filtered bounding boxes.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def filter_bboxes(\n    bboxes: np.ndarray,\n    shape: ShapeType,\n    min_area: float = 0.0,\n    min_visibility: float = 0.0,\n    min_width: float = 1.0,\n    min_height: float = 1.0,\n) -&gt; np.ndarray:\n    \"\"\"Remove bounding boxes that either lie outside of the visible area by more than min_visibility\n    or whose area in pixels is under the threshold set by `min_area`. Also crops boxes to final image size.\n\n    Args:\n        bboxes: numpy array of bounding boxes with shape (num_bboxes, 4+).\n                The first 4 columns are [x_min, y_min, x_max, y_max].\n        shape (dict[str, int]): The shape of the image/volume:\n                               - For 2D: {'height': int, 'width': int}\n                               - For 3D: {'height': int, 'width': int, 'depth': int}\n\n        min_area: Minimum area of a bounding box in pixels. Default: 0.0.\n        min_visibility: Minimum fraction of area for a bounding box to remain. Default: 0.0.\n        min_width: Minimum width of a bounding box in pixels. Default: 0.0.\n        min_height: Minimum height of a bounding box in pixels. Default: 0.0.\n\n    Returns:\n        numpy array of filtered bounding boxes.\n    \"\"\"\n    epsilon = 1e-7\n\n    if len(bboxes) == 0:\n        return np.array([], dtype=np.float32).reshape(0, 4)\n\n    # Calculate areas of bounding boxes before clipping in pixels\n    denormalized_box_areas = calculate_bbox_areas_in_pixels(bboxes, shape)\n\n    # Clip bounding boxes in ratio\n    clipped_bboxes = clip_bboxes(bboxes, shape)\n\n    # Calculate areas of clipped bounding boxes in pixels\n    clipped_box_areas = calculate_bbox_areas_in_pixels(clipped_bboxes, shape)\n\n    # Calculate width and height of the clipped bounding boxes\n    denormalized_bboxes = denormalize_bboxes(clipped_bboxes[:, :4], shape)\n\n    clipped_widths = denormalized_bboxes[:, 2] - denormalized_bboxes[:, 0]\n    clipped_heights = denormalized_bboxes[:, 3] - denormalized_bboxes[:, 1]\n\n    # Create a mask for bboxes that meet all criteria\n    mask = (\n        (denormalized_box_areas &gt;= epsilon)\n        &amp; (clipped_box_areas &gt;= min_area - epsilon)\n        &amp; (clipped_box_areas / denormalized_box_areas &gt;= min_visibility - epsilon)\n        &amp; (clipped_widths &gt;= min_width - epsilon)\n        &amp; (clipped_heights &gt;= min_height - epsilon)\n    )\n\n    # Apply the mask to get the filtered bboxes\n    filtered_bboxes = clipped_bboxes[mask]\n\n    return np.array([], dtype=np.float32).reshape(0, 4) if len(filtered_bboxes) == 0 else filtered_bboxes\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.masks_from_bboxes","title":"<code>def masks_from_bboxes    (bboxes, shape)    </code> [view source on GitHub]","text":"<p>Create binary masks from multiple bounding boxes</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Array of bounding boxes with shape (N, 4), where N is the number of boxes</p> <code>shape</code> <code>ShapeType | tuple[int, int]</code> <p>{\"height\": int, \"width\": int} or tuple[int, int]</p> <p>Returns:</p> Type Description <code>masks</code> <p>Array of binary masks with shape (N, height, width)</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def masks_from_bboxes(bboxes: np.ndarray, shape: ShapeType | tuple[int, int]) -&gt; np.ndarray:\n    \"\"\"Create binary masks from multiple bounding boxes\n\n    Args:\n        bboxes: Array of bounding boxes with shape (N, 4), where N is the number of boxes\n        shape: {\"height\": int, \"width\": int} or tuple[int, int]\n\n    Returns:\n        masks: Array of binary masks with shape (N, height, width)\n\n    \"\"\"\n    if isinstance(shape, dict):\n        height, width = shape[\"height\"], shape[\"width\"]\n    else:\n        height, width = shape[:2]\n\n    masks = np.zeros((len(bboxes), height, width), dtype=np.uint8)\n    y, x = np.ogrid[:height, :width]\n\n    for i, (x_min, y_min, x_max, y_max) in enumerate(bboxes[:, :4].astype(int)):\n        masks[i] = (x_min &lt;= x) &amp; (x &lt; x_max) &amp; (y_min &lt;= y) &amp; (y &lt; y_max)\n\n    return masks\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.normalize_bboxes","title":"<code>def normalize_bboxes    (bboxes, shape)    </code> [view source on GitHub]","text":"<p>Normalize array of bounding boxes.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>Denormalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> <code>shape</code> <code>ShapeType | tuple[int, int]</code> <p>Image shape <code>(height, width)</code>.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Normalized bounding boxes <code>[(x_min, y_min, x_max, y_max, ...)]</code>.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>@handle_empty_array(\"bboxes\")\ndef normalize_bboxes(bboxes: np.ndarray, shape: ShapeType | tuple[int, int]) -&gt; np.ndarray:\n    \"\"\"Normalize array of bounding boxes.\n\n    Args:\n        bboxes: Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n        shape: Image shape `(height, width)`.\n\n    Returns:\n        Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.\n\n    \"\"\"\n    if isinstance(shape, tuple):\n        rows, cols = shape[:2]\n    else:\n        rows, cols = shape[\"height\"], shape[\"width\"]\n\n    normalized = bboxes.copy().astype(float)\n    normalized[:, [0, 2]] /= cols\n    normalized[:, [1, 3]] /= rows\n    return normalized\n</code></pre>"},{"location":"api_reference/core/bbox_utils/#albumentations.core.bbox_utils.union_of_bboxes","title":"<code>def union_of_bboxes    (bboxes, erosion_rate)    </code> [view source on GitHub]","text":"<p>Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.</p> <p>Parameters:</p> Name Type Description <code>bboxes</code> <code>np.ndarray</code> <p>List of bounding boxes</p> <code>erosion_rate</code> <code>float</code> <p>How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.</p> <p>Returns:</p> Type Description <code>np.ndarray | None</code> <p>A bounding box <code>(x_min, y_min, x_max, y_max)</code> or None if no bboxes are given or if             the bounding boxes become invalid after erosion.</p> Source code in <code>albumentations/core/bbox_utils.py</code> Python<pre><code>def union_of_bboxes(bboxes: np.ndarray, erosion_rate: float) -&gt; np.ndarray | None:\n    \"\"\"Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.\n\n    Args:\n        bboxes (np.ndarray): List of bounding boxes\n        erosion_rate (float): How much each bounding box can be shrunk, useful for erosive cropping.\n            Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.\n\n    Returns:\n        np.ndarray | None: A bounding box `(x_min, y_min, x_max, y_max)` or None if no bboxes are given or if\n                    the bounding boxes become invalid after erosion.\n    \"\"\"\n    if not bboxes.size:\n        return None\n\n    if erosion_rate == 1:\n        return None\n\n    if bboxes.shape[0] == 1:\n        return bboxes[0][:4]\n\n    epsilon = 1e-6\n\n    x_min, y_min = np.min(bboxes[:, :2], axis=0)\n    x_max, y_max = np.max(bboxes[:, 2:4], axis=0)\n\n    width = x_max - x_min\n    height = y_max - y_min\n\n    erosion_x = width * erosion_rate * 0.5\n    erosion_y = height * erosion_rate * 0.5\n\n    x_min += erosion_x\n    y_min += erosion_y\n    x_max -= erosion_x\n    y_max -= erosion_y\n\n    if abs(x_max - x_min) &lt; epsilon or abs(y_max - y_min) &lt; epsilon:\n        return None\n\n    return np.array([x_min, y_min, x_max, y_max], dtype=np.float32)\n</code></pre>"},{"location":"api_reference/core/composition/","title":"Composition API (core.composition)","text":""},{"location":"api_reference/core/composition/#albumentations.core.composition.BaseCompose","title":"<code>class  BaseCompose</code> <code>           (transforms, p, mask_interpolation=None, seed=None, save_applied_params=False)                                                     </code>  [view source on GitHub]","text":"<p>Base class for composing multiple transforms together.</p> <p>This class serves as a foundation for creating compositions of transforms in the Albumentations library. It provides basic functionality for managing a sequence of transforms and applying them to data.</p> <p>Attributes:</p> Name Type Description <code>transforms</code> <code>List[TransformType]</code> <p>A list of transforms to be applied.</p> <code>p</code> <code>float</code> <p>Probability of applying the compose. Should be in the range [0, 1].</p> <code>replay_mode</code> <code>bool</code> <p>If True, the compose is in replay mode.</p> <code>_additional_targets</code> <code>Dict[str, str]</code> <p>Additional targets for transforms.</p> <code>_available_keys</code> <code>Set[str]</code> <p>Set of available keys for data.</p> <code>processors</code> <code>Dict[str, Union[BboxProcessor, KeypointsProcessor]]</code> <p>Processors for specific data types.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>TransformsSeqType</code> <p>A sequence of transforms to compose.</p> <code>p</code> <code>float</code> <p>Probability of applying the compose.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If an invalid additional target is specified.</p> <p>Note</p> <ul> <li>Subclasses should implement the call method to define how   the composition is applied to data.</li> <li>The class supports serialization and deserialization of transforms.</li> <li>It provides methods for adding targets, setting deterministic behavior,   and checking data validity post-transform.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class BaseCompose(Serializable):\n    \"\"\"Base class for composing multiple transforms together.\n\n    This class serves as a foundation for creating compositions of transforms\n    in the Albumentations library. It provides basic functionality for\n    managing a sequence of transforms and applying them to data.\n\n    Attributes:\n        transforms (List[TransformType]): A list of transforms to be applied.\n        p (float): Probability of applying the compose. Should be in the range [0, 1].\n        replay_mode (bool): If True, the compose is in replay mode.\n        _additional_targets (Dict[str, str]): Additional targets for transforms.\n        _available_keys (Set[str]): Set of available keys for data.\n        processors (Dict[str, Union[BboxProcessor, KeypointsProcessor]]): Processors for specific data types.\n\n    Args:\n        transforms (TransformsSeqType): A sequence of transforms to compose.\n        p (float): Probability of applying the compose.\n\n    Raises:\n        ValueError: If an invalid additional target is specified.\n\n    Note:\n        - Subclasses should implement the __call__ method to define how\n          the composition is applied to data.\n        - The class supports serialization and deserialization of transforms.\n        - It provides methods for adding targets, setting deterministic behavior,\n          and checking data validity post-transform.\n    \"\"\"\n\n    _transforms_dict: dict[int, BasicTransform] | None = None\n    check_each_transform: tuple[DataProcessor, ...] | None = None\n    main_compose: bool = True\n\n    def __init__(\n        self,\n        transforms: TransformsSeqType,\n        p: float,\n        mask_interpolation: int | None = None,\n        seed: int | None = None,\n        save_applied_params: bool = False,\n    ):\n        if isinstance(transforms, (BaseCompose, BasicTransform)):\n            warnings.warn(\n                \"transforms is single transform, but a sequence is expected! Transform will be wrapped into list.\",\n                stacklevel=2,\n            )\n            transforms = [transforms]\n\n        self.transforms = transforms\n        self.p = p\n\n        self.replay_mode = False\n        self._additional_targets: dict[str, str] = {}\n        self._available_keys: set[str] = set()\n        self.processors: dict[str, BboxProcessor | KeypointsProcessor] = {}\n        self._set_keys()\n        self.set_mask_interpolation(mask_interpolation)\n        self.seed = seed\n        self.random_generator = np.random.default_rng(seed)\n        self.py_random = random.Random(seed)\n        self.set_random_seed(seed)\n        self.save_applied_params = save_applied_params\n\n    def _track_transform_params(self, transform: TransformType, data: dict[str, Any]) -&gt; None:\n        \"\"\"Track transform parameters if tracking is enabled.\"\"\"\n        if \"applied_transforms\" in data and hasattr(transform, \"params\") and transform.params:\n            data[\"applied_transforms\"].append((transform.__class__.__name__, transform.params.copy()))\n\n    def set_random_state(\n        self,\n        random_generator: np.random.Generator,\n        py_random: random.Random,\n    ) -&gt; None:\n        \"\"\"Set random state directly from generators.\n\n        Args:\n            random_generator: numpy random generator to use\n            py_random: python random generator to use\n        \"\"\"\n        self.random_generator = random_generator\n        self.py_random = py_random\n\n        # Propagate both random states to all transforms\n        for transform in self.transforms:\n            if isinstance(transform, (BasicTransform, BaseCompose)):\n                transform.set_random_state(random_generator, py_random)\n\n    def set_random_seed(self, seed: int | None) -&gt; None:\n        \"\"\"Set random state from seed.\n\n        Args:\n            seed: Random seed to use\n        \"\"\"\n        self.seed = seed\n        self.random_generator = np.random.default_rng(seed)\n        self.py_random = random.Random(seed)\n\n        # Propagate seed to all transforms\n        for transform in self.transforms:\n            if isinstance(transform, (BasicTransform, BaseCompose)):\n                transform.set_random_seed(seed)\n\n    def set_mask_interpolation(self, mask_interpolation: int | None) -&gt; None:\n        self.mask_interpolation = mask_interpolation\n        self._set_mask_interpolation_recursive(self.transforms)\n\n    def _set_mask_interpolation_recursive(self, transforms: TransformsSeqType) -&gt; None:\n        for transform in transforms:\n            if isinstance(transform, BasicTransform):\n                if hasattr(transform, \"mask_interpolation\") and self.mask_interpolation is not None:\n                    transform.mask_interpolation = self.mask_interpolation\n            elif isinstance(transform, BaseCompose):\n                transform.set_mask_interpolation(self.mask_interpolation)\n\n    def __iter__(self) -&gt; Iterator[TransformType]:\n        return iter(self.transforms)\n\n    def __len__(self) -&gt; int:\n        return len(self.transforms)\n\n    def __call__(self, *args: Any, **data: Any) -&gt; dict[str, Any]:\n        raise NotImplementedError\n\n    def __getitem__(self, item: int) -&gt; TransformType:\n        return self.transforms[item]\n\n    def __repr__(self) -&gt; str:\n        return self.indented_repr()\n\n    @property\n    def additional_targets(self) -&gt; dict[str, str]:\n        return self._additional_targets\n\n    @property\n    def available_keys(self) -&gt; set[str]:\n        return self._available_keys\n\n    def indented_repr(self, indent: int = REPR_INDENT_STEP) -&gt; str:\n        args = {k: v for k, v in self.to_dict_private().items() if not (k.startswith(\"__\") or k == \"transforms\")}\n        repr_string = self.__class__.__name__ + \"([\"\n        for t in self.transforms:\n            repr_string += \"\\n\"\n            t_repr = t.indented_repr(indent + REPR_INDENT_STEP) if hasattr(t, \"indented_repr\") else repr(t)\n            repr_string += \" \" * indent + t_repr + \",\"\n        repr_string += \"\\n\" + \" \" * (indent - REPR_INDENT_STEP) + f\"], {format_args(args)})\"\n        return repr_string\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return get_shortest_class_fullname(cls)\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        return {\n            \"__class_fullname__\": self.get_class_fullname(),\n            \"p\": self.p,\n            \"transforms\": [t.to_dict_private() for t in self.transforms],\n        }\n\n    def get_dict_with_id(self) -&gt; dict[str, Any]:\n        return {\n            \"__class_fullname__\": self.get_class_fullname(),\n            \"id\": id(self),\n            \"params\": None,\n            \"transforms\": [t.get_dict_with_id() for t in self.transforms],\n        }\n\n    def add_targets(self, additional_targets: dict[str, str] | None) -&gt; None:\n        if additional_targets:\n            for k, v in additional_targets.items():\n                if k in self._additional_targets and v != self._additional_targets[k]:\n                    raise ValueError(\n                        f\"Trying to overwrite existed additional targets. \"\n                        f\"Key={k} Exists={self._additional_targets[k]} New value: {v}\",\n                    )\n            self._additional_targets.update(additional_targets)\n            for t in self.transforms:\n                t.add_targets(additional_targets)\n            for proc in self.processors.values():\n                proc.add_targets(additional_targets)\n        self._set_keys()\n\n    def _set_keys(self) -&gt; None:\n        \"\"\"Set _available_keys\"\"\"\n        self._available_keys.update(self._additional_targets.keys())\n        for t in self.transforms:\n            self._available_keys.update(t.available_keys)\n            if hasattr(t, \"targets_as_params\"):\n                self._available_keys.update(t.targets_as_params)\n        if self.processors:\n            self._available_keys.update([\"labels\"])\n            for proc in self.processors.values():\n                if proc.default_data_name not in self._available_keys:  # if no transform to process this data\n                    warnings.warn(\n                        f\"Got processor for {proc.default_data_name}, but no transform to process it.\",\n                        stacklevel=2,\n                    )\n                self._available_keys.update(proc.data_fields)\n                if proc.params.label_fields:\n                    self._available_keys.update(proc.params.label_fields)\n\n    def set_deterministic(self, flag: bool, save_key: str = \"replay\") -&gt; None:\n        for t in self.transforms:\n            t.set_deterministic(flag, save_key)\n\n    def check_data_post_transform(self, data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Check and filter data after transformation.\n\n        Args:\n            data: Dictionary containing transformed data\n\n        Returns:\n            Filtered data dictionary\n        \"\"\"\n        if self.check_each_transform:\n            shape = get_shape(data)\n\n            for proc in self.check_each_transform:\n                for data_name, data_value in data.items():\n                    if data_name in proc.data_fields or (\n                        data_name in self._additional_targets\n                        and self._additional_targets[data_name] in proc.data_fields\n                    ):\n                        data[data_name] = proc.filter(data_value, shape)\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.Compose","title":"<code>class  Compose</code> <code>         (transforms, bbox_params=None, keypoint_params=None, additional_targets=None, p=1.0, is_check_shapes=True, strict=True, mask_interpolation=None, seed=None, save_applied_params=False)                           </code>  [view source on GitHub]","text":"<p>Compose multiple transforms together and apply them sequentially to input data.</p> <p>This class allows you to chain multiple image augmentation transforms and apply them in a specified order. It also handles bounding box and keypoint transformations if the appropriate parameters are provided.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>List[Union[BasicTransform, BaseCompose]]</code> <p>A list of transforms to apply.</p> <code>bbox_params</code> <code>Union[dict, BboxParams, None]</code> <p>Parameters for bounding box transforms. Can be a dict of params or a BboxParams object. Default is None.</p> <code>keypoint_params</code> <code>Union[dict, KeypointParams, None]</code> <p>Parameters for keypoint transforms. Can be a dict of params or a KeypointParams object. Default is None.</p> <code>additional_targets</code> <code>Dict[str, str]</code> <p>A dictionary mapping additional target names to their types. For example, {'image2': 'image'}. Default is None.</p> <code>p</code> <code>float</code> <p>Probability of applying all transforms. Should be in range [0, 1]. Default is 1.0.</p> <code>is_check_shapes</code> <code>bool</code> <p>If True, checks consistency of shapes for image/mask/masks on each call. Disable only if you are sure about your data consistency. Default is True.</p> <code>strict</code> <code>bool</code> <p>If True, raises an error on unknown input keys. If False, ignores them. Default is True.</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method for mask transforms. When defined, it overrides the interpolation method specified in individual transforms. Default is None.</p> <code>seed</code> <code>int</code> <p>Random seed. Default is None.</p> <code>save_applied_params</code> <code>bool</code> <p>If True, saves the applied parameters of each transform. Default is False.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n...     A.RandomCrop(width=256, height=256),\n...     A.HorizontalFlip(p=0.5),\n...     A.RandomBrightnessContrast(p=0.2),\n... ])\n&gt;&gt;&gt; transformed = transform(image=image)\n</code></pre> <p>Note</p> <ul> <li>The class checks the validity of input data and shapes if is_check_args and is_check_shapes are True.</li> <li>When bbox_params or keypoint_params are provided, it sets up the corresponding processors.</li> <li>The transform can handle additional targets specified in the additional_targets dictionary.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class Compose(BaseCompose, HubMixin):\n    \"\"\"Compose multiple transforms together and apply them sequentially to input data.\n\n    This class allows you to chain multiple image augmentation transforms and apply them\n    in a specified order. It also handles bounding box and keypoint transformations if\n    the appropriate parameters are provided.\n\n    Args:\n        transforms (List[Union[BasicTransform, BaseCompose]]): A list of transforms to apply.\n        bbox_params (Union[dict, BboxParams, None]): Parameters for bounding box transforms.\n            Can be a dict of params or a BboxParams object. Default is None.\n        keypoint_params (Union[dict, KeypointParams, None]): Parameters for keypoint transforms.\n            Can be a dict of params or a KeypointParams object. Default is None.\n        additional_targets (Dict[str, str], optional): A dictionary mapping additional target names\n            to their types. For example, {'image2': 'image'}. Default is None.\n        p (float): Probability of applying all transforms. Should be in range [0, 1]. Default is 1.0.\n        is_check_shapes (bool): If True, checks consistency of shapes for image/mask/masks on each call.\n            Disable only if you are sure about your data consistency. Default is True.\n        strict (bool): If True, raises an error on unknown input keys. If False, ignores them. Default is True.\n        mask_interpolation (int, optional): Interpolation method for mask transforms. When defined,\n            it overrides the interpolation method specified in individual transforms. Default is None.\n        seed (int, optional): Random seed. Default is None.\n        save_applied_params (bool): If True, saves the applied parameters of each transform. Default is False.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        ...     A.RandomCrop(width=256, height=256),\n        ...     A.HorizontalFlip(p=0.5),\n        ...     A.RandomBrightnessContrast(p=0.2),\n        ... ])\n        &gt;&gt;&gt; transformed = transform(image=image)\n\n    Note:\n        - The class checks the validity of input data and shapes if is_check_args and is_check_shapes are True.\n        - When bbox_params or keypoint_params are provided, it sets up the corresponding processors.\n        - The transform can handle additional targets specified in the additional_targets dictionary.\n    \"\"\"\n\n    def __init__(\n        self,\n        transforms: TransformsSeqType,\n        bbox_params: dict[str, Any] | BboxParams | None = None,\n        keypoint_params: dict[str, Any] | KeypointParams | None = None,\n        additional_targets: dict[str, str] | None = None,\n        p: float = 1.0,\n        is_check_shapes: bool = True,\n        strict: bool = True,\n        mask_interpolation: int | None = None,\n        seed: int | None = None,\n        save_applied_params: bool = False,\n    ):\n        super().__init__(\n            transforms=transforms,\n            p=p,\n            mask_interpolation=mask_interpolation,\n            seed=seed,\n            save_applied_params=save_applied_params,\n        )\n\n        if bbox_params:\n            if isinstance(bbox_params, dict):\n                b_params = BboxParams(**bbox_params)\n            elif isinstance(bbox_params, BboxParams):\n                b_params = bbox_params\n            else:\n                msg = \"unknown format of bbox_params, please use `dict` or `BboxParams`\"\n                raise ValueError(msg)\n            self.processors[\"bboxes\"] = BboxProcessor(b_params)\n\n        if keypoint_params:\n            if isinstance(keypoint_params, dict):\n                k_params = KeypointParams(**keypoint_params)\n            elif isinstance(keypoint_params, KeypointParams):\n                k_params = keypoint_params\n            else:\n                msg = \"unknown format of keypoint_params, please use `dict` or `KeypointParams`\"\n                raise ValueError(msg)\n            self.processors[\"keypoints\"] = KeypointsProcessor(k_params)\n\n        for proc in self.processors.values():\n            proc.ensure_transforms_valid(self.transforms)\n\n        self.add_targets(additional_targets)\n        if not self.transforms:  # if no transforms -&gt; do nothing, all keys will be available\n            self._available_keys.update(AVAILABLE_KEYS)\n\n        self.is_check_args = True\n        self.strict = strict\n\n        self.is_check_shapes = is_check_shapes\n        self.check_each_transform = tuple(  # processors that checks after each transform\n            proc for proc in self.processors.values() if getattr(proc.params, \"check_each_transform\", False)\n        )\n        self._set_check_args_for_transforms(self.transforms)\n\n        self._set_processors_for_transforms(self.transforms)\n\n        self.save_applied_params = save_applied_params\n        self._images_was_list = False\n        self._masks_was_list = False\n\n    def _set_processors_for_transforms(self, transforms: TransformsSeqType) -&gt; None:\n        for transform in transforms:\n            if isinstance(transform, BasicTransform):\n                if hasattr(transform, \"set_processors\"):\n                    transform.set_processors(self.processors)\n            elif isinstance(transform, BaseCompose):\n                self._set_processors_for_transforms(transform.transforms)\n\n    def _set_check_args_for_transforms(self, transforms: TransformsSeqType) -&gt; None:\n        for transform in transforms:\n            if isinstance(transform, BaseCompose):\n                self._set_check_args_for_transforms(transform.transforms)\n                transform.check_each_transform = self.check_each_transform\n                transform.processors = self.processors\n            if isinstance(transform, Compose):\n                transform.disable_check_args_private()\n\n    def disable_check_args_private(self) -&gt; None:\n        self.is_check_args = False\n        self.strict = False\n        self.main_compose = False\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if args:\n            msg = \"You have to pass data to augmentations as named arguments, for example: aug(image=image)\"\n            raise KeyError(msg)\n\n        if not isinstance(force_apply, (bool, int)):\n            msg = \"force_apply must have bool or int type\"\n            raise TypeError(msg)\n\n        # Initialize applied_transforms only in top-level Compose if requested\n        if self.save_applied_params and self.main_compose:\n            data[\"applied_transforms\"] = []\n\n        need_to_run = force_apply or self.py_random.random() &lt; self.p\n        if not need_to_run:\n            return data\n\n        self.preprocess(data)\n\n        for t in self.transforms:\n            data = t(**data)\n            self._track_transform_params(t, data)\n            data = self.check_data_post_transform(data)\n\n        return self.postprocess(data)\n\n    def preprocess(self, data: Any) -&gt; None:\n        \"\"\"Preprocess input data before applying transforms.\"\"\"\n        self._validate_data(data)\n        self._preprocess_processors(data)\n        self._preprocess_arrays(data)\n\n    def _validate_data(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Validate input data keys and arguments.\"\"\"\n        if not self.strict:\n            return\n\n        for data_name in data:\n            if not self._is_valid_key(data_name):\n                raise ValueError(f\"Key {data_name} is not in available keys.\")\n\n        if self.is_check_args:\n            self._check_args(**data)\n\n    def _is_valid_key(self, key: str) -&gt; bool:\n        \"\"\"Check if the key is valid for processing.\"\"\"\n        return key in self._available_keys or key in MASK_KEYS or key in IMAGE_KEYS or key == \"applied_transforms\"\n\n    def _preprocess_processors(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Run preprocessors if this is the main compose.\"\"\"\n        if not self.main_compose:\n            return\n\n        for processor in self.processors.values():\n            processor.ensure_data_valid(data)\n        for processor in self.processors.values():\n            processor.preprocess(data)\n\n    def _preprocess_arrays(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Convert lists to numpy arrays for images and masks.\"\"\"\n        self._preprocess_images(data)\n        self._preprocess_masks(data)\n\n    def _preprocess_images(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Convert image lists to numpy arrays.\"\"\"\n        if \"images\" not in data:\n            return\n\n        if isinstance(data[\"images\"], (list, tuple)):\n            self._images_was_list = True\n            data[\"images\"] = np.stack(data[\"images\"])\n        else:\n            self._images_was_list = False\n\n    def _preprocess_masks(self, data: dict[str, Any]) -&gt; None:\n        \"\"\"Convert mask lists to numpy arrays.\"\"\"\n        if \"masks\" not in data:\n            return\n\n        if isinstance(data[\"masks\"], (list, tuple)):\n            self._masks_was_list = True\n            data[\"masks\"] = np.stack(data[\"masks\"])\n        else:\n            self._masks_was_list = False\n\n    def postprocess(self, data: dict[str, Any]) -&gt; dict[str, Any]:\n        if self.main_compose:\n            for p in self.processors.values():\n                p.postprocess(data)\n\n            # Convert back to list if original input was a list\n            if \"images\" in data and self._images_was_list:\n                data[\"images\"] = list(data[\"images\"])\n\n            if \"masks\" in data and self._masks_was_list:\n                data[\"masks\"] = list(data[\"masks\"])\n\n        return data\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        dictionary = super().to_dict_private()\n        bbox_processor = self.processors.get(\"bboxes\")\n        keypoints_processor = self.processors.get(\"keypoints\")\n        dictionary.update(\n            {\n                \"bbox_params\": bbox_processor.params.to_dict_private() if bbox_processor else None,\n                \"keypoint_params\": (keypoints_processor.params.to_dict_private() if keypoints_processor else None),\n                \"additional_targets\": self.additional_targets,\n                \"is_check_shapes\": self.is_check_shapes,\n            },\n        )\n        return dictionary\n\n    def get_dict_with_id(self) -&gt; dict[str, Any]:\n        dictionary = super().get_dict_with_id()\n        bbox_processor = self.processors.get(\"bboxes\")\n        keypoints_processor = self.processors.get(\"keypoints\")\n        dictionary.update(\n            {\n                \"bbox_params\": bbox_processor.params.to_dict_private() if bbox_processor else None,\n                \"keypoint_params\": (keypoints_processor.params.to_dict_private() if keypoints_processor else None),\n                \"additional_targets\": self.additional_targets,\n                \"params\": None,\n                \"is_check_shapes\": self.is_check_shapes,\n            },\n        )\n        return dictionary\n\n    @staticmethod\n    def _check_single_data(data_name: str, data: Any) -&gt; tuple[int, int]:\n        if not isinstance(data, np.ndarray):\n            raise TypeError(f\"{data_name} must be numpy array type\")\n        return data.shape[:2]\n\n    @staticmethod\n    def _check_masks_data(data_name: str, data: Any) -&gt; tuple[int, int]:\n        \"\"\"Check masks data format and return shape.\n\n        Args:\n            data_name: Name of the data field being checked\n            data: Input data in one of these formats:\n                - List of numpy arrays, each of shape (H, W) or (H, W, C)\n                - Numpy array of shape (N, H, W) or (N, H, W, C)\n\n        Returns:\n            tuple: (height, width) of the first mask\n\n        Raises:\n            TypeError: If data format is invalid\n        \"\"\"\n        if isinstance(data, np.ndarray):\n            if data.ndim not in [3, 4]:  # (N,H,W) or (N,H,W,C)\n                raise TypeError(f\"{data_name} as numpy array must be 3D or 4D\")\n            return data.shape[1:3]  # Return (H,W)\n\n        if isinstance(data, (list, tuple)):\n            if not data:\n                raise ValueError(f\"{data_name} cannot be empty\")\n            if not all(isinstance(m, np.ndarray) for m in data):\n                raise TypeError(f\"All elements in {data_name} must be numpy arrays\")\n            if any(m.ndim not in [2, 3] for m in data):\n                raise TypeError(f\"All masks in {data_name} must be 2D or 3D numpy arrays\")\n            return data[0].shape[:2]\n\n        raise TypeError(f\"{data_name} must be either a numpy array or a sequence of numpy arrays\")\n\n    @staticmethod\n    def _check_multi_data(data_name: str, data: Any) -&gt; tuple[int, int]:\n        \"\"\"Check multi-image data format and return shape.\n\n        Args:\n            data_name: Name of the data field being checked\n            data: Input data in one of these formats:\n                - List-like of numpy arrays\n                - Numpy array of shape (N, H, W, C) or (N, H, W)\n\n        Returns:\n            tuple: (height, width) of the first image\n\n        Raises:\n            TypeError: If data format is invalid\n        \"\"\"\n        if isinstance(data, np.ndarray):\n            if data.ndim not in {3, 4}:  # (N,H,W) or (N,H,W,C)\n                raise TypeError(f\"{data_name} as numpy array must be 3D or 4D\")\n            return data.shape[1:3]  # Return (H,W)\n\n        if not isinstance(data, Sequence) or not isinstance(data[0], np.ndarray):\n            raise TypeError(f\"{data_name} must be either a numpy array or a list of numpy arrays\")\n        return data[0].shape[:2]\n\n    @staticmethod\n    def _check_bbox_keypoint_params(internal_data_name: str, processors: dict[str, Any]) -&gt; None:\n        if internal_data_name in CHECK_BBOX_PARAM and processors.get(\"bboxes\") is None:\n            raise ValueError(\"bbox_params must be specified for bbox transformations\")\n        if internal_data_name in CHECK_KEYPOINTS_PARAM and processors.get(\"keypoints\") is None:\n            raise ValueError(\"keypoints_params must be specified for keypoint transformations\")\n\n    @staticmethod\n    def _check_shapes(shapes: list[tuple[int, ...]], is_check_shapes: bool) -&gt; None:\n        if is_check_shapes and shapes and shapes.count(shapes[0]) != len(shapes):\n            raise ValueError(\n                \"Height and Width of image, mask or masks should be equal. You can disable shapes check \"\n                \"by setting a parameter is_check_shapes=False of Compose class (do it only if you are sure \"\n                \"about your data consistency).\",\n            )\n\n    def _check_args(self, **kwargs: Any) -&gt; None:\n        shapes = []  # For H,W checks\n        volume_shapes = []  # For D,H,W checks\n\n        for data_name, data in kwargs.items():\n            internal_name = self._additional_targets.get(data_name, data_name)\n\n            # For CHECKED_SINGLE, we must validate even if None\n            if internal_name in CHECKED_SINGLE:\n                if not isinstance(data, np.ndarray):\n                    raise TypeError(f\"{data_name} must be numpy array type\")\n                shapes.append(data.shape[:2])\n                continue\n\n            # Skip empty data or non-array/list inputs for other types\n            if data is None:\n                continue\n            if not isinstance(data, (np.ndarray, list)):\n                continue\n\n            self._check_bbox_keypoint_params(internal_name, self.processors)\n\n            shape = self._get_data_shape(data_name, internal_name, data)\n            if shape is None:\n                continue\n\n            # Handle different shape types\n            if internal_name in CHECKED_VOLUME | CHECKED_MASK3D:\n                shapes.append(shape[1:3])  # H,W from (D,H,W)\n                volume_shapes.append(shape[:3])  # D,H,W\n            elif internal_name in {\"volumes\", \"masks3d\"}:\n                shapes.append(shape[2:4])  # H,W from (N,D,H,W)\n                volume_shapes.append(shape[1:4])  # D,H,W from (N,D,H,W)\n            else:\n                shapes.append(shape[:2])  # H,W\n\n        self._check_shape_consistency(shapes, volume_shapes)\n\n    def _get_data_shape(self, data_name: str, internal_name: str, data: Any) -&gt; tuple[int, ...] | None:\n        \"\"\"Get shape of data based on its type.\"\"\"\n        if internal_name in CHECKED_SINGLE:\n            if not isinstance(data, np.ndarray):\n                raise TypeError(f\"{data_name} must be numpy array type\")\n            return data.shape\n\n        if internal_name in CHECKED_VOLUME:\n            return self._check_volume_data(data_name, data)\n\n        if internal_name in CHECKED_MASK3D:\n            return self._check_mask3d_data(data_name, data)\n\n        if internal_name in CHECKED_MULTI:\n            if internal_name == \"masks\":\n                return self._check_masks_data(data_name, data)\n            if internal_name in {\"volumes\", \"masks3d\"}:  # Group these together\n                if not isinstance(data, np.ndarray):\n                    raise TypeError(f\"{data_name} must be numpy array type\")\n                if data.ndim not in {4, 5}:  # (N,D,H,W) or (N,D,H,W,C)\n                    raise TypeError(f\"{data_name} must be 4D or 5D array\")\n                return data.shape  # Return full shape\n            return self._check_multi_data(data_name, data)\n\n        return None\n\n    def _check_shape_consistency(self, shapes: list[tuple[int, ...]], volume_shapes: list[tuple[int, ...]]) -&gt; None:\n        \"\"\"Check consistency of shapes.\"\"\"\n        # Check H,W consistency\n        self._check_shapes(shapes, self.is_check_shapes)\n\n        # Check D,H,W consistency for volumes and 3D masks\n        if self.is_check_shapes and volume_shapes and volume_shapes.count(volume_shapes[0]) != len(volume_shapes):\n            raise ValueError(\n                \"Depth, Height and Width of volume, mask3d, volumes and masks3d should be equal. \"\n                \"You can disable shapes check by setting is_check_shapes=False.\",\n            )\n\n    @staticmethod\n    def _check_volume_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        if data.ndim not in {3, 4}:  # (D,H,W) or (D,H,W,C)\n            raise TypeError(f\"{data_name} must be 3D or 4D array\")\n        return data.shape[:3]  # Return (D,H,W)\n\n    @staticmethod\n    def _check_volumes_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        if data.ndim not in {4, 5}:  # (N,D,H,W) or (N,D,H,W,C)\n            raise TypeError(f\"{data_name} must be 4D or 5D array\")\n        return data.shape[1:4]  # Return (D,H,W)\n\n    @staticmethod\n    def _check_mask3d_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        \"\"\"Check single volumetric mask data format and return shape.\"\"\"\n        if data.ndim not in {3, 4}:  # (D,H,W) or (D,H,W,C)\n            raise TypeError(f\"{data_name} must be 3D or 4D array\")\n        return data.shape[:3]  # Return (D,H,W)\n\n    @staticmethod\n    def _check_masks3d_data(data_name: str, data: np.ndarray) -&gt; tuple[int, int, int]:\n        \"\"\"Check multiple volumetric masks data format and return shape.\"\"\"\n        if data.ndim not in [4, 5]:  # (N,D,H,W) or (N,D,H,W,C)\n            raise TypeError(f\"{data_name} must be 4D or 5D array\")\n        return data.shape[1:4]  # Return (D,H,W)\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.OneOf","title":"<code>class  OneOf</code> <code>         (transforms, p=0.5)                 </code>  [view source on GitHub]","text":"<p>Select one of transforms to apply. Selected transform will be called with <code>force_apply=True</code>. Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>list</code> <p>list of transformations to compose.</p> <code>p</code> <code>float</code> <p>probability of applying selected transform. Default: 0.5.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class OneOf(BaseCompose):\n    \"\"\"Select one of transforms to apply. Selected transform will be called with `force_apply=True`.\n    Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.\n\n    Args:\n        transforms (list): list of transformations to compose.\n        p (float): probability of applying selected transform. Default: 0.5.\n\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, p: float = 0.5):\n        super().__init__(transforms=transforms, p=p)\n        transforms_ps = [t.p for t in self.transforms]\n        s = sum(transforms_ps)\n        self.transforms_ps = [t / s for t in transforms_ps]\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode:\n            for t in self.transforms:\n                data = t(**data)\n            return data\n\n        if self.transforms_ps and (force_apply or self.py_random.random() &lt; self.p):\n            idx: int = self.random_generator.choice(len(self.transforms), p=self.transforms_ps)\n            t = self.transforms[idx]\n            data = t(force_apply=True, **data)\n            self._track_transform_params(t, data)\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.OneOrOther","title":"<code>class  OneOrOther</code> <code>         (first=None, second=None, transforms=None, p=0.5)                 </code>  [view source on GitHub]","text":"<p>Select one or another transform to apply. Selected transform will be called with <code>force_apply=True</code>.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class OneOrOther(BaseCompose):\n    \"\"\"Select one or another transform to apply. Selected transform will be called with `force_apply=True`.\"\"\"\n\n    def __init__(\n        self,\n        first: TransformType | None = None,\n        second: TransformType | None = None,\n        transforms: TransformsSeqType | None = None,\n        p: float = 0.5,\n    ):\n        if transforms is None:\n            if first is None or second is None:\n                msg = \"You must set both first and second or set transforms argument.\"\n                raise ValueError(msg)\n            transforms = [first, second]\n        super().__init__(transforms, p)\n        if len(self.transforms) != NUM_ONEOF_TRANSFORMS:\n            warnings.warn(\"Length of transforms is not equal to 2.\", stacklevel=2)\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode:\n            for t in self.transforms:\n                data = t(**data)\n                self._track_transform_params(t, data)\n            return data\n\n        if self.py_random.random() &lt; self.p:\n            return self.transforms[0](force_apply=True, **data)\n\n        return self.transforms[-1](force_apply=True, **data)\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.RandomOrder","title":"<code>class  RandomOrder</code> <code>       (transforms, n=1, replace=False, p=1)                 </code>  [view source on GitHub]","text":"<p>Apply a random subset of transforms from the given list in a random order.</p> <p>The <code>RandomOrder</code> class allows you to select a specified number of transforms from a list and apply them to the input data in a random order. This is useful for creating more diverse augmentation pipelines where the order of transformations can vary, potentially leading to different results.</p> <p>Attributes:</p> Name Type Description <code>transforms</code> <code>TransformsSeqType</code> <p>A list of transformations to choose from.</p> <code>n</code> <code>int</code> <p>The number of transforms to apply. If <code>n</code> is greater than the number of available transforms      and <code>replace</code> is False, <code>n</code> will be set to the number of available transforms.</p> <code>replace</code> <code>bool</code> <p>Whether to sample transforms with replacement. If True, the same transform can be             selected multiple times. Default is False.</p> <code>p</code> <code>float</code> <p>Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.RandomOrder([\n...     A.HorizontalFlip(p=1),\n...     A.VerticalFlip(p=1),\n...     A.RandomBrightnessContrast(p=1),\n... ], n=2, replace=False, p=0.5)\n&gt;&gt;&gt; # This will apply 2 out of the 3 transforms in a random order with 50% probability\n</code></pre> <p>Note</p> <ul> <li>The probabilities of individual transforms are used as weights for sampling.</li> <li>When <code>replace</code> is True, the same transform can be selected multiple times.</li> <li>The random order of transforms will not be replayed in <code>ReplayCompose</code>.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class RandomOrder(SomeOf):\n    \"\"\"Apply a random subset of transforms from the given list in a random order.\n\n    The `RandomOrder` class allows you to select a specified number of transforms from a list and apply them\n    to the input data in a random order. This is useful for creating more diverse augmentation pipelines\n    where the order of transformations can vary, potentially leading to different results.\n\n    Attributes:\n        transforms (TransformsSeqType): A list of transformations to choose from.\n        n (int): The number of transforms to apply. If `n` is greater than the number of available transforms\n                 and `replace` is False, `n` will be set to the number of available transforms.\n        replace (bool): Whether to sample transforms with replacement. If True, the same transform can be\n                        selected multiple times. Default is False.\n        p (float): Probability of applying the selected transforms. Should be in the range [0, 1]. Default is 1.0.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.RandomOrder([\n        ...     A.HorizontalFlip(p=1),\n        ...     A.VerticalFlip(p=1),\n        ...     A.RandomBrightnessContrast(p=1),\n        ... ], n=2, replace=False, p=0.5)\n        &gt;&gt;&gt; # This will apply 2 out of the 3 transforms in a random order with 50% probability\n\n    Note:\n        - The probabilities of individual transforms are used as weights for sampling.\n        - When `replace` is True, the same transform can be selected multiple times.\n        - The random order of transforms will not be replayed in `ReplayCompose`.\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, n: int = 1, replace: bool = False, p: float = 1):\n        super().__init__(transforms=transforms, n=n, replace=replace, p=p)\n\n    def _get_idx(self) -&gt; np.ndarray[np.int_]:\n        return self.random_generator.choice(\n            len(self.transforms),\n            size=self.n,\n            replace=self.replace,\n            p=self.transforms_ps,\n        )\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.SelectiveChannelTransform","title":"<code>class  SelectiveChannelTransform</code> <code>         (transforms, channels=(0, 1, 2), p=1.0)                 </code>  [view source on GitHub]","text":"<p>A transformation class to apply specified transforms to selected channels of an image.</p> <p>This class extends BaseCompose to allow selective application of transformations to specified image channels. It extracts the selected channels, applies the transformations, and then reinserts the transformed channels back into their original positions in the image.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>TransformsSeqType</code> <p>A sequence of transformations (from Albumentations) to be applied to the specified channels.</p> <code>channels</code> <code>Sequence[int]</code> <p>A sequence of integers specifying the indices of the channels to which the transforms should be applied.</p> <code>p</code> <code>float</code> <p>Probability that the transform will be applied; the default is 1.0 (always apply).</p> <p>Methods</p> <p>call(args, *kwargs):     Applies the transforms to the image according to the specified channels.     The input data should include 'image' key with the image array.</p> <p>Returns:</p> Type Description <code>dict[str, Any]</code> <p>The transformed data dictionary, which includes the transformed 'image' key.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class SelectiveChannelTransform(BaseCompose):\n    \"\"\"A transformation class to apply specified transforms to selected channels of an image.\n\n    This class extends BaseCompose to allow selective application of transformations to\n    specified image channels. It extracts the selected channels, applies the transformations,\n    and then reinserts the transformed channels back into their original positions in the image.\n\n    Parameters:\n        transforms (TransformsSeqType):\n            A sequence of transformations (from Albumentations) to be applied to the specified channels.\n        channels (Sequence[int]):\n            A sequence of integers specifying the indices of the channels to which the transforms should be applied.\n        p (float):\n            Probability that the transform will be applied; the default is 1.0 (always apply).\n\n    Methods:\n        __call__(*args, **kwargs):\n            Applies the transforms to the image according to the specified channels.\n            The input data should include 'image' key with the image array.\n\n    Returns:\n        dict[str, Any]: The transformed data dictionary, which includes the transformed 'image' key.\n    \"\"\"\n\n    def __init__(\n        self,\n        transforms: TransformsSeqType,\n        channels: Sequence[int] = (0, 1, 2),\n        p: float = 1.0,\n    ) -&gt; None:\n        super().__init__(transforms, p)\n        self.channels = channels\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if force_apply or self.py_random.random() &lt; self.p:\n            image = data[\"image\"]\n\n            selected_channels = image[:, :, self.channels]\n            sub_image = np.ascontiguousarray(selected_channels)\n\n            for t in self.transforms:\n                sub_image = t(image=sub_image)[\"image\"]\n                self._track_transform_params(t, sub_image)\n\n            transformed_channels = cv2.split(sub_image)\n            output_img = image.copy()\n\n            for idx, channel in zip(self.channels, transformed_channels):\n                output_img[:, :, idx] = channel\n\n            data[\"image\"] = np.ascontiguousarray(output_img)\n\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.Sequential","title":"<code>class  Sequential</code> <code>         (transforms, p=0.5)                 </code>  [view source on GitHub]","text":"<p>Sequentially applies all transforms to targets.</p> <p>Note</p> <p>This transform is not intended to be a replacement for <code>Compose</code>. Instead, it should be used inside <code>Compose</code> the same way <code>OneOf</code> or <code>OneOrOther</code> are used. For instance, you can combine <code>OneOf</code> with <code>Sequential</code> to create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly chose sequence to input data (see the <code>Example</code> section for an example definition of such pipeline).</p> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.Compose([\n&gt;&gt;&gt;    A.OneOf([\n&gt;&gt;&gt;        A.Sequential([\n&gt;&gt;&gt;            A.HorizontalFlip(p=0.5),\n&gt;&gt;&gt;            A.ShiftScaleRotate(p=0.5),\n&gt;&gt;&gt;        ]),\n&gt;&gt;&gt;        A.Sequential([\n&gt;&gt;&gt;            A.VerticalFlip(p=0.5),\n&gt;&gt;&gt;            A.RandomBrightnessContrast(p=0.5),\n&gt;&gt;&gt;        ]),\n&gt;&gt;&gt;    ], p=1)\n&gt;&gt;&gt; ])\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class Sequential(BaseCompose):\n    \"\"\"Sequentially applies all transforms to targets.\n\n    Note:\n        This transform is not intended to be a replacement for `Compose`. Instead, it should be used inside `Compose`\n        the same way `OneOf` or `OneOrOther` are used. For instance, you can combine `OneOf` with `Sequential` to\n        create an augmentation pipeline that contains multiple sequences of augmentations and applies one randomly\n        chose sequence to input data (see the `Example` section for an example definition of such pipeline).\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.Compose([\n        &gt;&gt;&gt;    A.OneOf([\n        &gt;&gt;&gt;        A.Sequential([\n        &gt;&gt;&gt;            A.HorizontalFlip(p=0.5),\n        &gt;&gt;&gt;            A.ShiftScaleRotate(p=0.5),\n        &gt;&gt;&gt;        ]),\n        &gt;&gt;&gt;        A.Sequential([\n        &gt;&gt;&gt;            A.VerticalFlip(p=0.5),\n        &gt;&gt;&gt;            A.RandomBrightnessContrast(p=0.5),\n        &gt;&gt;&gt;        ]),\n        &gt;&gt;&gt;    ], p=1)\n        &gt;&gt;&gt; ])\n\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, p: float = 0.5):\n        super().__init__(transforms, p)\n\n    def __call__(self, *args: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode or force_apply or self.py_random.random() &lt; self.p:\n            for t in self.transforms:\n                data = t(**data)\n                self._track_transform_params(t, data)\n                data = self.check_data_post_transform(data)\n        return data\n</code></pre>"},{"location":"api_reference/core/composition/#albumentations.core.composition.SomeOf","title":"<code>class  SomeOf</code> <code>         (transforms, n=1, replace=False, p=1)                   </code>  [view source on GitHub]","text":"<p>Apply a random subset of transforms from the given list.</p> <p>This class selects a specified number of transforms from the provided list and applies them to the input data. The selection can be done with or without replacement, allowing for the same transform to be potentially applied multiple times.</p> <p>Parameters:</p> Name Type Description <code>transforms</code> <code>List[Union[BasicTransform, BaseCompose]]</code> <p>A list of transforms to choose from.</p> <code>n</code> <code>int</code> <p>The number of transforms to apply. If greater than the number of      transforms and replace=False, it will be set to the number of transforms.</p> <code>replace</code> <code>bool</code> <p>Whether to sample transforms with replacement. Default is True.</p> <code>p</code> <code>float</code> <p>Probability of applying the selected transforms. Should be in the range [0, 1].        Default is 1.0.</p> <code>mask_interpolation</code> <code>int</code> <p>Interpolation method for mask transforms.                                 When defined, it overrides the interpolation method                                 specified in individual transforms. Default is None.</p> <p>Note</p> <ul> <li>If <code>n</code> is greater than the number of transforms and <code>replace</code> is False,   <code>n</code> will be set to the number of transforms with a warning.</li> <li>The probabilities of individual transforms are used as weights for sampling.</li> <li>When <code>replace</code> is True, the same transform can be selected multiple times.</li> </ul> <p>Examples:</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n&gt;&gt;&gt; transform = A.SomeOf([\n...     A.HorizontalFlip(p=1),\n...     A.VerticalFlip(p=1),\n...     A.RandomBrightnessContrast(p=1),\n... ], n=2, replace=False, p=0.5)\n&gt;&gt;&gt; # This will apply 2 out of the 3 transforms with 50% probability\n</code></pre> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/composition.py</code> Python<pre><code>class SomeOf(BaseCompose):\n    \"\"\"Apply a random subset of transforms from the given list.\n\n    This class selects a specified number of transforms from the provided list\n    and applies them to the input data. The selection can be done with or without\n    replacement, allowing for the same transform to be potentially applied multiple times.\n\n    Args:\n        transforms (List[Union[BasicTransform, BaseCompose]]): A list of transforms to choose from.\n        n (int): The number of transforms to apply. If greater than the number of\n                 transforms and replace=False, it will be set to the number of transforms.\n        replace (bool): Whether to sample transforms with replacement. Default is True.\n        p (float): Probability of applying the selected transforms. Should be in the range [0, 1].\n                   Default is 1.0.\n        mask_interpolation (int, optional): Interpolation method for mask transforms.\n                                            When defined, it overrides the interpolation method\n                                            specified in individual transforms. Default is None.\n\n    Note:\n        - If `n` is greater than the number of transforms and `replace` is False,\n          `n` will be set to the number of transforms with a warning.\n        - The probabilities of individual transforms are used as weights for sampling.\n        - When `replace` is True, the same transform can be selected multiple times.\n\n    Example:\n        &gt;&gt;&gt; import albumentations as A\n        &gt;&gt;&gt; transform = A.SomeOf([\n        ...     A.HorizontalFlip(p=1),\n        ...     A.VerticalFlip(p=1),\n        ...     A.RandomBrightnessContrast(p=1),\n        ... ], n=2, replace=False, p=0.5)\n        &gt;&gt;&gt; # This will apply 2 out of the 3 transforms with 50% probability\n    \"\"\"\n\n    def __init__(self, transforms: TransformsSeqType, n: int = 1, replace: bool = False, p: float = 1):\n        super().__init__(transforms, p)\n        self.n = n\n        if not replace and n &gt; len(self.transforms):\n            self.n = len(self.transforms)\n            warnings.warn(\n                f\"`n` is greater than number of transforms. `n` will be set to {self.n}.\",\n                UserWarning,\n                stacklevel=2,\n            )\n        self.replace = replace\n        transforms_ps = [t.p for t in self.transforms]\n        s = sum(transforms_ps)\n        self.transforms_ps = [t / s for t in transforms_ps]\n\n    def __call__(self, *arg: Any, force_apply: bool = False, **data: Any) -&gt; dict[str, Any]:\n        if self.replay_mode:\n            for t in self.transforms:\n                data = t(**data)\n                data = self.check_data_post_transform(data)\n            return data\n\n        if self.transforms_ps and (force_apply or self.py_random.random() &lt; self.p):\n            for i in self._get_idx():\n                t = self.transforms[i]\n                data = t(force_apply=True, **data)\n                self._track_transform_params(t, data)\n                data = self.check_data_post_transform(data)\n        return data\n\n    def _get_idx(self) -&gt; np.ndarray[np.int_]:\n        idx = self.random_generator.choice(\n            len(self.transforms),\n            size=self.n,\n            replace=self.replace,\n            p=self.transforms_ps,\n        )\n        idx.sort()\n        return idx\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        dictionary = super().to_dict_private()\n        dictionary.update({\"n\": self.n, \"replace\": self.replace})\n        return dictionary\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/","title":"Helper functions for working with keypoints (augmentations.core.keypoints_utils)","text":""},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.KeypointParams","title":"<code>class  KeypointParams</code> <code>       (format, label_fields=None, remove_invisible=True, angle_in_degrees=True, check_each_transform=True)                         </code>  [view source on GitHub]","text":"<p>Parameters of keypoints</p> <p>Parameters:</p> Name Type Description <code>format</code> <code>str</code> <p>format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa', 'xyz'.</p> <p>x - X coordinate,</p> <p>y - Y coordinate</p> <p>z - Z coordinate (for 3D keypoints)</p> <p>s - Keypoint scale</p> <p>a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)</p> <code>label_fields</code> <code>list</code> <p>list of fields that are joined with keypoints, e.g labels. Should be same type as keypoints.</p> <code>remove_invisible</code> <code>bool</code> <p>to remove invisible points after transform or not</p> <code>angle_in_degrees</code> <code>bool</code> <p>angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints</p> <code>check_each_transform</code> <code>bool</code> <p>if <code>True</code>, then keypoints will be checked after each dual transform. Default: <code>True</code></p> <p>Note</p> <p>The internal Albumentations format is [x, y, z, angle, scale]. For 2D formats (xy, yx, xya, xys, xyas, xysa), z coordinate is set to 0. For formats without angle or scale, these values are set to 0.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>class KeypointParams(Params):\n    \"\"\"Parameters of keypoints\n\n    Args:\n        format (str): format of keypoints. Should be 'xy', 'yx', 'xya', 'xys', 'xyas', 'xysa', 'xyz'.\n\n            x - X coordinate,\n\n            y - Y coordinate\n\n            z - Z coordinate (for 3D keypoints)\n\n            s - Keypoint scale\n\n            a - Keypoint orientation in radians or degrees (depending on KeypointParams.angle_in_degrees)\n\n        label_fields (list): list of fields that are joined with keypoints, e.g labels.\n            Should be same type as keypoints.\n        remove_invisible (bool): to remove invisible points after transform or not\n        angle_in_degrees (bool): angle in degrees or radians in 'xya', 'xyas', 'xysa' keypoints\n        check_each_transform (bool): if `True`, then keypoints will be checked after each dual transform.\n            Default: `True`\n\n    Note:\n        The internal Albumentations format is [x, y, z, angle, scale]. For 2D formats (xy, yx, xya, xys, xyas, xysa),\n        z coordinate is set to 0. For formats without angle or scale, these values are set to 0.\n    \"\"\"\n\n    def __init__(\n        self,\n        format: str,  # noqa: A002\n        label_fields: Sequence[str] | None = None,\n        remove_invisible: bool = True,\n        angle_in_degrees: bool = True,\n        check_each_transform: bool = True,\n    ):\n        super().__init__(format, label_fields)\n        self.remove_invisible = remove_invisible\n        self.angle_in_degrees = angle_in_degrees\n        self.check_each_transform = check_each_transform\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        data = super().to_dict_private()\n        data.update(\n            {\n                \"remove_invisible\": self.remove_invisible,\n                \"angle_in_degrees\": self.angle_in_degrees,\n                \"check_each_transform\": self.check_each_transform,\n            },\n        )\n        return data\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return \"KeypointParams\"\n\n    def __repr__(self) -&gt; str:\n        return (\n            f\"KeypointParams(format={self.format}, label_fields={self.label_fields},\"\n            f\" remove_invisible={self.remove_invisible}, angle_in_degrees={self.angle_in_degrees},\"\n            f\" check_each_transform={self.check_each_transform})\"\n        )\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.KeypointsProcessor","title":"<code>class  KeypointsProcessor</code> <code>       (params, additional_targets=None)                             </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>class KeypointsProcessor(DataProcessor):\n    def __init__(self, params: KeypointParams, additional_targets: dict[str, str] | None = None):\n        super().__init__(params, additional_targets)\n\n    @property\n    def default_data_name(self) -&gt; str:\n        return \"keypoints\"\n\n    def ensure_data_valid(self, data: dict[str, Any]) -&gt; None:\n        if self.params.label_fields and not all(i in data for i in self.params.label_fields):\n            msg = \"Your 'label_fields' are not valid - them must have same names as params in 'keypoint_params' dict\"\n            raise ValueError(msg)\n\n    def filter(\n        self,\n        data: np.ndarray,\n        shape: ShapeType,\n    ) -&gt; np.ndarray:\n        \"\"\"Filter keypoints based on visibility within given shape.\n\n        Args:\n            data: Keypoints in [x, y, z, angle, scale] format\n            shape: Shape to check against as {'height': height, 'width': width, 'depth': depth}\n\n        Returns:\n            Filtered keypoints\n        \"\"\"\n        self.params: KeypointParams\n        return filter_keypoints(data, shape, remove_invisible=self.params.remove_invisible)\n\n    def check(self, data: np.ndarray, shape: ShapeType) -&gt; None:\n        check_keypoints(data, shape)\n\n    def convert_from_albumentations(\n        self,\n        data: np.ndarray,\n        shape: ShapeType,\n    ) -&gt; np.ndarray:\n        if not data.size:\n            return data\n\n        params = self.params\n        return convert_keypoints_from_albumentations(\n            data,\n            params.format,\n            shape,\n            check_validity=params.remove_invisible,\n            angle_in_degrees=params.angle_in_degrees,\n        )\n\n    def convert_to_albumentations(\n        self,\n        data: np.ndarray,\n        shape: ShapeType,\n    ) -&gt; np.ndarray:\n        if not data.size:\n            return data\n        params = self.params\n        return convert_keypoints_to_albumentations(\n            data,\n            params.format,\n            shape,\n            check_validity=params.remove_invisible,\n            angle_in_degrees=params.angle_in_degrees,\n        )\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.check_keypoints","title":"<code>def check_keypoints    (keypoints, shape)    </code> [view source on GitHub]","text":"<p>Check if keypoint coordinates are within valid ranges for the given shape.</p> <p>This function validates that: 1. All x-coordinates are within [0, width) 2. All y-coordinates are within [0, height) 3. If depth is provided in shape, z-coordinates are within [0, depth) 4. Angles are within the range [0, 2\u03c0)</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 5+), where N is the number of keypoints. - First 2 columns are always x, y - Column 3 (if present) is z - Column 4 (if present) is angle - Column 5+ (if present) are additional attributes</p> <code>shape</code> <code>ShapeType</code> <p>The shape of the image/volume:                - For 2D: {'height': int, 'width': int}                - For 3D: {'height': int, 'width': int, 'depth': int}</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If any keypoint coordinate is outside the valid range, or if angles are invalid.        The error message will detail which keypoints are invalid and why.</p> <p>Note</p> <ul> <li>The function assumes that keypoint coordinates are in absolute pixel values, not normalized</li> <li>Angles are in radians</li> <li>Z-coordinates are only checked if 'depth' is present in shape</li> </ul> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def check_keypoints(keypoints: np.ndarray, shape: ShapeType) -&gt; None:\n    \"\"\"Check if keypoint coordinates are within valid ranges for the given shape.\n\n    This function validates that:\n    1. All x-coordinates are within [0, width)\n    2. All y-coordinates are within [0, height)\n    3. If depth is provided in shape, z-coordinates are within [0, depth)\n    4. Angles are within the range [0, 2\u03c0)\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 5+), where N is the number of keypoints.\n            - First 2 columns are always x, y\n            - Column 3 (if present) is z\n            - Column 4 (if present) is angle\n            - Column 5+ (if present) are additional attributes\n        shape (ShapeType): The shape of the image/volume:\n                           - For 2D: {'height': int, 'width': int}\n                           - For 3D: {'height': int, 'width': int, 'depth': int}\n\n    Raises:\n        ValueError: If any keypoint coordinate is outside the valid range, or if angles are invalid.\n                   The error message will detail which keypoints are invalid and why.\n\n    Note:\n        - The function assumes that keypoint coordinates are in absolute pixel values, not normalized\n        - Angles are in radians\n        - Z-coordinates are only checked if 'depth' is present in shape\n    \"\"\"\n    height, width = shape[\"height\"], shape[\"width\"]\n    has_depth = \"depth\" in shape\n\n    # Check x and y coordinates (always present)\n    x, y = keypoints[:, 0], keypoints[:, 1]\n    invalid_x = np.where((x &lt; 0) | (x &gt;= width))[0]\n    invalid_y = np.where((y &lt; 0) | (y &gt;= height))[0]\n\n    error_messages = []\n\n    # Handle x, y errors\n    for idx in sorted(set(invalid_x) | set(invalid_y)):\n        if idx in invalid_x:\n            error_messages.append(\n                f\"Expected x for keypoint {keypoints[idx]} to be in range [0, {width}), got {x[idx]}\",\n            )\n        if idx in invalid_y:\n            error_messages.append(\n                f\"Expected y for keypoint {keypoints[idx]} to be in range [0, {height}), got {y[idx]}\",\n            )\n\n    # Check z coordinates if depth is provided and keypoints have z\n    if has_depth and keypoints.shape[1] &gt; 2:\n        z = keypoints[:, 2]\n        depth = shape[\"depth\"]\n        invalid_z = np.where((z &lt; 0) | (z &gt;= depth))[0]\n        error_messages.extend(\n            f\"Expected z for keypoint {keypoints[idx]} to be in range [0, {depth}), got {z[idx]}\" for idx in invalid_z\n        )\n\n    # Check angles only if keypoints have angle column\n    if keypoints.shape[1] &gt; 3:\n        angles = keypoints[:, 3]\n        invalid_angles = np.where((angles &lt; 0) | (angles &gt;= 2 * math.pi))[0]\n        error_messages.extend(\n            f\"Expected angle for keypoint {keypoints[idx]} to be in range [0, 2\u03c0), got {angles[idx]}\"\n            for idx in invalid_angles\n        )\n\n    if error_messages:\n        raise ValueError(\"\\n\".join(error_messages))\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.convert_keypoints_from_albumentations","title":"<code>def convert_keypoints_from_albumentations    (keypoints, target_format, shape, check_validity=False, angle_in_degrees=True)    </code> [view source on GitHub]","text":"<p>Convert keypoints from Albumentations format to various other formats.</p> <p>This function takes keypoints in the standard Albumentations format [x, y, z, angle, scale] and converts them to the specified target format.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints in Albumentations format with shape (N, 5+),                     where N is the number of keypoints. Each row represents a keypoint                     [x, y, z, angle, scale, ...].</p> <code>target_format</code> <code>Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\", \"xyz\"]</code> <p>The desired output format. - \"xy\": [x, y] - \"yx\": [y, x] - \"xya\": [x, y, angle] - \"xys\": [x, y, scale] - \"xyas\": [x, y, angle, scale] - \"xysa\": [x, y, scale, angle] - \"xyz\": [x, y, z]</p> <code>shape</code> <code>ShapeType</code> <p>The shape of the image {'height': height, 'width': width, 'depth': depth}.</p> <code>check_validity</code> <code>bool</code> <p>If True, check if the keypoints are within the image boundaries.                              Defaults to False.</p> <code>angle_in_degrees</code> <code>bool</code> <p>If True, convert output angles to degrees.                                If False, angles remain in radians.                                Defaults to True.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints in the specified target format with shape (N, 2+).             Any additional columns from the input keypoints beyond the first 5             are preserved and appended after the converted columns.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the target_format is not one of the supported formats.</p> <p>Note</p> <ul> <li>Input angles are assumed to be in the range [0, 2\u03c0) radians</li> <li>If the input keypoints have additional columns beyond the first 5,   these columns are preserved in the output</li> </ul> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def convert_keypoints_from_albumentations(\n    keypoints: np.ndarray,\n    target_format: Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\", \"xyz\"],\n    shape: ShapeType,\n    check_validity: bool = False,\n    angle_in_degrees: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Convert keypoints from Albumentations format to various other formats.\n\n    This function takes keypoints in the standard Albumentations format [x, y, z, angle, scale]\n    and converts them to the specified target format.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints in Albumentations format with shape (N, 5+),\n                                where N is the number of keypoints. Each row represents a keypoint\n                                [x, y, z, angle, scale, ...].\n        target_format (Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\", \"xyz\"]): The desired output format.\n            - \"xy\": [x, y]\n            - \"yx\": [y, x]\n            - \"xya\": [x, y, angle]\n            - \"xys\": [x, y, scale]\n            - \"xyas\": [x, y, angle, scale]\n            - \"xysa\": [x, y, scale, angle]\n            - \"xyz\": [x, y, z]\n        shape (ShapeType): The shape of the image {'height': height, 'width': width, 'depth': depth}.\n        check_validity (bool, optional): If True, check if the keypoints are within the image boundaries.\n                                         Defaults to False.\n        angle_in_degrees (bool, optional): If True, convert output angles to degrees.\n                                           If False, angles remain in radians.\n                                           Defaults to True.\n\n    Returns:\n        np.ndarray: Array of keypoints in the specified target format with shape (N, 2+).\n                    Any additional columns from the input keypoints beyond the first 5\n                    are preserved and appended after the converted columns.\n\n    Raises:\n        ValueError: If the target_format is not one of the supported formats.\n\n    Note:\n        - Input angles are assumed to be in the range [0, 2\u03c0) radians\n        - If the input keypoints have additional columns beyond the first 5,\n          these columns are preserved in the output\n    \"\"\"\n    if target_format not in keypoint_formats:\n        raise ValueError(f\"Unknown target_format {target_format}. Supported formats are: {keypoint_formats}\")\n\n    x, y, z, angle, scale = keypoints[:, 0], keypoints[:, 1], keypoints[:, 2], keypoints[:, 3], keypoints[:, 4]\n    angle = angle_to_2pi_range(angle)\n\n    if check_validity:\n        check_keypoints(np.column_stack((x, y, z, angle, scale)), shape)\n\n    if angle_in_degrees:\n        angle = np.degrees(angle)\n\n    format_to_columns = {\n        \"xy\": [x, y],\n        \"yx\": [y, x],\n        \"xya\": [x, y, angle],\n        \"xys\": [x, y, scale],\n        \"xyas\": [x, y, angle, scale],\n        \"xysa\": [x, y, scale, angle],\n        \"xyz\": [x, y, z],\n    }\n\n    result = np.column_stack(format_to_columns[target_format])\n\n    # Add any additional columns from the original keypoints\n    if keypoints.shape[1] &gt; NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:\n        return np.column_stack((result, keypoints[:, NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS:]))\n\n    return result\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.convert_keypoints_to_albumentations","title":"<code>def convert_keypoints_to_albumentations    (keypoints, source_format, shape, check_validity=False, angle_in_degrees=True)    </code> [view source on GitHub]","text":"<p>Convert keypoints from various formats to the Albumentations format.</p> <p>This function takes keypoints in different formats and converts them to the standard Albumentations format: [x, y, z, angle, scale]. For 2D formats, z is set to 0. For formats without angle or scale, these values are set to 0.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>Array of keypoints with shape (N, 2+), where N is the number of keypoints.                     The number of columns depends on the source_format.</p> <code>source_format</code> <code>Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\", \"xyz\"]</code> <p>The format of the input keypoints. - \"xy\": [x, y] - \"yx\": [y, x] - \"xya\": [x, y, angle] - \"xys\": [x, y, scale] - \"xyas\": [x, y, angle, scale] - \"xysa\": [x, y, scale, angle] - \"xyz\": [x, y, z]</p> <code>shape</code> <code>ShapeType</code> <p>The shape of the image {'height': height, 'width': width, 'depth': depth}.</p> <code>check_validity</code> <code>bool</code> <p>If True, check if the converted keypoints are within the image boundaries.                              Defaults to False.</p> <code>angle_in_degrees</code> <code>bool</code> <p>If True, convert input angles from degrees to radians.                                Defaults to True.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>Array of keypoints in Albumentations format [x, y, z, angle, scale] with shape (N, 5+).             Any additional columns from the input keypoints are preserved and appended after the             first 5 columns.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If the source_format is not one of the supported formats.</p> <p>Note</p> <ul> <li>For 2D formats (xy, yx, xya, xys, xyas, xysa), z coordinate is set to 0</li> <li>Angles are converted to the range [0, 2\u03c0) radians</li> <li>If the input keypoints have additional columns beyond what's specified in the source_format,   these columns are preserved in the output</li> </ul> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def convert_keypoints_to_albumentations(\n    keypoints: np.ndarray,\n    source_format: Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\", \"xyz\"],\n    shape: ShapeType,\n    check_validity: bool = False,\n    angle_in_degrees: bool = True,\n) -&gt; np.ndarray:\n    \"\"\"Convert keypoints from various formats to the Albumentations format.\n\n    This function takes keypoints in different formats and converts them to the standard\n    Albumentations format: [x, y, z, angle, scale]. For 2D formats, z is set to 0.\n    For formats without angle or scale, these values are set to 0.\n\n    Args:\n        keypoints (np.ndarray): Array of keypoints with shape (N, 2+), where N is the number of keypoints.\n                                The number of columns depends on the source_format.\n        source_format (Literal[\"xy\", \"yx\", \"xya\", \"xys\", \"xyas\", \"xysa\", \"xyz\"]): The format of the input keypoints.\n            - \"xy\": [x, y]\n            - \"yx\": [y, x]\n            - \"xya\": [x, y, angle]\n            - \"xys\": [x, y, scale]\n            - \"xyas\": [x, y, angle, scale]\n            - \"xysa\": [x, y, scale, angle]\n            - \"xyz\": [x, y, z]\n        shape (ShapeType): The shape of the image {'height': height, 'width': width, 'depth': depth}.\n        check_validity (bool, optional): If True, check if the converted keypoints are within the image boundaries.\n                                         Defaults to False.\n        angle_in_degrees (bool, optional): If True, convert input angles from degrees to radians.\n                                           Defaults to True.\n\n    Returns:\n        np.ndarray: Array of keypoints in Albumentations format [x, y, z, angle, scale] with shape (N, 5+).\n                    Any additional columns from the input keypoints are preserved and appended after the\n                    first 5 columns.\n\n    Raises:\n        ValueError: If the source_format is not one of the supported formats.\n\n    Note:\n        - For 2D formats (xy, yx, xya, xys, xyas, xysa), z coordinate is set to 0\n        - Angles are converted to the range [0, 2\u03c0) radians\n        - If the input keypoints have additional columns beyond what's specified in the source_format,\n          these columns are preserved in the output\n    \"\"\"\n    if source_format not in keypoint_formats:\n        raise ValueError(f\"Unknown source_format {source_format}. Supported formats are: {keypoint_formats}\")\n\n    format_to_indices: dict[str, list[int | None]] = {\n        \"xy\": [0, 1, None, None, None],\n        \"yx\": [1, 0, None, None, None],\n        \"xya\": [0, 1, None, 2, None],\n        \"xys\": [0, 1, None, None, 2],\n        \"xyas\": [0, 1, None, 2, 3],\n        \"xysa\": [0, 1, None, 3, 2],\n        \"xyz\": [0, 1, 2, None, None],\n    }\n\n    indices: list[int | None] = format_to_indices[source_format]\n\n    processed_keypoints = np.zeros((keypoints.shape[0], NUM_KEYPOINTS_COLUMNS_IN_ALBUMENTATIONS), dtype=np.float32)\n\n    for i, idx in enumerate(indices):\n        if idx is not None:\n            processed_keypoints[:, i] = keypoints[:, idx]\n\n    if angle_in_degrees and indices[3] is not None:  # angle is now at index 3\n        processed_keypoints[:, 3] = np.radians(processed_keypoints[:, 3])\n\n    processed_keypoints[:, 3] = angle_to_2pi_range(processed_keypoints[:, 3])  # angle is now at index 3\n\n    if keypoints.shape[1] &gt; len(source_format):\n        processed_keypoints = np.column_stack((processed_keypoints, keypoints[:, len(source_format) :]))\n\n    if check_validity:\n        check_keypoints(processed_keypoints, shape)\n\n    return processed_keypoints\n</code></pre>"},{"location":"api_reference/core/keypoints_utils/#albumentations.core.keypoints_utils.filter_keypoints","title":"<code>def filter_keypoints    (keypoints, shape, remove_invisible)    </code> [view source on GitHub]","text":"<p>Filter keypoints to remove those outside the boundaries.</p> <p>Parameters:</p> Name Type Description <code>keypoints</code> <code>np.ndarray</code> <p>A numpy array of shape (N, 5+) where N is the number of keypoints.        Each row represents a keypoint (x, y, z, angle, scale, ...).</p> <code>shape</code> <code>ShapeType</code> <p>Shape to check against as {'height': height, 'width': width, 'depth': depth}.</p> <code>remove_invisible</code> <code>bool</code> <p>If True, remove keypoints outside the boundaries.</p> <p>Returns:</p> Type Description <code>np.ndarray</code> <p>A numpy array of filtered keypoints.</p> Source code in <code>albumentations/core/keypoints_utils.py</code> Python<pre><code>def filter_keypoints(\n    keypoints: np.ndarray,\n    shape: ShapeType,\n    remove_invisible: bool,\n) -&gt; np.ndarray:\n    \"\"\"Filter keypoints to remove those outside the boundaries.\n\n    Args:\n        keypoints: A numpy array of shape (N, 5+) where N is the number of keypoints.\n                   Each row represents a keypoint (x, y, z, angle, scale, ...).\n        shape: Shape to check against as {'height': height, 'width': width, 'depth': depth}.\n        remove_invisible: If True, remove keypoints outside the boundaries.\n\n    Returns:\n        A numpy array of filtered keypoints.\n    \"\"\"\n    if not remove_invisible:\n        return keypoints\n\n    if not keypoints.size:\n        return keypoints\n\n    height, width, depth = shape[\"height\"], shape[\"width\"], shape.get(\"depth\", None)\n\n    # Create boolean mask for visible keypoints\n    x, y, z = keypoints[:, 0], keypoints[:, 1], keypoints[:, 2]\n    visible = (x &gt;= 0) &amp; (x &lt; width) &amp; (y &gt;= 0) &amp; (y &lt; height)\n\n    if depth is not None:\n        visible &amp;= (z &gt;= 0) &amp; (z &lt; depth)\n\n    # Apply the mask to filter keypoints\n    return keypoints[visible]\n</code></pre>"},{"location":"api_reference/core/serialization/","title":"Serialization API (core.serialization)","text":""},{"location":"api_reference/core/serialization/#albumentations.core.serialization.Serializable","title":"<code>class  Serializable</code> <code> </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>class Serializable(metaclass=SerializableMeta):\n    @classmethod\n    @abstractmethod\n    def is_serializable(cls) -&gt; bool:\n        raise NotImplementedError\n\n    @classmethod\n    @abstractmethod\n    def get_class_fullname(cls) -&gt; str:\n        raise NotImplementedError\n\n    @abstractmethod\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        raise NotImplementedError\n\n    def to_dict(self, on_not_implemented_error: str = \"raise\") -&gt; dict[str, Any]:\n        \"\"\"Take a transform pipeline and convert it to a serializable representation that uses only standard\n        python data types: dictionaries, lists, strings, integers, and floats.\n\n        Args:\n            self: A transform that should be serialized. If the transform doesn't implement the `to_dict`\n                method and `on_not_implemented_error` equals to 'raise' then `NotImplementedError` is raised.\n                If `on_not_implemented_error` equals to 'warn' then `NotImplementedError` will be ignored\n                but no transform parameters will be serialized.\n            on_not_implemented_error (str): `raise` or `warn`.\n\n        \"\"\"\n        if on_not_implemented_error not in {\"raise\", \"warn\"}:\n            msg = f\"Unknown on_not_implemented_error value: {on_not_implemented_error}. Supported values are: 'raise' \"\n            \"and 'warn'\"\n            raise ValueError(msg)\n        try:\n            transform_dict = self.to_dict_private()\n        except NotImplementedError:\n            if on_not_implemented_error == \"raise\":\n                raise\n\n            transform_dict = {}\n            warnings.warn(\n                f\"Got NotImplementedError while trying to serialize {self}. Object arguments are not preserved. \"\n                f\"Implement either '{self.__class__.__name__}.get_transform_init_args_names' \"\n                f\"or '{self.__class__.__name__}.get_transform_init_args' \"\n                \"method to make the transform serializable\",\n                stacklevel=2,\n            )\n        return {\"__version__\": __version__, \"transform\": transform_dict}\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.SerializableMeta","title":"<code>class  SerializableMeta</code> <code> </code>  [view source on GitHub]","text":"<p>A metaclass that is used to register classes in <code>SERIALIZABLE_REGISTRY</code> or <code>NON_SERIALIZABLE_REGISTRY</code> so they can be found later while deserializing transformation pipeline using classes full names.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>class SerializableMeta(ABCMeta):\n    \"\"\"A metaclass that is used to register classes in `SERIALIZABLE_REGISTRY` or `NON_SERIALIZABLE_REGISTRY`\n    so they can be found later while deserializing transformation pipeline using classes full names.\n    \"\"\"\n\n    def __new__(cls, name: str, bases: tuple[type, ...], *args: Any, **kwargs: Any) -&gt; SerializableMeta:\n        cls_obj = super().__new__(cls, name, bases, *args, **kwargs)\n        if name != \"Serializable\" and ABC not in bases:\n            if cls_obj.is_serializable():\n                SERIALIZABLE_REGISTRY[cls_obj.get_class_fullname()] = cls_obj\n            else:\n                NON_SERIALIZABLE_REGISTRY[cls_obj.get_class_fullname()] = cls_obj\n        return cls_obj\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return False\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return get_shortest_class_fullname(cls)\n\n    @classmethod\n    def _to_dict(cls) -&gt; dict[str, Any]:\n        return {}\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.from_dict","title":"<code>def from_dict    (transform_dict, nonserializable=None)    </code> [view source on GitHub]","text":"<p>transform_dict: A dictionary with serialized transform pipeline. nonserializable (dict): A dictionary that contains non-serializable transforms.     This dictionary is required when you are restoring a pipeline that contains non-serializable transforms.     Keys in that dictionary should be named same as <code>name</code> arguments in respective transforms from     a serialized pipeline.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def from_dict(\n    transform_dict: dict[str, Any],\n    nonserializable: dict[str, Any] | None = None,\n) -&gt; Serializable | None:\n    \"\"\"Args:\n    transform_dict: A dictionary with serialized transform pipeline.\n    nonserializable (dict): A dictionary that contains non-serializable transforms.\n        This dictionary is required when you are restoring a pipeline that contains non-serializable transforms.\n        Keys in that dictionary should be named same as `name` arguments in respective transforms from\n        a serialized pipeline.\n\n    \"\"\"\n    register_additional_transforms()\n    transform = transform_dict[\"transform\"]\n    lmbd = instantiate_nonserializable(transform, nonserializable)\n    if lmbd:\n        return lmbd\n    name = transform[\"__class_fullname__\"]\n    args = {k: v for k, v in transform.items() if k != \"__class_fullname__\"}\n    cls = SERIALIZABLE_REGISTRY[shorten_class_name(name)]\n    if \"transforms\" in args:\n        args[\"transforms\"] = [from_dict({\"transform\": t}, nonserializable=nonserializable) for t in args[\"transforms\"]]\n    return cls(**args)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.get_shortest_class_fullname","title":"<code>def get_shortest_class_fullname    (cls)    </code> [view source on GitHub]","text":"<p>The function <code>get_shortest_class_fullname</code> takes a class object as input and returns its shortened full name.</p> <p>:param cls: The parameter <code>cls</code> is of type <code>Type[BasicCompose]</code>, which means it expects a class that is a subclass of <code>BasicCompose</code> :type cls: Type[BasicCompose] :return: a string, which is the shortened version of the full class name.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def get_shortest_class_fullname(cls: type[Any]) -&gt; str:\n    \"\"\"The function `get_shortest_class_fullname` takes a class object as input and returns its shortened\n    full name.\n\n    :param cls: The parameter `cls` is of type `Type[BasicCompose]`, which means it expects a class that\n    is a subclass of `BasicCompose`\n    :type cls: Type[BasicCompose]\n    :return: a string, which is the shortened version of the full class name.\n    \"\"\"\n    class_fullname = f\"{cls.__module__}.{cls.__name__}\"\n    return shorten_class_name(class_fullname)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.load","title":"<code>def load    (filepath_or_buffer, data_format='json', nonserializable=None)    </code> [view source on GitHub]","text":"<p>Load a serialized pipeline from a file or file-like object and construct a transform pipeline.</p> <p>Parameters:</p> Name Type Description <code>filepath_or_buffer</code> <code>Union[str, Path, TextIO]</code> <p>The file path or file-like object to read the serialized data from. If a string is provided, it is interpreted as a path to a file. If a file-like object is provided, the serialized data will be read from it directly.</p> <code>data_format</code> <code>str</code> <p>The format of the serialized data. Valid options are 'json' and 'yaml'. Defaults to 'json'.</p> <code>nonserializable</code> <code>Optional[dict[str, Any]]</code> <p>A dictionary that contains non-serializable transforms. This dictionary is required when restoring a pipeline that contains non-serializable transforms. Keys in the dictionary should be named the same as the <code>name</code> arguments in respective transforms from the serialized pipeline. Defaults to None.</p> <p>Returns:</p> Type Description <code>object</code> <p>The deserialized transform pipeline.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>data_format</code> is 'yaml' but PyYAML is not installed.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def load(\n    filepath_or_buffer: str | Path | TextIO,\n    data_format: str = \"json\",\n    nonserializable: dict[str, Any] | None = None,\n) -&gt; object:\n    \"\"\"Load a serialized pipeline from a file or file-like object and construct a transform pipeline.\n\n    Args:\n        filepath_or_buffer (Union[str, Path, TextIO]): The file path or file-like object to read the serialized\n            data from.\n            If a string is provided, it is interpreted as a path to a file. If a file-like object is provided,\n            the serialized data will be read from it directly.\n        data_format (str): The format of the serialized data. Valid options are 'json' and 'yaml'.\n            Defaults to 'json'.\n        nonserializable (Optional[dict[str, Any]]): A dictionary that contains non-serializable transforms.\n            This dictionary is required when restoring a pipeline that contains non-serializable transforms.\n            Keys in the dictionary should be named the same as the `name` arguments in respective transforms\n            from the serialized pipeline. Defaults to None.\n\n    Returns:\n        object: The deserialized transform pipeline.\n\n    Raises:\n        ValueError: If `data_format` is 'yaml' but PyYAML is not installed.\n\n    \"\"\"\n    check_data_format(data_format)\n\n    if isinstance(filepath_or_buffer, (str, Path)):  # Assume it's a filepath\n        with open(filepath_or_buffer) as f:\n            if data_format == \"json\":\n                transform_dict = json.load(f)\n            else:\n                if not yaml_available:\n                    msg = \"You need to install PyYAML to load a pipeline in yaml format\"\n                    raise ValueError(msg)\n                transform_dict = yaml.safe_load(f)\n    elif data_format == \"json\":\n        transform_dict = json.load(filepath_or_buffer)\n    else:\n        if not yaml_available:\n            msg = \"You need to install PyYAML to load a pipeline in yaml format\"\n            raise ValueError(msg)\n        transform_dict = yaml.safe_load(filepath_or_buffer)\n\n    return from_dict(transform_dict, nonserializable=nonserializable)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.register_additional_transforms","title":"<code>def register_additional_transforms    ()    </code> [view source on GitHub]","text":"<p>Register transforms that are not imported directly into the <code>albumentations</code> module by checking the availability of optional dependencies.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def register_additional_transforms() -&gt; None:\n    \"\"\"Register transforms that are not imported directly into the `albumentations` module by checking\n    the availability of optional dependencies.\n    \"\"\"\n    if importlib.util.find_spec(\"torch\") is not None:\n        try:\n            # Import `albumentations.pytorch` only if `torch` is installed.\n            import albumentations.pytorch\n\n            # Use a dummy operation to acknowledge the use of the imported module and avoid linting errors.\n            _ = albumentations.pytorch.ToTensorV2\n        except ImportError:\n            pass\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.save","title":"<code>def save    (transform, filepath_or_buffer, data_format='json', on_not_implemented_error='raise')    </code> [view source on GitHub]","text":"<p>Serialize a transform pipeline and save it to either a file specified by a path or a file-like object in either JSON or YAML format.</p> <p>Parameters:</p> Name Type Description <code>transform</code> <code>Serializable</code> <p>The transform pipeline to serialize.</p> <code>filepath_or_buffer</code> <code>Union[str, Path, TextIO]</code> <p>The file path or file-like object to write the serialized data to. If a string is provided, it is interpreted as a path to a file. If a file-like object is provided, the serialized data will be written to it directly.</p> <code>data_format</code> <code>str</code> <p>The format to serialize the data in. Valid options are 'json' and 'yaml'. Defaults to 'json'.</p> <code>on_not_implemented_error</code> <code>str</code> <p>Determines the behavior if a transform does not implement the <code>to_dict</code> method. If set to 'raise', a <code>NotImplementedError</code> is raised. If set to 'warn', the exception is ignored, and no transform arguments are saved. Defaults to 'raise'.</p> <p>Exceptions:</p> Type Description <code>ValueError</code> <p>If <code>data_format</code> is 'yaml' but PyYAML is not installed.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def save(\n    transform: Serializable,\n    filepath_or_buffer: str | Path | TextIO,\n    data_format: str = \"json\",\n    on_not_implemented_error: str = \"raise\",\n) -&gt; None:\n    \"\"\"Serialize a transform pipeline and save it to either a file specified by a path or a file-like object\n    in either JSON or YAML format.\n\n    Args:\n        transform (Serializable): The transform pipeline to serialize.\n        filepath_or_buffer (Union[str, Path, TextIO]): The file path or file-like object to write the serialized\n            data to.\n            If a string is provided, it is interpreted as a path to a file. If a file-like object is provided,\n            the serialized data will be written to it directly.\n        data_format (str): The format to serialize the data in. Valid options are 'json' and 'yaml'.\n            Defaults to 'json'.\n        on_not_implemented_error (str): Determines the behavior if a transform does not implement the `to_dict` method.\n            If set to 'raise', a `NotImplementedError` is raised. If set to 'warn', the exception is ignored, and\n            no transform arguments are saved. Defaults to 'raise'.\n\n    Raises:\n        ValueError: If `data_format` is 'yaml' but PyYAML is not installed.\n\n    \"\"\"\n    check_data_format(data_format)\n    transform_dict = transform.to_dict(on_not_implemented_error=on_not_implemented_error)\n    transform_dict = serialize_enum(transform_dict)\n\n    # Determine whether to write to a file or a file-like object\n    if isinstance(filepath_or_buffer, (str, Path)):  # It's a filepath\n        with open(filepath_or_buffer, \"w\") as f:\n            if data_format == \"yaml\":\n                if not yaml_available:\n                    msg = \"You need to install PyYAML to save a pipeline in YAML format\"\n                    raise ValueError(msg)\n                yaml.safe_dump(transform_dict, f, default_flow_style=False)\n            elif data_format == \"json\":\n                json.dump(transform_dict, f)\n    elif data_format == \"yaml\":\n        if not yaml_available:\n            msg = \"You need to install PyYAML to save a pipeline in YAML format\"\n            raise ValueError(msg)\n        yaml.safe_dump(transform_dict, filepath_or_buffer, default_flow_style=False)\n    elif data_format == \"json\":\n        json.dump(transform_dict, filepath_or_buffer, indent=2)\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.serialize_enum","title":"<code>def serialize_enum    (obj)    </code> [view source on GitHub]","text":"<p>Recursively search for Enum objects and convert them to their value. Also handle any Mapping or Sequence types.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def serialize_enum(obj: Any) -&gt; Any:\n    \"\"\"Recursively search for Enum objects and convert them to their value.\n    Also handle any Mapping or Sequence types.\n    \"\"\"\n    if isinstance(obj, Mapping):\n        return {k: serialize_enum(v) for k, v in obj.items()}\n    if isinstance(obj, Sequence) and not isinstance(obj, str):  # exclude strings since they're also sequences\n        return [serialize_enum(v) for v in obj]\n    return obj.value if isinstance(obj, Enum) else obj\n</code></pre>"},{"location":"api_reference/core/serialization/#albumentations.core.serialization.to_dict","title":"<code>def to_dict    (transform, on_not_implemented_error='raise')    </code> [view source on GitHub]","text":"<p>Take a transform pipeline and convert it to a serializable representation that uses only standard python data types: dictionaries, lists, strings, integers, and floats.</p> <p>Parameters:</p> Name Type Description <code>transform</code> <code>Serializable</code> <p>A transform that should be serialized. If the transform doesn't implement the <code>to_dict</code> method and <code>on_not_implemented_error</code> equals to 'raise' then <code>NotImplementedError</code> is raised. If <code>on_not_implemented_error</code> equals to 'warn' then <code>NotImplementedError</code> will be ignored but no transform parameters will be serialized.</p> <code>on_not_implemented_error</code> <code>str</code> <p><code>raise</code> or <code>warn</code>.</p> Source code in <code>albumentations/core/serialization.py</code> Python<pre><code>def to_dict(transform: Serializable, on_not_implemented_error: str = \"raise\") -&gt; dict[str, Any]:\n    \"\"\"Take a transform pipeline and convert it to a serializable representation that uses only standard\n    python data types: dictionaries, lists, strings, integers, and floats.\n\n    Args:\n        transform: A transform that should be serialized. If the transform doesn't implement the `to_dict`\n            method and `on_not_implemented_error` equals to 'raise' then `NotImplementedError` is raised.\n            If `on_not_implemented_error` equals to 'warn' then `NotImplementedError` will be ignored\n            but no transform parameters will be serialized.\n        on_not_implemented_error (str): `raise` or `warn`.\n\n    \"\"\"\n    return transform.to_dict(on_not_implemented_error)\n</code></pre>"},{"location":"api_reference/core/transforms_interface/","title":"Transforms Interface (core.transforms_interface)","text":""},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.BaseTransformInitSchema","title":"<code>class  BaseTransformInitSchema</code> <code> </code>","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class BaseTransformInitSchema(BaseModel):\n    model_config = ConfigDict(arbitrary_types_allowed=True)\n    always_apply: bool | None\n    p: ProbabilityType\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.BasicTransform","title":"<code>class  BasicTransform</code> <code>         (p=0.5, always_apply=None)                                                                                     </code>  [view source on GitHub]","text":"<p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class BasicTransform(Serializable, metaclass=CombinedMeta):\n    _targets: tuple[Targets, ...] | Targets  # targets that this transform can work on\n    _available_keys: set[str]  # targets that this transform, as string, lower-cased\n    _key2func: dict[\n        str,\n        Callable[..., Any],\n    ]  # mapping for targets (plus additional targets) and methods for which they depend\n    call_backup = None\n    interpolation: int\n    fill: DropoutFillValue\n    fill_mask: ColorType | None\n    # replay mode params\n    deterministic: bool = False\n    save_key = \"replay\"\n    replay_mode = False\n    applied_in_replay = False\n\n    class InitSchema(BaseTransformInitSchema):\n        pass\n\n    def __init__(self, p: float = 0.5, always_apply: bool | None = None):\n        self.p = p\n        if always_apply is not None:\n            if always_apply:\n                warn(\n                    \"always_apply is deprecated. Use `p=1` if you want to always apply the transform.\"\n                    \" self.p will be set to 1.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n                self.p = 1.0\n            else:\n                warn(\n                    \"always_apply is deprecated.\",\n                    DeprecationWarning,\n                    stacklevel=2,\n                )\n        self._additional_targets: dict[str, str] = {}\n        # replay mode params\n        self.params: dict[Any, Any] = {}\n        self._key2func = {}\n        self._set_keys()\n        self.processors: dict[str, BboxProcessor | KeypointsProcessor] = {}\n        self.seed: int | None = None\n        self.random_generator = np.random.default_rng(self.seed)\n        self.py_random = random.Random(self.seed)\n\n    def set_random_state(\n        self,\n        random_generator: np.random.Generator,\n        py_random: random.Random,\n    ) -&gt; None:\n        \"\"\"Set random state directly from generators.\n\n        Args:\n            random_generator: numpy random generator to use\n            py_random: python random generator to use\n        \"\"\"\n        self.random_generator = random_generator\n        self.py_random = py_random\n\n    def set_random_seed(self, seed: int | None) -&gt; None:\n        \"\"\"Set random state from seed.\n\n        Args:\n            seed: Random seed to use\n        \"\"\"\n        self.seed = seed\n        self.random_generator = np.random.default_rng(seed)\n        self.py_random = random.Random(seed)\n\n    def get_dict_with_id(self) -&gt; dict[str, Any]:\n        d = self.to_dict_private()\n        d[\"id\"] = id(self)\n        return d\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        \"\"\"Returns names of arguments that are used in __init__ method of the transform.\"\"\"\n        msg = (\n            f\"Class {self.get_class_fullname()} is not serializable because the `get_transform_init_args_names` \"\n            \"method is not implemented\"\n        )\n        raise NotImplementedError(msg)\n\n    def set_processors(self, processors: dict[str, BboxProcessor | KeypointsProcessor]) -&gt; None:\n        self.processors = processors\n\n    def get_processor(self, key: str) -&gt; BboxProcessor | KeypointsProcessor | None:\n        return self.processors.get(key)\n\n    def __call__(self, *args: Any, force_apply: bool = False, **kwargs: Any) -&gt; Any:\n        if args:\n            msg = \"You have to pass data to augmentations as named arguments, for example: aug(image=image)\"\n            raise KeyError(msg)\n        if self.replay_mode:\n            if self.applied_in_replay:\n                return self.apply_with_params(self.params, **kwargs)\n            return kwargs\n\n        # Reset params at the start of each call\n        self.params = {}\n\n        if self.should_apply(force_apply=force_apply):\n            params = self.get_params()\n            params = self.update_params_shape(params=params, data=kwargs)\n\n            if self.targets_as_params:  # check if all required targets are in kwargs.\n                missing_keys = set(self.targets_as_params).difference(kwargs.keys())\n                if missing_keys and not (missing_keys == {\"image\"} and \"images\" in kwargs):\n                    msg = f\"{self.__class__.__name__} requires {self.targets_as_params} missing keys: {missing_keys}\"\n                    raise ValueError(msg)\n\n            params_dependent_on_data = self.get_params_dependent_on_data(params=params, data=kwargs)\n            params.update(params_dependent_on_data)\n\n            if self.targets_as_params:  # this block will be removed after removing `get_params_dependent_on_targets`\n                targets_as_params = {k: kwargs.get(k) for k in self.targets_as_params}\n                if missing_keys:  # here we expecting case when missing_keys == {\"image\"} and \"images\" in kwargs\n                    targets_as_params[\"image\"] = kwargs[\"images\"][0]\n                params_dependent_on_targets = self.get_params_dependent_on_targets(targets_as_params)\n                params.update(params_dependent_on_targets)\n\n            # Store the final params\n            self.params = params\n\n            if self.deterministic:\n                kwargs[self.save_key][id(self)] = deepcopy(params)\n            return self.apply_with_params(params, **kwargs)\n\n        return kwargs\n\n    def get_applied_params(self) -&gt; dict[str, Any]:\n        \"\"\"Returns the parameters that were used in the last transform application.\n        Returns empty dict if transform was not applied.\n        \"\"\"\n        return self.params\n\n    def should_apply(self, force_apply: bool = False) -&gt; bool:\n        if self.p &lt;= 0.0:\n            return False\n        if self.p &gt;= 1.0 or force_apply:\n            return True\n        return self.py_random.random() &lt; self.p\n\n    def apply_with_params(self, params: dict[str, Any], *args: Any, **kwargs: Any) -&gt; dict[str, Any]:\n        \"\"\"Apply transforms with parameters.\"\"\"\n        params = self.update_params(params, **kwargs)  # remove after move parameters like interpolation\n        res = {}\n        for key, arg in kwargs.items():\n            if key in self._key2func and arg is not None:\n                target_function = self._key2func[key]\n                res[key] = ensure_contiguous_output(\n                    target_function(ensure_contiguous_output(arg), **params),\n                )\n            else:\n                res[key] = arg\n        return res\n\n    def set_deterministic(self, flag: bool, save_key: str = \"replay\") -&gt; BasicTransform:\n        \"\"\"Set transform to be deterministic.\"\"\"\n        if save_key == \"params\":\n            msg = \"params save_key is reserved\"\n            raise KeyError(msg)\n\n        self.deterministic = flag\n        if self.deterministic and self.targets_as_params:\n            warn(\n                self.get_class_fullname() + \" could work incorrectly in ReplayMode for other input data\"\n                \" because its' params depend on targets.\",\n                stacklevel=2,\n            )\n        self.save_key = save_key\n        return self\n\n    def __repr__(self) -&gt; str:\n        state = self.get_base_init_args()\n        state.update(self.get_transform_init_args())\n        return f\"{self.__class__.__name__}({format_args(state)})\"\n\n    def apply(self, img: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform on image.\"\"\"\n        raise NotImplementedError\n\n    def apply_to_images(self, images: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform on images.\n\n        Args:\n            images: Input images as numpy array of shape:\n                - (num_images, height, width, channels)\n                - (num_images, height, width) for grayscale\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed images as numpy array in the same format as input\n        \"\"\"\n        # Handle batched numpy array input\n        transformed = np.stack([self.apply(image, **params) for image in images])\n        return np.require(transformed, requirements=[\"C_CONTIGUOUS\"])\n\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform slice by slice to a volume.\n\n        Args:\n            volume: Input volume of shape (depth, height, width) or (depth, height, width, channels)\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed volume as numpy array in the same format as input\n        \"\"\"\n        return self.apply_to_images(volume, *args, **params)\n\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to multiple volumes.\"\"\"\n        return np.stack([self.apply_to_volume(vol, *args, **params) for vol in volumes])\n\n    def get_params(self) -&gt; dict[str, Any]:\n        \"\"\"Returns parameters independent of input.\"\"\"\n        return {}\n\n    def update_params_shape(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Updates parameters with input shape.\"\"\"\n        # Extract shape from volume, volumes, image, or images\n        if \"volume\" in data:\n            shape = data[\"volume\"][0].shape  # Take first slice of volume\n        elif \"volumes\" in data:\n            shape = data[\"volumes\"][0][0].shape  # Take first slice of first volume\n        elif \"image\" in data:\n            shape = data[\"image\"].shape\n        else:\n            shape = data[\"images\"][0].shape\n\n        # For volumes/images, shape will be either (H, W) or (H, W, C)\n        params[\"shape\"] = shape\n        return params\n\n    def get_params_dependent_on_data(self, params: dict[str, Any], data: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"Returns parameters dependent on input.\"\"\"\n        return params\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        # mapping for targets and methods for which they depend\n        # for example:\n        # &gt;&gt;  {\"image\": self.apply}\n        # &gt;&gt;  {\"masks\": self.apply_to_masks}\n        raise NotImplementedError\n\n    def _set_keys(self) -&gt; None:\n        \"\"\"Set _available_keys.\"\"\"\n        if not hasattr(self, \"_targets\"):\n            self._available_keys = set()\n        else:\n            self._available_keys = {\n                target.value.lower()\n                for target in (self._targets if isinstance(self._targets, tuple) else [self._targets])\n            }\n        self._available_keys.update(self.targets.keys())\n        self._key2func = {key: self.targets[key] for key in self._available_keys if key in self.targets}\n\n    @property\n    def available_keys(self) -&gt; set[str]:\n        \"\"\"Returns set of available keys.\"\"\"\n        return self._available_keys\n\n    def update_params(self, params: dict[str, Any], **kwargs: Any) -&gt; dict[str, Any]:\n        \"\"\"Update parameters with transform specific params.\n        This method is deprecated, use:\n        - `get_params` for transform specific params like interpolation and\n        - `update_params_shape` for data like shape.\n        \"\"\"\n        if hasattr(self, \"interpolation\"):\n            params[\"interpolation\"] = self.interpolation\n        if hasattr(self, \"fill\"):\n            params[\"fill\"] = self.fill\n        if hasattr(self, \"fill_mask\"):\n            params[\"fill_mask\"] = self.fill_mask\n\n        # Use update_params_shape to get shape consistently\n        return self.update_params_shape(params, kwargs)\n\n    def add_targets(self, additional_targets: dict[str, str]) -&gt; None:\n        \"\"\"Add targets to transform them the same way as one of existing targets.\n        ex: {'target_image': 'image'}\n        ex: {'obj1_mask': 'mask', 'obj2_mask': 'mask'}\n        by the way you must have at least one object with key 'image'\n\n        Args:\n            additional_targets (dict): keys - new target name, values - old target name. ex: {'image2': 'image'}\n\n        \"\"\"\n        for k, v in additional_targets.items():\n            if k in self._additional_targets and v != self._additional_targets[k]:\n                raise ValueError(\n                    f\"Trying to overwrite existed additional targets. \"\n                    f\"Key={k} Exists={self._additional_targets[k]} New value: {v}\",\n                )\n            if v in self._available_keys:\n                self._additional_targets[k] = v\n                self._key2func[k] = self.targets[v]\n                self._available_keys.add(k)\n\n    @property\n    def targets_as_params(self) -&gt; list[str]:\n        \"\"\"Targets used to get params dependent on targets.\n        This is used to check input has all required targets.\n        \"\"\"\n        return []\n\n    def get_params_dependent_on_targets(self, params: dict[str, Any]) -&gt; dict[str, Any]:\n        \"\"\"This method is deprecated.\n        Use `get_params_dependent_on_data` instead.\n        Returns parameters dependent on targets.\n        Dependent target is defined in `self.targets_as_params`\n        \"\"\"\n        return {}\n\n    @classmethod\n    def get_class_fullname(cls) -&gt; str:\n        return get_shortest_class_fullname(cls)\n\n    @classmethod\n    def is_serializable(cls) -&gt; bool:\n        return True\n\n    def get_base_init_args(self) -&gt; dict[str, Any]:\n        \"\"\"Returns base init args - p\"\"\"\n        return {\"p\": self.p}\n\n    def get_transform_init_args(self) -&gt; dict[str, Any]:\n        \"\"\"Exclude seed from init args during serialization\"\"\"\n        args = {k: getattr(self, k) for k in self.get_transform_init_args_names()}\n        args.pop(\"seed\", None)  # Remove seed from args\n        return args\n\n    def to_dict_private(self) -&gt; dict[str, Any]:\n        state = {\"__class_fullname__\": self.get_class_fullname()}\n        state.update(self.get_base_init_args())\n        state.update(self.get_transform_init_args())\n        return state\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.DualTransform","title":"<code>class  DualTransform</code> <code> </code>  [view source on GitHub]","text":"<p>A base class for transformations that should be applied both to an image and its corresponding properties such as masks, bounding boxes, and keypoints. This class ensures that when a transform is applied to an image, all associated entities are transformed accordingly to maintain consistency between the image and its annotations.</p> <p>Methods</p> <p>apply(img: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to the image.</p> <pre><code>img: Input image of shape (H, W, C) or (H, W) for grayscale.\n**params: Additional parameters specific to the transform.\n\nReturns Transformed image of the same shape as input.\n</code></pre> <p>apply_to_images(images: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to multiple images.</p> <pre><code>images: Input images of shape (N, H, W, C) or (N, H, W) for grayscale.\n**params: Additional parameters specific to the transform.\n\nReturns Transformed images in the same format as input.\n</code></pre> <p>apply_to_mask(mask: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to a mask.</p> <pre><code>mask: Input mask of shape (H, W), (H, W, C) for multi-channel masks\n**params: Additional parameters specific to the transform.\n\nReturns Transformed mask in the same format as input.\n</code></pre> <p>apply_to_masks(masks: np.ndarray, **params: Any) -&gt; np.ndarray | list[np.ndarray]:     Apply the transform to multiple masks.</p> <pre><code>masks: Array of shape (N, H, W) or (N, H, W, C) where N is number of masks\n**params: Additional parameters specific to the transform.\nReturns Transformed masks in the same format as input.\n</code></pre> <p>apply_to_keypoints(keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to keypoints.</p> <pre><code>!!! keypoints \"Array of shape (N, 2+) where N is the number of keypoints.\"\n    **params: Additional parameters specific to the transform.\nReturns Transformed keypoints array of shape (N, 2+).\n</code></pre> <p>apply_to_bboxes(bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to bounding boxes.</p> <pre><code>!!! bboxes \"Array of shape (N, 4+) where N is the number of bounding boxes,\"\n        and each row is in the format [x_min, y_min, x_max, y_max].\n**params: Additional parameters specific to the transform.\n\nReturns Transformed bounding boxes array of shape (N, 4+).\n</code></pre> <p>apply_to_volume(volume: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to a volume.</p> <pre><code>volume: Input volume of shape (D, H, W) or (D, H, W, C).\n**params: Additional parameters specific to the transform.\n\nReturns Transformed volume of the same shape as input.\n</code></pre> <p>apply_to_volumes(volumes: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to multiple volumes.</p> <pre><code>volumes: Input volumes of shape (N, D, H, W) or (N, D, H, W, C).\n**params: Additional parameters specific to the transform.\n\nReturns Transformed volumes in the same format as input.\n</code></pre> <p>apply_to_mask3d(mask: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to a 3D mask.</p> <pre><code>mask: Input 3D mask of shape (D, H, W) or (D, H, W, C)\n**params: Additional parameters specific to the transform.\n\nReturns Transformed 3D mask in the same format as input.\n</code></pre> <p>apply_to_masks3d(masks: np.ndarray, **params: Any) -&gt; np.ndarray:     Apply the transform to multiple 3D masks.</p> <pre><code>masks: Input 3D masks of shape (N, D, H, W) or (N, D, H, W, C)\n**params: Additional parameters specific to the transform.\n\nReturns Transformed 3D masks in the same format as input.\n</code></pre> <p>Note</p> <ul> <li>All <code>apply_*</code> methods should maintain the input shape and format of the data.</li> <li>When applying transforms to masks, ensure that discrete values (e.g., class labels) are preserved.</li> <li>For keypoints and bounding boxes, the transformation should maintain their relative positions     with respect to the transformed image.</li> <li>The difference between <code>apply_to_mask</code> and <code>apply_to_masks</code> is mainly in how they handle 3D arrays:     <code>apply_to_mask</code> treats a 3D array as a multi-channel mask, while <code>apply_to_masks</code> treats it as     multiple single-channel masks.</li> </ul> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class DualTransform(BasicTransform):\n    \"\"\"A base class for transformations that should be applied both to an image and its corresponding properties\n    such as masks, bounding boxes, and keypoints. This class ensures that when a transform is applied to an image,\n    all associated entities are transformed accordingly to maintain consistency between the image and its annotations.\n\n    Methods:\n        apply(img: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to the image.\n\n            img: Input image of shape (H, W, C) or (H, W) for grayscale.\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed image of the same shape as input.\n\n        apply_to_images(images: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to multiple images.\n\n            images: Input images of shape (N, H, W, C) or (N, H, W) for grayscale.\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed images in the same format as input.\n\n        apply_to_mask(mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to a mask.\n\n            mask: Input mask of shape (H, W), (H, W, C) for multi-channel masks\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed mask in the same format as input.\n\n        apply_to_masks(masks: np.ndarray, **params: Any) -&gt; np.ndarray | list[np.ndarray]:\n            Apply the transform to multiple masks.\n\n            masks: Array of shape (N, H, W) or (N, H, W, C) where N is number of masks\n            **params: Additional parameters specific to the transform.\n            Returns Transformed masks in the same format as input.\n\n        apply_to_keypoints(keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to keypoints.\n\n            keypoints: Array of shape (N, 2+) where N is the number of keypoints.\n                **params: Additional parameters specific to the transform.\n            Returns Transformed keypoints array of shape (N, 2+).\n\n        apply_to_bboxes(bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to bounding boxes.\n\n            bboxes: Array of shape (N, 4+) where N is the number of bounding boxes,\n                    and each row is in the format [x_min, y_min, x_max, y_max].\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed bounding boxes array of shape (N, 4+).\n\n        apply_to_volume(volume: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to a volume.\n\n            volume: Input volume of shape (D, H, W) or (D, H, W, C).\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed volume of the same shape as input.\n\n        apply_to_volumes(volumes: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to multiple volumes.\n\n            volumes: Input volumes of shape (N, D, H, W) or (N, D, H, W, C).\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed volumes in the same format as input.\n\n        apply_to_mask3d(mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to a 3D mask.\n\n            mask: Input 3D mask of shape (D, H, W) or (D, H, W, C)\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed 3D mask in the same format as input.\n\n        apply_to_masks3d(masks: np.ndarray, **params: Any) -&gt; np.ndarray:\n            Apply the transform to multiple 3D masks.\n\n            masks: Input 3D masks of shape (N, D, H, W) or (N, D, H, W, C)\n            **params: Additional parameters specific to the transform.\n\n            Returns Transformed 3D masks in the same format as input.\n\n    Note:\n        - All `apply_*` methods should maintain the input shape and format of the data.\n        - When applying transforms to masks, ensure that discrete values (e.g., class labels) are preserved.\n        - For keypoints and bounding boxes, the transformation should maintain their relative positions\n            with respect to the transformed image.\n        - The difference between `apply_to_mask` and `apply_to_masks` is mainly in how they handle 3D arrays:\n            `apply_to_mask` treats a 3D array as a multi-channel mask, while `apply_to_masks` treats it as\n            multiple single-channel masks.\n\n    \"\"\"\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        return {\n            \"image\": self.apply,\n            \"images\": self.apply_to_images,\n            \"mask\": self.apply_to_mask,\n            \"masks\": self.apply_to_masks,\n            \"mask3d\": self.apply_to_mask3d,\n            \"masks3d\": self.apply_to_masks3d,\n            \"bboxes\": self.apply_to_bboxes,\n            \"keypoints\": self.apply_to_keypoints,\n            \"volume\": self.apply_to_volume,\n            \"volumes\": self.apply_to_volumes,\n        }\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        msg = f\"Method apply_to_keypoints is not implemented in class {self.__class__.__name__}\"\n        raise NotImplementedError(msg)\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        raise NotImplementedError(f\"BBoxes not implemented for {self.__class__.__name__}\")\n\n    def apply_to_mask(self, mask: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        return self.apply(mask, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=False)\n    def apply_to_masks(self, masks: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to multiple masks.\n\n        Args:\n            masks: Array of shape (N, H, W) or (N, H, W, C) where N is number of masks\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Array of transformed masks with same shape as input\n        \"\"\"\n        return self.apply(masks, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=False, has_depth_dim=True)\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to single 3D mask.\n\n        Args:\n            mask3d: Input 3D mask of shape (D, H, W) or (D, H, W, C)\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed 3D mask in the same format as input\n        \"\"\"\n        return self.apply_to_mask(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to batch of 3D masks.\n\n        Args:\n            masks3d: Input 3D masks of shape (N, D, H, W) or (N, D, H, W, C)\n            *args: Additional positional arguments\n            **params: Additional parameters specific to the transform\n\n        Returns:\n            Transformed 3D masks in the same format as input\n        \"\"\"\n        return self.apply_to_mask(masks3d, *args, **params)\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.ImageOnlyTransform","title":"<code>class  ImageOnlyTransform</code> <code> </code>  [view source on GitHub]","text":"<p>Transform applied to image only.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class ImageOnlyTransform(BasicTransform):\n    \"\"\"Transform applied to image only.\"\"\"\n\n    _targets = (Targets.IMAGE, Targets.VOLUME)\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        return {\n            \"image\": self.apply,\n            \"images\": self.apply_to_images,\n            \"volume\": self.apply_to_volume,\n            \"volumes\": self.apply_to_volumes,\n        }\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.NoOp","title":"<code>class  NoOp</code> <code> </code>  [view source on GitHub]","text":"<p>Identity transform (does nothing).</p> <p>Targets</p> <p>image, mask, bboxes, keypoints, volume, mask3d</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class NoOp(DualTransform):\n    \"\"\"Identity transform (does nothing).\n\n    Targets:\n        image, mask, bboxes, keypoints, volume, mask3d\n    \"\"\"\n\n    _targets = ALL_TARGETS\n\n    def apply_to_keypoints(self, keypoints: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return keypoints\n\n    def apply_to_bboxes(self, bboxes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return bboxes\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return img\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return mask\n\n    def apply_to_volume(self, volume: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return volume\n\n    def apply_to_volumes(self, volumes: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return volumes\n\n    def apply_to_mask3d(self, mask3d: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return mask3d\n\n    def apply_to_masks3d(self, masks3d: np.ndarray, **params: Any) -&gt; np.ndarray:\n        return masks3d\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/core/transforms_interface/#albumentations.core.transforms_interface.Transform3D","title":"<code>class  Transform3D</code> <code> </code>  [view source on GitHub]","text":"<p>Base class for all 3D transforms.</p> <p>Transform3D inherits from DualTransform because 3D transforms can be applied to both volumes and masks, similar to how 2D DualTransforms work with images and masks.</p> <p>Targets</p> <p>volume: 3D numpy array of shape (D, H, W) or (D, H, W, C) volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C) mask: 3D numpy array of shape (D, H, W) masks: Batch of 3D arrays of shape (N, D, H, W) keypoints: 3D numpy array of shape (N, 3)</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/core/transforms_interface.py</code> Python<pre><code>class Transform3D(DualTransform):\n    \"\"\"Base class for all 3D transforms.\n\n    Transform3D inherits from DualTransform because 3D transforms can be applied to both\n    volumes and masks, similar to how 2D DualTransforms work with images and masks.\n\n    Targets:\n        volume: 3D numpy array of shape (D, H, W) or (D, H, W, C)\n        volumes: Batch of 3D arrays of shape (N, D, H, W) or (N, D, H, W, C)\n        mask: 3D numpy array of shape (D, H, W)\n        masks: Batch of 3D arrays of shape (N, D, H, W)\n        keypoints: 3D numpy array of shape (N, 3)\n    \"\"\"\n\n    def apply_to_volume(self, volume: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to single 3D volume.\"\"\"\n        raise NotImplementedError\n\n    @batch_transform(\"spatial\", keep_depth_dim=True, has_batch_dim=True, has_depth_dim=True)\n    def apply_to_volumes(self, volumes: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to batch of 3D volumes.\"\"\"\n        return self.apply_to_volume(volumes, *args, **params)\n\n    def apply_to_mask3d(self, mask3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to single 3D mask.\"\"\"\n        return self.apply_to_volume(mask3d, *args, **params)\n\n    @batch_transform(\"spatial\", keep_depth_dim=True, has_batch_dim=True, has_depth_dim=True)\n    def apply_to_masks3d(self, masks3d: np.ndarray, *args: Any, **params: Any) -&gt; np.ndarray:\n        \"\"\"Apply transform to batch of 3D masks.\"\"\"\n        return self.apply_to_mask3d(masks3d, *args, **params)\n\n    @property\n    def targets(self) -&gt; dict[str, Callable[..., Any]]:\n        \"\"\"Define valid targets for 3D transforms.\"\"\"\n        return {\n            \"volume\": self.apply_to_volume,\n            \"volumes\": self.apply_to_volumes,\n            \"mask3d\": self.apply_to_mask3d,\n            \"masks3d\": self.apply_to_masks3d,\n            \"keypoints\": self.apply_to_keypoints,\n        }\n</code></pre>"},{"location":"api_reference/pytorch/","title":"Index","text":"<ul> <li>Transforms (albumentations.pytorch.transforms)</li> </ul>"},{"location":"api_reference/pytorch/transforms/","title":"Transforms (pytorch.transforms)","text":""},{"location":"api_reference/pytorch/transforms/#albumentations.pytorch.transforms.ToTensor3D","title":"<code>class  ToTensor3D</code> <code>       (p=1.0, always_apply=None)                         </code>  [view source on GitHub]","text":"<p>Convert 3D volumes and masks to PyTorch tensors.</p> <p>This transform is designed for 3D medical imaging data. It converts numpy arrays to PyTorch tensors and ensures consistent channel positioning.</p> <p>For all inputs (volumes and masks):     - Input:  (D, H, W, C) or (D, H, W) - depth, height, width, [channels]     - Output: (C, D, H, W) - channels first format for PyTorch              For single-channel input, adds C=1 dimension</p> <p>Note</p> <p>This transform always moves channels to first position as this is the standard PyTorch format. For masks that need to stay in DHWC format, use a different transform or handle the transposition after this transform.</p> <p>Parameters:</p> Name Type Description <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/pytorch/transforms.py</code> Python<pre><code>class ToTensor3D(BasicTransform):\n    \"\"\"Convert 3D volumes and masks to PyTorch tensors.\n\n    This transform is designed for 3D medical imaging data. It converts numpy arrays\n    to PyTorch tensors and ensures consistent channel positioning.\n\n    For all inputs (volumes and masks):\n        - Input:  (D, H, W, C) or (D, H, W) - depth, height, width, [channels]\n        - Output: (C, D, H, W) - channels first format for PyTorch\n                 For single-channel input, adds C=1 dimension\n\n    Note:\n        This transform always moves channels to first position as this is\n        the standard PyTorch format. For masks that need to stay in DHWC format,\n        use a different transform or handle the transposition after this transform.\n\n    Args:\n        p (float): Probability of applying the transform. Default: 1.0\n    \"\"\"\n\n    _targets = (Targets.IMAGE, Targets.MASK)\n\n    def __init__(self, p: float = 1.0, always_apply: bool | None = None):\n        super().__init__(p=p, always_apply=always_apply)\n\n    @property\n    def targets(self) -&gt; dict[str, Any]:\n        return {\n            \"volume\": self.apply_to_volume,\n            \"mask3d\": self.apply_to_mask3d,\n        }\n\n    def apply_to_volume(self, volume: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        \"\"\"Convert 3D volume to channels-first tensor.\"\"\"\n        if volume.ndim == NUM_VOLUME_DIMENSIONS:  # D,H,W,C\n            return torch.from_numpy(volume.transpose(3, 0, 1, 2))\n        if volume.ndim == NUM_VOLUME_DIMENSIONS - 1:  # D,H,W\n            return torch.from_numpy(volume[np.newaxis, ...])\n        raise ValueError(f\"Expected 3D or 4D array (D,H,W) or (D,H,W,C), got {volume.ndim}D array\")\n\n    def apply_to_mask3d(self, mask3d: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        \"\"\"Convert 3D mask to channels-first tensor.\"\"\"\n        return self.apply_to_volume(mask3d, **params)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return ()\n</code></pre>"},{"location":"api_reference/pytorch/transforms/#albumentations.pytorch.transforms.ToTensorV2","title":"<code>class  ToTensorV2</code> <code>       (transpose_mask=False, p=1.0, always_apply=None)                             </code>  [view source on GitHub]","text":"<p>Converts images/masks to PyTorch Tensors, inheriting from BasicTransform. For images:     - If input is in <code>HWC</code> format, converts to PyTorch <code>CHW</code> format     - If input is in <code>HW</code> format, converts to PyTorch <code>1HW</code> format (adds channel dimension)</p> <p>Attributes:</p> Name Type Description <code>transpose_mask</code> <code>bool</code> <p>If True, transposes 3D input mask dimensions from <code>[height, width, num_channels]</code> to <code>[num_channels, height, width]</code>.</p> <code>p</code> <code>float</code> <p>Probability of applying the transform. Default: 1.0.</p> <p>Interactive Tool Available!</p> <p> Explore this transform visually and adjust parameters interactively using this tool: </p> <p> Open Tool </p> Source code in <code>albumentations/pytorch/transforms.py</code> Python<pre><code>class ToTensorV2(BasicTransform):\n    \"\"\"Converts images/masks to PyTorch Tensors, inheriting from BasicTransform.\n    For images:\n        - If input is in `HWC` format, converts to PyTorch `CHW` format\n        - If input is in `HW` format, converts to PyTorch `1HW` format (adds channel dimension)\n\n    Attributes:\n        transpose_mask (bool): If True, transposes 3D input mask dimensions from `[height, width, num_channels]` to\n            `[num_channels, height, width]`.\n        p (float): Probability of applying the transform. Default: 1.0.\n    \"\"\"\n\n    _targets = (Targets.IMAGE, Targets.MASK)\n\n    def __init__(self, transpose_mask: bool = False, p: float = 1.0, always_apply: bool | None = None):\n        super().__init__(p=p, always_apply=always_apply)\n        self.transpose_mask = transpose_mask\n\n    @property\n    def targets(self) -&gt; dict[str, Any]:\n        return {\n            \"image\": self.apply,\n            \"images\": self.apply_to_images,\n            \"mask\": self.apply_to_mask,\n            \"masks\": self.apply_to_masks,\n        }\n\n    def apply(self, img: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        if img.ndim not in {MONO_CHANNEL_DIMENSIONS, NUM_MULTI_CHANNEL_DIMENSIONS}:\n            msg = \"Albumentations only supports images in HW or HWC format\"\n            raise ValueError(msg)\n\n        if img.ndim == MONO_CHANNEL_DIMENSIONS:\n            img = np.expand_dims(img, 2)\n\n        return torch.from_numpy(img.transpose(2, 0, 1))\n\n    def apply_to_mask(self, mask: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        if self.transpose_mask and mask.ndim == NUM_MULTI_CHANNEL_DIMENSIONS:\n            mask = mask.transpose(2, 0, 1)\n        return torch.from_numpy(mask)\n\n    @overload\n    def apply_to_masks(self, masks: list[np.ndarray], **params: Any) -&gt; list[torch.Tensor]: ...\n\n    @overload\n    def apply_to_masks(self, masks: np.ndarray, **params: Any) -&gt; torch.Tensor: ...\n\n    def apply_to_masks(self, masks: np.ndarray | list[np.ndarray], **params: Any) -&gt; torch.Tensor | list[torch.Tensor]:\n        \"\"\"Convert numpy array or list of numpy array masks to torch tensor(s).\n\n        Args:\n            masks: Numpy array of shape (N, H, W) or (N, H, W, C),\n                or a list of numpy arrays with shape (H, W) or (H, W, C).\n            params: Additional parameters.\n\n        Returns:\n            If transpose_mask is True and input is (N, H, W, C), returns tensor of shape (N, C, H, W).\n            If transpose_mask is True and input is (H, W, C), returns a list of tensors with shape (C, H, W).\n            Otherwise, returns tensors with the same shape as input.\n        \"\"\"\n        if isinstance(masks, list):\n            return [self.apply_to_mask(mask, **params) for mask in masks]\n\n        if self.transpose_mask and masks.ndim == NUM_VOLUME_DIMENSIONS:  # (N, H, W, C)\n            masks = np.transpose(masks, (0, 3, 1, 2))  # -&gt; (N, C, H, W)\n        return torch.from_numpy(masks)\n\n    def apply_to_images(self, images: np.ndarray, **params: Any) -&gt; torch.Tensor:\n        \"\"\"Convert batch of images from (N, H, W, C) to (N, C, H, W).\"\"\"\n        if images.ndim != NUM_VOLUME_DIMENSIONS:  # N,H,W,C\n            raise ValueError(f\"Expected 4D array (N,H,W,C), got {images.ndim}D array\")\n        return torch.from_numpy(images.transpose(0, 3, 1, 2))  # -&gt; (N,C,H,W)\n\n    def get_transform_init_args_names(self) -&gt; tuple[str, ...]:\n        return (\"transpose_mask\",)\n</code></pre>"},{"location":"autoalbument/","title":"AutoAlbument Overview","text":"<p>AutoAlbument is an AutoML tool that learns image augmentation policies from data using the Faster AutoAugment algorithm. It relieves the user from manually selecting augmentations and tuning their parameters. AutoAlbument provides a complete ready-to-use configuration for an augmentation pipeline.</p> <p>AutoAlbument supports image classification and semantic segmentation tasks. The library requires Python 3.6 or higher.</p> <p>The source code and issue tracker are available at https://github.com/albumentations-team/autoalbument</p> <p>Table of contents:</p> <ul> <li>AutoAlbument introduction and core concepts</li> <li>Installation</li> <li>Benchmarks and a comparison with baseline augmentation strategies</li> <li>How to use AutoAlbument</li> <li>How to use an AutoAlbument Docker image</li> <li>How to use a custom classification or semantic segmentation model</li> <li>Metrics and their meaning</li> <li>Tuning parameters</li> <li>Examples</li> <li>Search algorithms</li> <li>FAQ</li> </ul>"},{"location":"autoalbument/benchmarks/","title":"Benchmarks and a comparison with baseline augmentation strategies","text":"<p>Here is a comparison between a baseline augmentation strategy and an augmentation policy discovered by AutoAlbument for different classification and semantic segmentation tasks. You can read more about these benchmarks in the autoalbument-benchmarks repository.</p>"},{"location":"autoalbument/benchmarks/#classification","title":"Classification","text":"Dataset Baseline Top-1 Accuracy AutoAlbument Top-1 Accuracy CIFAR10 91.79 96.02 SVHN 98.31 98.48 ImageNet 73.27 75.17"},{"location":"autoalbument/benchmarks/#semantic-segmentation","title":"Semantic segmentation","text":"Dataset Baseline mIOU AutoAlbument mIOU Pascal VOC 73.34 75.55 Cityscapes 79.47 79.92"},{"location":"autoalbument/custom_model/","title":"How to use a custom classification or semantic segmentation model","text":"<p>By default AutoAlbument uses <code>pytorch-image-models</code> for classification and <code>segmentation_models.pytorch</code> for semantic segmentation. You can use any model from these packages by providing an appropriate model name.</p> <p>However, you can also use a custom model with AutoAlbument. To do so, you need to define a Discriminator model. This Discriminator model should have two outputs.</p> <ul> <li> <p>The first output should provide a prediction for a classification or semantic segmentation task. For classification, it should output a tensor with a shape <code>[batch_size, num_classes]</code> with logits. For semantic segmentation, it should output a tensor with the shape <code>[batch_size, num_classes, height, width]</code> with logits.</p> </li> <li> <p>The second (auxiliary) output should return a tensor with the shape <code>[batch_size]</code> that contains logits for Discriminator's predictions (whether Discriminator thinks that an image wasn't or was augmented).</p> </li> </ul> <p>To create such a model, you need to subclass the <code>autoalbument.faster_autoaugment.models.BaseDiscriminator</code> class and implement the <code>forward</code> method. This method should take a batch of images, that is, a tensor with the shape <code>[batch_size, num_channels, height, width]</code>. It should return a tuple that contains tensors from the two outputs described above.</p> <p>As an example, take a look at how default classification and semantic segmentation models are defined in AutoAlbument - https://github.com/albumentations-team/autoalbument/blob/master/autoalbument/faster_autoaugment/models.py or explore an example of a custom model for the CIFAR10 dataset.</p> <p>Next, you need to specify this custom model in <code>config.yaml</code>, an  AutoAlbument config file. AutoAlbument uses the <code>instantiate</code> function from Hydra to instantiate an object. You need to set the <code>_target_</code> config variable in the <code>classification_model</code> or <code>semantic_segmentation_model</code> section, depending on the task. In this config variable, you need to provide a path to a class with the model. This path should be located inside PYTHONPATH, so Hydra could correctly use it. The simplest way is to define your model in a file such as <code>model.py</code> and place this file in the same directory with <code>dataset.py</code> and <code>search.yaml</code> because this directory is automatically added to PYTHONPATH. Next, you could define <code>_target_</code> such as <code>_target_: model.MyClassificationModel</code>.</p> <p>Take a look at the CIFAR10 example config that uses a custom model defined in model.py as a starting point for defining a custom model.</p>"},{"location":"autoalbument/docker/","title":"How to use an AutoAlbument Docker image","text":"<p>You can run AutoAlbument from a Docker image. The <code>ghcr.io/albumentations-team/autoalbument:latest</code> Docker image contains the latest release version of AutoAlbument.</p> <p>You can also use an image that contains a specific version of AutoAlbument. In that case, you need to use the AutoAlbument version as a tag for a Docker image, e.g., the <code>ghcr.io/albumentations-team/autoalbument:0.3.0</code> image contains AutoAlbument 0.3.0.</p> <p>The latest AutoAlbument image is based on the <code>pytorch/pytorch:1.7.0-cuda11.0-cudnn8-runtime</code> image.</p> <p>When you run a Docker container with AutoAlbument, you need to mount a config directory (a directory containing <code>dataset.py</code> and <code>search.yaml</code> files) and other required directories, such as a directory that contains training data.</p> <p>Here is an example command that runs a Docker container that will search for CIFAR10 augmentation policies.</p> <p><code>docker run -it --rm --gpus all --ipc=host -v ~/projects/autoalbument/examples/cifar10:/config -v ~/data:/home/autoalbument/data -u $(id -u ${USER}):$(id -g ${USER}) ghcr.io/albumentations-team/autoalbument:latest</code></p> <p>Let's take a look at the arguments:</p> <ul> <li><code>--it</code>. Tell Docker that you run an interactive process. Read more in the Docker documentation.</li> <li><code>--rm</code>. Automatically clean up a container when it exits. Read more in the Docker documentation.</li> <li><code>--gpus all</code>. Specify GPUs to use. Read more in the Docker documentation.</li> <li><code>--ipc=host</code>. Increase shared memory size for PyTorch DataLoader. Read more in the PyTorch documentation.</li> <li><code>-v ~/projects/autoalbument/examples/cifar10:/config</code>. Mounts the <code>~/projects/autoalbument/examples/cifar10</code> directory from the host to the <code>/config</code> directory into the container. This example assumes that you have the AutoAlbument repository in the <code>~/projects/autoalbument/</code> directory. Generally speaking, you need to mount a directory containing <code>dataset.py</code> and <code>search.yaml</code> into the <code>/config</code> directory in a container.</li> <li><code>-v ~/data:/home/autoalbument/data</code>. Mounts the directory <code>~/data</code> that contains the CIFAR10 dataset into the <code>/home/autoalbument/data</code> directory. You can mount a host directory with a dataset into any container directory, but you need to specify config parameters accordingly. In this example, we mount the directory into <code>/home/autoalbument/data</code> because we set this directory (<code>~/data/cifar10</code>) in the config as a root directory for the dataset. Note that Docker doesn't support tilde expansion for the HOME directory, so we explicitly name HOME directory as <code>/home/autoalbument</code> because <code>autoalbument</code> is a default user inside the container.</li> <li><code>-u $(id -u ${USER}):$(id -g ${USER})</code>. We use that command to tell Docker to use the host's user ID to run code inside a container. We need this command because AutoAlbument will produce artifacts in the config directory (such as augmentation configs and logs). We need that the host user owns those files (and not <code>root</code>, for example) so you can access them afterward.</li> <li><code>ghcr.io/albumentations-team/autoalbument:latest</code> is the Docker image's name. <code>latest</code> is a tag for the latest stable release. Alternatively, you can use a tag that specifies an AutoAlbument version, e.g., <code>ghcr.io/albumentations-team/autoalbument:0.3.0</code>.</li> </ul>"},{"location":"autoalbument/faq/","title":"FAQ","text":""},{"location":"autoalbument/faq/#search-takes-a-lot-of-time-how-can-i-speed-it-up","title":"Search takes a lot of time. How can I speed it up?","text":"<p>Instead of a full training dataset, you can use a reduced version to search for augmentation policies. For example, the authors of Faster AutoAugment used 6000 images from the 120 selected classes to find augmentation policies for ImageNet (while the full dataset for ILSVRC contains 1.2 million images and 1000 classes).</p>"},{"location":"autoalbument/how_to_use/","title":"How to use AutoAlbument","text":"<ol> <li>You need to create a configuration file with AutoAlbument parameters and a Python file that implements a custom PyTorch Dataset for your data. Next, you need to pass those files to AutoAlbument.</li> <li>AutoAlbument will use Generative Adversarial Network to discover augmentation policies and then create a file containing those policies.</li> <li>Finally, you can use Albumentations to load augmentation policies from the file and utilize them in your computer vision pipeline.</li> </ol>"},{"location":"autoalbument/how_to_use/#step-1-create-a-configuration-file-and-a-custom-pytorch-dataset-for-your-data","title":"Step 1. Create a configuration file and a custom PyTorch Dataset for your data","text":""},{"location":"autoalbument/how_to_use/#a-create-a-directory-with-configuration-files","title":"a. Create a directory with configuration files","text":"<p>Run <code>autoalbument-create --config-dir &lt;/path/to/directory&gt; --task &lt;deep_learning_task&gt; --num-classes &lt;num_classes&gt;</code>, e.g. <code>autoalbument-create --config-dir ~/experiments/autoalbument-search-cifar10 --task classification --num-classes 10</code>.  - A value for the <code>--config-dir</code> option should contain a path to the directory. AutoAlbument will create this directory and put two files into it: <code>dataset.py</code> and <code>search.yaml</code> (more on them later).   - A value for the <code>--task</code> option should contain the name of a deep learning task. Supported values are <code>classification</code> and <code>semantic_segmentation</code>.  - A value for the <code>--num-classes</code> option should contain the number of distinct classes in the classification or segmentation dataset.</p> <p>By default, AutoAlbument creates a <code>search.yaml</code> file that contains only most important configuration parameters. To explore all available parameters you can create a config file that contains them all by providing the <code>--generate-full-config</code> argument, e.g. <code>autoalbument-create --config-dir ~/experiments/autoalbument-search-cifar10 --task classification --num-classes 10 --generate-full-config</code></p>"},{"location":"autoalbument/how_to_use/#b-add-implementation-for-__len__-and-__getitem__-methods-in-datasetpy","title":"b. Add implementation for <code>__len__</code> and <code>__getitem__</code> methods in <code>dataset.py</code>","text":"<p>The <code>dataset.py</code> file created at step 1 by <code>autoalbument-create</code> contains stubs for implementing a PyTorch dataset (you can read more about creating custom PyTorch datasets here). You need to add implementation for <code>__len__</code> and <code>__getitem__</code> methods (and optionally add the initialization logic if required).</p> <p>A dataset for a classification task should return an image and a class label. A dataset for a segmentation task should return an image and an associated mask.</p>"},{"location":"autoalbument/how_to_use/#c-optional-adjust-search-parameters-in-searchyaml","title":"c. [Optional] Adjust search parameters in <code>search.yaml</code>","text":"<p>You may want to change the parameters that AutoAlbument will use to search for augmentation policies. To do this, you need to edit the <code>search.yaml</code> file created by <code>autoalbument-create</code> at step 1. Each configuration parameter contains a comment that describes the meaning of the setting. Please refer to the  \"Tuning the search parameters\" section that includes a description of the most critical parameters.</p> <p><code>search.yaml</code> is a Hydra config file. You can use all Hydra features inside it.</p>"},{"location":"autoalbument/how_to_use/#step-2-use-autoalbument-to-search-for-augmentation-policies","title":"Step 2. Use AutoAlbument to search for augmentation policies.","text":"<p>To search for augmentation policies, run <code>autoalbument-search --config-dir &lt;/path/to/directory&gt;</code>, e.g. <code>autoalbument-search --config-dir ~/experiments/autoalbument-search-cifar10</code>. The value of <code>--config-dir</code> should be the same value that was passed to <code>autoalbument-create</code> at step 1.</p> <p><code>autoalbument-search</code> will create a directory with output files (by default the path of the directory will be <code>&lt;config_dir&gt;/outputs/&lt;current_date&gt;/&lt;current_time&gt;</code>, but you can customize it in search.yaml).  The <code>policy</code> subdirectory will contain JSON files with policies found at each search phase's epoch.</p> <p><code>autoalbument-search</code> is a command wrapped with the <code>@hydra.main</code> decorator from Hydra. You can use all Hydra features when calling this command.</p> <p>AutoAlbument uses PyTorch to search for augmentation policies. You can speed up the search by using a CUDA-capable GPU.</p>"},{"location":"autoalbument/how_to_use/#step-3-use-albumentations-to-load-augmentation-policies-and-utilize-them-in-your-training-pipeline","title":"Step 3. Use Albumentations to load augmentation policies and utilize them in your training pipeline.","text":"<p>AutoAlbument produces a JSON file that contains a configuration for an augmentation pipeline. You can load that JSON file with Albumentations:</p> Text Only<pre><code>import albumentations as A\ntransform = A.load(\"/path/to/policy.json\")\n</code></pre> <p>Then you can use the created augmentation pipeline to augment the input data.</p> <p>For example, to augment an image for a classification task:</p> Text Only<pre><code>transformed = transform(image=image)\ntransformed_image = transformed[\"image\"]\n</code></pre> <p>To augment an image and a mask for a semantic segmentation task: Text Only<pre><code>transformed = transform(image=image, mask=mask)\ntransformed_image = transformed[\"image\"]\ntransformed_mask = transformed[\"mask\"]\n</code></pre></p>"},{"location":"autoalbument/how_to_use/#additional-resources","title":"Additional resources","text":"<ul> <li> <p>You can read more about the most important configuration parameters for AutoAlbument in Tuning the search parameters.</p> </li> <li> <p>To see examples of configuration files and custom PyTorch Datasets, please refer to Examples</p> </li> <li> <p>You can read more about using Albumentations for augmentation in those articles Image augmentation for classification, Mask augmentation for segmentation.</p> </li> <li> <p>Refer to this section of the documentation to get examples of how to use Albumentations with PyTorch and TensorFlow 2.</p> </li> </ul>"},{"location":"autoalbument/installation/","title":"Installation","text":"<p>AutoAlbument requires Python 3.6 or higher.</p>"},{"location":"autoalbument/installation/#pypi","title":"PyPI","text":"<p>To install the latest stable version from PyPI:</p> <p><code>pip install -U autoalbument</code></p>"},{"location":"autoalbument/installation/#github","title":"GitHub","text":"<p>To install the latest version from GitHub:</p> <p><code>pip install -U git+https://github.com/albumentations-team/autoalbument</code></p>"},{"location":"autoalbument/introduction/","title":"AutoAlbument introduction and core concepts","text":""},{"location":"autoalbument/introduction/#what-is-autoalbument","title":"What is AutoAlbument","text":"<p>AutoAlbument is a tool that automatically searches for the best augmentation policies for your data.</p> <p>Under the hood, it uses the Faster AutoAugment algorithm. Briefly speaking, the idea is to use a GAN-like architecture in which Generator applies augmentation to some input images, and Discriminator must determine whether an image was or wasn't augmented. This process helps to find augmentation policies that will produce images similar to the original images.</p>"},{"location":"autoalbument/introduction/#how-to-use-autoalbument","title":"How to use AutoAlbument","text":"<p>To use AutoAlbument, you need to define two things: a PyTorch Dataset for your data and configuration parameters for AutoAlbument. You can read the detailed instruction in the How to use AutoAlbument article.</p> <p>Internally AutoAlbument uses PyTorch Lightning for training a GAN and Hydra for handling configuration parameters.</p> <p>Here are a few things about AutoAlbument and Hydra.</p>"},{"location":"autoalbument/introduction/#hydra","title":"Hydra","text":"<p>The main internal configuration file is located at autoalbument/cli/conf/config.yaml</p> <p>Here is its content:</p> Text Only<pre><code>defaults:\n - _version\n - task\n - policy_model: default\n - classification_model: default\n - semantic_segmentation_model: default\n - data: default\n - searcher: default\n - trainer: default\n - optim: default\n - callbacks: default\n - logger: default\n - hydra: default\n - seed\n - search\n</code></pre> <p>Basically, it includes a bunch of config files with default values. Those config files are split into sets of closely related parameters such as model parameters or optimizer parameters. All default config files are located in their respective directories inside autoalbument/cli/conf</p> <p>The main config file also includes the <code>search.yaml</code> file, which you will use for overriding default parameters for your specific dataset and task (you can read more about creating the <code>search.yaml</code> file with <code>autoalbument-create</code> in How to use AutoAlbument)</p> <p>To allow great flexibility, AutoAlbument relies heavily on the <code>instantiate</code> function from Hydra. This function allows to define a path to a Python class in a YAML config (using the <code>_target_</code> parameter) along with arguments to that class, and Hydra will create an instance of this class with the provided arguments.</p> <p>As a practice example, if a config contains a definition like this:</p> Text Only<pre><code>_target_: autoalbument.faster_autoaugment.models.ClassificationModel\nnum_classes: 10\narchitecture: resnet18\npretrained: False\n</code></pre> <p>AutoAlbument will translate it approximately to the following call:</p> Text Only<pre><code>from autoalbument.faster_autoaugment.models import ClassificationModel\n\nmodel = ClassificationModel(num_classes=10, architecture='resnet18', pretrained=False)\n</code></pre> <p>By relying on this feature, AutoAlbument allows customizing its behavior without changing the library's internal code.</p>"},{"location":"autoalbument/introduction/#pytorch-lightning","title":"PyTorch Lightning","text":"<p>AutoAlbument relies on PyTorch Lightning to train a GAN. In AutoAlbument configs, you can configure PyTorch Lightning by passing the appropriate arguments to Trainer through the <code>trainer</code> config or defining a list of Callbacks through the <code>callbacks</code> config.</p>"},{"location":"autoalbument/metrics/","title":"Metrics and their meaning","text":"<p>During the search phase, AutoAlbument outputs four metrics: <code>loss</code>, <code>d_loss</code>, <code>a_loss</code>, and <code>Average Parameter Change</code> (at the end of an epoch).</p>"},{"location":"autoalbument/metrics/#a_loss","title":"a_loss","text":"<p><code>a_loss</code> is a loss for the policy network (or Generator in terms of GAN), which applies augmentations to input images.</p>"},{"location":"autoalbument/metrics/#d_loss","title":"d_loss","text":"<p><code>d_loss</code> is a loss for the Discriminator, the network that tries to guess whether the input image is an augmented or non-augmented one.</p>"},{"location":"autoalbument/metrics/#loss","title":"loss","text":"<p><code>loss</code> is a task-specific loss (<code>CrossEntropyLoss</code> for classification, <code>BCEWithLogitsLoss</code> for semantic segmentation) that acts as a regularizer and prevents the policy network from applying such augmentations that will make an object with class A looks like an object with class B.</p>"},{"location":"autoalbument/metrics/#average-parameter-change","title":"Average Parameter Change","text":"<p><code>Average Parameter Change</code> is a difference between magnitudes of augmentation parameters multiplied by their probabilities at the end of an epoch and the same parameters at the beginning of the epoch. The metric is calculated using the following formula:</p> <p></p> <ul> <li><code>m'</code>  and <code>m</code> are magnitude values for the i-th augmentation at the end and the beginning of the epoch, respectively.</li> <li><code>p'</code>  and <code>p</code> are probability values for the i-th augmentation at the end and the beginning of the epoch, respectively.</li> </ul> <p>The intuition behind this metric is that at the beginning, augmentation parameters are initialized at random, so they are now optimal and prone to heavy change at each epoch. After some time, these parameters should begin to converge, and they should change less at each epoch.</p>"},{"location":"autoalbument/metrics/#examples-for-metric-values","title":"Examples for metric values","text":"<p>Below are TensorBoard logs for AutoAlbument on different datasets. The search was performed using AutoAlbument configs from the examples directory.</p> <ul> <li>CIFAR10</li> <li>SVHN</li> <li>ImageNet</li> <li>Pascal VOC</li> <li>Cityscapes</li> </ul> <p>As you see, in all these charts, <code>loss</code> is slightly decreasing at each epoch, and <code>a_loss</code> or <code>d_loss</code> could either decrease or increase. <code>Average Parameter Change</code> is usually large at first epochs, but then it starts to decrease. As a rule of thumb, to decide whether you should stop AutoAlbument search and use the resulting policy, you should check that <code>Average Parameter Change</code> is stopped decreasing and started to oscillate, wait for a few more epochs, and use the found policy from that epoch.</p> <p>In autoalbument-benchmaks, we use AutoAlbument policies produced by the last epoch on these charts.</p>"},{"location":"autoalbument/search_algorithms/","title":"Search algorithms","text":"<p>AutoAlbument uses the following algorithms to search for augmentation policies.</p>"},{"location":"autoalbument/search_algorithms/#faster-autoaugment","title":"Faster AutoAugment","text":"<p>\"Faster AutoAugment: Learning Augmentation Strategies using Backpropagation\" by Ryuichiro Hataya, Jan Zdenek, Kazuki Yoshizoe, and Hideki Nakayama. Paper | Original implementation</p>"},{"location":"autoalbument/tuning_parameters/","title":"Tuning the search parameters","text":"<p>The <code>search.yaml</code> file contains parameters for the search of augmentation policies. Here is an example <code>search.yaml</code> for image classification on the CIFAR-10 dataset, and here is an example <code>search.yaml</code> for semantic segmentation on the Pascal VOC dataset.</p>"},{"location":"autoalbument/tuning_parameters/#task-specific-model","title":"Task-specific model","text":"<p>A task-specific model is a model that classifies images for a classification task or outputs masks for a semantic segmentation task. Settings for a task-specific model are defined by either <code>classification_model</code> or <code>semantic_segmentation_model</code> depending on a selected task. Ideally, you should select the same model (the same architecture and the same pretrained weights) that you will use in an actual task. AutoAlbument uses models from PyTorch Image Models and Segmentation models packages for classification and semantic segmentation respectively.</p>"},{"location":"autoalbument/tuning_parameters/#base-pytorch-parameters","title":"Base PyTorch parameters.","text":"<p>You may want to adjust the following parameters for a PyTorch pipeline:</p> <ul> <li><code>data.dataloader</code> parameters such as batch_size and <code>num_workers</code></li> <li>Number of epochs to search for best augmentation policies in <code>optim.epochs</code>.</li> <li>Learning rate for optimizers in <code>optim.main.lr</code> and <code>optim.policy.lr</code>.</li> </ul>"},{"location":"autoalbument/tuning_parameters/#parameters-for-the-augmentations-search","title":"Parameters for the augmentations search.","text":"<p>Those parameters are defined in <code>policy_model</code>. You may want to tune the following ones:</p> <ul> <li> <p><code>num_sub_policies</code> - number of distinct augmentation sub-policies. A random sub-policy is selected in each iteration, and that sub-policy is applied to input data. The larger number of sub-policies will produce a more diverse set of augmentations. On the other side, the more sub-policies you have, the more time and data you need to tune those sub-policies correctly.</p> </li> <li> <p><code>num_chunks</code> controls the balance between speed and diversity of augmentations in a search phase. Each batch is split-up into <code>num_chunks</code> chunks, and then a random sub-policy is applied to each chunk separately. The larger the value of <code>num_chunks</code> helps to learn augmentation policies better but simultaneously increases the searching time. Authors of FasterAutoAugment used such values for <code>num_chunks</code> that each chunk consisted of 8 to 16 images.</p> </li> <li> <p><code>operation_count</code> - the number of augmentation operations that will be applied to each input data instance. For example, <code>operation_count: 1</code> means that only one operation will be applied to an input image/mask, and <code>operation_count: 4</code> means that four sequential operations will be applied to each input image/mask. The larger number of operations produces a more diverse set of augmentations but simultaneously increases the searching time.</p> </li> </ul>"},{"location":"autoalbument/tuning_parameters/#preprocessing-transforms","title":"Preprocessing transforms","text":"<p>If images have different sizes or you want to train a model on image patches, you could define preprocessing transforms (such as Resizing, Cropping, and Padding) in <code>data.preprocessing</code>. Those transforms will always be applied to all input data. Found augmentation policies will also contain those preprocessing transforms.</p> <p>Note that it is crucial for Policy Model (a model that searches for augmentation parameters) to receive images of the same size that will be used during the training of an actual model. For some augmentations, parameters depend on input data's height and width (for example, hole sizes for the Cutout augmentation).</p>"},{"location":"autoalbument/examples/cifar10/","title":"Image classification on the CIFAR10 dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/cifar10</p>"},{"location":"autoalbument/examples/cifar10/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/cifar10/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/cifar10/#modelpy","title":"model.py","text":"Python"},{"location":"autoalbument/examples/cityscapes/","title":"Semantic segmentation on Cityscapes dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/cityscapes</p>"},{"location":"autoalbument/examples/cityscapes/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/cityscapes/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/imagenet/","title":"Image classification on the ImageNet dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/imagenet</p>"},{"location":"autoalbument/examples/imagenet/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/imagenet/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/list/","title":"List of examples","text":"<ul> <li>Image classification on the CIFAR10 dataset.</li> <li>Image classification on the SVHN dataset.</li> <li>Image classification on the ImageNet dataset.</li> <li>Semantic segmentation on the Pascal VOC dataset.</li> <li>Semantic segmentation on the Cityscapes dataset.</li> </ul> <p>To run the search with an example config:</p> Bash<pre><code>autoalbument-search --config-dir &lt;/path/to/directory_with_dataset.py_and_search.yaml&gt;\n</code></pre>"},{"location":"autoalbument/examples/pascal_voc/","title":"Semantic segmentation on the Pascal VOC dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/pascal_voc</p>"},{"location":"autoalbument/examples/pascal_voc/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/pascal_voc/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/svhn/","title":"Image classification on the SVHN dataset","text":"<p>The following files are also available on GitHub - https://github.com/albumentations-team/autoalbument/tree/master/examples/svhn</p>"},{"location":"autoalbument/examples/svhn/#datasetpy","title":"dataset.py","text":"Python"},{"location":"autoalbument/examples/svhn/#searchyaml","title":"search.yaml","text":"YAML"},{"location":"autoalbument/examples/svhn/#modelpy","title":"model.py","text":"Python"},{"location":"contributing/coding_guidelines/","title":"Coding Guidelines","text":"<p>This document outlines the coding standards and best practices for contributing to Albumentations.</p>"},{"location":"contributing/coding_guidelines/#important-note-about-guidelines","title":"Important Note About Guidelines","text":"<p>These guidelines represent our current best practices, developed through experience maintaining and expanding the Albumentations codebase. While some existing code may not strictly follow these standards (due to historical reasons), we are gradually refactoring the codebase to align with these guidelines.</p> <p>For new contributions:</p> <ul> <li>All new code must follow these guidelines</li> <li>All modifications to existing code should move it closer to these standards</li> <li>Pull requests that introduce patterns we're trying to move away from will not be accepted</li> </ul> <p>For existing code:</p> <ul> <li>You may encounter patterns that don't match these guidelines (e.g., transforms with \"Random\" prefix or Union types for parameters)</li> <li>These are considered technical debt that we're working to address</li> <li>When modifying existing code, take the opportunity to align it with current standards where possible</li> </ul>"},{"location":"contributing/coding_guidelines/#code-style-and-formatting","title":"Code Style and Formatting","text":""},{"location":"contributing/coding_guidelines/#pre-commit-hooks","title":"Pre-commit Hooks","text":"<p>We use pre-commit hooks to maintain consistent code quality. These hooks automatically check and format your code before each commit.</p> <ul> <li>Install pre-commit if you haven't already:</li> </ul> Bash<pre><code>pip install pre-commit\npre-commit install\n</code></pre> <ul> <li>The hooks will run automatically on <code>git commit</code>. To run manually:</li> </ul> Bash<pre><code>pre-commit run --files $(find albumentations -type f)\n</code></pre>"},{"location":"contributing/coding_guidelines/#python-version-and-type-hints","title":"Python Version and Type Hints","text":"<ul> <li>Use Python 3.9+ features and syntax</li> <li>Always include type hints using Python 3.10+ typing syntax:</li> </ul> Python<pre><code># Correct\ndef transform(self, value: float, range: tuple[float, float]) -&gt; float:\n\n# Incorrect - don't use capital-case types\ndef transform(self, value: float, range: Tuple[float, float]) -&gt; Float:\n</code></pre> <ul> <li>Use <code>|</code> instead of <code>Union</code> and for optional types:</li> </ul> Python<pre><code># Correct\ndef process(value: int | float | None) -&gt; str:\n\n# Incorrect\ndef process(value: Optional[Union[int, float]) -&gt; str:\n</code></pre>"},{"location":"contributing/coding_guidelines/#naming-conventions","title":"Naming Conventions","text":""},{"location":"contributing/coding_guidelines/#transform-names","title":"Transform Names","text":"<ul> <li>Avoid adding \"Random\" prefix to new transforms</li> </ul> Python<pre><code># Correct\nclass Brightness(ImageOnlyTransform):\n\n# Incorrect (historical pattern)\nclass RandomBrightness(ImageOnlyTransform):\n</code></pre>"},{"location":"contributing/coding_guidelines/#parameter-naming","title":"Parameter Naming","text":"<ul> <li>Use <code>_range</code> suffix for interval parameters:</li> </ul> Python<pre><code># Correct\nbrightness_range: tuple[float, float]\nshadow_intensity_range: tuple[float, float]\n\n# Incorrect\nbrightness_limit: tuple[float, float]\nshadow_intensity: tuple[float, float]\n</code></pre>"},{"location":"contributing/coding_guidelines/#standard-parameter-names","title":"Standard Parameter Names","text":"<p>For transforms that handle gaps or boundaries, use these consistent names:</p> <ul> <li><code>border_mode</code>: Specifies how to handle gaps, not <code>mode</code> or <code>pad_mode</code></li> <li><code>fill</code>: Defines how to fill holes (pixel value or method), not <code>fill_value</code>, <code>cval</code>, <code>fill_color</code>, <code>pad_value</code>, <code>pad_cval</code>, <code>value</code>, <code>color</code></li> <li><code>fill_mask</code>: Same as <code>fill</code> but for mask filling, not <code>fill_mask_value</code>, <code>fill_mask_color</code>, <code>fill_mask_cval</code></li> </ul>"},{"location":"contributing/coding_guidelines/#parameter-types-and-ranges","title":"Parameter Types and Ranges","text":""},{"location":"contributing/coding_guidelines/#parameter-definitions","title":"Parameter Definitions","text":"<ul> <li>Prefer range parameters over fixed values:</li> </ul> Python<pre><code># Correct\ndef __init__(self, brightness_range: tuple[float, float] = (-0.2, 0.2)):\n\n# Avoid\ndef __init__(self, brightness: float = 0.2):\n</code></pre>"},{"location":"contributing/coding_guidelines/#avoid-union-types-for-parameters","title":"Avoid Union Types for Parameters","text":"<ul> <li>Don't use <code>Union[float, tuple[float, float]]</code> for parameters</li> <li>Instead, always use ranges where sampling is needed:</li> </ul> Python<pre><code># Correct\nscale_range: tuple[float, float] = (0.5, 1.5)\n\n# Avoid\nscale: float | tuple[float, float] = 1.0\n</code></pre> <ul> <li>For fixed values, use same value for both range ends:</li> </ul> Python<pre><code>brightness_range = (0.1, 0.1)  # Fixed brightness of 0.1\n</code></pre>"},{"location":"contributing/coding_guidelines/#transform-design-principles","title":"Transform Design Principles","text":""},{"location":"contributing/coding_guidelines/#relative-parameters","title":"Relative Parameters","text":"<ul> <li>Prefer parameters that are relative to image dimensions rather than fixed pixel values:</li> </ul> Python<pre><code># Correct - relative to image size\ndef __init__(self, crop_size_range: tuple[float, float] = (0.1, 0.3)):\n    # crop_size will be fraction of min(height, width)\n\n# Avoid - fixed pixel values\ndef __init__(self, crop_size_range: tuple[int, int] = (32, 96)):\n    # crop_size will be fixed regardless of image size\n</code></pre>"},{"location":"contributing/coding_guidelines/#data-type-consistency","title":"Data Type Consistency","text":"<ul> <li>Ensure transforms produce consistent results regardless of input data type</li> <li>Use provided decorators to handle type conversions:</li> <li><code>@uint8_io</code>: For transforms that work with uint8 images</li> <li><code>@float32_io</code>: For transforms that work with float32 images</li> </ul> <p>The decorators will:</p> <ul> <li>Pass through images that are already in the target type without conversion</li> <li>Convert other types as needed and convert back after processing</li> </ul> Python<pre><code>@uint8_io  # If input is uint8 =&gt; use as is; if float32 =&gt; convert to uint8, process, convert back\ndef apply(self, img: np.ndarray, **params) -&gt; np.ndarray:\n    # img is guaranteed to be uint8\n    # if input was float32 =&gt; result will be converted back to float32\n    # if input was uint8 =&gt; result will stay uint8\n    return cv2.blur(img, (3, 3))\n\n@float32_io  # If input is float32 =&gt; use as is; if uint8 =&gt; convert to float32, process, convert back\ndef apply(self, img: np.ndarray, **params) -&gt; np.ndarray:\n    # img is guaranteed to be float32 in range [0, 1]\n    # if input was uint8 =&gt; result will be converted back to uint8\n    # if input was float32 =&gt; result will stay float32\n    return img * 0.5\n\n# Avoid - manual type conversion\ndef apply(self, img: np.ndarray, **params) -&gt; np.ndarray:\n    if img.dtype != np.uint8:\n        img = (img * 255).clip(0, 255).astype(np.uint8)\n    result = cv2.blur(img, (3, 3))\n    if img.dtype != np.uint8:\n        result = result.astype(np.float32) / 255\n    return result\n</code></pre>"},{"location":"contributing/coding_guidelines/#channel-flexibility","title":"Channel Flexibility","text":"<ul> <li>Support arbitrary number of channels unless specifically constrained:</li> </ul> <p>```python   # Correct - works with any number of channels   def apply(self, img: np.ndarray, **params) -&gt; np.ndarray:       # img shape is (H, W, C), works for any C       return img * self.factor</p> <p># Also correct - explicitly requires RGB   def apply(self, img: np.ndarray, **params) -&gt; np.ndarray:       if img.shape[-1] != 3:           raise ValueError(\"Transform requires RGB image\")       return rgb_to_hsv(img)  # RGB-specific processing</p>"},{"location":"contributing/coding_guidelines/#random-number-generation","title":"Random Number Generation","text":""},{"location":"contributing/coding_guidelines/#using-random-generators","title":"Using Random Generators","text":"<ul> <li>Use class-level random generators instead of direct numpy or random calls:</li> </ul> Python<pre><code># Correct\nvalue = self.random_generator.uniform(0, 1, size=image.shape)\nchoice = self.py_random.choice(options)\n\n# Incorrect\nvalue = np.random.uniform(0, 1, size=image.shape)\nchoice = random.choice(options)\n</code></pre> <ul> <li>Prefer Python's standard library <code>random</code> over <code>numpy.random</code>:</li> </ul> Python<pre><code># Correct - using standard library random (faster)\nvalue = self.py_random.uniform(0, 1)\nchoice = self.py_random.choice(options)\n\n# Use numpy.random only when needed\nvalue = self.random_generator.randint(0, 255, size=image.shape)\n</code></pre>"},{"location":"contributing/coding_guidelines/#parameter-sampling","title":"Parameter Sampling","text":"<ul> <li>Handle all probability calculations in <code>get_params</code> or <code>get_params_dependent_on_data</code></li> <li>Don't perform random operations in <code>apply_xxx</code> or <code>__init__</code> methods:</li> </ul> Python<pre><code>def get_params(self):\n    return {\n        \"brightness\": self.random_generator.uniform(\n            self.brightness_range[0],\n            self.brightness_range[1]\n        )\n    }\n</code></pre>"},{"location":"contributing/coding_guidelines/#transform-development","title":"Transform Development","text":""},{"location":"contributing/coding_guidelines/#method-definitions","title":"Method Definitions","text":"<ul> <li>Don't use default arguments in <code>apply_xxx</code> methods:</li> </ul> Python<pre><code># Correct\ndef apply_to_mask(self, mask: np.ndarray, fill_mask: int) -&gt; np.ndarray:\n\n# Incorrect\ndef apply_to_mask(self, mask: np.ndarray, fill_mask: int = 0) -&gt; np.ndarray:\n</code></pre>"},{"location":"contributing/coding_guidelines/#parameter-generation","title":"Parameter Generation","text":""},{"location":"contributing/coding_guidelines/#using-get_params_dependent_on_data","title":"Using get_params_dependent_on_data","text":"<p>This method provides access to image shape and target data for parameter generation:</p> Python<pre><code>def get_params_dependent_on_data(\n    self,\n    params: dict[str, Any],\n    data: dict[str, Any]\n) -&gt; dict[str, Any]:\n    # Access image shape - always available\n    height, width = params[\"shape\"][:2]\n\n    # Access targets if they were passed to transform\n    image = data.get(\"image\")  # Original image\n    mask = data.get(\"mask\")    # Segmentation mask\n    bboxes = data.get(\"bboxes\")  # Bounding boxes\n    keypoints = data.get(\"keypoints\")  # Keypoint coordinates\n\n    # Example: Calculate parameters based on image size\n    crop_size = min(height, width) // 2\n    center_x = width // 2\n    center_y = height // 2\n\n    return {\n        \"crop_size\": crop_size,\n        \"center\": (center_x, center_y)\n    }\n</code></pre> <p>The method receives:</p> <ul> <li><code>params</code>: Dictionary containing image metadata, where <code>params[\"shape\"]</code> is always available</li> <li><code>data</code>: Dictionary containing all targets passed to the transform</li> </ul> <p>Use this method when you need to:</p> <ul> <li>Calculate parameters based on image dimensions</li> <li>Access target data for parameter generation</li> <li>Ensure transform parameters are appropriate for the input data</li> </ul>"},{"location":"contributing/coding_guidelines/#parameter-validation-with-initschema","title":"Parameter Validation with <code>InitSchema</code>","text":"<p>Each transform must include an <code>InitSchema</code> class that inherits from <code>BaseTransformInitSchema</code>. This class is responsible for:</p> <ul> <li>Validating input parameters before <code>__init__</code> execution</li> <li>Converting parameter types if needed</li> <li>Ensuring consistent parameter handling</li> </ul> Python<pre><code># Correct - full parameter validation\nclass RandomGravel(ImageOnlyTransform):\n    class InitSchema(BaseTransformInitSchema):\n      slant_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]\n      brightness_coefficient: float = Field(gt=0, le=1)\n\n\n  def __init__(self, slant_range: tuple[float, float], brightness_coefficient: float, p: float = 0.5):\n      super().__init__(p=p)\n      self.slant_range = slant_range\n      self.brightness_coefficient = brightness_coefficient\n</code></pre> Python<pre><code># Incorrect - missing InitSchema\nclass RandomGravel(ImageOnlyTransform):\n    def __init__(self, slant_range: tuple[float, float], brightness_coefficient: float, p: float = 0.5):\n        super().__init__(p=p)\n        self.slant_range = slant_range\n        self.brightness_coefficient = brightness_coefficient\n</code></pre>"},{"location":"contributing/coding_guidelines/#coordinate-systems","title":"Coordinate Systems","text":""},{"location":"contributing/coding_guidelines/#image-center-calculations","title":"Image Center Calculations","text":"<p>The center point calculation differs slightly between targets:</p> <ul> <li>For images, masks, and keypoints:</li> </ul> Python<pre><code># Correct - using helper function\nfrom albumentations.augmentations.geometric.functional import center\ncenter_x, center_y = center(image_shape)  # Returns ((width-1)/2, (height-1)/2)\n\n# Incorrect - manual calculation might miss the -1\ncenter_x = width / 2  # Wrong!\ncenter_y = height / 2  # Wrong!\n</code></pre> <ul> <li>For bounding boxes:</li> </ul> Python<pre><code># Correct - using helper function\nfrom albumentations.augmentations.geometric.functional import center_bbox\ncenter_x, center_y = center_bbox(image_shape)  # Returns (width/2, height/2)\n\n# Incorrect - using wrong center calculation\ncenter_x, center_y = center(image_shape)  # Wrong for bboxes!\n</code></pre> <p>This small difference is crucial for pixel-perfect accuracy. Always use the appropriate helper functions:</p> <ul> <li><code>center()</code> for image, mask, and keypoint transformations</li> <li><code>center_bbox()</code> for bounding box transformations</li> </ul>"},{"location":"contributing/coding_guidelines/#serialization-compatibility","title":"Serialization Compatibility","text":"<ul> <li>Ensure transforms work with both tuples and lists for range parameters</li> <li>Test serialization/deserialization with JSON and YAML formats</li> </ul>"},{"location":"contributing/coding_guidelines/#documentation","title":"Documentation","text":""},{"location":"contributing/coding_guidelines/#docstrings","title":"Docstrings","text":"<ul> <li>Use Google-style docstrings</li> <li>Include type information, parameter descriptions, and examples:</li> </ul> Python<pre><code>def transform(self, image: np.ndarray) -&gt; np.ndarray:\n    \"\"\"Apply brightness transformation to the image.\n\n    Args:\n        image: Input image in RGB format.\n\n    Returns:\n        Transformed image.\n\n    Examples:\n        &gt;&gt;&gt; transform = Brightness(brightness_range=(-0.2, 0.2))\n        &gt;&gt;&gt; transformed = transform(image=image)\n    \"\"\"\n</code></pre>"},{"location":"contributing/coding_guidelines/#comments","title":"Comments","text":"<ul> <li>Add comments for complex logic</li> <li>Explain why, not what (the code shows what)</li> <li>Keep comments up to date with code changes</li> </ul>"},{"location":"contributing/coding_guidelines/#updating-transform-documentation","title":"Updating Transform Documentation","text":"<p>When adding a new transform or modifying the targets of an existing one, you must update the transforms documentation in the README:</p> <ol> <li>Generate the updated documentation by running:</li> </ol> Bash<pre><code>python -m tools.make_transforms_docs make\n</code></pre> <ol> <li> <p>This will output a formatted list of all transforms and their supported targets</p> </li> <li> <p>Update the relevant section in README.md with the new information</p> </li> <li> <p>Ensure the documentation accurately reflects which targets (image, mask, bboxes, keypoints, etc.) are supported by each transform</p> </li> </ol> <p>This helps maintain accurate and up-to-date documentation about transform capabilities.</p>"},{"location":"contributing/coding_guidelines/#testing","title":"Testing","text":""},{"location":"contributing/coding_guidelines/#test-coverage","title":"Test Coverage","text":"<ul> <li>Write tests for all new functionality</li> <li>Include edge cases and error conditions</li> <li>Ensure reproducibility with fixed random seeds</li> </ul>"},{"location":"contributing/coding_guidelines/#test-organization","title":"Test Organization","text":"<ul> <li>Place tests in the appropriate module under <code>tests/</code></li> <li>Follow existing test patterns and naming conventions</li> <li>Use pytest fixtures when appropriate</li> </ul>"},{"location":"contributing/coding_guidelines/#code-review-guidelines","title":"Code Review Guidelines","text":"<p>Before submitting your PR:</p> <ol> <li>Run all tests</li> <li>Run pre-commit hooks</li> <li>Check type hints</li> <li>Update documentation if needed</li> <li>Ensure code follows these guidelines</li> </ol>"},{"location":"contributing/coding_guidelines/#getting-help","title":"Getting Help","text":"<p>If you have questions about these guidelines:</p> <ol> <li>Join our Discord community</li> <li>Open a GitHub issue</li> <li>Ask in your pull request</li> </ol>"},{"location":"contributing/environment_setup/","title":"Setting Up Your Development Environment","text":"<p>This guide will help you set up your development environment for contributing to Albumentations.</p>"},{"location":"contributing/environment_setup/#prerequisites","title":"Prerequisites","text":"<ul> <li>Python 3.9 or higher</li> <li>Git</li> <li>A GitHub account</li> </ul>"},{"location":"contributing/environment_setup/#step-by-step-setup","title":"Step-by-Step Setup","text":""},{"location":"contributing/environment_setup/#1-fork-and-clone-the-repository","title":"1. Fork and Clone the Repository","text":"<ol> <li>Fork the Albumentations repository on GitHub</li> <li>Clone your fork locally:</li> </ol> Bash<pre><code>git clone https://github.com/YOUR_USERNAME/albumentations.git\ncd albumentations\n</code></pre>"},{"location":"contributing/environment_setup/#2-create-a-virtual-environment","title":"2. Create a Virtual Environment","text":"<p>Choose the appropriate commands for your operating system:</p>"},{"location":"contributing/environment_setup/#linux-macos","title":"Linux / macOS","text":"Bash<pre><code>python3 -m venv env\nsource env/bin/activate\n</code></pre>"},{"location":"contributing/environment_setup/#windows-cmdexe","title":"Windows (cmd.exe)","text":"Bash<pre><code>python -m venv env\nenv\\Scripts\\activate.bat\n</code></pre>"},{"location":"contributing/environment_setup/#windows-powershell","title":"Windows (PowerShell)","text":"Bash<pre><code>python -m venv env\nenv\\Scripts\\activate.ps1\n</code></pre>"},{"location":"contributing/environment_setup/#3-install-dependencies","title":"3. Install Dependencies","text":"<ol> <li>Install the project in editable mode:</li> </ol> Bash<pre><code>pip install -e .\n</code></pre> <ol> <li>Install development dependencies:</li> </ol> Bash<pre><code>pip install -r requirements-dev.txt\n</code></pre>"},{"location":"contributing/environment_setup/#4-set-up-pre-commit-hooks","title":"4. Set Up Pre-commit Hooks","text":"<p>Pre-commit hooks help maintain code quality by automatically checking your changes before each commit.</p> <ol> <li>Install pre-commit:</li> </ol> Bash<pre><code>pip install pre-commit\n</code></pre> <ol> <li>Set up the hooks:</li> </ol> Bash<pre><code>pre-commit install\n</code></pre> <ol> <li>(Optional) Run hooks manually on all files:</li> </ol> Bash<pre><code>pre-commit run --files $(find albumentations -type f)\n</code></pre>"},{"location":"contributing/environment_setup/#verifying-your-setup","title":"Verifying Your Setup","text":""},{"location":"contributing/environment_setup/#run-tests","title":"Run Tests","text":"<p>Ensure everything is set up correctly by running the test suite:</p> Bash<pre><code>pytest\n</code></pre>"},{"location":"contributing/environment_setup/#common-issues-and-solutions","title":"Common Issues and Solutions","text":""},{"location":"contributing/environment_setup/#permission-errors","title":"Permission Errors","text":"<ul> <li>Linux/macOS: If you encounter permission errors, try using <code>sudo</code> for system-wide installations or consider using <code>--user</code> flag with pip</li> <li>Windows: Run your terminal as administrator if you encounter permission issues</li> </ul>"},{"location":"contributing/environment_setup/#virtual-environment-not-activating","title":"Virtual Environment Not Activating","text":"<ul> <li>Ensure you're in the correct directory</li> <li>Check that Python is properly installed and in your system PATH</li> <li>Try creating the virtual environment with the full Python path</li> </ul>"},{"location":"contributing/environment_setup/#import-errors-after-installation","title":"Import Errors After Installation","text":"<ul> <li>Verify that you're using the correct virtual environment</li> <li>Confirm that all dependencies were installed successfully</li> <li>Try reinstalling the package in editable mode</li> </ul>"},{"location":"contributing/environment_setup/#next-steps","title":"Next Steps","text":"<p>After setting up your environment:</p> <ol> <li>Create a new branch for your work</li> <li>Make your changes</li> <li>Run tests and pre-commit hooks</li> <li>Submit a pull request</li> </ol> <p>For more detailed information about contributing, please refer to Coding Guidelines</p>"},{"location":"contributing/environment_setup/#getting-help","title":"Getting Help","text":"<p>If you encounter any issues with the setup:</p> <ol> <li>Check our Discord community</li> <li>Open an issue on GitHub</li> <li>Review existing issues for similar problems and solutions</li> </ol>"},{"location":"examples/","title":"List of examples","text":"<ul> <li>Defining a simple augmentation pipeline for image augmentation</li> <li>Using Albumentations to augment bounding boxes for object detection tasks</li> <li>How to use Albumentations for detection tasks if you need to keep all bounding boxes</li> <li>Using Albumentations for a semantic segmentation task</li> <li>Using Albumentations to augment keypoints</li> <li>Applying the same augmentation with the same parameters to multiple images, masks, bounding boxes, or keypoints</li> <li>Weather augmentations in Albumentations</li> <li>Example of applying XYMasking transform</li> <li>Example of applying ChromaticAberration transform</li> <li>Example of applying Morphological transform</li> <li>Example of applying D4 transform</li> <li>Example of applying RandomGridShuffle transform</li> <li>Example of applying OverlayElements transform</li> <li>Example of applying TextImage transform</li> <li>Migrating from torchvision to Albumentations</li> <li>Debugging an augmentation pipeline with ReplayCompose</li> <li>How to save and load parameters of an augmentation pipeline</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> <li>How to save and load transforms to HuggingFace Hub.</li> </ul>"},{"location":"examples/#examples-of-how-to-use-albumentations-with-different-deep-learning-frameworks","title":"Examples of how to use Albumentations with different deep learning frameworks","text":"<ul> <li>PyTorch</li> <li>PyTorch and Albumentations for image classification</li> <li>PyTorch and Albumentations for semantic segmentation</li> <li>TensorFlow 2</li> <li>Using Albumentations with Tensorflow</li> </ul>"},{"location":"external_resources/blog_posts_podcasts_talks/","title":"Blog posts, podcasts, talks, and videos about Albumentations","text":""},{"location":"external_resources/blog_posts_podcasts_talks/#blog-posts","title":"Blog posts","text":"<ul> <li>Custom Image Augmentation with Keras. Solving CIFAR-10 with Albumentations and TPU on Google Colab..</li> <li>Road detection using segmentation models and albumentations libraries on Keras.</li> <li>Image Data Augmentation for TensorFlow 2, Keras and PyTorch with Albumentations in Python</li> <li>Explore image augmentations using a convenient tool</li> <li>Image Augmentation using PyTorch and Albumentations</li> <li>Employing the albumentation library in PyTorch workflows. Bonus: Helper for selecting appropriate values!</li> <li>Overview of Albumentations: Open-source library for advanced image augmentations</li> </ul>"},{"location":"external_resources/blog_posts_podcasts_talks/#podcasts-talks-and-videos","title":"Podcasts, talks, and videos","text":"<ul> <li>PyConBY 2020: Eugene Khvedchenya - Albumentations: Fast and Flexible image augmentations</li> <li>Albumentations Framework: a fast image augmentations library | Interview with Dr. Vladimir Iglovikov</li> <li>Image Data Augmentation for TensorFlow 2, Keras and PyTorch with Albumentations in Python</li> <li>Bengali.AI competition - Ch 5. Image augmentations using albumentations</li> <li>Albumentations Tutorial for Data Augmentation</li> </ul>"},{"location":"external_resources/books/","title":"Books that mention Albumentations","text":"<ul> <li>Deep Learning For Dummies. John Paul Mueller, Luca Massaron. May 2019.</li> <li>Data Science Programming All-in-One For Dummies. John Paul Mueller, Luca Massaron. January 2020.</li> <li>PyTorch Computer Vision Cookbook. Michael Avendi. March 2020.</li> <li>Approaching (Almost) Any Machine Learning Problem. Abhishek Thakur. June 2020.</li> </ul>"},{"location":"external_resources/online_courses/","title":"Online classes that cover Albumentations","text":""},{"location":"external_resources/online_courses/#udemy","title":"Udemy","text":"<ul> <li>Modern Computer Vision &amp; Deep Learning with Python &amp; PyTorch</li> <li>Deep Learning for Image Segmentation with Python &amp; Pytorch</li> <li>Deep Learning Masterclass with TensorFlow 2 Over 20 Projects</li> <li>Master Deep Learning for Computer Vision in TensorFlow</li> <li>Deep Learning : Image Classification with Tensorflow in 2024</li> <li>Deep learning with PyTorch | Medical Imaging Competitions</li> <li>Veri Art\u0131r\u0131m\u0131: Albumentations ile Projelerle Veri Art\u0131r\u0131m\u0131</li> <li>Mastering Advanced Representation Learning (CV)</li> </ul>"},{"location":"external_resources/online_courses/#coursera","title":"Coursera","text":"<ul> <li>Deep Learning with PyTorch : Image Segmentation</li> <li>Facial Keypoint Detection with PyTorch</li> <li>Deep Learning with PyTorch : Object Localization</li> <li>Aerial Image Segmentation with PyTorch</li> </ul>"},{"location":"getting_started/augmentation_mapping/","title":"Transform Library Comparison Guide","text":"<p>This guide helps you find equivalent transforms between Albumentations and other popular libraries (torchvision and Kornia).</p>"},{"location":"getting_started/augmentation_mapping/#key-differences","title":"Key Differences","text":""},{"location":"getting_started/augmentation_mapping/#compared-to-torchvision","title":"Compared to TorchVision","text":"<ul> <li>Albumentations operates on numpy arrays (TorchVision uses PyTorch tensors)</li> <li>More parameters for fine-tuning transformations</li> <li>Built-in support for mask augmentation</li> <li>Better handling of bounding boxes and keypoints</li> </ul>"},{"location":"getting_started/augmentation_mapping/#compared-to-kornia","title":"Compared to Kornia","text":"<ul> <li>CPU-based numpy operations (Kornia uses GPU tensors)</li> <li>More comprehensive support for detection/segmentation</li> <li>Generally better CPU performance</li> <li>Simpler API for common tasks</li> </ul>"},{"location":"getting_started/augmentation_mapping/#common-transform-mappings","title":"Common Transform Mappings","text":""},{"location":"getting_started/augmentation_mapping/#basic-geometric-transforms","title":"Basic Geometric Transforms","text":"TorchVision Transform Albumentations Equivalent Notes Resize Resize / LongestMaxSize - TorchVision's <code>Resize</code> combines two Albumentations behaviors:\u00a0\u00a01. When given (h,w): equivalent to Albumentations <code>Resize</code>\u00a0\u00a02. When given single int + max_size: similar to <code>LongestMaxSize</code>- Albumentations allows separate interpolation method for masks- TorchVision has <code>antialias</code> parameter, Albumentations doesn't ScaleJitter OneOf + multiple Resize - Can be approximated in Albumentations using OneOf container with multiple Resize transforms- Example: <code>transforms = A.OneOf([</code> <code>A.Resize(height=int(target_h * scale), width=int(target_w * scale))</code> <code>for scale in np.linspace(0.1, 2.0, num=20)</code> <code>])</code>- Not exactly the same as continuous random scaling, but provides similar functionality RandomShortestSize OneOf + SmallestMaxSize - Can be approximated in Albumentations using: <code>transforms = A.OneOf([</code> <code>A.SmallestMaxSize(max_size=size, max_height=max_size, max_width=max_size)</code> <code>for size in [480, 512, 544, 576, 608]</code> <code>])</code>- Randomly selects size for shortest side while maintaining aspect ratio- Optional <code>max_size</code> parameter limits longest side- TorchVision has <code>antialias</code> parameter, Albumentations doesn't RandomResize OneOf + Resize - TorchVision: randomly selects single size S between <code>min_size</code> and <code>max_size</code>, sets both <code>width</code> and <code>height</code> to <code>S</code>- No direct equivalent in Albumentations (RandomScale preserves aspect ratio)- Can be approximated using: <code>transforms = A.OneOf([</code> <code>A.Resize(size, size)</code> <code>for size in range(min_size, max_size + 1, step)</code> <code>])</code> RandomCrop RandomCrop - Both perform random cropping with similar core functionality- Key differences:\u00a0\u00a01. TorchVision accepts single int for square crop, Albumentations requires both height and width\u00a0\u00a02. Padding options differ:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports padding parameter for pre-padding\u00a0\u00a0\u00a0\u00a0- Albumentations: offers <code>pad_position</code> parameter ('center', 'top_left', etc.)\u00a0\u00a03. Fill value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports dict mapping for different types\u00a0\u00a0\u00a0\u00a0- Albumentations: separate <code>fill</code> and <code>fill_mask</code> parameters\u00a0\u00a04. Padding modes:\u00a0\u00a0\u00a0\u00a0- TorchVision: 'constant', 'edge', 'reflect', 'symmetric'\u00a0\u00a0\u00a0\u00a0- Albumentations: uses OpenCV border modes RandomResizedCrop RandomResizedCrop - Nearly identical functionality and parameters- Key differences:\u00a0\u00a01. TorchVision accepts single int for square output, Albumentations requires <code>(height, width)</code> tuple\u00a0\u00a02. Default values are the same <code>(scale=(0.08, 1.0), ratio=(0.75, 1.3333))</code>\u00a0\u00a03. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Separate <code>mask_interpolation</code> parameter\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomIoUCrop RandomSizedBBoxSafeCrop - Both ensure safe cropping with respect to bounding boxes- Key differences:\u00a0\u00a01. TorchVision:\u00a0\u00a0\u00a0\u00a0- Implements exact SSD paper approach\u00a0\u00a0\u00a0\u00a0- Uses IoU-based sampling strategy\u00a0\u00a0\u00a0\u00a0- Requires explicit sanitization of boxes after crop\u00a0\u00a02. Albumentations:\u00a0\u00a0\u00a0\u00a0- Simpler approach ensuring bbox safety\u00a0\u00a0\u00a0\u00a0- Directly specifies target size\u00a0\u00a0\u00a0\u00a0- Automatically handles bbox cleanup- For exact SSD-style cropping, might need custom implementation in Albumentations CenterCrop CenterCrop - Both crop the center part of the input- Key differences:\u00a0\u00a01. Size specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: accepts single int for square crop or (height, width) tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: requires separate height and width parameters\u00a0\u00a02. Padding behavior:\u00a0\u00a0\u00a0\u00a0- TorchVision: always pads with 0 if image is smaller\u00a0\u00a0\u00a0\u00a0- Albumentations: optional padding with <code>pad_if_needed</code>\u00a0\u00a03. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Configurable padding mode and position\u00a0\u00a0\u00a0\u00a0- Separate fill values for image and mask\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomHorizontalFlip HorizontalFlip - Identical functionality- Both have default probability p=0.5- Only naming difference: TorchVision includes \"Random\" in name RandomVerticalFlip VerticalFlip - Identical functionality- Both have default probability p=0.5- Only naming difference: TorchVision includes \"Random\" in name Pad Pad - Similar core padding functionality- Both support:\u00a0\u00a0- Single int for all sides\u00a0\u00a0- (pad_x, pad_y) for symmetric padding\u00a0\u00a0- (left, top, right, bottom) for per-side padding- Key differences:\u00a0\u00a01. Padding modes:\u00a0\u00a0\u00a0\u00a0- TorchVision: 'constant', 'edge', 'reflect', 'symmetric'\u00a0\u00a0\u00a0\u00a0- Albumentations: uses OpenCV border modes\u00a0\u00a02. Fill value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports dict mapping for different types\u00a0\u00a0\u00a0\u00a0- Albumentations: separate <code>fill</code> and <code>fill_mask</code> parameters\u00a0\u00a03. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomZoomOut RandomScale + PadIfNeeded - No direct equivalent in Albumentations- Can be approximated by combining: <code>A.Compose([</code> <code>A.RandomScale(scale_limit=(0.0, 3.0), p=0.5),  # scale_limit=(0.0, 3.0) maps to side_range=(1.0, 4.0)</code> <code>A.PadIfNeeded(min_height=height, min_width=width, border_mode=cv2.BORDER_CONSTANT, value=fill)</code> <code>])</code>- Key differences:\u00a0\u00a01. TorchVision implements specific SSD paper approach\u00a0\u00a02. Albumentations requires composition of two transforms RandomRotation Rotate - Similar core rotation functionality but with different parameters- Key differences:\u00a0\u00a01. Angle specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>degrees</code> parameter (-degrees, +degrees) or (min, max)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>limit</code> parameter (-limit, +limit) or (min, max)\u00a0\u00a02. Output size control:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>expand=True/False</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>crop_border=True/False</code>\u00a0\u00a03. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- Separate mask interpolation\u00a0\u00a0\u00a0\u00a0- Bbox rotation methods ('largest_box' or 'ellipse')\u00a0\u00a0\u00a0\u00a0- More border modes\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a04. Center specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: supports custom center point\u00a0\u00a0\u00a0\u00a0- Albumentations: always uses image center RandomAffine Affine - Both support core affine operations (translation, rotation, scale, shear)- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single parameters for each transform\u00a0\u00a0\u00a0\u00a0- Albumentations: more flexible with dict options for x/y axes\u00a0\u00a02. Scale handling:\u00a0\u00a0\u00a0\u00a0- Albumentations adds <code>keep_ratio</code> and <code>balanced_scale</code>\u00a0\u00a0\u00a0\u00a0- Albumentations supports independent x/y scaling\u00a0\u00a03. Translation:\u00a0\u00a0\u00a0\u00a0- TorchVision: fraction only\u00a0\u00a0\u00a0\u00a0- Albumentations: both percent and pixels\u00a0\u00a04. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- <code>fit_output</code> to adjust image plane\u00a0\u00a0\u00a0\u00a0- Separate mask interpolation\u00a0\u00a0\u00a0\u00a0- More border modes\u00a0\u00a0\u00a0\u00a0- Bbox rotation methods\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomPerspective Perspective - Both apply random perspective transformations- Key differences:\u00a0\u00a01. Distortion control:\u00a0\u00a0\u00a0\u00a0- TorchVision: single <code>distortion_scale</code> (0 to 1)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>scale</code> tuple for corner movement range\u00a0\u00a02. Output handling:\u00a0\u00a0\u00a0\u00a0- Albumentations adds <code>keep_size</code> and <code>fit_output</code> options\u00a0\u00a0\u00a0\u00a0- Can control whether to maintain original size\u00a0\u00a03. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- Separate mask interpolation\u00a0\u00a0\u00a0\u00a0- More border modes\u00a0\u00a0\u00a0\u00a0- Better control over output size and fitting ElasticTransform ElasticTransform - Similar core functionality: both apply elastic deformations to images- Key differences:\u00a0\u00a01. Parameters have opposite meanings:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>alpha</code> (displacement), <code>sigma</code> (smoothness)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>alpha</code> (smoothness), <code>sigma</code> (displacement)\u00a0\u00a02. Default values reflect this difference:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>alpha=50.0, sigma=5.0</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>alpha=1.0, sigma=50.0</code>- Note on implementation:\u00a0\u00a0- Albumentations follows Simard et al. 2003 paper more closely:\u00a0\u00a0\u00a0\u00a0- \u03c3 should be ~0.05 * image_size\u00a0\u00a0\u00a0\u00a0- \u03b1 should be proportional to \u03c3- Additional Albumentations features:\u00a0\u00a0- <code>approximate</code> mode\u00a0\u00a0- <code>same_dxdy</code> option\u00a0\u00a0- Choice of noise distribution\u00a0\u00a0- Separate mask interpolation ColorJitter ColorJitter - Similar core functionality: both randomly adjust brightness, contrast, saturation, and hue- Key similarities:\u00a0\u00a01. Same parameter names and meanings\u00a0\u00a02. Same value ranges (e.g., hue should be in [-0.5, 0.5])\u00a0\u00a03. Random order of transformations- Key differences:\u00a0\u00a01. Default values:\u00a0\u00a0\u00a0\u00a0- TorchVision: all None by default\u00a0\u00a0\u00a0\u00a0- Albumentations: defaults to (0.8, 1.2) for brightness/contrast/saturation\u00a0\u00a02. Implementation:\u00a0\u00a0\u00a0\u00a0- TorchVision: uses Pillow\u00a0\u00a0\u00a0\u00a0- Albumentations: uses OpenCV (may produce slightly different results)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Explicit probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- Value saturation instead of uint8 overflow RandomChannelPermutation ChannelShuffle - Both randomly permute image channels- Key similarities:\u00a0\u00a01. Same core functionality\u00a0\u00a02. Work on multi-channel images (typically RGB)- Key differences:\u00a0\u00a01. Naming convention only\u00a0\u00a02. Albumentations adds:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomPhotometricDistort RandomOrder + ColorJitter + ChannelShuffle - TorchVision's transform is from SSD paper, combines:\u00a0\u00a01. Color jittering (brightness, contrast, saturation, hue)\u00a0\u00a02. Random channel permutation- Can be replicated in Albumentations using: <code>A.RandomOrder([</code> <code>A.ColorJitter(brightness=(0.875, 1.125),</code> <code>contrast=(0.5, 1.5),</code> <code>saturation=(0.5, 1.5),</code> <code>hue=(-0.05, 0.05),</code> <code>p=0.5),</code> <code>A.ChannelShuffle(p=0.5)</code> <code>])</code> Grayscale ToGray - Similar core functionality: convert RGB to grayscale- Key differences:\u00a0\u00a01. Output channels:\u00a0\u00a0\u00a0\u00a0- TorchVision: only 1 or 3 channels\u00a0\u00a0\u00a0\u00a0- Albumentations: supports any number of output channels\u00a0\u00a02. Conversion methods:\u00a0\u00a0\u00a0\u00a0- TorchVision: single method (weighted RGB)\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple methods via <code>method</code> parameter:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 weighted_average (default, same as TorchVision)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 from_lab, desaturation, average, max, pca\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- More flexible channel handling RGB ToRGB - Similar core functionality: convert to RGB format- Key differences:\u00a0\u00a01. Input handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: accepts 1 or 3 channel inputs\u00a0\u00a0\u00a0\u00a0- Albumentations: only accepts single-channel inputs\u00a0\u00a02. Output channels:\u00a0\u00a0\u00a0\u00a0- TorchVision: always 3 channels\u00a0\u00a0\u00a0\u00a0- Albumentations: configurable via <code>num_output_channels</code>\u00a0\u00a03. Behavior:\u00a0\u00a0\u00a0\u00a0- TorchVision: converts to RGB if not already RGB\u00a0\u00a0\u00a0\u00a0- Albumentations: strictly grayscale to RGB conversion\u00a0\u00a04. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomGrayscale ToGray - Similar core functionality: convert to grayscale with probability- Key differences:\u00a0\u00a01. Default probability:\u00a0\u00a0\u00a0\u00a0- TorchVision: p=0.1\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a02. Output handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: always preserves input channels\u00a0\u00a0\u00a0\u00a0- Albumentations: configurable output channels\u00a0\u00a03. Conversion methods:\u00a0\u00a0\u00a0\u00a0- TorchVision: single method\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple methods with different channel support:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 weighted_average, from_lab: 3-channel only\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 desaturation, average, max, pca: any number of channels\u00a0\u00a04. Channel requirements:\u00a0\u00a0\u00a0\u00a0- TorchVision: works with 1 or 3 channels\u00a0\u00a0\u00a0\u00a0- Albumentations: depends on method chosen GaussianBlur GaussianBlur - Similar core functionality: apply Gaussian blur with random kernel size- Key similarities:\u00a0\u00a01. Both support random kernel sizes\u00a0\u00a02. Both support random sigma values- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: <code>kernel_size</code> (exact size), <code>sigma</code> (range)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> (size range), <code>sigma_limit</code> (range)\u00a0\u00a02. Kernel size constraints:\u00a0\u00a0\u00a0\u00a0- TorchVision: must specify exact size\u00a0\u00a0\u00a0\u00a0- Albumentations: can specify range (3, 7) or auto-compute\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- Auto-computation of kernel size from sigma GaussianNoise GaussNoise - Similar core functionality: add Gaussian noise to images- Key similarities:\u00a0\u00a01. Both support mean and standard deviation parameters- Key differences:\u00a0\u00a01. Parameter ranges:\u00a0\u00a0\u00a0\u00a0- TorchVision: fixed values for mean and sigma\u00a0\u00a0\u00a0\u00a0- Albumentations: ranges for both (<code>std_range</code>, <code>mean_range</code>)\u00a0\u00a02. Value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: expects float [0,1], has clip option\u00a0\u00a0\u00a0\u00a0- Albumentations: auto-scales based on dtype\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Per-channel noise option\u00a0\u00a0\u00a0\u00a0- Noise scale factor for performance\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomInvert InvertImg - Similar core functionality: invert image colors- Key similarities:\u00a0\u00a01. Both invert pixel values\u00a0\u00a02. Both have default probability of 0.5- Key differences:\u00a0\u00a01. Value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: works with [0,1] float tensors\u00a0\u00a0\u00a0\u00a0- Albumentations: auto-handles uint8 (255) and float32 (1.0) RandomPosterize Posterize - Similar core functionality: reduce color bits- Key similarities:\u00a0\u00a01. Both posterize images with probability p=0.5- Key differences:\u00a0\u00a01. Bits specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single fixed value [0-8]\u00a0\u00a0\u00a0\u00a0- Albumentations: flexible options with [1-7] (recommended):\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Single value for all channels\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Range (min_bits, max_bits)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Per-channel values [r,g,b]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 Per-channel ranges [(r_min,r_max), ...]\u00a0\u00a02. Practical range:\u00a0\u00a0\u00a0\u00a0- TorchVision: includes 0 (black) and 8 (unchanged)\u00a0\u00a0\u00a0\u00a0- Albumentations: recommended [1-7] for actual posterization RandomSolarize Solarize - Similar core functionality: invert pixels above threshold- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both invert values above threshold- Key differences:\u00a0\u00a01. Threshold specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single fixed threshold value\u00a0\u00a0\u00a0\u00a0- Albumentations: range via <code>threshold_range</code>\u00a0\u00a02. Value handling:\u00a0\u00a0\u00a0\u00a0- TorchVision: works with raw threshold values\u00a0\u00a0\u00a0\u00a0- Albumentations: uses normalized [0,1] range:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 uint8: multiplied by 255\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 float32: multiplied by 1.0 RandomAdjustSharpness Sharpen - Similar core functionality: adjust image sharpness- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- TorchVision: single <code>sharpness_factor</code>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 0: blurred\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 1: original\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 2: doubled sharpness\u00a0\u00a0\u00a0\u00a0- Albumentations: more controls:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 <code>alpha</code>: effect visibility [0,1]\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 <code>lightness</code>: contrast control\u00a0\u00a02. Method options:\u00a0\u00a0\u00a0\u00a0- TorchVision: single method\u00a0\u00a0\u00a0\u00a0- Albumentations: two methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 'kernel': Laplacian operator\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 'gaussian': blur interpolation RandomAutocontrast AutoContrast Same core functionality with identical parameters (p=0.5) RandomEqualize Equalize - Similar core functionality: histogram equalization- Key similarities:\u00a0\u00a01. Both have default probability <code>p=0.5</code>- Key differences:\u00a0\u00a01. Additional Albumentations features:\u00a0\u00a0\u00a0\u00a0- Choice of algorithm (cv/pil methods)\u00a0\u00a0\u00a0\u00a0- Per-channel or luminance-based equalization\u00a0\u00a0\u00a0\u00a0- Optional masking support Normalize Normalize - Similar core functionality: normalize image values- Key similarities:\u00a0\u00a01. Both support mean/std normalization- Key differences:\u00a0\u00a01. Normalization options:\u00a0\u00a0\u00a0\u00a0- TorchVision: only (input - mean) / std\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 standard (same as TorchVision)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 image (global stats)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 image_per_channel\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 min_max\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 min_max_per_channel\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>max_pixel_value</code> parameter\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code> RandomErasing Erasing - Similar core functionality: randomly erase image regions- Key similarities:\u00a0\u00a01. Both have default probability <code>p=0.5</code>\u00a0\u00a02. Same default scale=(0.02, 0.33)\u00a0\u00a03. Same default ratio=(0.3, 3.3)- Key differences:\u00a0\u00a01. Fill value options:\u00a0\u00a0\u00a0\u00a0- TorchVision: number/tuple or 'random'\u00a0\u00a0\u00a0\u00a0- Albumentations: additional options:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 random_uniform\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 inpaint_telea\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u2022 inpaint_ns\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Mask fill value option\u00a0\u00a0\u00a0\u00a0- Support for masks, bboxes, keypoints JPEG ImageCompression - Similar core functionality: apply JPEG compression- Key similarities:\u00a0\u00a01. Both use quality range 1-100\u00a0\u00a02. Both support quality ranges- Key differences:\u00a0\u00a01. Compression types:\u00a0\u00a0\u00a0\u00a0- TorchVision: JPEG only\u00a0\u00a0\u00a0\u00a0- Albumentations: JPEG and WebP\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Probability parameter <code>p</code>\u00a0\u00a0\u00a0\u00a0- Default quality range (99, 100)"},{"location":"getting_started/augmentation_mapping/#kornia-to-albumentations","title":"Kornia to Albumentations","text":"Kornia Albumentations Notes ColorJitter ColorJitter - Similar core functionality: randomly adjust brightness, contrast, saturation, and hue- Key similarities:\u00a0\u00a01. Both support same parameters (brightness, contrast, saturation, hue)\u00a0\u00a02. Both allow float or tuple ranges for parameters- Key differences:\u00a0\u00a01. Default values:\u00a0\u00a0\u00a0\u00a0- Albumentations: (0.8, 1.2) for brightness/contrast/saturation\u00a0\u00a0\u00a0\u00a0- Kornia: 0.0 for all parameters\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a03. Note: Kornia recommends using <code>ColorJiggle</code> instead as it follows color theory better RandomAutoContrast AutoContrast - Similar core functionality: enhance image contrast automatically- Key similarities:\u00a0\u00a01. Both stretch intensity range to use full range\u00a0\u00a02. Both preserve relative intensities- Key differences:\u00a0\u00a01. Default probability:\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>clip_output</code> parameter to control value clipping RandomBoxBlur Blur - Similar core functionality: apply box/average blur to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both apply box/average blur filter- Key differences:\u00a0\u00a01. Kernel size specification:\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> parameter for range (e.g., (3, 7))\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>kernel_size</code> tuple (default (3, 3))\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>border_type</code> parameter ('reflect', 'replicate', 'circular')\u00a0\u00a0\u00a0\u00a0- <code>normalized</code> parameter for L1 norm control RandomBrightness RandomBrightnessContrast - Different scope:\u00a0\u00a0- Kornia: brightness only\u00a0\u00a0- Albumentations: combines brightness and contrast- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>brightness</code> tuple (default: (1.0, 1.0))\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>brightness_limit</code> (default: (-0.2, 0.2))\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>brightness_by_max</code> parameter for adjustment method\u00a0\u00a0\u00a0\u00a0- <code>ensure_safe_range</code> to prevent overflow/underflow\u00a0\u00a0\u00a0\u00a0- Combined contrast control\u00a0\u00a04. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>clip_output</code> parameter RandomChannelDropout ChannelDropout - Similar core functionality: randomly drop image channels- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both allow specifying fill value for dropped channels- Key differences:\u00a0\u00a01. Channel drop specification:\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>num_drop_channels</code> (default: 1)\u00a0\u00a0\u00a0\u00a0- Albumentations: flexible <code>channel_drop_range</code> tuple (default: (1, 1))\u00a0\u00a02. Error handling:\u00a0\u00a0\u00a0\u00a0- Albumentations: explicit checks for single-channel images and invalid ranges\u00a0\u00a0\u00a0\u00a0- Kornia: simpler parameter validation RandomChannelShuffle ChannelShuffle - Identical core functionality: randomly shuffle image channels RandomClahe CLAHE - Similar core functionality: apply Contrast Limited Adaptive Histogram Equalization- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both allow configuring grid size and clip limit- Key differences:\u00a0\u00a01. Parameter defaults:\u00a0\u00a0\u00a0\u00a0- Kornia: clip_limit=(40.0, 40.0), grid_size=(8, 8)\u00a0\u00a0\u00a0\u00a0- Albumentations: clip_limit=(1, 4), tile_grid_size=(8, 8)\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>slow_and_differentiable</code> parameter for implementation choice RandomContrast RandomBrightnessContrast - Different scope:\u00a0\u00a0- Kornia: contrast only\u00a0\u00a0- Albumentations: combines brightness and contrast- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>contrast</code> tuple (default: (1.0, 1.0))\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>contrast_limit</code> (default: (-0.2, 0.2))\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>ensure_safe_range</code> to prevent overflow/underflow\u00a0\u00a0\u00a0\u00a0- Combined brightness control\u00a0\u00a04. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>clip_output</code> parameter RandomEqualize Equalize - Similar core functionality: apply histogram equalization- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>mode</code> parameter to choose between 'cv' and 'pil' methods\u00a0\u00a0\u00a0\u00a0- <code>by_channels</code> parameter for per-channel or luminance-based equalization\u00a0\u00a0\u00a0\u00a0- <code>mask</code> parameter to selectively apply equalization\u00a0\u00a0\u00a0\u00a0- <code>mask_params</code> for dynamic mask generation RandomGamma RandomGamma - Similar core functionality: apply random gamma correction- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: separate <code>gamma</code> (1.0, 1.0) and <code>gain</code> (1.0, 1.0) tuples\u00a0\u00a0\u00a0\u00a0- Albumentations: single <code>gamma_limit</code> (80, 120) as percentage range\u00a0\u00a02. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>eps</code> parameter to prevent numerical errors RandomGaussianBlur GaussianBlur - Similar core functionality: apply Gaussian blur with random parameters- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support kernel size and sigma parameters- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: requires explicit <code>kernel_size</code> and <code>sigma</code> range\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> (default: (3, 7)) and <code>sigma_limit</code> (default: 0)\u00a0\u00a02. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>border_type</code> parameter for padding mode\u00a0\u00a0\u00a0\u00a0- <code>separable</code> parameter for 1D convolution optimization RandomGaussianIllumination Illumination - Similar core functionality: apply illumination effects- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support controlling effect intensity and position- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: Gaussian illumination patterns only\u00a0\u00a0\u00a0\u00a0- Albumentations: Multiple modes (linear, corner, gaussian)\u00a0\u00a02. Parameter ranges:\u00a0\u00a0\u00a0\u00a0- Kornia: gain=(0.01, 0.15), center=(0.1, 0.9), sigma=(0.2, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: intensity_range=(0.01, 0.2), center_range=(0.1, 0.9), sigma_range=(0.2, 1.0)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>mode</code> parameter for different effect types\u00a0\u00a0\u00a0\u00a0- <code>effect_type</code> for brighten/darken control\u00a0\u00a0\u00a0\u00a0- <code>angle_range</code> for linear gradients\u00a0\u00a04. Additional in Kornia:\u00a0\u00a0\u00a0\u00a0- <code>sign</code> parameter for effect direction RandomGaussianNoise GaussNoise - Similar core functionality: add Gaussian noise to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>mean</code> (default: 0.0) and <code>std</code> (default: 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: ranges via <code>std_range</code> (0.2, 0.44) and <code>mean_range</code> (0.0, 0.0)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>per_channel</code> parameter for independent channel noise\u00a0\u00a0\u00a0\u00a0- <code>noise_scale_factor</code> for performance optimization\u00a0\u00a0\u00a0\u00a0- Automatic value scaling based on image dtype RandomGrayscale ToGray - Similar core functionality: convert images to grayscale- Key differences:\u00a0\u00a01. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=0.1\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5\u00a0\u00a02. Conversion options:\u00a0\u00a0\u00a0\u00a0- Kornia: customizable <code>rgb_weights</code> for channel mixing\u00a0\u00a0\u00a0\u00a0- Albumentations: multiple <code>method</code> options (weighted_average, from_lab, desaturation, average, max, pca)\u00a0\u00a03. Output control:\u00a0\u00a0\u00a0\u00a0- Kornia: always 3-channel output\u00a0\u00a0\u00a0\u00a0- Albumentations: configurable <code>num_output_channels</code> RandomHue ColorJitter (hue parameter) - Similar core functionality: adjust image hue- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: hue-only transform\u00a0\u00a0\u00a0\u00a0- Albumentations: part of ColorJitter with brightness, contrast, and saturation\u00a0\u00a02. Default values:\u00a0\u00a0\u00a0\u00a0- Kornia: hue=(0.0, 0.0), p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: hue=(-0.5, 0.5), p=0.5 RandomInvert InvertImg - Similar core functionality: invert image values- Key differences:\u00a0\u00a01. Maximum value handling:\u00a0\u00a0\u00a0\u00a0- Kornia: configurable via <code>max_val</code> parameter (default: 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: automatically determined by dtype (255 for uint8, 1.0 for float32) RandomJPEG ImageCompression - Similar core functionality: apply image compression- Key differences:\u00a0\u00a01. Compression options:\u00a0\u00a0\u00a0\u00a0- Kornia: JPEG only\u00a0\u00a0\u00a0\u00a0- Albumentations: supports both JPEG and WebP\u00a0\u00a02. Quality specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>jpeg_quality</code> (default: 50.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>quality_range</code> (default: (99, 100))\u00a0\u00a03. Default probability:\u00a0\u00a0\u00a0\u00a0- Kornia: p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: p=0.5 RandomLinearCornerIllumination Illumination (corner mode) - Similar core functionality: apply corner illumination effects- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: corner illumination only\u00a0\u00a0\u00a0\u00a0- Albumentations: part of general Illumination transform with multiple modes\u00a0\u00a02. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>gain</code> (0.01, 0.2) and <code>sign</code> (-1.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>intensity_range</code> (0.01, 0.2) and <code>effect_type</code> (brighten/darken/both)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Multiple illumination modes (linear, corner, gaussian)\u00a0\u00a0\u00a0\u00a0- More control over effect parameters RandomLinearIllumination Illumination (linear mode) - Similar core functionality: apply linear illumination effects- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: linear illumination only\u00a0\u00a0\u00a0\u00a0- Albumentations: part of general Illumination transform with multiple modes\u00a0\u00a02. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>gain</code> (0.01, 0.2) and <code>sign</code> (-1.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>intensity_range</code> (0.01, 0.2), <code>effect_type</code> (brighten/darken/both), and <code>angle_range</code> (0, 360)\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Multiple illumination modes (linear, corner, gaussian)\u00a0\u00a0\u00a0\u00a0- Explicit angle control for gradient direction RandomMedianBlur MedianBlur - Similar core functionality: apply median blur filter- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Kernel size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: fixed <code>kernel_size</code> tuple (default: (3, 3))\u00a0\u00a0\u00a0\u00a0- Albumentations: range via <code>blur_limit</code> (default: (3, 7))\u00a0\u00a02. Kernel constraints:\u00a0\u00a0\u00a0\u00a0- Albumentations: enforces odd kernel sizes RandomMotionBlur MotionBlur - Similar core functionality: apply directional motion blur- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support angle and direction control- Key differences:\u00a0\u00a01. Kernel size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>kernel_size</code> as int or tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>blur_limit</code> (default: (3, 7))\u00a0\u00a02. Angle control:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>angle</code> parameter with symmetric range (-angle, angle)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>angle_range</code> (default: (0, 360))\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>allow_shifted</code> parameter for kernel position control RandomPlanckianJitter PlanckianJitter - Similar core functionality: apply physics-based color temperature variations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support 'blackbody' and 'cied' modes- Key differences:\u00a0\u00a01. Temperature control:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>select_from</code> parameter for discrete jitter selection\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>temperature_limit</code> for continuous range\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>sampling_method</code> parameter ('uniform' or 'gaussian')\u00a0\u00a0\u00a0\u00a0- More detailed control over temperature ranges\u00a0\u00a0\u00a0\u00a0- Better documentation of physics-based effects RandomPlasmaBrightness PlasmaBrightnessContrast - Similar core functionality: apply fractal-based brightness adjustments- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Diamond-Square algorithm for pattern generation- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>roughness</code> (0.1, 0.7) and <code>intensity</code> (0.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>brightness_range</code> (-0.3, 0.3), <code>contrast_range</code> (-0.3, 0.3), <code>roughness</code> (default: 3.0)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Combined brightness and contrast adjustment\u00a0\u00a0\u00a0\u00a0- <code>plasma_size</code> parameter for pattern detail control\u00a0\u00a0\u00a0\u00a0- More detailed mathematical formulation and documentation RandomPlasmaContrast PlasmaBrightnessContrast - Similar core functionality: apply fractal-based contrast adjustments- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Diamond-Square algorithm for pattern generation- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>roughness</code> (0.1, 0.7) only\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>contrast_range</code> (-0.3, 0.3), <code>roughness</code> (default: 3.0), <code>plasma_size</code> (default: 256)\u00a0\u00a02. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: contrast-only adjustment\u00a0\u00a0\u00a0\u00a0- Albumentations: combined brightness and contrast adjustment\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- More detailed mathematical formulation\u00a0\u00a0\u00a0\u00a0- Pattern size control via <code>plasma_size</code> RandomPlasmaShadow PlasmaShadow - Similar core functionality: apply fractal-based shadow effects- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Diamond-Square algorithm for pattern generation- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>roughness</code> (0.1, 0.7), <code>shade_intensity</code> (-1.0, 0.0), <code>shade_quantity</code> (0.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>shadow_intensity_range</code> (0.3, 0.7), <code>plasma_size</code> (default: 256), <code>roughness</code> (default: 3.0)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Pattern size control via <code>plasma_size</code>\u00a0\u00a0\u00a0\u00a0- More intuitive intensity range (0 to 1)\u00a0\u00a0\u00a0\u00a0- More detailed mathematical formulation and documentation RandomPosterize Posterize - Similar core functionality: reduce color bits in image- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both operate on color bit reduction- Key differences:\u00a0\u00a01. Bit specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>bits</code> parameter (default: 3) with range (0, 8], can be float or tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>num_bits</code> parameter (default: 4) with range [1, 7], supports multiple formats:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single int for all channels\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Tuple for random range\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* List for per-channel specification\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* List of tuples for per-channel ranges\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- More flexible channel-wise control\u00a0\u00a0\u00a0\u00a0- More detailed documentation and mathematical background RandomRain RandomRain - Similar core functionality: add rain effects to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Rain parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>number_of_drops</code> (1000, 2000), <code>drop_height</code> (5, 20), <code>drop_width</code> (-5, 5)\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>slant_range</code> (-10, 10), <code>drop_length</code> (20), <code>drop_width</code> (1)\u00a0\u00a02. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- <code>drop_color</code> customization\u00a0\u00a0\u00a0\u00a0- <code>blur_value</code> for atmospheric effect\u00a0\u00a0\u00a0\u00a0- <code>brightness_coefficient</code> for lighting adjustment\u00a0\u00a0\u00a0\u00a0- <code>rain_type</code> presets (drizzle, heavy, torrential)\u00a0\u00a03. Approach:\u00a0\u00a0\u00a0\u00a0- Kornia: Direct drop placement\u00a0\u00a0\u00a0\u00a0- Albumentations: More realistic simulation with slant, blur, and brightness effects RandomRGBShift AdditiveNoise - Similar core functionality: add noise/shifts to image channels- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both can affect individual channels- Key differences:\u00a0\u00a01. Approach:\u00a0\u00a0\u00a0\u00a0- Kornia: Simple RGB channel shifts with individual limits\u00a0\u00a0\u00a0\u00a0- Albumentations: More sophisticated noise generation with multiple distributions\u00a0\u00a02. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>r_shift_limit</code>, <code>g_shift_limit</code>, <code>b_shift_limit</code> (all default: 0.5)\u00a0\u00a0\u00a0\u00a0- Albumentations: Flexible noise configuration with:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Multiple noise types (uniform, gaussian, laplace, beta)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Different spatial modes (constant, per_pixel, shared)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Customizable distribution parameters\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Performance optimization options\u00a0\u00a0\u00a0\u00a0- More detailed control over noise distribution\u00a0\u00a0\u00a0\u00a0- Spatial application modes RandomSaltAndPepperNoise SaltAndPepper - Similar core functionality: apply salt and pepper noise to images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use same default parameters:\u00a0\u00a0\u00a0\u00a0- <code>amount</code> (0.01, 0.06)\u00a0\u00a0\u00a0\u00a0- <code>salt_vs_pepper</code> (0.4, 0.6)- Key differences:\u00a0\u00a01. Parameter flexibility:\u00a0\u00a0\u00a0\u00a0- Kornia: Supports single float or tuple for parameters\u00a0\u00a0\u00a0\u00a0- Albumentations: Requires tuples for ranges\u00a0\u00a02. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Detailed mathematical formulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Clear examples for different noise levels\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Implementation notes and edge cases\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* References to academic sources RandomSaturation ColorJitter - Different scope and functionality:- Key differences:\u00a0\u00a01. Scope:\u00a0\u00a0\u00a0\u00a0- Kornia: Saturation-only adjustment\u00a0\u00a0\u00a0\u00a0- Albumentations: Combined brightness, contrast, saturation, and hue adjustment\u00a0\u00a02. Default parameters:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>saturation</code> (1.0, 1.0), p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>saturation</code> (0.8, 1.2), p=0.5\u00a0\u00a03. Implementation:\u00a0\u00a0\u00a0\u00a0- Kornia: Aligns with PIL/TorchVision implementation\u00a0\u00a0\u00a0\u00a0- Albumentations: Uses OpenCV with noted differences in HSV conversion\u00a0\u00a04. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Brightness adjustment\u00a0\u00a0\u00a0\u00a0- Contrast adjustment\u00a0\u00a0\u00a0\u00a0- Hue adjustment\u00a0\u00a0\u00a0\u00a0- Random order of transformations RandomSharpness Sharpen - Similar core functionality: sharpen images- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>sharpness</code> parameter (default: 0.5)\u00a0\u00a0\u00a0\u00a0- Albumentations: More detailed control with:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>alpha</code> (0.2, 0.5) for effect visibility\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>lightness</code> (0.5, 1.0) for contrast\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>method</code> choice ('kernel' or 'gaussian')\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>kernel_size</code> and <code>sigma</code> for gaussian method\u00a0\u00a02. Implementation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: Single approach\u00a0\u00a0\u00a0\u00a0- Albumentations: Two methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Kernel-based using Laplacian operator\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Gaussian interpolation\u00a0\u00a03. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides detailed mathematical formulation and references RandomSnow RandomSnow - Similar core functionality: add snow effects to images- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>snow_coefficient</code> (0.5, 0.5), <code>brightness</code> (2, 2), p=1.0\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>snow_point_range</code> (0.1, 0.3), <code>brightness_coeff</code> (2.5), p=0.5\u00a0\u00a02. Implementation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: Single approach\u00a0\u00a0\u00a0\u00a0- Albumentations: Two methods:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"bleach\": Simple pixel value thresholding\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"texture\": Advanced snow texture simulation\u00a0\u00a03. Additional in Albumentations:\u00a0\u00a0\u00a0\u00a0- Detailed snow simulation with:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* HSV color space manipulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Gaussian noise for texture\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Depth effect simulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Sparkle effects\u00a0\u00a04. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides detailed mathematical formulation and implementation notes RandomSolarize Solarize - Similar core functionality: invert pixel values above threshold- Key similarities:\u00a0\u00a01. Both have default probability p=0.5- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Two parameters:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>thresholds</code> (default: 0.1) for threshold range\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>additions</code> (default: 0.1) for value adjustment\u00a0\u00a0\u00a0\u00a0- Albumentations: Single parameter:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>threshold_range</code> (default: (0.5, 0.5))\u00a0\u00a02. Threshold handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Generates from (0.5 - x, 0.5 + x) for float input\u00a0\u00a0\u00a0\u00a0- Albumentations: Direct range specification, scaled by image type max value\u00a0\u00a03. Documentation:\u00a0\u00a0\u00a0\u00a0- Albumentations provides:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Detailed examples for both uint8 and float32 images\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Clear mathematical formulation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Image type-specific behavior explanation CenterCrop CenterCrop - Similar core functionality: crop center of image- Key similarities:\u00a0\u00a01. Both have default probability p=1.0- Key differences:\u00a0\u00a01. Size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>size</code> parameter (int or tuple)\u00a0\u00a0\u00a0\u00a0- Albumentations: Separate <code>height</code> and <code>width</code> parameters\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> for interpolation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>resample</code> mode selection\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>cropping_mode</code> ('slice' or 'resample')\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>pad_if_needed</code> for handling small images\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>border_mode</code> for padding method\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>fill</code> and <code>fill_mask</code> for padding values\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>pad_position</code> options\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Supports images, masks, bboxes, and keypoints PadTo PadIfNeeded - Can achieve same core functionality- Key similarities:\u00a0\u00a01. Both have default probability p=1.0\u00a0\u00a02. Can pad to exact size:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>size=(height, width)</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>min_height=height, min_width=width</code>- Key differences:\u00a0\u00a01. Parameter naming:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>size</code> tuple\u00a0\u00a0\u00a0\u00a0- Albumentations: Separate dimension parameters\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simple <code>pad_mode</code> selection\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>pad_value</code>\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Flexible <code>position</code> options\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate <code>fill</code> and <code>fill_mask</code>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Optional divisibility padding\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Multiple target support\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomAffine Affine - Similar core functionality: apply affine transformations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support rotation, translation, scaling, and shear- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>degrees</code> for rotation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>translate</code> as fraction\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>scale</code> as tuple\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>shear</code> in degrees\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* More flexible parameter formats\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Supports both percent and pixel translation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Dictionary format for independent axis control\u00a0\u00a02. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* <code>fit_output</code> for automatic size adjustment\u00a0\u00a0\u00a0\u00a0* <code>keep_ratio</code> for aspect ratio preservation\u00a0\u00a0\u00a0\u00a0* <code>rotate_method</code> options\u00a0\u00a0\u00a0\u00a0* <code>balanced_scale</code> for even scale distribution\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomCrop RandomCrop - Similar core functionality: randomly crop image patches- Key similarities:\u00a0\u00a01. Both have default probability p=1.0\u00a0\u00a02. Both support padding if needed- Key differences:\u00a0\u00a01. Size specification:\u00a0\u00a0\u00a0\u00a0- Kornia: Single <code>size</code> tuple (height, width)\u00a0\u00a0\u00a0\u00a0- Albumentations: Separate <code>height</code> and <code>width</code> parameters\u00a0\u00a02. Padding options:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Flexible padding sizes (int, tuple[2], tuple[4])\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Multiple padding modes (constant, reflect, replicate)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single fill value\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simpler padding interface\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate fill values for image and mask\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Flexible pad positioning\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomElasticTransform ElasticTransform - Similar core functionality: apply elastic deformations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use Gaussian smoothing for displacement fields\u00a0\u00a03. Both support independent control of x/y deformations:\u00a0\u00a0\u00a0\u00a0- Kornia: via separate values in <code>sigma</code>/<code>alpha</code> tuples\u00a0\u00a0\u00a0\u00a0- Albumentations: via <code>same_dxdy</code> parameter- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>kernel_size</code> tuple (63, 63)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>sigma</code> tuple (32.0, 32.0)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>alpha</code> tuple (1.0, 1.0)\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>sigma</code> (default: 50.0)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>alpha</code> (default: 1.0)\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Control over padding mode\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>approximate</code> mode for faster processing\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Choice of noise distribution\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomErasing Erasing - Similar core functionality: randomly erase rectangular regions- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Same default parameters:\u00a0\u00a0\u00a0\u00a0* <code>scale</code> (0.02, 0.33)\u00a0\u00a0\u00a0\u00a0* <code>ratio</code> (0.3, 3.3)- Key differences:\u00a0\u00a01. Fill value options:\u00a0\u00a0\u00a0\u00a0- Kornia: Simple numeric <code>value</code> (default: 0.0)\u00a0\u00a0\u00a0\u00a0- Albumentations: Rich <code>fill</code> options:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Numeric values\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"random\" per pixel\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"random_uniform\" per region\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"inpaint_telea\" method\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* \"inpaint_ns\" method\u00a0\u00a02. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* Separate <code>mask_fill</code> value\u00a0\u00a0\u00a0\u00a0* Support for masks, bboxes, keypoints\u00a0\u00a0\u00a0\u00a0* Inpainting options for more natural-looking results\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomFisheye OpticalDistortion - Similar core functionality: apply optical/fisheye distortion- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support fisheye distortion- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate <code>center_x</code>, <code>center_y</code> for distortion center\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>gamma</code> for distortion strength\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>distort_limit</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>mode</code> selection ('camera' or 'fisheye')\u00a0\u00a02. Distortion models:\u00a0\u00a0\u00a0\u00a0- Kornia: Fisheye only\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Camera matrix model\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fisheye model\u00a0\u00a03. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* Separate interpolation methods for image and mask\u00a0\u00a0\u00a0\u00a0* Support for masks, bboxes, keypoints\u00a0\u00a04. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomHorizontalFlip HorizontalFlip - Similar core functionality: flip image horizontally- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Simple operation with same visual result- Key differences:\u00a0\u00a01. Batch handling:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Additional <code>p_batch</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>same_on_batch</code> option\u00a0\u00a0\u00a0\u00a0- Albumentations: No batch-specific parameters\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomPerspective Perspective - Similar core functionality: apply perspective transformation- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both transform image by moving corners\u00a0\u00a03. Both support different interpolation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: via <code>resample</code> (BILINEAR, NEAREST)\u00a0\u00a0\u00a0\u00a0- Albumentations: via <code>interpolation</code> (INTER_LINEAR, INTER_NEAREST, etc.)- Key differences:\u00a0\u00a01. Distortion control:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>distortion_scale</code> (0 to 1, default: 0.5)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>sampling_method</code> ('basic' or 'area_preserving')\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>scale</code> tuple for corner movement range\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>fit_output</code> option for image capture\u00a0\u00a02. Output handling:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>keepdim</code> for batch form\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>keep_size</code> for output dimensions\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Border mode and fill options\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomResizedCrop RandomResizedCrop - Similar core functionality: crop random patches and resize- Key similarities:\u00a0\u00a01. Both have default probability p=1.0\u00a0\u00a02. Same default parameters:\u00a0\u00a0\u00a0\u00a0* <code>scale</code> (0.08, 1.0)\u00a0\u00a0\u00a0\u00a0* <code>ratio</code> (~0.75, ~1.33)\u00a0\u00a03. Both support different interpolation methods:\u00a0\u00a0\u00a0\u00a0- Kornia: via <code>resample</code>\u00a0\u00a0\u00a0\u00a0- Albumentations: via <code>interpolation</code>- Key differences:\u00a0\u00a01. Implementation options:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>cropping_mode</code> ('slice' or 'resample')\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>keepdim</code> for batch form\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fallback to center crop after 10 attempts\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomRotation90 RandomRotate90 - Similar core functionality: rotate image by 90 degrees- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both rotate in 90-degree increments- Key differences:\u00a0\u00a01. Rotation control:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>times</code> parameter to specify range of rotations\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>resample</code> and <code>align_corners</code> for interpolation\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simpler implementation (0-3 rotations)\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomRotation Rotate - Similar core functionality: rotate image by random angle- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support different interpolation methods- Key differences:\u00a0\u00a01. Angle specification:\u00a0\u00a0\u00a0\u00a0- Kornia: <code>degrees</code> parameter (if single value, range is (-degrees, +degrees))\u00a0\u00a0\u00a0\u00a0- Albumentations: <code>limit</code> parameter (default: (-90, 90))\u00a0\u00a02. Additional features:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> for interpolation\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Border mode options\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fill values for padding\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>rotate_method</code> for bboxes\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>crop_border</code> option\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints RandomShear Affine (shear parameter) - Similar core functionality: apply shear transformation- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both support different interpolation methods\u00a0\u00a03. Both support independent x/y shear control- Key differences:\u00a0\u00a01. Parameter specification:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Dedicated shear transform\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>shear</code> parameter supports float, tuple(2), or tuple(4)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simple padding modes (zeros, border, reflection)\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Part of general Affine transform\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>shear</code> supports number, tuple, or dict format\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* More border modes and fill options\u00a0\u00a02. Additional features in Albumentations:\u00a0\u00a0\u00a0\u00a0* Separate mask interpolation\u00a0\u00a0\u00a0\u00a0* <code>fit_output</code> option\u00a0\u00a0\u00a0\u00a0* Combined with other affine transforms\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomThinPlateSpline ThinPlateSpline - Similar core functionality: apply smooth, non-rigid deformations- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Both use thin plate spline algorithm\u00a0\u00a03. Both support interpolation options- Key differences:\u00a0\u00a01. Deformation control:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Single <code>scale</code> parameter (default: 0.2)\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Fixed control point grid\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>scale_range</code> tuple for range of deformation\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Configurable <code>num_control_points</code>\u00a0\u00a02. Implementation details:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* <code>align_corners</code> parameter\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Binary mode choice (bilinear/nearest)\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* OpenCV interpolation flags\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* More granular control over grid\u00a0\u00a03. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, keypoints, bboxes RandomVerticalFlip VerticalFlip - Similar core functionality: flip image vertically- Key similarities:\u00a0\u00a01. Both have default probability p=0.5\u00a0\u00a02. Simple operation with same visual result- Key differences:\u00a0\u00a01. Implementation:\u00a0\u00a0\u00a0\u00a0- Kornia:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Additional <code>p_batch</code> parameter\u00a0\u00a0\u00a0\u00a0- Albumentations:\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0* Simpler implementation\u00a0\u00a02. Target handling:\u00a0\u00a0\u00a0\u00a0- Kornia: Image tensors only\u00a0\u00a0\u00a0\u00a0- Albumentations: Images, masks, bboxes, keypoints"},{"location":"getting_started/augmentation_mapping/#key-differences_1","title":"Key Differences","text":""},{"location":"getting_started/augmentation_mapping/#compared-to-torchvision_1","title":"Compared to TorchVision","text":"<ul> <li>Albumentations operates on numpy arrays instead of PyTorch tensors</li> <li>Albumentations typically provides more parameters for fine-tuning transformations</li> <li>Most Albumentations transforms support both image and mask augmentation</li> <li>Better support for bounding box and keypoint augmentation</li> </ul>"},{"location":"getting_started/augmentation_mapping/#compared-to-kornia_1","title":"Compared to Kornia","text":"<ul> <li>Kornia operates directly on GPU tensors, while Albumentations works with numpy arrays</li> <li>Albumentations provides more comprehensive support for object detection and segmentation tasks</li> <li>Albumentations typically offers better performance for CPU-based augmentations</li> </ul>"},{"location":"getting_started/augmentation_mapping/#performance-comparison","title":"Performance Comparison","text":"<p>According to benchmarking results, Albumentations generally offers superior CPU performance compared to TorchVision and Kornia for most transforms. Here are some key highlights: Common Transforms Performance (images/second, higher is better)</p> Transform Albumentations TorchVision Kornia Notes HorizontalFlip 8,618 914 390 Albumentations is ~9x faster than TorchVision, ~22x faster than Kornia VerticalFlip 22,847 3,198 1,212 Albumentations is ~7x faster than TorchVision, ~19x faster than Kornia RandomResizedCrop 2,828 511 287 Albumentations is ~5.5x faster than TorchVision, ~10x faster than Kornia Normalize 1,196 519 626 Albumentations is ~2x faster than both ColorJitter 628 46 55 Albumentations is ~13x faster than both"},{"location":"getting_started/augmentation_mapping/#key-performance-insights","title":"Key Performance Insights:","text":"<ul> <li>Basic Operations: Albumentations excels at basic transforms like flips and crops, often being 5-20x faster than alternatives</li> <li>Complex Operations: For more complex transforms like elastic deformation, the performance gap narrows</li> <li>Memory Efficiency: Working with numpy arrays (Albumentations) is generally more memory efficient than tensor operations (Kornia/TorchVision) on CPU</li> </ul>"},{"location":"getting_started/augmentation_mapping/#when-to-choose-each-library","title":"When to Choose Each Library:","text":"<ul> <li>Albumentations: Best choice for CPU-based preprocessing pipelines and when maximum performance is needed</li> <li>Kornia: Consider when doing augmentation on GPU with existing PyTorch tensors</li> <li>TorchVision: Good choice when deeply integrated into PyTorch ecosystem and GPU performance isn't critical</li> </ul> <p>Note: Benchmarks performed on macOS-15.0.1-arm64 with Python 3.12.7. Your results may vary based on hardware and setup.</p>"},{"location":"getting_started/augmentation_mapping/#code-examples","title":"Code Examples","text":""},{"location":"getting_started/augmentation_mapping/#torchvision-to-albumentations","title":"TorchVision to Albumentations","text":"Python<pre><code># TorchVision\ntransforms = T.Compose([\n    T.RandomHorizontalFlip(p=0.5),\n    T.RandomRotation(10),\n    T.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n])\n\n# Albumentations equivalent\ntransforms = A.Compose([\n    A.HorizontalFlip(p=0.5),\n    A.Rotate(limit=10),\n    A.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n])\n</code></pre>"},{"location":"getting_started/augmentation_mapping/#kornia-to-albumentations_1","title":"Kornia to Albumentations","text":"Python<pre><code># Kornia\ntransforms = K.AugmentationSequential(\n    K.RandomHorizontalFlip(p=0.5),\n    K.RandomRotation(degrees=10),\n    K.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n)\n\n# Albumentations equivalent\ntransforms = A.Compose([\n    A.HorizontalFlip(p=0.5),\n    A.Rotate(limit=10),\n    A.Normalize(mean=[0.485, 0.456, 0.406],\n                std=[0.229, 0.224, 0.225])\n])\n</code></pre>"},{"location":"getting_started/augmentation_mapping/#additional-resources","title":"Additional Resources","text":"<ul> <li>TorchVision Transforms Documentation</li> <li>Kornia Augmentation Documentation</li> <li>Albumentations Documentation</li> </ul>"},{"location":"getting_started/bounding_boxes_augmentation/","title":"Bounding boxes augmentation for object detection","text":""},{"location":"getting_started/bounding_boxes_augmentation/#different-annotations-formats","title":"Different annotations formats","text":"<p>Bounding boxes are rectangles that mark objects on an image. There are multiple formats of bounding boxes annotations. Each format uses its specific representation of bounding boxes coordinates. Albumentations supports four formats: <code>pascal_voc</code>, <code>albumentations</code>, <code>coco</code>, and <code>yolo</code> .</p> <p>Let's take a look at each of those formats and how they represent coordinates of bounding boxes.</p> <p>As an example, we will use an image from the dataset named Common Objects in Context. It contains one bounding box that marks a cat. The image width is 640 pixels, and its height is 480 pixels. The width of the bounding box is 322 pixels, and its height is 117 pixels.</p> <p>The bounding box has the following <code>(x, y)</code> coordinates of its corners: top-left is <code>(x_min, y_min)</code> or <code>(98px, 345px)</code>, top-right is <code>(x_max, y_min)</code> or <code>(420px, 345px)</code>, bottom-left is <code>(x_min, y_max)</code> or <code>(98px, 462px)</code>, bottom-right is <code>(x_max, y_max)</code> or <code>(420px, 462px)</code>. As you see, coordinates of the bounding box's corners are calculated with respect to the top-left corner of the image which has <code>(x, y)</code> coordinates <code>(0, 0)</code>.</p> <p> An example image with a bounding box from the COCO dataset</p>"},{"location":"getting_started/bounding_boxes_augmentation/#pascal_voc","title":"pascal_voc","text":"<p><code>pascal_voc</code> is a format used by the Pascal VOC dataset. Coordinates of a bounding box are encoded with four values in pixels: <code>[x_min, y_min, x_max, y_max]</code>.  <code>x_min</code> and <code>y_min</code> are coordinates of the top-left corner of the bounding box. <code>x_max</code> and <code>y_max</code> are coordinates of bottom-right corner of the bounding box.</p> <p>Coordinates of the example bounding box in this format are <code>[98, 345, 420, 462]</code>.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#albumentations","title":"albumentations","text":"<p><code>albumentations</code> is similar to <code>pascal_voc</code>, because it also uses four values <code>[x_min, y_min, x_max, y_max]</code> to represent a bounding box. But unlike <code>pascal_voc</code>, <code>albumentations</code> uses normalized values. To normalize values, we divide coordinates in pixels for the x- and y-axis by the width and the height of the image.</p> <p>Coordinates of the example bounding box in this format are <code>[98 / 640, 345 / 480, 420 / 640, 462 / 480]</code> which are <code>[0.153125, 0.71875, 0.65625, 0.9625]</code>.</p> <p>Albumentations uses this format internally to work with bounding boxes and augment them.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#coco","title":"coco","text":"<p><code>coco</code> is a format used by the Common Objects in Context COCO dataset.</p> <p>In <code>coco</code>, a bounding box is defined by four values in pixels <code>[x_min, y_min, width, height]</code>. They are coordinates of the top-left corner along with the width and height of the bounding box.</p> <p>Coordinates of the example bounding box in this format are <code>[98, 345, 322, 117]</code>.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#yolo","title":"yolo","text":"<p>In <code>yolo</code>, a bounding box is represented by four values <code>[x_center, y_center, width, height]</code>. <code>x_center</code> and <code>y_center</code> are the normalized coordinates of the center of the bounding box. To make coordinates normalized, we take pixel values of x and y, which marks the center of the bounding box on the x- and y-axis. Then we divide the value of x by the width of the image and value of y by the height of the image. <code>width</code> and <code>height</code> represent the width and the height of the bounding box. They are normalized as well.</p> <p>Coordinates of the example bounding box in this format are <code>[((420 + 98) / 2) / 640, ((462 + 345) / 2) / 480, 322 / 640, 117 / 480]</code> which are <code>[0.4046875, 0.840625, 0.503125, 0.24375]</code>.</p> <p> How different formats represent coordinates of a bounding box</p>"},{"location":"getting_started/bounding_boxes_augmentation/#bounding-boxes-augmentation","title":"Bounding boxes augmentation","text":"<p>Just like with images and masks augmentation, the process of augmenting bounding boxes consists of 4 steps.</p> <ol> <li>You import the required libraries.</li> <li>You define an augmentation pipeline.</li> <li>You read images and bounding boxes from the disk.</li> <li>You pass an image and bounding boxes to the augmentation pipeline and receive augmented images and boxes.</li> </ol> <p>Note</p> <p>Some transforms in Albumentation don't support bounding boxes. If you try to use them you will get an exception. Please refer to this article to check whether a transform can augment bounding boxes.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#step-1-import-the-required-libraries","title":"Step 1. Import the required libraries.","text":"Python<pre><code>import albumentations as A\nimport cv2\n</code></pre>"},{"location":"getting_started/bounding_boxes_augmentation/#step-2-define-an-augmentation-pipeline","title":"Step 2. Define an augmentation pipeline.","text":"<p>Here an example of a minimal declaration of an augmentation pipeline that works with bounding boxes.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=450, height=450),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], bbox_params=A.BboxParams(format='coco'))\n</code></pre> <p>Note that unlike image and masks augmentation, <code>Compose</code> now has an additional parameter <code>bbox_params</code>. You need to pass an instance of <code>A.BboxParams</code> to that argument. <code>A.BboxParams</code> specifies settings for working with bounding boxes. <code>format</code> sets the format for bounding boxes coordinates.</p> <p>It can either be <code>pascal_voc</code>, <code>albumentations</code>, <code>coco</code> or <code>yolo</code>. This value is required because Albumentation needs to know the coordinates' source format for bounding boxes to apply augmentations correctly.</p> <p>Besides <code>format</code>, <code>A.BboxParams</code> supports a few more settings.</p> <p>Here is an example of <code>Compose</code> that shows all available settings with <code>A.BboxParams</code>:</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=450, height=450),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], bbox_params=A.BboxParams(format='coco', min_area=1024, min_visibility=0.1, label_fields=['class_labels']))\n</code></pre>"},{"location":"getting_started/bounding_boxes_augmentation/#min_area-and-min_visibility","title":"<code>min_area</code> and <code>min_visibility</code>","text":"<p><code>min_area</code> and <code>min_visibility</code> parameters control what Albumentations should do to the augmented bounding boxes if their size has changed after augmentation. The size of bounding boxes could change if you apply spatial augmentations, for example, when you crop a part of an image or when you resize an image.</p> <p><code>min_area</code> is a value in pixels. If the area of a bounding box after augmentation becomes smaller than <code>min_area</code>, Albumentations will drop that box. So the returned list of augmented bounding boxes won't contain that bounding box.</p> <p><code>min_visibility</code> is a value between 0 and 1. If the ratio of the bounding box area after augmentation to <code>the area of the bounding box before augmentation</code> becomes smaller than <code>min_visibility</code>, Albumentations will drop that box. So if the augmentation process cuts the most of the bounding box, that box won't be present in the returned list of the augmented bounding boxes.</p> <p>Here is an example image that contains two bounding boxes. Bounding boxes coordinates are declared using the <code>coco</code> format.</p> <p> An example image with two bounding boxes</p> <p>First, we apply the <code>CenterCrop</code> augmentation without declaring parameters <code>min_area</code> and <code>min_visibility</code>. The augmented image contains two bounding boxes.</p> <p> An example image with two bounding boxes after applying augmentation</p> <p>Next, we apply the same <code>CenterCrop</code> augmentation, but now we also use the <code>min_area</code> parameter. Now, the augmented image contains only one bounding box, because the other bounding box's area after augmentation became smaller than <code>min_area</code>, so Albumentations dropped that bounding box.</p> <p> An example image with one bounding box after applying augmentation with 'min_area'</p> <p>Finally, we apply the <code>CenterCrop</code> augmentation with the <code>min_visibility</code>. After that augmentation, the resulting image doesn't contain any bounding box, because visibility of all bounding boxes after augmentation are below threshold set by <code>min_visibility</code>.</p> <p> An example image with zero bounding boxes after applying augmentation with 'min_visibility'</p>"},{"location":"getting_started/bounding_boxes_augmentation/#class-labels-for-bounding-boxes","title":"Class labels for bounding boxes","text":"<p>Besides coordinates, each bounding box should have an associated class label that tells which object lies inside the bounding box. There are two ways to pass a label for a bounding box.</p> <p>Let's say you have an example image with three objects: <code>dog</code>, <code>cat</code>, and <code>sports ball</code>. Bounding boxes coordinates in the <code>coco</code> format for those objects are <code>[23, 74, 295, 388]</code>, <code>[377, 294, 252, 161]</code>, and <code>[333, 421, 49, 49]</code>.</p> <p> An example image with 3 bounding boxes from the COCO dataset</p>"},{"location":"getting_started/bounding_boxes_augmentation/#1-you-can-pass-labels-along-with-bounding-boxes-coordinates-by-adding-them-as-additional-values-to-the-list-of-coordinates","title":"1. You can pass labels along with bounding boxes coordinates by adding them as additional values to the list of coordinates.","text":"<p>For the image above, bounding boxes with class labels will become <code>[23, 74, 295, 388, 'dog']</code>, <code>[377, 294, 252, 161, 'cat']</code>, and <code>[333, 421, 49, 49, 'sports ball']</code>.</p> <p>Class labels could be of any type: integer, string, or any other Python data type. For example, integer values as class labels will look the following: <code>[23, 74, 295, 388, 18]</code>, <code>[377, 294, 252, 161, 17]</code>, and <code>[333, 421, 49, 49, 37].</code></p> <p>Also, you can use multiple class values for each bounding box, for example <code>[23, 74, 295, 388, 'dog', 'animal']</code>, <code>[377, 294, 252, 161, 'cat', 'animal']</code>, and <code>[333, 421, 49, 49, 'sports ball', 'item']</code>.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#2you-can-pass-labels-for-bounding-boxes-as-a-separate-list-the-preferred-way","title":"2.You can pass labels for bounding boxes as a separate list (the preferred way).","text":"<p>For example, if you have three bounding boxes like <code>[23, 74, 295, 388]</code>, <code>[377, 294, 252, 161]</code>, and <code>[333, 421, 49, 49]</code> you can create a separate list with values like <code>['cat', 'dog', 'sports ball']</code>, or <code>[18, 17, 37]</code> that contains class labels for those bounding boxes. Next, you pass that list with class labels as a separate argument to the <code>transform</code> function. Albumentations needs to know the names of all those lists with class labels to join them with augmented bounding boxes correctly. Then, if a bounding box is dropped after augmentation because it is no longer visible, Albumentations will drop the class label for that box as well. Use <code>label_fields</code> parameter to set names for all arguments in <code>transform</code> that will contain label descriptions for bounding boxes (more on that in Step 4).</p>"},{"location":"getting_started/bounding_boxes_augmentation/#step-3-read-images-and-bounding-boxes-from-the-disk","title":"Step 3. Read images and bounding boxes from the disk.","text":"<p>Read an image from the disk.</p> Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> <p>Bounding boxes can be stored on the disk in different serialization formats: JSON, XML, YAML, CSV, etc. So the code to read bounding boxes depends on the actual format of data on the disk.</p> <p>After you read the data from the disk, you need to prepare bounding boxes for Albumentations.</p> <p>Albumentations expects that bounding boxes will be represented as a list of lists. Each list contains information about a single bounding box. A bounding box definition should have at list four elements that represent the coordinates of that bounding box. The actual meaning of those four values depends on the format of bounding boxes (either <code>pascal_voc</code>, <code>albumentations</code>, <code>coco</code>, or <code>yolo</code>). Besides four coordinates, each definition of a bounding box may contain one or more extra values. You can use those extra values to store additional information about the bounding box, such as a class label of the object inside the box. During augmentation, Albumentations will not process those extra values. The library will return them as is along with the updated coordinates of the augmented bounding box.</p>"},{"location":"getting_started/bounding_boxes_augmentation/#step-4-pass-an-image-and-bounding-boxes-to-the-augmentation-pipeline-and-receive-augmented-images-and-boxes","title":"Step 4. Pass an image and bounding boxes to the augmentation pipeline and receive augmented images and boxes.","text":"<p>As discussed in Step 2, there are two ways of passing class labels along with bounding boxes coordinates:</p>"},{"location":"getting_started/bounding_boxes_augmentation/#1-pass-class-labels-along-with-coordinates","title":"1. Pass class labels along with coordinates","text":"<p>So, if you have coordinates of three bounding boxes that look like this:</p> Python<pre><code>bboxes = [\n    [23, 74, 295, 388],\n    [377, 294, 252, 161],\n    [333, 421, 49, 49],\n]\n</code></pre> <p>you can add a class label for each bounding box as an additional element of the list along with four coordinates. So now a list with bounding boxes and their coordinates will look the following:</p> Python<pre><code>bboxes = [\n    [23, 74, 295, 388, 'dog'],\n    [377, 294, 252, 161, 'cat'],\n    [333, 421, 49, 49, 'sports ball'],\n]\n</code></pre> <p>or with multiple labels per each bounding box: Python<pre><code>bboxes = [\n    [23, 74, 295, 388, 'dog', 'animal'],\n    [377, 294, 252, 161, 'cat', 'animal'],\n    [333, 421, 49, 49, 'sports ball', 'item'],\n]\n</code></pre></p> <p>You can use any data type for declaring class labels. It can be string, integer, or any other Python data type.</p> <p>Next, you pass an image and bounding boxes for it to the <code>transform</code> function and receive the augmented image and bounding boxes.</p> Python<pre><code>transformed = transform(image=image, bboxes=bboxes)\ntransformed_image = transformed['image']\ntransformed_bboxes = transformed['bboxes']\n</code></pre> <p> Example input and output data for bounding boxes augmentation</p>"},{"location":"getting_started/bounding_boxes_augmentation/#2-pass-class-labels-in-a-separate-argument-to-transform-the-preferred-way","title":"2. Pass class labels in a separate argument to <code>transform</code> (the preferred way).","text":"<p>Let's say you have coordinates of three bounding boxes Python<pre><code>bboxes = [\n    [23, 74, 295, 388],\n    [377, 294, 252, 161],\n    [333, 421, 49, 49],\n]\n</code></pre></p> <p>You can create a separate list that contains class labels for those bounding boxes:</p> Python<pre><code>class_labels = ['cat', 'dog', 'parrot']\n</code></pre> <p>Then you pass both bounding boxes and class labels to <code>transform</code>. Note that to pass class labels, you need to use the name of the argument that you declared in <code>label_fields</code> when creating an instance of Compose in step 2. In our case, we set the name of the argument to <code>class_labels</code>.</p> Python<pre><code>transformed = transform(image=image, bboxes=bboxes, class_labels=class_labels)\ntransformed_image = transformed['image']\ntransformed_bboxes = transformed['bboxes']\ntransformed_class_labels = transformed['class_labels']\n</code></pre> <p> Example input and output data for bounding boxes augmentation with a separate argument for class labels</p> <p>Note that <code>label_fields</code> expects a list, so you can set multiple fields that contain labels for your bounding boxes. So if you declare Compose like</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=450, height=450),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], bbox_params=A.BboxParams(format='coco', label_fields=['class_labels', 'class_categories'])))\n</code></pre> <p>you can use those multiple arguments to pass info about class labels, like</p> Python<pre><code>class_labels = ['cat', 'dog', 'parrot']\nclass_categories = ['animal', 'animal', 'item']\n\ntransformed = transform(image=image, bboxes=bboxes, class_labels=class_labels, class_categories=class_categories)\ntransformed_image = transformed['image']\ntransformed_bboxes = transformed['bboxes']\ntransformed_class_labels = transformed['class_labels']\ntransformed_class_categories = transformed['class_categories']\n</code></pre>"},{"location":"getting_started/bounding_boxes_augmentation/#examples","title":"Examples","text":"<ul> <li>Using Albumentations to augment bounding boxes for object detection tasks</li> <li>How to use Albumentations for detection tasks if you need to keep all bounding boxes</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/image_augmentation/","title":"Image augmentation for classification","text":"<p>We can divide the process of image augmentation into four steps:</p> <ol> <li>Import albumentations and a library to read images from the disk (e.g., OpenCV).</li> <li>Define an augmentation pipeline.</li> <li>Read images from the disk.</li> <li>Pass images to the augmentation pipeline and receive augmented images.</li> </ol>"},{"location":"getting_started/image_augmentation/#step-1-import-the-required-libraries","title":"Step 1. Import the required libraries.","text":"<ul> <li>Import Albumentations</li> </ul> Python<pre><code>import albumentations as A\n</code></pre> <ul> <li>Import a library to read images from the disk. In this example, we will use OpenCV. It is an open-source computer vision library that supports many image formats. Albumentations has OpenCV as a dependency, so you already have OpenCV installed.</li> </ul> Python<pre><code>import cv2\n</code></pre>"},{"location":"getting_started/image_augmentation/#step-2-define-an-augmentation-pipeline","title":"Step 2. Define an augmentation pipeline.","text":"<p>To define an augmentation pipeline, you need to create an instance of the <code>Compose</code> class. As an argument to the <code>Compose</code> class, you need to pass a list of augmentations you want to apply. A call to <code>Compose</code> will return a transform function that will perform image augmentation.</p> <p>Let's look at an example:</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=256, height=256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n])\n</code></pre> <p>In the example, <code>Compose</code> receives a list with three augmentations: <code>A.RandomCrop</code>, <code>A.HorizontalFlip</code>, and <code>A.RandomBrighntessContrast</code>. You can find the full list of all available augmentations in the GitHub repository and in the API Docs. A demo playground that demonstrates how augmentations will transform the input image is available at https://explore.albumentations.ai.</p> <p>To create an augmentation, you create an instance of the required augmentation class and pass augmentation parameters to it. <code>A.RandomCrop</code> receives two parameters, <code>height</code> and <code>width</code>. <code>A.RandomCrop(width=256, height=256)</code> means that <code>A.RandomCrop</code> will take an input image, extract a random patch with size 256 by 256 pixels from it and then pass the result to the next augmentation in the pipeline (in this case to <code>A.HorizontalFlip</code>).</p> <p><code>A.HorizontalFlip</code> in this example has one parameter named <code>p</code>. <code>p</code> is a special parameter that is supported by almost all augmentations. It controls the probability of applying the augmentation. <code>p=0.5</code> means that with a probability of 50%, the transform will flip the image horizontally, and with a probability of 50%, the transform won't modify the input image.</p> <p><code>A.RandomBrighntessContrast</code> in the example also has one parameter, <code>p</code>. With a probability of 20%, this augmentation will change the brightness and contrast of the image received from <code>A.HorizontalFlip</code>. And with a probability of 80%, it will keep the received image unchanged.</p> <p> A visualized version of the augmentation pipeline. You pass an image to it, the image goes through all transformations, and then you receive an augmented image from the pipeline.</p>"},{"location":"getting_started/image_augmentation/#step-3-read-images-from-the-disk","title":"Step 3. Read images from the disk.","text":"<p>To pass an image to the augmentation pipeline, you need to read it from the disk. The pipeline expects to receive an image in the form of a NumPy array. If it is a color image, it should have three channels in the following order: Red, Green, Blue (so a regular RGB image).</p> <p>To read images from the disk, you can use OpenCV - a popular library for image processing. It supports a lot of input formats and is installed along with Albumentations since Albumentations utilizes that library under the hood for a lot of augmentations.</p> <p>To import OpenCV</p> Python<pre><code>import cv2\n</code></pre> <p>To read an image with OpenCV</p> <p>Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> Note the usage of <code>cv2.cvtColor</code>. For historical reasons, OpenCV reads an image in BGR format (so color channels of the image have the following order: Blue, Green, Red). Albumentations uses the most common and popular RGB image format. So when using OpenCV, we need to convert the image format to RGB explicitly.</p> <p>Besides OpenCV, you can use other image processing libraries.</p>"},{"location":"getting_started/image_augmentation/#pillow","title":"Pillow","text":"<p>Pillow is a popular Python image processing library.</p> <ul> <li>Install Pillow</li> </ul> Bash<pre><code>    pip install pillow\n</code></pre> <ul> <li>Import Pillow and NumPy (we need NumPy to convert a Pillow image to a NumPy array. NumPy is already installed along with Albumentations).</li> </ul> Python<pre><code>from PIL import Image\nimport numpy as np\n</code></pre> <ul> <li>Read an image with Pillow and convert it to a NumPy array. Python<pre><code>pillow_image = Image.open(\"image.jpg\")\nimage = np.array(pillow_image)\n</code></pre></li> </ul>"},{"location":"getting_started/image_augmentation/#step-4-pass-images-to-the-augmentation-pipeline-and-receive-augmented-images","title":"Step 4. Pass images to the augmentation pipeline and receive augmented images.","text":"<p>To pass an image to the augmentation pipeline you need to call the <code>transform</code> function created by a call to <code>A.Compose</code> at Step 2. In the <code>image</code> argument to that function, you need to pass an image that you want to augment.</p> Python<pre><code>transformed = transform(image=image)\n</code></pre> <p><code>transform</code> will return a dictionary with a single key <code>image</code>. Value at that key will contain an augmented image.</p> Python<pre><code>transformed_image = transformed[\"image\"]\n</code></pre> <p>To augment the next image, you need to call <code>transform</code> again and pass a new image as the <code>image</code> argument:</p> Python<pre><code>another_transformed_image = transform(image=another_image)[\"image\"]\n</code></pre> <p>Each augmentation will change the input image with the probability set by the parameter <code>p</code>. Also, many augmentations have parameters that control the magnitude of changes that will be applied to an image. For example, <code>A.RandomBrightnessContrast</code> has two parameters: <code>brightness_limit</code> that controls the magnitude of adjusting brightness and <code>contrast_limit</code> that controls the magnitude of adjusting contrast. The bigger the value, the more the augmentation will change an image. During augmentation, a magnitude of the transformation is sampled from a uniform distribution limited by <code>brightness_limit</code> and <code>contrast_limit</code>. That means that if you make multiple calls to <code>transform</code> with the same input image, you will get a different output image each time.</p> Python<pre><code>transform = A.Compose([\n    A.RandomBrightnessContrast(brightness_limit=1, contrast_limit=1, p=1.0),\n])\ntransformed_image_1 = transform(image=image)['image']\ntransformed_image_2 = transform(image=image)['image']\ntransformed_image_3 = transform(image=image)['image']\n</code></pre> <p></p>"},{"location":"getting_started/image_augmentation/#examples","title":"Examples","text":"<ul> <li>Defining a simple augmentation pipeline for image augmentation</li> <li>Working with non-8-bit images</li> <li>Weather augmentations in Albumentations</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/installation/","title":"Installation","text":"<p>Albumentations requires Python 3.8 or higher.</p>"},{"location":"getting_started/installation/#install-the-latest-stable-version-from-pypi","title":"Install the latest stable version from PyPI","text":"Bash<pre><code>pip install -U albumentations\n</code></pre>"},{"location":"getting_started/installation/#install-the-latest-version-from-the-master-branch-on-github","title":"Install the latest version from the master branch on GitHub","text":"Bash<pre><code>pip install -U git+https://github.com/albumentations-team/albumentations\n</code></pre>"},{"location":"getting_started/installation/#note-on-opencv-dependencies","title":"Note on OpenCV dependencies","text":"<p>By default, pip downloads a wheel distribution of Albumentations. This distribution has <code>opencv-python-headless</code> as its dependency.</p> <p>If you already have some OpenCV distribution (such as <code>opencv-python-headless</code>, <code>opencv-python</code>, <code>opencv-contrib-python</code> or <code>opencv-contrib-python-headless</code>) installed in your Python environment, you can force Albumentations to use it by providing the <code>--no-binary qudida,albumentations</code> argument to pip, e.g.</p> Bash<pre><code>pip install -U albumentations\n</code></pre> <p>pip will use the following logic to determine the required OpenCV distribution:</p> <ol> <li>If your Python environment already contains <code>opencv-python</code>, <code>opencv-contrib-python</code>, <code>opencv-contrib-python-headless</code> or <code>opencv-python-headless</code> pip will use it.</li> <li>If your Python environment doesn't contain any OpenCV distribution from step 1, pip will download <code>opencv-python-headless</code>.</li> </ol>"},{"location":"getting_started/installation/#install-the-latest-stable-version-from-conda-forge","title":"Install the latest stable version from conda-forge","text":"<p>If you are using Anaconda or Miniconda you can install Albumentations from conda-forge:</p> Bash<pre><code>conda install -c conda-forge albumentations\n</code></pre>"},{"location":"getting_started/keypoints_augmentation/","title":"Keypoints augmentation","text":"<p>Computer vision tasks such as human pose estimation, face detection, and emotion recognition usually work with keypoints on the image.</p> <p>In the case of pose estimation, keypoints mark human joints such as shoulder, elbow, wrist, knee, etc.</p> <p> Keypoints annotations along with visualized edges between keypoints. Images are from the COCO dataset.</p> <p>In the case of face detection, keypoints mark important areas of the face such as eyes, nose, corners of the mouth, etc.</p> <p> Facial keypoints. Source: the \"Facial Keypoints Detection\" competition on Kaggle.</p> <p>To define a keypoint, you usually need two values, x and y coordinates of the keypoint. Coordinates of the keypoint are calculated with respect to the top-left corner of the image which has <code>(x, y)</code> coordinates <code>(0, 0)</code>. Often keypoints have associated labels such as <code>right_elbow</code>, <code>left_wrist</code>, etc.</p> <p> An example image with five keypoints from the COCO dataset</p> <p>Some classical computer vision algorithms, such as SIFT, may use four values to describe a keypoint. In addition to the x and y coordinates, there are keypoint scale and keypoint angle. Albumentations support those values as well.</p> <p> A keypoint may also has associated scale and angle values</p> <p>Keypoint angles are counter-clockwise. For example, in the following image, the angle value is 65\u00b0. You can read more about angle of rotation in the Wikipedia article. </p>"},{"location":"getting_started/keypoints_augmentation/#supported-formats-for-keypoints-coordinates","title":"Supported formats for keypoints' coordinates.","text":"<ul> <li> <p><code>xy</code>. A keypoint is defined by x and y coordinates in pixels.</p> </li> <li> <p><code>yx</code>. A keypoint is defined by y and x coordinates in pixels.</p> </li> <li> <p><code>xya</code>. A keypoint is defined by x and y coordinates in pixels and the angle.</p> </li> <li> <p><code>xys</code>. A keypoint is defined by x and y coordinates in pixels, and the scale.</p> </li> <li> <p><code>xyas</code>. A keypoint is defined by x and y coordinates in pixels, the angle, and the scale.</p> </li> <li> <p><code>xysa</code>. A keypoint is defined by x and y coordinates in pixels, the scale, and the angle.</p> </li> </ul>"},{"location":"getting_started/keypoints_augmentation/#augmenting-keypoints","title":"Augmenting keypoints","text":"<p>The process of augmenting keypoints looks very similar to the bounding boxes augmentation. It consists of 4 steps.</p> <ol> <li>You import the required libraries.</li> <li>You define an augmentation pipeline.</li> <li>You read images and keypoints from the disk.</li> <li>You pass an image and keypoints to the augmentation pipeline and receive augmented images and keypoints.</li> </ol> <p>Note</p> <p>Some transforms in Albumentation don't support keypoints. If you try to use them you will get an exception. Please refer to this article to check whether a transform can augment keypoints.</p>"},{"location":"getting_started/keypoints_augmentation/#step-1-import-the-required-libraries","title":"Step 1. Import the required libraries.","text":"Python<pre><code>import albumentations as A\nimport cv2\n</code></pre>"},{"location":"getting_started/keypoints_augmentation/#step-2-define-an-augmentation-pipeline","title":"Step 2. Define an augmentation pipeline.","text":"<p>Here an example of a minimal declaration of an augmentation pipeline that works with keypoints.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy'))\n</code></pre> <p>Note that just like with bounding boxes, <code>Compose</code> has an additional parameter that defines the format for keypoints' coordinates. In the case of keypoints, it is called <code>keypoint_params</code>. Here we pass an instance of <code>A.KeypointParams</code> that says that <code>xy</code> coordinates format should be used.</p> <p>Besides <code>format</code>, <code>A.KeypointParams</code> supports a few more settings.</p> <p>Here is an example of <code>Compose</code> that shows all available settings with <code>A.KeypointParams</code></p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy', label_fields=['class_labels'], remove_invisible=True, angle_in_degrees=True))\n</code></pre>"},{"location":"getting_started/keypoints_augmentation/#label_fields","title":"<code>label_fields</code>","text":"<p>In some computer vision tasks, keypoints have not only coordinates but associated labels as well. For example, in pose estimation, each keypoint has a label such as <code>elbow</code>, <code>knee</code> or <code>wrist</code>. You need to pass those labels in a separate argument (or arguments, because you can use multiple fields) to the <code>transform</code> function that will augment keypoints. <code>label_fields</code> defines names of those fields. Step 4 describes how you need to use the <code>transform</code> function.</p>"},{"location":"getting_started/keypoints_augmentation/#remove_invisible","title":"<code>remove_invisible</code>","text":"<p>After the augmentation, some keypoints may become invisible because they will be located outside of the augmented image's visible area. For example, if you crop a part of the image, all the keypoints outside of the cropped area will become invisible. If <code>remove_invisible</code> is set to <code>True</code>, Albumentations won't return invisible keypoints. <code>remove_invisible</code> is set to <code>True</code> by default, so if you don't pass that argument, Albumentations won't return invisible keypoints.</p>"},{"location":"getting_started/keypoints_augmentation/#angle_in_degrees","title":"<code>angle_in_degrees</code>","text":"<p>If <code>angle_in_degrees</code> is set to <code>True</code> (this is the default value), then Albumentations expects that the angle value in formats <code>xya</code>, <code>xyas</code>, and <code>xysa</code> is defined in angles. If <code>angle_in_degrees</code> is set to <code>False</code>, Albumentations expects that the angle value is specified in radians.</p> <p>This setting doesn't affect <code>xy</code> and <code>yx</code> formats, because those formats don't use angles.</p>"},{"location":"getting_started/keypoints_augmentation/#3-read-images-and-keypoints-from-the-disk","title":"3. Read images and keypoints from the disk.","text":"<p>Read an image from the disk.</p> <p>Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> Keypoints can be stored on the disk in different serialization formats: JSON, XML, YAML, CSV, etc. So the code to read keypoints depends on the actual format of data on the disk.</p> <p>After you read the data from the disk, you need to prepare keypoints for Albumentations.</p> <p>Albumentations expects that keypoint will be represented as a list of lists. Each list contains information about a single keypoint. A definition of keypoint should have two to four elements depending on the selected format of keypoints. The first two elements are x and y coordinates of a keypoint in pixels (or y and x coordinates in the <code>yx</code> format). The third and fourth elements may be the angle and the scale of keypoint if you select a format that uses those values.</p>"},{"location":"getting_started/keypoints_augmentation/#step-4-pass-an-image-and-keypoints-to-the-augmentation-pipeline-and-receive-augmented-images-and-boxes","title":"Step 4. Pass an image and keypoints to the augmentation pipeline and receive augmented images and boxes.","text":"<p>Let's say you have an example image with five keypoints.</p> <p>A list with those five keypoints' coordinates in the <code>xy</code> format will look the following:</p> Python<pre><code>keypoints = [\n    (264, 203),\n    (86, 88),\n    (254, 160),\n    (193, 103),\n    (65, 341),\n]\n</code></pre> <p>Then you pass those keypoints to the <code>transform</code> function along with the image and receive the augmented versions of image and keypoints.</p> Python<pre><code>transformed = transform(image=image, keypoints=keypoints)\ntransformed_image = transformed['image']\ntransformed_keypoints = transformed['keypoints']\n</code></pre> <p> The augmented image with augmented keypoints</p> <p>If you set <code>remove_invisible</code> to <code>False</code> in <code>keypoint_params</code>, then Albumentations will return all keypoints, even if they lie outside the visible area. In the example image below, you can see that the keypoint for the right hip is located outside the image, but Albumentations still returned it. The area outside the image is highlighted in yellow.</p> <p> When <code>remove_invisible</code> is set to <code>False</code> Albumentations will return all keypoints, even those located outside the image</p> <p>If keypoints have associated class labels, you need to create a list that contains those labels:</p> Python<pre><code>class_labels = [\n    'left_elbow',\n    'right_elbow',\n    'left_wrist',\n    'right_wrist',\n    'right_hip',\n]\n</code></pre> <p>Also, you need to declare the name of the argument to <code>transform</code> that will contain those labels. For declaration, you need to use the <code>label_fields</code> parameters of <code>A.KeypointParams</code>.</p> <p>For example, we could use the <code>class_labels</code> name for the argument with labels.</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy', label_fields=['class_labels']))\n</code></pre> <p>Next, you pass both keypoints' coordinates and class labels to <code>transform</code>.</p> Python<pre><code>transformed = transform(image=image, keypoints=keypoints, class_labels=class_labels)\ntransformed_image = transformed['image']\ntransformed_keypoints = transformed['keypoints']\ntransformed_class_labels = transformed['class_labels']\n</code></pre> <p>Note that <code>label_fields</code> expects a list, so you can set multiple fields that contain labels for your keypoints. So if you declare Compose like</p> Python<pre><code>transform = A.Compose([\n    A.RandomCrop(width=330, height=330),\n    A.RandomBrightnessContrast(p=0.2),\n], keypoint_params=A.KeypointParams(format='xy', label_fields=['class_labels', 'class_sides']))\n</code></pre> <p>you can use those multiple arguments to pass info about class labels, like</p> Python<pre><code>class_labels = [\n    'left_elbow',\n    'right_elbow',\n    'left_wrist',\n    'right_wrist',\n    'right_hip',\n]\n\nclass_sides = ['left', 'right', 'left', 'right', 'right']\n\ntransformed = transform(image=image, keypoints=keypoints, class_labels=class_labels, class_sides=class_sides)\ntransformed_class_sides = transformed['class_sides']\ntransformed_class_labels = transformed['class_labels']\ntransformed_keypoints = transformed['keypoints']\ntransformed_image = transformed['image']\n</code></pre> <p> Example input and output data for keypoints augmentation with two separate arguments for class labels</p> <p>Note</p> <p>Some augmentations may affect class labels and make them incorrect. For example, the <code>HorizontalFlip</code> augmentation mirrors the input image. When you apply that augmentation to keypoints that mark the side of body parts (left or right), those keypoints will point to the wrong side (since <code>left</code> on the mirrored image becomes <code>right</code>). So when you are creating an augmentation pipeline look carefully which augmentations could be applied to the input data.</p> <p> <code>HorizontalFlip</code> may make keypoints' labels incorrect</p>"},{"location":"getting_started/keypoints_augmentation/#examples","title":"Examples","text":"<ul> <li>Using Albumentations to augment keypoints</li> </ul>"},{"location":"getting_started/mask_augmentation/","title":"Mask augmentation for segmentation","text":"<p>For instance and semantic segmentation tasks, you need to augment both the input image and one or more output masks.</p> <p>Albumentations ensures that the input image and the output mask will receive the same set of augmentations with the same parameters.</p> <p>The process of augmenting images and masks looks very similar to the regular image-only augmentation.</p> <ol> <li>You import the required libraries.</li> <li>You define an augmentation pipeline.</li> <li>You read images and masks from the disk.</li> <li>You pass an image and one or more masks to the augmentation pipeline and receive augmented images and masks.</li> </ol>"},{"location":"getting_started/mask_augmentation/#steps-1-and-2-import-the-required-libraries-and-define-an-augmentation-pipeline","title":"Steps 1 and 2. Import the required libraries and define an augmentation pipeline.","text":"<p>Image augmentation for classification described Steps 1 and 2 in great detail. These are the same steps for the simultaneous augmentation of images and masks.</p> Python<pre><code>import albumentations as A\nimport cv2\n\ntransform = A.Compose([\n    A.RandomCrop(width=256, height=256),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n])\n</code></pre>"},{"location":"getting_started/mask_augmentation/#step-3-read-images-and-masks-from-the-disk","title":"Step 3. Read images and masks from the disk.","text":"<ul> <li>Reading an image</li> </ul> Python<pre><code>image = cv2.imread(\"/path/to/image.jpg\")\nimage = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n</code></pre> <ul> <li>For semantic segmentation, you usually read one mask per image. Albumentations expects the mask to be a NumPy array. The height and width of the mask should have the same values as the height and width of the image.</li> </ul> Python<pre><code>mask = cv2.imread(\"/path/to/mask.png\")\n</code></pre> <ul> <li>For instance segmentation, you sometimes need to read multiple masks per image. Then you create a list that contains all the masks.</li> </ul> Python<pre><code>mask_1 = cv2.imread(\"/path/to/mask_1.png\")\nmask_2 = cv2.imread(\"/path/to/mask_2.png\")\nmask_3 = cv2.imread(\"/path/to/mask_3.png\")\nmasks = [mask_1, mask_2, mask_3]\n</code></pre> <p>Some datasets use other formats to store masks. For example, they can use Run-Length Encoding or Polygon coordinates. In that case, you need to convert a mask to a NumPy before augmenting it with Albumentations. Often dataset authors provide special libraries and tools to simplify the conversion.</p>"},{"location":"getting_started/mask_augmentation/#step-4-pass-image-and-masks-to-the-augmentation-pipeline-and-receive-augmented-images-and-masks","title":"Step 4. Pass image and masks to the augmentation pipeline and receive augmented images and masks.","text":"<p>If the image has one associated mask, you need to call <code>transform</code> with two arguments: <code>image</code> and <code>mask</code>. In <code>image</code> you should pass the input image, in <code>mask</code> you should pass the output mask. <code>transform</code> will return a dictionary with two keys: <code>image</code> will contain the augmented image, and <code>mask</code> will contain the augmented mask.</p> Python<pre><code>transformed = transform(image=image, mask=mask)\ntransformed_image = transformed['image']\ntransformed_mask = transformed['mask']\n</code></pre> <p></p> <p>An image and a mask before and after augmentation. Inria Aerial Image Labeling dataset contains aerial photos as well as their segmentation masks. Each pixel of the mask is marked as 1 if the pixel belongs to the class <code>building</code> and 0 otherwise.</p> <p>If the image has multiple associated masks, you should use the <code>masks</code> argument instead of <code>mask</code>. In <code>masks</code> you should pass a list of masks.</p> Python<pre><code>transformed = transform(image=image, masks=masks)\ntransformed_image = transformed['image']\ntransformed_masks = transformed['masks']\n</code></pre>"},{"location":"getting_started/mask_augmentation/#examples","title":"Examples","text":"<ul> <li>Using Albumentations for a semantic segmentation task</li> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/setting_probabilities/","title":"Setting probabilities for transforms in an augmentation pipeline","text":"<p>Each augmentation in Albumentations has a parameter named <code>p</code> that sets the probability of applying that augmentation to input data.</p> <p>The following augmentations have the default value of <code>p</code> set 1 (which means that by default they will be applied to each instance of input data): <code>Compose</code>, <code>ReplayCompose</code>, <code>CenterCrop</code>, <code>Crop</code>, <code>CropNonEmptyMaskIfExists</code>, <code>FromFloat</code>, <code>CenterCrop</code>, <code>Crop</code>, <code>CropNonEmptyMaskIfExists</code>, <code>FromFloat</code>, <code>IAACropAndPad</code>, <code>Lambda</code>, <code>LongestMaxSize</code>, <code>Normalize</code>, <code>PadIfNeeded</code>, <code>RandomCrop</code>, <code>RandomCropNearBBox</code>, <code>RandomResizedCrop</code>, <code>RandomSizedBBoxSafeCrop</code>, <code>RandomSizedCrop</code>, <code>Resize</code>, <code>SmallestMaxSize</code>, <code>ToFloat</code>.</p> <p>All other augmentations have the default value of <code>p</code> set 0.5, which means that by default, they will be applied to 50% of instances of input data.</p> <p>Let's take a look at the example:</p> Python<pre><code>import albumentations as A\nimport cv2\n\np1 = 0.95\np2 = 0.85\np3 = 0.75\n\n\ntransform = A.Compose([\n    A.RandomRotate90(p=p2),\n    A.OneOf([\n        A.IAAAdditiveGaussianNoise(p=0.9),\n        A.GaussNoise(p=0.6),\n    ], p=p3)\n], p=p1)\n\nimage = cv2.imread('some/image.jpg')\nimage = cv2.cvtColor(cv2.COLOR_BGR2RGB)\n\ntransformed = transform(image=image)\ntransformed_image = transformed['image']\n</code></pre> <p>We declare an augmentation pipeline. In this pipeline, we use three placeholder values to set probabilities: <code>p1</code>, <code>p2</code>, and <code>p3</code>. Let's take a closer look at them.</p>"},{"location":"getting_started/setting_probabilities/#p1","title":"<code>p1</code>","text":"<p><code>p1</code> sets the probability that the augmentation pipeline will apply augmentations at all.</p> <p>If <code>p1</code> is set to 0, then augmentations inside <code>Compose</code> will never be applied to the input image, so the augmentation pipeline will always return the input image unchanged.</p> <p>If <code>p1</code> is set to 1, then all augmentations inside <code>Compose</code> will have a chance to be applied. The example above contains two augmentations inside <code>Compose</code>: <code>RandomRotate90</code> and the <code>OneOf</code> block with two child augmentations (more on their probabilities later). Any value of <code>p1</code> between 0 and 1 means that augmentations inside <code>Compose</code> could be applied with the probability between 0 and 100%.</p> <p>If <code>p1</code> equals to 1 or <code>p1</code> is less than 1, but the random generator decides to apply augmentations inside Compose probabilities <code>p2</code> and <code>p3</code> come into play.</p>"},{"location":"getting_started/setting_probabilities/#p2","title":"<code>p2</code>","text":"<p>Each augmentation inside <code>Compose</code> has a probability of being applied. <code>p2</code> sets the probability of applying <code>RandomRotate90</code>. In the example above, <code>p2</code> equals 0.85, so <code>RandomRotate90</code> has an 85% chance to be applied to the input image.</p>"},{"location":"getting_started/setting_probabilities/#p3","title":"<code>p3</code>","text":"<p><code>p3</code> sets the probability of applying the <code>OneOf</code> block. If the random generator decided to apply <code>RandomRotate90</code> at the previous step, then <code>OneOf</code> will receive data augmented by it. If the random generator decided not to apply <code>RandomRotate90</code> then <code>OneOf</code> will receive the input data (that was passed to <code>Compose</code>) since <code>RandomRotate90</code> is skipped.</p> <p>The <code>OneOf</code>block applies one of the augmentations inside it. That means that if the random generator chooses to apply <code>OneOf</code> then one child augmentation from it will be applied to the input data.</p> <p>To decide which augmentation within the <code>OneOf</code> block is used, Albumentations uses the following rule:</p> <p>The <code>OneOf</code> block normalizes the probabilities of all augmentations inside it, so their probabilities sum up to 1. Next, <code>OneOf</code> chooses one of the augmentations inside it with a chance defined by its normalized probability and applies it to the input data. In the example above <code>IAAAdditiveGaussianNoise</code> has probability 0.9 and <code>GaussNoise</code> probability 0.6. After normalization, they become 0.6 and 0.4. Which means that <code>OneOf</code> will decide that it should use <code>IAAAdditiveGaussianNoise</code> with probability 0.6 and <code>GaussNoise</code> otherwise.</p>"},{"location":"getting_started/setting_probabilities/#example-calculations","title":"Example calculations","text":"<p>Thus, each augmentation in the example above will be applied with the probability:</p> <ul> <li><code>RandomRotate90</code>: <code>p1</code> * <code>p2</code></li> <li><code>IAAAdditiveGaussianNoise</code>: <code>p1</code> * <code>p3</code> * (0.9 / (0.9 + 0.6))</li> <li><code>GaussianNoise</code>: <code>p1</code> * <code>p3</code> * (0.6 / (0.9 + 0.6))</li> </ul>"},{"location":"getting_started/simultaneous_augmentation/","title":"Simultaneous augmentation of multiple targets: masks, bounding boxes, keypoints","text":"<p>Albumentations can apply the same set of transformations to the input images and all the targets that are passed to <code>transform</code>: masks, bounding boxes, and keypoints.</p> <p>Please refer to articles Image augmentation for classification, Mask augmentation for segmentation, Bounding boxes augmentation for object detection, and Keypoints augmentation for the detailed description of each data type.</p> <p>Note</p> <p>Some transforms in Albumentation don't support bounding boxes or keypoints. If you try to use them you will get an exception. Please refer to this article to check whether a transform can augment bounding boxes and keypoints.</p> <p>Below is an example, how you can simultaneously augment the input image, mask, bounding boxes with their labels, and keypoints with their labels. Note that the only required argument to <code>transform</code> is <code>image</code>; all other arguments are optional, and you can combine them in any way.</p>"},{"location":"getting_started/simultaneous_augmentation/#step-1-define-compose-with-parameters-that-specify-formats-for-bounding-boxes-and-keypoints","title":"Step 1. Define <code>Compose</code> with parameters that specify formats for bounding boxes and keypoints.","text":"Python<pre><code>transform = A.Compose(\n  [A.RandomCrop(width=330, height=330), A.RandomBrightnessContrast(p=0.2)],\n  bbox_params=A.BboxParams(format=\"coco\", label_fields=[\"bbox_classes\"]),\n  keypoint_params=A.KeypointParams(format=\"xy\", label_fields=[\"keypoints_classes\"]),\n)\n</code></pre>"},{"location":"getting_started/simultaneous_augmentation/#step-2-load-all-required-data-from-the-disk","title":"Step 2. Load all required data from the disk","text":"<p>Please refer to articles Image augmentation for classification, Mask augmentation for segmentation, Bounding boxes augmentation for object detection, and Keypoints augmentation for more information about loading the input data.</p> <p>For example, here is an image from the COCO dataset. that has one associated mask, one bounding box with the class label <code>person</code>, and five keypoints that define body parts.</p> <p> An example image with mask, bounding boxes and keypoints</p>"},{"location":"getting_started/simultaneous_augmentation/#step-3-pass-all-targets-to-transform-and-receive-their-augmented-versions","title":"Step 3. Pass all targets to <code>transform</code> and receive their augmented versions","text":"Python<pre><code>transformed = transform(\n  image=img,\n  mask=mask,\n  bboxes=bboxes,\n  bbox_classes=bbox_classes,\n  keypoints=keypoints,\n  keypoints_classes=keypoints_classes,\n)\ntransformed_image = transformed[\"image\"]\ntransformed_mask = transformed[\"mask\"]\ntransformed_bboxes = transformed[\"bboxes\"]\ntransformed_bbox_classes = transformed[\"bbox_classes\"]\ntransformed_keypoints = transformed[\"keypoints\"]\ntransformed_keypoints_classes = transformed[\"keypoints_classes\"]\n</code></pre> <p> The augmented version of the image and its targets</p>"},{"location":"getting_started/simultaneous_augmentation/#examples","title":"Examples","text":"<ul> <li>Showcase. Cool augmentation examples on diverse set of images from various real-world tasks.</li> </ul>"},{"location":"getting_started/transforms_and_targets/","title":"A list of transforms and their supported targets","text":"<p>We can split all transforms into two groups: pixel-level transforms, and spatial-level transforms. Pixel-level transforms will change just an input image and will leave any additional targets such as masks, bounding boxes, and keypoints unchanged. Spatial-level transforms will simultaneously change both an input image as well as additional targets such as masks, bounding boxes, and keypoints. For the additional information, please refer to this section of \"Why you need a dedicated library for image augmentation\".</p>"},{"location":"getting_started/transforms_and_targets/#pixel-level-transforms","title":"Pixel-level transforms","text":"<p>Here is a list of all available pixel-level transforms. You can apply a pixel-level transform to any target, and under the hood, the transform will change only the input image and return any other input targets such as masks, bounding boxes, or keypoints unchanged.</p> <ul> <li>AdditiveNoise</li> <li>AdvancedBlur</li> <li>AutoContrast</li> <li>Blur</li> <li>CLAHE</li> <li>ChannelDropout</li> <li>ChannelShuffle</li> <li>ChromaticAberration</li> <li>ColorJitter</li> <li>Defocus</li> <li>Downscale</li> <li>Emboss</li> <li>Equalize</li> <li>FDA</li> <li>FancyPCA</li> <li>FromFloat</li> <li>GaussNoise</li> <li>GaussianBlur</li> <li>GlassBlur</li> <li>HistogramMatching</li> <li>HueSaturationValue</li> <li>ISONoise</li> <li>Illumination</li> <li>ImageCompression</li> <li>InvertImg</li> <li>MedianBlur</li> <li>MotionBlur</li> <li>MultiplicativeNoise</li> <li>Normalize</li> <li>PixelDistributionAdaptation</li> <li>PlanckianJitter</li> <li>PlasmaBrightnessContrast</li> <li>PlasmaShadow</li> <li>Posterize</li> <li>RGBShift</li> <li>RandomBrightnessContrast</li> <li>RandomFog</li> <li>RandomGamma</li> <li>RandomGravel</li> <li>RandomRain</li> <li>RandomShadow</li> <li>RandomSnow</li> <li>RandomSunFlare</li> <li>RandomToneCurve</li> <li>RingingOvershoot</li> <li>SaltAndPepper</li> <li>Sharpen</li> <li>ShotNoise</li> <li>Solarize</li> <li>Spatter</li> <li>Superpixels</li> <li>TemplateTransform</li> <li>TextImage</li> <li>ToFloat</li> <li>ToGray</li> <li>ToRGB</li> <li>ToSepia</li> <li>UnsharpMask</li> <li>ZoomBlur</li> </ul>"},{"location":"getting_started/transforms_and_targets/#spatial-level-transforms","title":"Spatial-level transforms","text":"<p>Here is a table with spatial-level transforms and targets they support. If you try to apply a spatial-level transform to an unsupported target, Albumentations will raise an error.</p> Transform Image Mask BBoxes Keypoints Volume Mask3D Affine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 AtLeastOneBBoxRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 BBoxSafeRandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CenterCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CoarseDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Crop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropAndPad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 CropNonEmptyMaskIfExists \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 D4 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ElasticTransform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Erasing \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 FrequencyMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 GridElasticDeform \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 HorizontalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Lambda \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 LongestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 MaskDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Morphological \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 NoOp \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OpticalDistortion \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 OverlayElements \u2713 \u2713 Pad \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PadIfNeeded \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Perspective \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PiecewiseAffine \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 PixelDropout \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropFromBorders \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomCropNearBBox \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomGridShuffle \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomResizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomRotate90 \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomScale \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedBBoxSafeCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 RandomSizedCrop \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Resize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Rotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SafeRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ShiftScaleRotate \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 SmallestMaxSize \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 ThinPlateSpline \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 TimeReverse \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 Transpose \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 VerticalFlip \u2713 \u2713 \u2713 \u2713 \u2713 \u2713 XYMasking \u2713 \u2713 \u2713 \u2713 \u2713 \u2713"},{"location":"getting_started/video_augmentation/","title":"Working with Video Data in Albumentations","text":""},{"location":"getting_started/video_augmentation/#overview","title":"Overview","text":"<p>While Albumentations is primarily known for image augmentation, it can effectively process video data by treating it as a sequence of frames. When you pass a video as a numpy array, Albumentations will apply the same transform with identical parameters to each frame, ensuring temporal consistency.</p>"},{"location":"getting_started/video_augmentation/#data-format","title":"Data Format","text":""},{"location":"getting_started/video_augmentation/#video-frames","title":"Video Frames","text":"<p>Albumentations accepts video data as numpy arrays in the following formats: - <code>(N, H, W)</code> - Grayscale video (N frames) - <code>(N, H, W, C)</code> - Color video (N frames)</p> <p>Where: - N = Number of frames - H = Height - W = Width  - C = Channels (e.g., 3 for RGB)</p>"},{"location":"getting_started/video_augmentation/#video-masks","title":"Video Masks","text":"<p>For video segmentation tasks, masks should match the frame dimensions: - <code>(N, H, W)</code> - Binary or single-class masks - <code>(N, H, W, C)</code> - Multi-class masks</p>"},{"location":"getting_started/video_augmentation/#basic-usage","title":"Basic Usage","text":"Python<pre><code>import albumentations as A\nimport numpy as np\n</code></pre>"},{"location":"getting_started/video_augmentation/#create-transform-pipeline","title":"Create transform pipeline","text":"Python<pre><code>transform = A.Compose([\n    A.RandomCrop(height=224, width=224),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n], seed=42)\n</code></pre>"},{"location":"getting_started/video_augmentation/#example-video-data","title":"Example video data","text":"Python<pre><code>video = np.random.rand(32, 256, 256, 3) # 32 RGB frames\nmasks = np.zeros((32, 256, 256)) # 32 binary masks\n</code></pre>"},{"location":"getting_started/video_augmentation/#apply-transform","title":"Apply transform","text":"Python<pre><code>augmented_video = transform(images=video, masks=masks)\n</code></pre>"},{"location":"getting_started/video_augmentation/#apply-transforms-same-parameters-for-all-frames","title":"Apply transforms - same parameters for all frames","text":"Python<pre><code>transformed = transform(images=video, mask=masks)\ntransformed_video = transformed['image']\ntransformed_masks = transformed['mask']\n</code></pre>"},{"location":"getting_started/video_augmentation/#key-features","title":"Key Features","text":"<ol> <li> <p>Temporal Consistency: The same transform with identical parameters is applied to all frames, preserving temporal consistency.</p> </li> <li> <p>Memory Efficiency: Frames are processed as a batch, avoiding repeated parameter generation.</p> </li> <li> <p>Compatible with All Transforms: Works with any Albumentations transform that supports the image target.</p> </li> </ol>"},{"location":"getting_started/video_augmentation/#example-pipeline-for-video-processing","title":"Example Pipeline for Video Processing","text":"Python<pre><code>def create_video_pipeline(\n    crop_size=(224, 224),\n    p_spatial=0.5,\n    p_color=0.3\n    ):\n    return A.Compose([\n        # Spatial transforms - same crop/flip for all frames\n        A.RandomCrop(\n            height=crop_size[0],\n            width=crop_size[1],\n            p=1.0\n        ),\n        A.HorizontalFlip(p=p_spatial),\n        # Color transforms - same adjustment for all frames\n        A.ColorJitter(\n            brightness=0.2,\n            contrast=0.2,\n            saturation=0.2,\n            hue=0.1,\n            p=p_color\n        ),\n        # Noise/blur - same pattern for all frames\n        A.GaussianBlur(p=0.3),\n    ])\n</code></pre>"},{"location":"getting_started/video_augmentation/#best-practices","title":"Best Practices","text":"<ol> <li>Performance Optimization:</li> <li>Place cropping operations first to reduce computation</li> <li>Consider frame rate and whether all frames need processing</li> </ol>"},{"location":"getting_started/video_augmentation/#next-steps","title":"Next Steps","text":"<ul> <li>Learn about Volumetric Data (3D) for volumetric data</li> </ul>"},{"location":"getting_started/volumetric_augmentation/","title":"Introduction to 3D Medical Image Augmentation","text":""},{"location":"getting_started/volumetric_augmentation/#overview","title":"Overview","text":"<p>While primarily used for medical imaging (CT scans, MRI), Albumentations' 3D transforms can be applied to various volumetric data types</p>"},{"location":"getting_started/volumetric_augmentation/#medical-imaging","title":"Medical Imaging","text":"<ul> <li>CT and MRI scans</li> <li>Ultrasound volumes</li> <li>PET scans</li> <li>Multi-modal medical imaging</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#scientific-data","title":"Scientific Data","text":"<ul> <li>Microscopy z-stacks</li> <li>Cryo-EM volumes</li> <li>Geological seismic data</li> <li>Weather radar volumes</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#industrial-applications","title":"Industrial Applications","text":"<ul> <li>3D NDT (Non-Destructive Testing) scans</li> <li>Industrial CT for quality control</li> <li>Material analysis volumes</li> <li>3D ultrasonic testing data</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#computer-vision","title":"Computer Vision","text":"<ul> <li>Depth camera sequences</li> <li>LiDAR point cloud voxelizations</li> <li>Multi-view stereo reconstructions</li> </ul>"},{"location":"getting_started/volumetric_augmentation/#data-format","title":"Data Format","text":""},{"location":"getting_started/volumetric_augmentation/#volumes","title":"Volumes","text":"<p>Albumentations expects 3D volumes as numpy arrays in the following formats: - <code>(D, H, W)</code> - Single-channel volumes (e.g., CT scans) - <code>(D, H, W, C)</code> - Multi-channel volumes (e.g., multi-modal MRI)</p> <p>Where: - D = Depth (number of slices) - H = Height - W = Width  - C = Channels (optional)</p>"},{"location":"getting_started/volumetric_augmentation/#3d-masks","title":"3D Masks","text":"<p>Segmentation masks should match the volume dimensions: - <code>(D, H, W)</code> - Binary or single-class masks - <code>(D, H, W, C)</code> - Multi-class masks</p>"},{"location":"getting_started/volumetric_augmentation/#basic-usage","title":"Basic Usage","text":"Python<pre><code>import albumentations as A\nimport numpy as np\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#create-a-basic-3d-augmentation-pipeline","title":"Create a basic 3D augmentation pipeline","text":"Python<pre><code>transform = A.Compose([\n    # Crop volume to a fixed size for memory efficiency\n    A.RandomCrop3D(size=(64, 128, 128), p=1.0),    \n    # Randomly remove cubic regions to simulate occlusions\n    A.CoarseDropout3D(\n        num_holes_range=(2, 6),\n        hole_depth_range=(0.1, 0.3),\n        hole_height_range=(0.1, 0.3),\n        hole_width_range=(0.1, 0.3),\n        p=0.5\n    ),    \n])\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#apply-to-volume-and-mask","title":"Apply to volume and mask","text":"Python<pre><code>volume = np.random.rand(96, 256, 256) # Your 3D medical volume\nmask = np.zeros((96, 256, 256)) # Your 3D segmentation mask\ntransformed = transform(volume=volume, mask3d=mask)\ntransformed_volume = transformed['volume']\ntransformed_mask = transformed['mask3d']\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#available-3d-transforms","title":"Available 3D Transforms","text":"<p>Here are some examples of available 3D transforms:</p> <ul> <li><code>CenterCrop3D</code> - Crop the center part of a 3D volume</li> <li><code>RandomCrop3D</code> - Randomly crop a part of a 3D volume</li> <li><code>Pad3D</code> - Pad a 3D volume</li> <li><code>PadIfNeeded3D</code> - Pad if volume size is less than desired size</li> <li><code>CoarseDropout3D</code> - Random dropout of 3D cubic regions</li> <li><code>CubicSymmetry</code> - Apply random cubic symmetry transformations</li> </ul> <p>For a complete and up-to-date list of all available 3D transforms, please see our API Reference.</p>"},{"location":"getting_started/volumetric_augmentation/#combining-2d-and-3d-transforms","title":"Combining 2D and 3D Transforms","text":"<p>You can combine 2D and 3D transforms in the same pipeline. 2D transforms will be applied slice-by-slice in the XY plane:</p> Python<pre><code>transform = A.Compose([\n    # 3D transforms\n    A.RandomCrop3D(size=(64, 128, 128)),\n    # 2D transforms (applied to each XY slice)\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n])\n\ntransformed = transform(volume=volume, mask3d=mask)\ntransformed_volume = transformed['volume']\ntransformed_mask = transformed['mask3d']\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#best-practices","title":"Best Practices","text":"<ol> <li>Memory Management: 3D volumes can be large. Consider using smaller crop sizes or processing in patches.</li> <li>Place cropping operations at the beginning of your pipeline for better performance</li> <li>Example: A <code>256x256x256</code> volume cropped to <code>64x64x64</code> will process subsequent transforms ~64x faster</li> </ol>"},{"location":"getting_started/volumetric_augmentation/#efficient-pipeline-cropping-first","title":"Efficient pipeline - cropping first","text":"Python<pre><code>efficient_transform = A.Compose([\nA.RandomCrop3D(size=(64, 64, 64)), # Do this first!\nA.CoarseDropout3D(...),\nA.RandomBrightnessContrast(...)\n])\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#less-efficient-pipeline-processing-full-volume-unnecessarily","title":"Less efficient pipeline - processing full volume unnecessarily","text":"Python<pre><code>inefficient_transform = A.Compose([\nA.CoarseDropout3D(...), # Processing full volume\nA.RandomBrightnessContrast(...), # Processing full volume\nA.RandomCrop3D(size=(64, 64, 64)) # Cropping at the end\n])\n</code></pre> <ol> <li>Avoid Interpolation Artifacts: For highest quality augmentation, prefer transforms that only rearrange existing voxels without interpolation:</li> </ol> <p>a) Available Artifact-Free Transforms:       - <code>HorizontalFlip</code>, <code>VerticalFlip</code> - Mirror images across X or Y axes       - <code>RandomRotate90</code> - Rotate by 90 degrees in XY plane       - <code>D4</code> - All possible combinations of flips and 90-degree rotations in XY plane (8 variants)       - <code>CubicSymmetry</code> - 3D extension of D4, includes all 48 possible cube symmetries</p> <p>These transforms maintain perfect image quality because they only move existing voxels to new positions without creating new values through interpolation.</p> <pre><code>b) Benefits of Artifact-Free Transforms:\n- Preserve original voxel values exactly\n- Maintain spatial relationships between tissues\n- No blurring or information loss\n- Faster computation (no interpolation needed)\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#example-pipeline","title":"Example Pipeline","text":"<p>Here's a complete example of a medical image augmentation pipeline:</p> Python<pre><code>import albumentations as A\nimport numpy as np\n\ndef create_3d_pipeline(\n    crop_size=(64, 128, 128),\n    p_spatial=0.5,\n    p_intensity=0.3\n    ):\n    return A.Compose([\n        # Spatial transforms\n        A.RandomCrop3D(\n            size=crop_size,\n            p=1.0\n        ),\n        A.CubicSymmetry(p=p_spatial),\n        # Intensity transforms\n        A.CoarseDropout3D(\n            num_holes_range=(2, 5),\n            hole_depth_range=(0.1, 0.2),\n            hole_height_range=(0.1, 0.2),\n            hole_width_range=(0.1, 0.2),\n            p=p_intensity\n        ),\n    ])\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#usage","title":"Usage","text":"Python<pre><code>transform = create_3d_pipeline()\nvolume = np.random.rand(96, 256, 256)\nmask = np.zeros((96, 256, 256))\ntransformed = transform(volume=volume, mask3d=mask)\n</code></pre>"},{"location":"getting_started/volumetric_augmentation/#next-steps","title":"Next Steps","text":"<ul> <li>Learn about Video Augmentation for sequential data</li> </ul>"},{"location":"integrations/","title":"Integrations","text":"<p>Here are some examples of how to use Albumentations with different deep learning frameworks and tools:</p> <ul> <li>HuggingFace</li> <li>FiftyOne</li> <li>Roboflow</li> </ul>"},{"location":"integrations/fiftyone/","title":"FiftyOne integration","text":""},{"location":"integrations/fiftyone/#introduction","title":"Introduction","text":"<p>FiftyOne is an open-source visualization and analysis tool for machine learning datasets, particularly useful in computer vision projects. It facilitates detailed dataset examination and the fine-tuning of model performance.</p> <p>Albumentations could be used in FiftyOne via the FiftyOne Plugin.</p> <p>With the FiftyOne Albumentations plugin, you can transform any and all labels of type Detections, Keypoints, Segmentation, and Heatmap, or just the <code>images</code> themselves.</p> <p>Info</p> <p>This tutorial is almost entirely based on the FiftyOne Documentation and serves as an overview of the functionality of the FiftyOne Albumentations plugin.</p> <p>For more up to date information check the original source.</p> <p>This integration guide will focus on the setup process and the functionality of the plugin.</p> <p>For a tutorial on how to curate your augmentations, check out the Data Augmentation Tutorial as FiftyOne Documentation.</p>"},{"location":"integrations/fiftyone/#overview","title":"Overview","text":"<p>Albumentations supports 80+ transforms spanning pixel-level, geometric transformations, and more.</p> <p>As of April 29, 2024 FiftyOne supports:</p> <ul> <li>AdvancedBlur</li> <li>GridDropout</li> <li>MaskDropout</li> <li>PiecewiseAffine</li> <li>RandomGravel</li> <li>RandomGridShuffle</li> <li>RandomShadow</li> <li>RandomSunFlare</li> <li>Rotate</li> </ul>"},{"location":"integrations/fiftyone/#functionality","title":"Functionality","text":"<p>The FiftyOne Albumentations plugin provides the following functionality:</p> <ul> <li>Apply Albumentations transformations to your dataset, your current view, or selected samples</li> <li>Visualize the effects of these transformations directly within the FiftyOne App</li> <li>View samples generated by the last applied transformation</li> <li>Save augmented samples to the dataset</li> <li>Get info about the last applied transformation</li> <li>Save transformation pipelines to the dataset for reproducibility</li> </ul>"},{"location":"integrations/fiftyone/#setup","title":"Setup","text":"<p>Make sure you have FiftyOne and Albumentations installed:</p> Bash<pre><code>pip install -U fiftyone albumentations\n</code></pre> <p>Next, install the FiftyOne Albumentations plugin:</p> Bash<pre><code>fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin\n</code></pre> <p>Note</p> <p>If you have the FiftyOne Plugin Utils plugin installed, you can also install the Albumentations plugin via the <code>install_plugin</code> operator, selecting the Albumentations plugin from the community dropdown menu.</p> <p>You will also need to load (and download if necessary) a dataset to apply the augmentations to. For this guide, we'll use the the quickstart dataset:</p> Python<pre><code>import fiftyone as fo\nimport fiftyone.zoo as foz\n\n## only take 5 samples for quick demonstration\ndataset = foz.load_zoo_dataset(\"quickstart\", max_samples=5)\n\n# only keep the ground truth labels\ndataset.select_fields(\"ground_truth\").keep_fields()\n\nsession = fo.launch_app(dataset)\n</code></pre> <p>Note</p> <p>The quickstart dataset only contains Detections labels. If you want to test Albumentations transformations on other label types, here are some quick examples to get you started, using FiftyOne's Hugging Face Transformers and Ultralytics integrations: Bash<pre><code>pip install -U transformers ultralytics\n</code></pre> Python<pre><code>import fiftyone as fo\nimport fiftyone.zoo as foz\n\nfrom ultralytics import YOLO\n\n# Keypoints\nmodel = YOLO(\"yolov8l-pose.pt\")\ndataset.apply_model(model, label_field=\"keypoints\")\n\n# Instance Segmentation\nmodel = YOLO(\"yolov8l-seg.pt\")\ndataset.apply_model(model, label_field=\"instances\")\n\n# Semantic Segmentation\nmodel = foz.load_zoo_model(\n    \"segmentation-transformer-torch\",\n    name_or_path=\"Intel/dpt-large-ade\",\n)\ndataset.apply_model(model, label_field=\"mask\")\n\n# Heatmap\nmodel = foz.load_zoo_model(\n    \"depth-estimation-transformer-torch\",\n    name_or_path=\"LiheYoung/depth-anything-small-hf\",\n)\ndataset.apply_model(model, label_field=\"depth_map\")\n</code></pre></p>"},{"location":"integrations/fiftyone/#apply-transformations","title":"Apply transformations","text":"<p>To apply Albumentations transformations to your dataset, you can use the augment_with_albumentations operator. Press the backtick key to open the operator modal, and select the <code>augment_with_albumentations</code> operator from the dropdown menu.</p> <p>You can then configure the transformations to apply:</p> <ul> <li>Number of augmentations per sample: The number of augmented samples to generate for each input sample. The default is 1, which is sufficient for deterministic transformations, but for probabilistic transformations, you may want to generate multiple samples to see the range of possible outputs.</li> <li>Number of transforms: The number of transformations to compose into the pipeline to be applied to each sample. The default is 1, but you can set this as high as you'd like \u2014 the more transformations, the more complex the augmentations will be. You will be able to configure each transform separately.</li> <li>Target view: The view to which the transformations will be applied. The default is <code>dataset</code>, but you can also apply the transformations to the current view or to currently selected samples within the app.</li> <li>Execution mode: If you set <code>delegated=False</code>, the operation will be executed immediately. If you set <code>delegated=True</code>, the operation will be queued as a job, which you can then run in the background from your terminal with:</li> </ul> Bash<pre><code>fiftyone delegated launch\n</code></pre> <p>For each transformation, you can select either a \"primitive\" transformation from the Albumentations library, or a \"saved\" transformation pipeline that you have previously saved to the dataset. These saved pipelines can consist of one or more transformations.</p> <p>When you apply a primitive transformation, you can configure the parameters of the transformation directly within the app. The available parameters, their default values, types, and docstrings are all integrated directly from the Albumentations library.</p> <p></p> <p>When you apply a saved pipeline, there will not be any parameters to configure.</p> <p></p>"},{"location":"integrations/fiftyone/#visualize-transformations","title":"Visualize transformations","text":"<p>Once you've applied the transformations, you can visualize the effects of the transformations directly within the FiftyOne App. All augmented samples will be added to the dataset, and will be tagged as <code>augmented</code> so that you can easily filter for just augmented or non-augmented samples in the app.</p> <p></p> <p>You can also filter for augmented samples programmatically with the match_tags() method:</p> Python<pre><code># get just the augmented samples\naugmented_view = dataset.match_tags(\"augmented\")\n\n# get just the non-augmented samples\nnon_augmented_view = dataset.match_tags(\"augmented\", bool=False)\n</code></pre> <p>However, matching on these tags will return all samples that have been generated by an augmentation, not just the samples that were generated by the last applied transformation \u2014 as you will see shortly, we can save augmentations to the dataset. To get just the samples generated by the last applied transformation, you can use the view_last_albumentations_run operator:</p> <p></p> <p>Note</p> <p>For all samples added to the dataset by the FiftyOne Albumentations plugin, there will be a field <code>\"transform\"</code>, which contains the information not just about the pipeline that was applied, but also about the specific parameters that were used for this application of the pipeline. For example, if you had a HorizontalFlip transformation with an application probability of <code>p=0.5</code>, the contents of the <code>\"transform\"</code> field tell you whether or not this transformation was applied to the sample!</p>"},{"location":"integrations/fiftyone/#save-augmentations","title":"Save augmentations","text":"<p>By default all augmentations are temporary, as the FiftyOne Albumentations plugin is primarily designed for rapid prototyping and experimentation. This means that when you generated a new batch of augmented samples, the previous batch of augmented samples will be removed from the dataset, and the image files will be deleted from disk.</p> <p>However, if you want to save the augmented samples to the dataset, you can use the save_albumentations_augmentations operator, which will save the augmented samples to the dataset while keeping the augmented tag on the samples.</p> <p></p>"},{"location":"integrations/fiftyone/#get-last-transformation-info","title":"Get last transformation info","text":"<p>When you apply a transformation pipeline to samples in your dataset using the FiftyOne Albumentations plugin, this information is captured and stored using FiftyOne's custom runs. This means that you can easily access the information about the last applied transformation.</p> <p>In the FiftyOne App, you can use the get_last_albumentations_run_info operator to display a formatted summary of the relevant information:</p> <p></p> <p>Note</p> <p>You can also access this information programmatically by getting info about the custom run that the information is stored in. For the Albumentations plugin, this info is stored via the key <code>'_last_albumentations_run'</code>:</p> Python<pre><code>last_run_info = dataset.get_run_info(\"_last_albumentations_run\")\nprint(last_run_info)\n</code></pre>"},{"location":"integrations/fiftyone/#save-transformations","title":"Save transformations","text":"<p>If you are satisfied with the transformation pipeline you have created, you can save the entire composition of transformations to the dataset, hyperparameters and all. This means that after your rapid prototyping phase, you can easily move to a more reproducible workflow, and you can share your transformations or port them to other datasets.</p> <p>To save a transformation pipeline, you can use the save_albumentations_transform operator:</p> <p>After doing so, you will be able to view the information about this saved transformation pipeline using the get_albumentations_run_info operator:</p> <p></p> <p>Additionally, you will have access to this saved transformation pipeline under the \"saved\" tab for each transformation in the augment_with_albumentations operator modal.</p>"},{"location":"integrations/huggingface/","title":"HuggingFace","text":"<ul> <li>Image classification</li> <li>Object Detection</li> </ul>"},{"location":"integrations/huggingface/image_classification_albumentations/","title":"Fine-tuning for Image Classification with \ud83e\udd17 Transformers","text":"<p>This notebook shows how to fine-tune any pretrained Vision model for Image Classification on a custom dataset. The idea is to add a randomly initialized classification head on top of a pre-trained encoder, and fine-tune the model altogether on a labeled dataset.</p>"},{"location":"integrations/huggingface/image_classification_albumentations/#imagefolder-feature","title":"ImageFolder feature","text":"<p>This notebook leverages the ImageFolder feature to easily run the notebook on a custom dataset (namely, EuroSAT in this tutorial). You can either load a <code>Dataset</code> from local folders or from local/remote files, like zip or tar.</p>"},{"location":"integrations/huggingface/image_classification_albumentations/#any-model","title":"Any model","text":"<p>This notebook is built to run on any image classification dataset with any vision model checkpoint from the Model Hub as long as that model has a version with a Image Classification head, such as: * ViT * Swin Transformer * ConvNeXT</p> <ul> <li>in short, any model supported by AutoModelForImageClassification.</li> </ul>"},{"location":"integrations/huggingface/image_classification_albumentations/#albumentations","title":"Albumentations","text":"<p>In this notebook, we are going to leverage the Albumentations library for data augmentation. Note that we have other versions of this notebook available as well with other libraries including:</p> <ul> <li>Torchvision's Transforms</li> <li>Kornia</li> <li>imgaug. </li> </ul> <p>Depending on the model and the GPU you are using, you might need to adjust the batch size to avoid out-of-memory errors. Set those two parameters, then the rest of the notebook should run smoothly.</p> <p>In this notebook, we'll fine-tune from the https://huggingface.co/facebook/convnext-tiny-224 checkpoint, but note that there are many, many more available on the hub.</p> Python<pre><code>model_checkpoint = \"facebook/convnext-tiny-224\" # pre-trained model from which to fine-tune\nbatch_size = 32 # batch size for training and evaluation\n</code></pre> <p>Before we start, let's install the <code>datasets</code>, <code>transformers</code> and <code>albumentations</code> libraries.</p> Python<pre><code>!pip install -q datasets transformers\n</code></pre> <pre><code>\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 325 kB 8.7 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 4.0 MB 67.0 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 77 kB 8.1 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1.1 MB 48.8 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 136 kB 72.0 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 212 kB 72.9 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 127 kB 75.0 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 895 kB 67.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 6.5 MB 56.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 596 kB 76.4 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 144 kB 76.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 94 kB 3.3 MB/s \n\u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 271 kB 77.3 MB/s \n\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\ndatascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.\u001b[0m\n\u001b[?25h\n</code></pre> Python<pre><code>!pip install -q albumentations\n</code></pre> <pre><code>\u001b[?25l\n</code></pre> <p>\u001b[K     |\u258c                               | 10 kB 26.1 MB/s eta 0:00:01 \u001b[K     |\u2588                               | 20 kB 27.6 MB/s eta 0:00:01 \u001b[K     |\u2588\u258b                              | 30 kB 11.8 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588                              | 40 kB 8.9 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u258b                             | 51 kB 6.7 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u258f                            | 61 kB 7.9 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u258b                            | 71 kB 8.0 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u258f                           | 81 kB 7.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u258a                           | 92 kB 8.2 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u258f                          | 102 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u258a                          | 112 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u258e                         | 122 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u258a                         | 133 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e                        | 143 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                        | 153 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e                       | 163 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                       | 174 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                      | 184 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589                      | 194 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                     | 204 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                     | 215 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d                    | 225 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                    | 235 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                   | 245 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                   | 256 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                  | 266 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                  | 276 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c                 | 286 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                 | 296 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b                | 307 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                | 317 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b               | 327 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f              | 337 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b              | 348 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f             | 358 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a             | 368 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f            | 378 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a            | 389 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e           | 399 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a           | 409 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e          | 419 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589          | 430 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258e         | 440 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589         | 450 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d        | 460 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2589        | 471 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d       | 481 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588       | 491 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258d      | 501 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588      | 512 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c     | 522 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588     | 532 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c    | 542 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588    | 552 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258c   | 563 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588   | 573 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b  | 583 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588  | 593 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b | 604 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258f| 614 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258b| 624 kB 8.4 MB/s eta 0:00:01 \u001b[K     |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 631 kB 8.4 MB/s      \u001b[?25h  Building wheel for imgaug (setup.py) ... \u001b[?25l\u001b[?25hdone</p> <p>If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.</p> <p>To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.</p> <p>First you have to store your authentication token from the Hugging Face website (sign up here if you haven't already!) then execute the following cell and input your token:</p> Python<pre><code>from huggingface_hub import notebook_login\n\nnotebook_login()\n</code></pre> <pre><code>Login successful\nYour token has been saved to /root/.huggingface/token\n\u001b[1m\u001b[31mAuthenticated through git-credential store but this isn't the helper defined on your machine.\nYou might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default\n\ngit config --global credential.helper store\u001b[0m\n</code></pre> <p>Then you need to install Git-LFS to upload your model checkpoints:</p> Python<pre><code>%%capture\n!sudo apt -qq install git-lfs\n!git config --global credential.helper store\n</code></pre> <p>We also quickly upload some telemetry - this tells us which examples and software versions are getting used so we know where to prioritize our maintenance efforts. We don't collect (or care about) any personally identifiable information, but if you'd prefer not to be counted, feel free to skip this step or delete this cell entirely.</p> Python<pre><code>from transformers.utils import send_example_telemetry\n\nsend_example_telemetry(\"image_classification_albumentations_notebook\", framework=\"pytorch\")\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#fine-tuning-a-model-on-an-image-classification-task","title":"Fine-tuning a model on an image classification task","text":"<p>In this notebook, we will see how to fine-tune one of the \ud83e\udd17 Transformers vision models on an Image Classification dataset.</p> <p>Given an image, the goal is to predict an appropriate class for it, like \"tiger\". The screenshot below is taken from a ViT fine-tuned on ImageNet-1k - try out the inference widget!</p> <p></p>"},{"location":"integrations/huggingface/image_classification_albumentations/#loading-the-dataset","title":"Loading the dataset","text":"<p>We will use the \ud83e\udd17 Datasets library's ImageFolder feature to download our custom dataset into a DatasetDict.</p> <p>In this case, the EuroSAT dataset is hosted remotely, so we provide the <code>data_files</code> argument. Alternatively, if you have local folders with images, you can load them using the <code>data_dir</code> argument. </p> Python<pre><code>from datasets import load_dataset \n\n# load a custom dataset from local/remote files using the ImageFolder feature\n\n# option 1: local/remote files (supporting the following formats: tar, gzip, zip, xz, rar, zstd)\ndataset = load_dataset(\"imagefolder\", data_files=\"https://madm.dfki.de/files/sentinel/EuroSAT.zip\")\n\n# note that you can also provide several splits:\n# dataset = load_dataset(\"imagefolder\", data_files={\"train\": [\"path/to/file1\", \"path/to/file2\"], \"test\": [\"path/to/file3\", \"path/to/file4\"]})\n\n# note that you can push your dataset to the hub very easily (and reload afterwards using load_dataset)!\n# dataset.push_to_hub(\"nielsr/eurosat\")\n# dataset.push_to_hub(\"nielsr/eurosat\", private=True)\n\n# option 2: local folder\n# dataset = load_dataset(\"imagefolder\", data_dir=\"path_to_folder\")\n\n# option 3: just load any existing dataset from the hub ...\n# dataset = load_dataset(\"cifar10\")\n</code></pre> <pre><code>Using custom data configuration default-0537267e6f812d56\n\n\nDownloading and preparing dataset image_folder/default to /root/.cache/huggingface/datasets/image_folder/default-0537267e6f812d56/0.0.0/ee92df8e96c6907f3c851a987be3fd03d4b93b247e727b69a8e23ac94392a091...\n\n\n\nDownloading data files: 0it [00:00, ?it/s]\n\n\n\nDownloading data files:   0%|          | 0/1 [00:00&lt;?, ?it/s]\n\n\n\nDownloading data:   0%|          | 0.00/94.3M [00:00&lt;?, ?B/s]\n\n\n\nExtracting data files:   0%|          | 0/1 [00:00&lt;?, ?it/s]\n\n\n\nGenerating train split: 0 examples [00:00, ? examples/s]\n\n\nDataset image_folder downloaded and prepared to /root/.cache/huggingface/datasets/image_folder/default-0537267e6f812d56/0.0.0/ee92df8e96c6907f3c851a987be3fd03d4b93b247e727b69a8e23ac94392a091. Subsequent calls will reuse this data.\n\n\n\n  0%|          | 0/1 [00:00&lt;?, ?it/s]\n</code></pre> <p>Let us also load the Accuracy metric, which we'll use to evaluate our model both during and after training.</p> Python<pre><code>from datasets import load_metric\n\nmetric = load_metric(\"accuracy\")\n</code></pre> <pre><code>Downloading builder script:   0%|          | 0.00/1.41k [00:00&lt;?, ?B/s]\n</code></pre> <p>The <code>dataset</code> object itself is a <code>DatasetDict</code>, which contains one key per split (in this case, only \"train\" for a training split).</p> Python<pre><code>dataset\n</code></pre> <pre><code>DatasetDict({\n    train: Dataset({\n        features: ['image', 'label'],\n        num_rows: 27000\n    })\n})\n</code></pre> <p>To access an actual element, you need to select a split first, then give an index:</p> Python<pre><code>example = dataset[\"train\"][10]\nexample\n</code></pre> <pre><code>{'image': &lt;PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=64x64 at 0x7FD62DA6B2D0&gt;,\n 'label': 2}\n</code></pre> <p>Each example consists of an image and a corresponding label. We can also verify this by checking the features of the dataset:</p> Python<pre><code>dataset[\"train\"].features\n</code></pre> <pre><code>{'image': Image(decode=True, id=None),\n 'label': ClassLabel(num_classes=10, names=['AnnualCrop', 'Forest', 'HerbaceousVegetation', 'Highway', 'Industrial', 'Pasture', 'PermanentCrop', 'Residential', 'River', 'SeaLake'], id=None)}\n</code></pre> <p>The cool thing is that we can directly view the image (as the 'image' field is an Image feature), as follows:</p> Python<pre><code>example['image']\n</code></pre> <p></p> <p>Let's make it a little bigger as the images in the EuroSAT dataset are of low resolution (64x64 pixels):</p> Python<pre><code>example['image'].resize((200, 200))\n</code></pre> <p></p> <p>Let's check the corresponding label:</p> Python<pre><code>example['label']\n</code></pre> <pre><code>2\n</code></pre> <p>As you can see, the <code>label</code> field is not an actual string label. By default the <code>ClassLabel</code> fields are encoded into integers for convenience:</p> Python<pre><code>dataset[\"train\"].features[\"label\"]\n</code></pre> <pre><code>ClassLabel(num_classes=10, names=['AnnualCrop', 'Forest', 'HerbaceousVegetation', 'Highway', 'Industrial', 'Pasture', 'PermanentCrop', 'Residential', 'River', 'SeaLake'], id=None)\n</code></pre> <p>Let's create an <code>id2label</code> dictionary to decode them back to strings and see what they are. The inverse <code>label2id</code> will be useful too, when we load the model later.</p> Python<pre><code>labels = dataset[\"train\"].features[\"label\"].names\nlabel2id, id2label = dict(), dict()\nfor i, label in enumerate(labels):\n    label2id[label] = i\n    id2label[i] = label\n\nid2label[2]\n</code></pre> <pre><code>'HerbaceousVegetation'\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#preprocessing-the-data","title":"Preprocessing the data","text":"<p>Before we can feed these images to our model, we need to preprocess them. </p> <p>Preprocessing images typically comes down to (1) resizing them to a particular size (2) normalizing the color channels (R,G,B) using a mean and standard deviation. These are referred to as image transformations.</p> <p>In addition, one typically performs what is called data augmentation during training (like random cropping and flipping) to make the model more robust and achieve higher accuracy. Data augmentation is also a great technique to increase the size of the training data.</p> <p>We will use <code>Albumentations</code> for the image transformations/data augmentation in this tutorial, but note that one can use any other package (like torchvision's transforms, imgaug, Kornia, etc.).</p> <p>To make sure we (1) resize to the appropriate size (2) use the appropriate image mean and standard deviation for the model architecture we are going to use, we instantiate what is called an image processor with the <code>AutoImageProcessor.from_pretrained</code> method.</p> <p>This image processor is a minimal preprocessor that can be used to prepare images for inference.</p> Python<pre><code>from transformers import AutoImageProcessor\n\nimage_processor = AutoImageProcessor.from_pretrained(model_checkpoint)\nimage_processor\n</code></pre> <pre><code>Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.\n\n\n\n\n\nConvNextImageProcessor {\n  \"crop_pct\": 0.875,\n  \"do_normalize\": true,\n  \"do_rescale\": true,\n  \"do_resize\": true,\n  \"image_mean\": [\n    0.485,\n    0.456,\n    0.406\n  ],\n  \"image_processor_type\": \"ConvNextImageProcessor\",\n  \"image_std\": [\n    0.229,\n    0.224,\n    0.225\n  ],\n  \"resample\": 3,\n  \"rescale_factor\": 0.00392156862745098,\n  \"size\": {\n    \"shortest_edge\": 224\n  }\n}\n</code></pre> <p>The Datasets library is made for processing data very easily. We can write custom functions, which can then be applied on an entire dataset (either using <code>.map()</code> or <code>.set_transform()</code>).</p> <p>Here we define 2 separate functions, one for training (which includes data augmentation) and one for validation (which only includes resizing, center cropping and normalizing). </p> Python<pre><code>import cv2\nimport albumentations as A\nimport numpy as np\n\nif \"height\" in image_processor.size:\n    size = (image_processor.size[\"height\"], image_processor.size[\"width\"])\n    crop_size = size\n    max_size = None\nelif \"shortest_edge\" in image_processor.size:\n    size = image_processor.size[\"shortest_edge\"]\n    crop_size = (size, size)\n    max_size = image_processor.size.get(\"longest_edge\")\n\ntrain_transforms = A.Compose([\n    A.Resize(height=size, width=size),\n    A.RandomRotate90(),\n    A.HorizontalFlip(p=0.5),\n    A.RandomBrightnessContrast(p=0.2),\n    A.Normalize(),\n])\n\nval_transforms = A.Compose([\n    A.Resize(height=size, width=size),\n    A.Normalize(),\n])\n\ndef preprocess_train(examples):\n    examples[\"pixel_values\"] = [\n        train_transforms(image=np.array(image))[\"image\"] for image in examples[\"image\"]\n    ]\n\n    return examples\n\ndef preprocess_val(examples):\n    examples[\"pixel_values\"] = [\n        val_transforms(image=np.array(image))[\"image\"] for image in examples[\"image\"]\n    ]\n\n    return examples\n</code></pre> <p>Next, we can preprocess our dataset by applying these functions. We will use the <code>set_transform</code> functionality, which allows to apply the functions above on-the-fly (meaning that they will only be applied when the images are loaded in RAM).</p> Python<pre><code># split up training into training + validation\nsplits = dataset[\"train\"].train_test_split(test_size=0.1)\ntrain_ds = splits['train']\nval_ds = splits['test']\n</code></pre> Python<pre><code>train_ds.set_transform(preprocess_train)\nval_ds.set_transform(preprocess_val)\n</code></pre> <p>Let's check the first example:</p> Python<pre><code>train_ds[0]\n</code></pre> <pre><code>{'image': &lt;PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=64x64 at 0x7FD610178490&gt;,\n 'label': 5,\n 'pixel_values': array([[[-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         ...,\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052]],\n\n        [[-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         ...,\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052],\n         [-1.34729   , -0.897759  , -0.37525052]],\n\n        [[-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         [-1.415789  , -0.53011197, -0.37525052],\n         ...,\n         [-1.3986642 , -0.93277305, -0.4101089 ],\n         [-1.3986642 , -0.93277305, -0.4101089 ],\n         [-1.3986642 , -0.93277305, -0.4101089 ]],\n\n        ...,\n\n        [[-1.5014129 , -0.582633  , -0.35782132],\n         [-1.5014129 , -0.582633  , -0.35782132],\n         [-1.5014129 , -0.582633  , -0.35782132],\n         ...,\n         [-1.4842881 , -0.98529404, -0.5146841 ],\n         [-1.4671633 , -1.0028011 , -0.49725488],\n         [-1.4671633 , -1.0028011 , -0.49725488]],\n\n        [[-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.35782132],\n         ...,\n         [-1.4842881 , -0.98529404, -0.5146841 ],\n         [-1.4671633 , -1.0028011 , -0.49725488],\n         [-1.4671633 , -1.0028011 , -0.49725488]],\n\n        [[-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.3403921 ],\n         [-1.5356623 , -0.565126  , -0.35782132],\n         ...,\n         [-1.4842881 , -0.98529404, -0.5146841 ],\n         [-1.4671633 , -1.0028011 , -0.49725488],\n         [-1.4671633 , -1.0028011 , -0.49725488]]], dtype=float32)}\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#training-the-model","title":"Training the model","text":"<p>Now that our data is ready, we can download the pretrained model and fine-tune it. For classification we use the <code>AutoModelForImageClassification</code> class. Like with the image processor, the <code>from_pretrained</code> method will download and cache the model for us. As the label ids and the number of labels are dataset dependent, we pass <code>num_labels</code>, <code>label2id</code>, and <code>id2label</code> alongside the <code>model_checkpoint</code> he\u00a3re.</p> <p>NOTE: in case you're planning to fine-tune an already fine-tuned checkpoint, like facebook/convnext-tiny-224 (which has already been fine-tuned on ImageNet-1k), then you need to provide the additional argument <code>ignore_mismatched_sizes=True</code> to the <code>from_pretrained</code> method. This will make sure the output head is thrown away and replaced by a new, randomly initialized classification head that includes a custom number of output neurons.</p> Python<pre><code>from transformers import AutoModelForImageClassification, TrainingArguments, Trainer\n\nnum_labels = len(id2label)\nmodel = AutoModelForImageClassification.from_pretrained(\n    model_checkpoint, \n    label2id=label2id,\n    id2label=id2label,\n    ignore_mismatched_sizes = True, # provide this in case you'd like to fine-tune an already fine-tuned checkpoint\n)\n</code></pre> <pre><code>Downloading:   0%|          | 0.00/68.0k [00:00&lt;?, ?B/s]\n\n\n\nDownloading:   0%|          | 0.00/109M [00:00&lt;?, ?B/s]\n\n\nSome weights of ConvNextForImageClassification were not initialized from the model checkpoint at facebook/convnext-tiny-224 and are newly initialized because the shapes did not match:\n- classifier.weight: found shape torch.Size([1000, 768]) in the checkpoint and torch.Size([10, 768]) in the model instantiated\n- classifier.bias: found shape torch.Size([1000]) in the checkpoint and torch.Size([10]) in the model instantiated\nYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n</code></pre> <p>The warning is telling us we are throwing away some weights (the weights and bias of the <code>pooler</code> layer) and randomly initializing some other (the weights and bias of the <code>classifier</code> layer). This is expected in this case, because we are adding a new head for which we don't have pretrained weights, so the library warns us we should fine-tune this model before using it for inference, which is exactly what we are going to do.</p> <p>To instantiate a <code>Trainer</code>, we will need to define the training configuration and the evaluation metric. The most important is the <code>TrainingArguments</code>, which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model.</p> <p>Most of the training arguments are pretty self-explanatory, but one that is quite important here is <code>remove_unused_columns=False</code>. This one will drop any features not used by the model's call function. By default it's <code>True</code> because usually it's ideal to drop unused feature columns, making it easier to unpack inputs into the model's call function. But, in our case, we need the unused features ('img' in particular) in order to create 'pixel_values'.</p> Python<pre><code>model_name = model_checkpoint.split(\"/\")[-1]\n\nargs = TrainingArguments(\n    f\"{model_name}-finetuned-eurosat-albumentations\",\n    remove_unused_columns=False,\n    evaluation_strategy = \"epoch\",\n    save_strategy = \"epoch\",\n    learning_rate=5e-5,\n    per_device_train_batch_size=batch_size,\n    gradient_accumulation_steps=4,\n    per_device_eval_batch_size=batch_size,\n    num_train_epochs=3,\n    warmup_ratio=0.1,\n    logging_steps=10,\n    load_best_model_at_end=True,\n    metric_for_best_model=\"accuracy\",\n    push_to_hub=True,\n)\n</code></pre> <p>Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the <code>batch_size</code> defined at the top of the notebook and customize the number of epochs for training, as well as the weight decay. Since the best model might not be the one at the end of training, we ask the <code>Trainer</code> to load the best model it saved (according to <code>metric_name</code>) at the end of training.</p> <p>The last argument <code>push_to_hub</code> allows the Trainer to push the model to the Hub regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally with a name that is different from the name of the repository, or if you want to push your model under an organization and not your name space, use the <code>hub_model_id</code> argument to set the repo name (it needs to be the full name, including your namespace: for instance <code>\"nielsr/vit-finetuned-cifar10\"</code> or <code>\"huggingface/nielsr/vit-finetuned-cifar10\"</code>).</p> <p>Next, we need to define a function for how to compute the metrics from the predictions, which will just use the <code>metric</code> we loaded earlier. The only preprocessing we have to do is to take the argmax of our predicted logits:</p> Python<pre><code>import numpy as np\n\n# the compute_metrics function takes a Named Tuple as input:\n# predictions, which are the logits of the model as Numpy arrays,\n# and label_ids, which are the ground-truth labels as Numpy arrays.\ndef compute_metrics(eval_pred):\n    \"\"\"Computes accuracy on a batch of predictions\"\"\"\n    predictions = np.argmax(eval_pred.predictions, axis=1)\n    return metric.compute(predictions=predictions, references=eval_pred.label_ids)\n</code></pre> <p>We also define a <code>collate_fn</code>, which will be used to batch examples together. Each batch consists of 2 keys, namely <code>pixel_values</code> and <code>labels</code>.</p> Python<pre><code>import torch\n\ndef collate_fn(examples):\n    images = []\n    labels = []\n    for example in examples:\n        image = np.moveaxis(example[\"pixel_values\"], source=2, destination=0)\n        images.append(torch.from_numpy(image))\n        labels.append(example[\"label\"])\n\n    pixel_values = torch.stack(images)\n    labels = torch.tensor(labels)\n    return {\"pixel_values\": pixel_values, \"labels\": labels}\n</code></pre> <p>Then we just need to pass all of this along with our datasets to the <code>Trainer</code>:</p> Python<pre><code>trainer = Trainer(\n    model,\n    args,\n    train_dataset=train_ds,\n    eval_dataset=val_ds,\n    tokenizer=image_processor,\n    compute_metrics=compute_metrics,\n    data_collator=collate_fn,\n)\n</code></pre> <pre><code>/content/convnext-tiny-224-finetuned-eurosat-albumentations is already a clone of https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations. Make sure you pull the latest changes with `repo.git_pull()`.\n</code></pre> <p>You might wonder why we pass along the <code>image_processor</code> as a tokenizer when we already preprocessed our data. This is only to make sure the image processor configuration file (stored as JSON) will also be uploaded to the repo on the hub.</p> <p>Now we can finetune our model by calling the <code>train</code> method:</p> Python<pre><code>trainer.train()\n</code></pre> <pre><code>/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:309: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning\n  FutureWarning,\n***** Running training *****\n  Num examples = 24300\n  Num Epochs = 3\n  Instantaneous batch size per device = 32\n  Total train batch size (w. parallel, distributed &amp; accumulation) = 128\n  Gradient Accumulation steps = 4\n  Total optimization steps = 570\n\n\n\n\n&lt;div&gt;\n\n  &lt;progress value='570' max='570' style='width:300px; height:20px; vertical-align: middle;'&gt;&lt;/progress&gt;\n  [570/570 15:59, Epoch 3/3]\n&lt;/div&gt;\n&lt;table border=\"1\" class=\"dataframe\"&gt;\n</code></pre> Epoch Training Loss Validation Loss Accuracy 1 0.141000 0.149633 0.954444 2 0.073600 0.095782 0.971852 3 0.056800 0.072716 0.974815 <p><p></p> <pre><code>***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\nSaving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-190/preprocessor_config.json\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\nSaving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-380/preprocessor_config.json\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\nSaving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570/preprocessor_config.json\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n\n\nTraining completed. Do not forget to share your model on huggingface.co/models =)\n\n\nLoading best model from convnext-tiny-224-finetuned-eurosat-albumentations/checkpoint-570 (score: 0.9748148148148148).\n\n\n\n\n\nTrainOutput(global_step=570, training_loss=0.34729809766275843, metrics={'train_runtime': 961.6293, 'train_samples_per_second': 75.809, 'train_steps_per_second': 0.593, 'total_flos': 1.8322098956292096e+18, 'train_loss': 0.34729809766275843, 'epoch': 3.0})\n</code></pre> <p>We can check with the <code>evaluate</code> method that our <code>Trainer</code> did reload the best model properly (if it was not the last one):</p> Python<pre><code>metrics = trainer.evaluate()\nprint(metrics)\n</code></pre> <pre><code>***** Running Evaluation *****\n  Num examples = 2700\n  Batch size = 32\n</code></pre>    [85/85 00:12]  <pre><code>{'eval_loss': 0.0727163776755333, 'eval_accuracy': 0.9748148148148148, 'eval_runtime': 13.0419, 'eval_samples_per_second': 207.026, 'eval_steps_per_second': 6.517, 'epoch': 3.0}\n</code></pre> <p>You can now upload the result of the training to the Hub, just execute this instruction (note that the Trainer will automatically create a model card for you, as well as adding Tensorboard metrics - see the \"Training metrics\" tab!):</p> Python<pre><code>trainer.push_to_hub()\n</code></pre> <pre><code>Saving model checkpoint to convnext-tiny-224-finetuned-eurosat-albumentations\nConfiguration saved in convnext-tiny-224-finetuned-eurosat-albumentations/config.json\nModel weights saved in convnext-tiny-224-finetuned-eurosat-albumentations/pytorch_model.bin\nFeature extractor saved in convnext-tiny-224-finetuned-eurosat-albumentations/preprocessor_config.json\n\n\n\nUpload file runs/Apr12_12-03-24_1ad162e1ead9/events.out.tfevents.1649765159.1ad162e1ead9.73.4:  24%|##4       \u2026\n\n\n\nUpload file runs/Apr12_12-03-24_1ad162e1ead9/events.out.tfevents.1649767032.1ad162e1ead9.73.6: 100%|##########\u2026\n\n\nTo https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\n   c500b3f..2143b42  main -&gt; main\n\nTo https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\n   2143b42..71339cf  main -&gt; main\n\n\n\n\n\n\n'https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/commit/2143b423b5cacdde6daebd3ee2b5971ecab463f6'\n</code></pre> <p>You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier <code>\"your-username/the-name-you-picked\"</code> so for instance:</p> Python<pre><code>from transformers import AutoModelForImageClassification, AutoImageProcessor\n\nimage_processor = AutoImageProcessor.from_pretrained(\"nielsr/my-awesome-model\")\nmodel = AutoModelForImageClassification.from_pretrained(\"nielsr/my-awesome-model\")\n</code></pre>"},{"location":"integrations/huggingface/image_classification_albumentations/#inference","title":"Inference","text":"<p>Let's say you have a new image, on which you'd like to make a prediction. Let's load a satellite image of a highway (that's not part of the EuroSAT dataset), and see how the model does.</p> Python<pre><code>from PIL import Image\nimport requests\n\nurl = 'https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/highway.jpg'\nimage = Image.open(requests.get(url, stream=True).raw)\nimage\n</code></pre> <p></p> <p>We'll load the image processor and model from the hub (here, we use the Auto Classes, which will make sure the appropriate classes will be loaded automatically based on the <code>config.json</code> and <code>preprocessor_config.json</code> files of the repo on the hub):</p> Python<pre><code>from transformers import AutoModelForImageClassification, AutoImageProcessor\n\nrepo_name = \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\"\n\nimage_processor = AutoImageProcessor.from_pretrained(repo_name)\nmodel = AutoModelForImageClassification.from_pretrained(repo_name)\n</code></pre> <pre><code>https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmp04g0zg5n\n\n\n\nDownloading:   0%|          | 0.00/266 [00:00&lt;?, ?B/s]\n\n\nstoring https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json in cache at /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\ncreating metadata file for /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\nloading feature extractor configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json from cache at /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\nFeature extractor ConvNextFeatureExtractor {\n  \"crop_pct\": 0.875,\n  \"do_normalize\": true,\n  \"do_resize\": true,\n  \"feature_extractor_type\": \"ConvNextFeatureExtractor\",\n  \"image_mean\": [\n    0.485,\n    0.456,\n    0.406\n  ],\n  \"image_std\": [\n    0.229,\n    0.224,\n    0.225\n  ],\n  \"resample\": 3,\n  \"size\": 224\n}\n\nhttps://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmpbf9y4q39\n\n\n\nDownloading:   0%|          | 0.00/1.03k [00:00&lt;?, ?B/s]\n\n\nstoring https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json in cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\ncreating metadata file for /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nloading configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nModel config ConvNextConfig {\n  \"_name_or_path\": \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\",\n  \"architectures\": [\n    \"ConvNextForImageClassification\"\n  ],\n  \"depths\": [\n    3,\n    3,\n    9,\n    3\n  ],\n  \"drop_path_rate\": 0.0,\n  \"hidden_act\": \"gelu\",\n  \"hidden_sizes\": [\n    96,\n    192,\n    384,\n    768\n  ],\n  \"id2label\": {\n    \"0\": \"AnnualCrop\",\n    \"1\": \"Forest\",\n    \"2\": \"HerbaceousVegetation\",\n    \"3\": \"Highway\",\n    \"4\": \"Industrial\",\n    \"5\": \"Pasture\",\n    \"6\": \"PermanentCrop\",\n    \"7\": \"Residential\",\n    \"8\": \"River\",\n    \"9\": \"SeaLake\"\n  },\n  \"image_size\": 224,\n  \"initializer_range\": 0.02,\n  \"label2id\": {\n    \"AnnualCrop\": 0,\n    \"Forest\": 1,\n    \"HerbaceousVegetation\": 2,\n    \"Highway\": 3,\n    \"Industrial\": 4,\n    \"Pasture\": 5,\n    \"PermanentCrop\": 6,\n    \"Residential\": 7,\n    \"River\": 8,\n    \"SeaLake\": 9\n  },\n  \"layer_norm_eps\": 1e-12,\n  \"layer_scale_init_value\": 1e-06,\n  \"model_type\": \"convnext\",\n  \"num_channels\": 3,\n  \"num_stages\": 4,\n  \"patch_size\": 4,\n  \"problem_type\": \"single_label_classification\",\n  \"torch_dtype\": \"float32\",\n  \"transformers_version\": \"4.18.0\"\n}\n\nhttps://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin not found in cache or force_download set to True, downloading to /root/.cache/huggingface/transformers/tmpzr_9yxjo\n\n\n\nDownloading:   0%|          | 0.00/106M [00:00&lt;?, ?B/s]\n\n\nstoring https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin in cache at /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\ncreating metadata file for /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\nloading weights file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\nAll model checkpoint weights were used when initializing ConvNextForImageClassification.\n\nAll the weights of ConvNextForImageClassification were initialized from the model checkpoint at nielsr/convnext-tiny-224-finetuned-eurosat-albumentations.\nIf your task is similar to the task the model of the checkpoint was trained on, you can already use ConvNextForImageClassification for predictions without further training.\n</code></pre> Python<pre><code># prepare image for the model\nencoding = image_processor(image.convert(\"RGB\"), return_tensors=\"pt\")\nprint(encoding.pixel_values.shape)\n</code></pre> <pre><code>torch.Size([1, 3, 224, 224])\n</code></pre> Python<pre><code>import torch\n\n# forward pass\nwith torch.no_grad():\n    outputs = model(**encoding)\n    logits = outputs.logits\n</code></pre> Python<pre><code>predicted_class_idx = logits.argmax(-1).item()\nprint(\"Predicted class:\", model.config.id2label[predicted_class_idx])\n</code></pre> <pre><code>Predicted class: Highway\n</code></pre> <p>Looks like our model got it correct! </p>"},{"location":"integrations/huggingface/image_classification_albumentations/#pipeline-api","title":"Pipeline API","text":"<p>An alternative way to quickly perform inference with any model on the hub is by leveraging the Pipeline API, which abstracts away all the steps we did manually above for us. It will perform the preprocessing, forward pass and postprocessing all in a single object. </p> <p>Let's showcase this for our trained model:</p> Python<pre><code>from transformers import pipeline\n\npipe = pipeline(\"image-classification\", \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\")\n</code></pre> <pre><code>loading configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nModel config ConvNextConfig {\n  \"_name_or_path\": \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\",\n  \"architectures\": [\n    \"ConvNextForImageClassification\"\n  ],\n  \"depths\": [\n    3,\n    3,\n    9,\n    3\n  ],\n  \"drop_path_rate\": 0.0,\n  \"hidden_act\": \"gelu\",\n  \"hidden_sizes\": [\n    96,\n    192,\n    384,\n    768\n  ],\n  \"id2label\": {\n    \"0\": \"AnnualCrop\",\n    \"1\": \"Forest\",\n    \"2\": \"HerbaceousVegetation\",\n    \"3\": \"Highway\",\n    \"4\": \"Industrial\",\n    \"5\": \"Pasture\",\n    \"6\": \"PermanentCrop\",\n    \"7\": \"Residential\",\n    \"8\": \"River\",\n    \"9\": \"SeaLake\"\n  },\n  \"image_size\": 224,\n  \"initializer_range\": 0.02,\n  \"label2id\": {\n    \"AnnualCrop\": 0,\n    \"Forest\": 1,\n    \"HerbaceousVegetation\": 2,\n    \"Highway\": 3,\n    \"Industrial\": 4,\n    \"Pasture\": 5,\n    \"PermanentCrop\": 6,\n    \"Residential\": 7,\n    \"River\": 8,\n    \"SeaLake\": 9\n  },\n  \"layer_norm_eps\": 1e-12,\n  \"layer_scale_init_value\": 1e-06,\n  \"model_type\": \"convnext\",\n  \"num_channels\": 3,\n  \"num_stages\": 4,\n  \"patch_size\": 4,\n  \"problem_type\": \"single_label_classification\",\n  \"torch_dtype\": \"float32\",\n  \"transformers_version\": \"4.18.0\"\n}\n\nloading configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/config.json from cache at /root/.cache/huggingface/transformers/25088566ab29cf0ff360b05880b5f20cdc0c79ab995056a1fb4f98212d021154.4637c3f271a8dfbcfe5c4ee777270112d841a5af95814f0fd086c3c2761e7370\nModel config ConvNextConfig {\n  \"_name_or_path\": \"nielsr/convnext-tiny-224-finetuned-eurosat-albumentations\",\n  \"architectures\": [\n    \"ConvNextForImageClassification\"\n  ],\n  \"depths\": [\n    3,\n    3,\n    9,\n    3\n  ],\n  \"drop_path_rate\": 0.0,\n  \"hidden_act\": \"gelu\",\n  \"hidden_sizes\": [\n    96,\n    192,\n    384,\n    768\n  ],\n  \"id2label\": {\n    \"0\": \"AnnualCrop\",\n    \"1\": \"Forest\",\n    \"2\": \"HerbaceousVegetation\",\n    \"3\": \"Highway\",\n    \"4\": \"Industrial\",\n    \"5\": \"Pasture\",\n    \"6\": \"PermanentCrop\",\n    \"7\": \"Residential\",\n    \"8\": \"River\",\n    \"9\": \"SeaLake\"\n  },\n  \"image_size\": 224,\n  \"initializer_range\": 0.02,\n  \"label2id\": {\n    \"AnnualCrop\": 0,\n    \"Forest\": 1,\n    \"HerbaceousVegetation\": 2,\n    \"Highway\": 3,\n    \"Industrial\": 4,\n    \"Pasture\": 5,\n    \"PermanentCrop\": 6,\n    \"Residential\": 7,\n    \"River\": 8,\n    \"SeaLake\": 9\n  },\n  \"layer_norm_eps\": 1e-12,\n  \"layer_scale_init_value\": 1e-06,\n  \"model_type\": \"convnext\",\n  \"num_channels\": 3,\n  \"num_stages\": 4,\n  \"patch_size\": 4,\n  \"problem_type\": \"single_label_classification\",\n  \"torch_dtype\": \"float32\",\n  \"transformers_version\": \"4.18.0\"\n}\n\nloading weights file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/pytorch_model.bin from cache at /root/.cache/huggingface/transformers/3f4bcce35d3279d19b07fb762859d89bce636d8f0235685031ef6494800b9769.d611c768c0b0939188b05c3d505f0b36c97aa57649d4637e3384992d3c5c0b89\nAll model checkpoint weights were used when initializing ConvNextForImageClassification.\n\nAll the weights of ConvNextForImageClassification were initialized from the model checkpoint at nielsr/convnext-tiny-224-finetuned-eurosat-albumentations.\nIf your task is similar to the task the model of the checkpoint was trained on, you can already use ConvNextForImageClassification for predictions without further training.\nloading feature extractor configuration file https://huggingface.co/nielsr/convnext-tiny-224-finetuned-eurosat-albumentations/resolve/main/preprocessor_config.json from cache at /root/.cache/huggingface/transformers/38b41a2c904b6ce5bb10403bf902ee4263144d862c5a602c83cd120c0c1ba0e6.37be7274d6b5860aee104bb1fbaeb0722fec3850a85bb2557ae9491f17f89433\nFeature extractor ConvNextFeatureExtractor {\n  \"crop_pct\": 0.875,\n  \"do_normalize\": true,\n  \"do_resize\": true,\n  \"feature_extractor_type\": \"ConvNextFeatureExtractor\",\n  \"image_mean\": [\n    0.485,\n    0.456,\n    0.406\n  ],\n  \"image_std\": [\n    0.229,\n    0.224,\n    0.225\n  ],\n  \"resample\": 3,\n  \"size\": 224\n}\n</code></pre> Python<pre><code>pipe(image)\n</code></pre> <pre><code>[{'label': 'Highway', 'score': 0.5163754224777222},\n {'label': 'River', 'score': 0.11824000626802444},\n {'label': 'AnnualCrop', 'score': 0.05467210337519646},\n {'label': 'PermanentCrop', 'score': 0.05066365748643875},\n {'label': 'Industrial', 'score': 0.049283623695373535}]\n</code></pre> <p>As we can see, it does not only show the class label with the highest probability, but does return the top 5 labels, with their corresponding scores. Note that the pipelines also work with local models and image_processor:</p> Python<pre><code>pipe = pipeline(\"image-classification\", \n                model=model,\n                feature_extractor=image_processor)\n</code></pre> Python<pre><code>pipe(image)\n</code></pre> <pre><code>[{'label': 'Highway', 'score': 0.5163754224777222},\n {'label': 'River', 'score': 0.11824000626802444},\n {'label': 'AnnualCrop', 'score': 0.05467210337519646},\n {'label': 'PermanentCrop', 'score': 0.05066365748643875},\n {'label': 'Industrial', 'score': 0.049283623695373535}]\n</code></pre> Python<pre><code>\n</code></pre>"},{"location":"integrations/huggingface/object_detection/","title":"Object Detection","text":""},{"location":"integrations/huggingface/object_detection/#object-detection","title":"Object detection","text":"<p>Object detection is the computer vision task of detecting instances (such as humans, buildings, or cars) in an image. Object detection models receive an image as input and output coordinates of the bounding boxes and associated labels of the detected objects. An image can contain multiple objects, each with its own bounding box and a label (e.g. it can have a car and a building), and each object can be present in different parts of an image (e.g. the image can have several cars). This task is commonly used in autonomous driving for detecting things like pedestrians, road signs, and traffic lights. Other applications include counting objects in images, image search, and more.</p> <p>In this guide, you will learn how to:</p> <ol> <li>Finetune DETR, a model that combines a convolutional  backbone with an encoder-decoder Transformer, on the CPPE-5  dataset.</li> <li>Use your finetuned model for inference.</li> </ol> <p> <p>To see all architectures and checkpoints compatible with this task, we recommend checking the task-page</p> <p></p> <p>Before you begin, make sure you have all the necessary libraries installed:</p> Bash<pre><code>pip install -q datasets transformers accelerate timm\npip install -q -U albumentations&gt;=1.4.5 torchmetrics pycocotools\n</code></pre> <p>You'll use \ud83e\udd17 Datasets to load a dataset from the Hugging Face Hub, \ud83e\udd17 Transformers to train your model, and <code>albumentations</code> to augment the data.</p> <p>We encourage you to share your model with the community. Log in to your Hugging Face account to upload it to the Hub. When prompted, enter your token to log in:</p> Python<pre><code>&gt;&gt;&gt; from huggingface_hub import notebook_login\n\n&gt;&gt;&gt; notebook_login()\n</code></pre> <p>To get started, we'll define global constants, namely the model name and image size. For this tutorial, we'll use the conditional DETR model due to its faster convergence. Feel free to select any object detection model available in the <code>transformers</code> library.</p> Python<pre><code>&gt;&gt;&gt; MODEL_NAME = \"microsoft/conditional-detr-resnet-50\"  # or \"facebook/detr-resnet-50\"\n&gt;&gt;&gt; IMAGE_SIZE = 480\n</code></pre>"},{"location":"integrations/huggingface/object_detection/#load-the-cppe-5-dataset","title":"Load the CPPE-5 dataset","text":"<p>The CPPE-5 dataset contains images with annotations identifying medical personal protective equipment (PPE) in the context of the COVID-19 pandemic.</p> <p>Start by loading the dataset and creating a <code>validation</code> split from <code>train</code>:</p> Python<pre><code>&gt;&gt;&gt; from datasets import load_dataset\n\n&gt;&gt;&gt; cppe5 = load_dataset(\"cppe-5\")\n\n&gt;&gt;&gt; if \"validation\" not in cppe5:\n...     split = cppe5[\"train\"].train_test_split(0.15, seed=1337)\n...     cppe5[\"train\"] = split[\"train\"]\n...     cppe5[\"validation\"] = split[\"test\"]\n\n&gt;&gt;&gt; cppe5\nDatasetDict({\n    train: Dataset({\n        features: ['image_id', 'image', 'width', 'height', 'objects'],\n        num_rows: 850\n    })\n    test: Dataset({\n        features: ['image_id', 'image', 'width', 'height', 'objects'],\n        num_rows: 29\n    })\n    validation: Dataset({\n        features: ['image_id', 'image', 'width', 'height', 'objects'],\n        num_rows: 150\n    })\n})\n</code></pre> <p>You'll see that this dataset has 1000 images for train and validation sets and a test set with 29 images.</p> <p>To get familiar with the data, explore what the examples look like.</p> Python<pre><code>&gt;&gt;&gt; cppe5[\"train\"][0]\n{\n  'image_id': 366,\n  'image': &lt;PIL.PngImagePlugin.PngImageFile image mode=RGBA size=500x290&gt;,\n  'width': 500,\n  'height': 500,\n  'objects': {\n    'id': [1932, 1933, 1934],\n    'area': [27063, 34200, 32431],\n    'bbox': [[29.0, 11.0, 97.0, 279.0],\n      [201.0, 1.0, 120.0, 285.0],\n      [382.0, 0.0, 113.0, 287.0]],\n    'category': [0, 0, 0]\n  }\n}\n</code></pre> <p>The examples in the dataset have the following fields: - <code>image_id</code>: the example image id - <code>image</code>: a <code>PIL.Image.Image</code> object containing the image - <code>width</code>: width of the image - <code>height</code>: height of the image - <code>objects</code>: a dictionary containing bounding box metadata for the objects in the image:   - <code>id</code>: the annotation id   - <code>area</code>: the area of the bounding box   - <code>bbox</code>: the object's bounding box (in the COCO format )   - <code>category</code>: the object's category, with possible values including <code>Coverall (0)</code>, <code>Face_Shield (1)</code>, <code>Gloves (2)</code>, <code>Goggles (3)</code> and <code>Mask (4)</code></p> <p>You may notice that the <code>bbox</code> field follows the COCO format, which is the format that the DETR model expects. However, the grouping of the fields inside <code>objects</code> differs from the annotation format DETR requires. You will need to apply some preprocessing transformations before using this data for training.</p> <p>To get an even better understanding of the data, visualize an example in the dataset.</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; import os\n&gt;&gt;&gt; from PIL import Image, ImageDraw\n\n&gt;&gt;&gt; image = cppe5[\"train\"][2][\"image\"]\n&gt;&gt;&gt; annotations = cppe5[\"train\"][2][\"objects\"]\n&gt;&gt;&gt; draw = ImageDraw.Draw(image)\n\n&gt;&gt;&gt; categories = cppe5[\"train\"].features[\"objects\"].feature[\"category\"].names\n\n&gt;&gt;&gt; id2label = {index: x for index, x in enumerate(categories, start=0)}\n&gt;&gt;&gt; label2id = {v: k for k, v in id2label.items()}\n\n&gt;&gt;&gt; for i in range(len(annotations[\"id\"])):\n...     box = annotations[\"bbox\"][i]\n...     class_idx = annotations[\"category\"][i]\n...     x, y, w, h = tuple(box)\n...     # Check if coordinates are normalized or not\n...     if max(box) &gt; 1.0:\n...         # Coordinates are un-normalized, no need to re-scale them\n...         x1, y1 = int(x), int(y)\n...         x2, y2 = int(x + w), int(y + h)\n...     else:\n...         # Coordinates are normalized, re-scale them\n...         x1 = int(x * width)\n...         y1 = int(y * height)\n...         x2 = int((x + w) * width)\n...         y2 = int((y + h) * height)\n...     draw.rectangle((x, y, x + w, y + h), outline=\"red\", width=1)\n...     draw.text((x, y), id2label[class_idx], fill=\"white\")\n\n&gt;&gt;&gt; image\n</code></pre> <p>To visualize the bounding boxes with associated labels, you can get the labels from the dataset's metadata, specifically the <code>category</code> field. You'll also want to create dictionaries that map a label id to a label class (<code>id2label</code>) and the other way around (<code>label2id</code>). You can use them later when setting up the model. Including these maps will make your model reusable by others if you share it on the Hugging Face Hub. Please note that, the part of above code that draws the bounding boxes assume that it is in <code>COCO</code> format <code>(x_min, y_min, width, height)</code>. It has to be adjusted to work for other formats like <code>(x_min, y_min, x_max, y_max)</code>.</p> <p>As a final step of getting familiar with the data, explore it for potential issues. One common problem with datasets for object detection is bounding boxes that \"stretch\" beyond the edge of the image. Such \"runaway\" bounding boxes can raise errors during training and should be addressed. There are a few examples with this issue in this dataset. To keep things simple in this guide, we will set <code>clip=True</code> for <code>BboxParams</code> in transformations below.</p>"},{"location":"integrations/huggingface/object_detection/#preprocess-the-data","title":"Preprocess the data","text":"<p>To finetune a model, you must preprocess the data you plan to use to match precisely the approach used for the pre-trained model. [<code>AutoImageProcessor</code>] takes care of processing image data to create <code>pixel_values</code>, <code>pixel_mask</code>, and <code>labels</code> that a DETR model can train with. The image processor has some attributes that you won't have to worry about:</p> <ul> <li><code>image_mean = [0.485, 0.456, 0.406 ]</code></li> <li><code>image_std = [0.229, 0.224, 0.225]</code></li> </ul> <p>These are the mean and standard deviation used to normalize images during the model pre-training. These values are crucial to replicate when doing inference or finetuning a pre-trained image model.</p> <p>Instantiate the image processor from the same checkpoint as the model you want to finetune.</p> Python<pre><code>&gt;&gt;&gt; from transformers import AutoImageProcessor\n\n&gt;&gt;&gt; MAX_SIZE = IMAGE_SIZE\n\n&gt;&gt;&gt; image_processor = AutoImageProcessor.from_pretrained(\n...     MODEL_NAME,\n...     do_resize=True,\n...     size={\"max_height\": MAX_SIZE, \"max_width\": MAX_SIZE},\n...     do_pad=True,\n...     pad_size={\"height\": MAX_SIZE, \"width\": MAX_SIZE},\n... )\n</code></pre> <p>Before passing the images to the <code>image_processor</code>, apply two preprocessing transformations to the dataset: - Augmenting images - Reformatting annotations to meet DETR expectations</p> <p>First, to make sure the model does not overfit on the training data, you can apply image augmentation with any data augmentation library. Here we use Albumentations. This library ensures that transformations affect the image and update the bounding boxes accordingly. The \ud83e\udd17 Datasets library documentation has a detailed guide on how to augment images for object detection, and it uses the exact same dataset as an example. Apply some geometric and color transformations to the image. For additional augmentation options, explore the Albumentations Demo Space.</p> Python<pre><code>&gt;&gt;&gt; import albumentations as A\n\n&gt;&gt;&gt; train_augment_and_transform = A.Compose(\n...     [\n...         A.Perspective(p=0.1),\n...         A.HorizontalFlip(p=0.5),\n...         A.RandomBrightnessContrast(p=0.5),\n...         A.HueSaturationValue(p=0.1),\n...     ],\n...     bbox_params=A.BboxParams(format=\"coco\", label_fields=[\"category\"], clip=True, min_area=25),\n... )\n\n&gt;&gt;&gt; validation_transform = A.Compose(\n...     [A.NoOp()],\n...     bbox_params=A.BboxParams(format=\"coco\", label_fields=[\"category\"], clip=True),\n... )\n</code></pre> <p>The <code>image_processor</code> expects the annotations to be in the following format: <code>{'image_id': int, 'annotations': List[Dict]}</code>,  where each dictionary is a COCO object annotation. Let's add a function to reformat annotations for a single example:</p> Python<pre><code>&gt;&gt;&gt; def format_image_annotations_as_coco(image_id, categories, areas, bboxes):\n...     \"\"\"Format one set of image annotations to the COCO format\n\n...     Args:\n...         image_id (str): image id. e.g. \"0001\"\n...         categories (List[int]): list of categories/class labels corresponding to provided bounding boxes\n...         areas (List[float]): list of corresponding areas to provided bounding boxes\n...         bboxes (List[Tuple[float]]): list of bounding boxes provided in COCO format\n...             ([center_x, center_y, width, height] in absolute coordinates)\n\n...     Returns:\n...         dict: {\n...             \"image_id\": image id,\n...             \"annotations\": list of formatted annotations\n...         }\n...     \"\"\"\n...     annotations = []\n...     for category, area, bbox in zip(categories, areas, bboxes):\n...         formatted_annotation = {\n...             \"image_id\": image_id,\n...             \"category_id\": category,\n...             \"iscrowd\": 0,\n...             \"area\": area,\n...             \"bbox\": list(bbox),\n...         }\n...         annotations.append(formatted_annotation)\n\n...     return {\n...         \"image_id\": image_id,\n...         \"annotations\": annotations,\n...     }\n</code></pre> <p>Now you can combine the image and annotation transformations to use on a batch of examples:</p> Python<pre><code>&gt;&gt;&gt; def augment_and_transform_batch(examples, transform, image_processor, return_pixel_mask=False):\n...     \"\"\"Apply augmentations and format annotations in COCO format for object detection task\"\"\"\n\n...     images = []\n...     annotations = []\n...     for image_id, image, objects in zip(examples[\"image_id\"], examples[\"image\"], examples[\"objects\"]):\n...         image = np.array(image.convert(\"RGB\"))\n\n...         # apply augmentations\n...         output = transform(image=image, bboxes=objects[\"bbox\"], category=objects[\"category\"])\n...         images.append(output[\"image\"])\n\n...         # format annotations in COCO format\n...         formatted_annotations = format_image_annotations_as_coco(\n...             image_id, output[\"category\"], objects[\"area\"], output[\"bboxes\"]\n...         )\n...         annotations.append(formatted_annotations)\n\n...     # Apply the image processor transformations: resizing, rescaling, normalization\n...     result = image_processor(images=images, annotations=annotations, return_tensors=\"pt\")\n\n...     if not return_pixel_mask:\n...         result.pop(\"pixel_mask\", None)\n\n...     return result\n</code></pre> <p>Apply this preprocessing function to the entire dataset using \ud83e\udd17 Datasets [<code>~datasets.Dataset.with_transform</code>] method. This method applies transformations on the fly when you load an element of the dataset.</p> <p>At this point, you can check what an example from the dataset looks like after the transformations. You should see a tensor with <code>pixel_values</code>, a tensor with <code>pixel_mask</code>, and <code>labels</code>.</p> Python<pre><code>&gt;&gt;&gt; from functools import partial\n\n&gt;&gt;&gt; # Make transform functions for batch and apply for dataset splits\n&gt;&gt;&gt; train_transform_batch = partial(\n...     augment_and_transform_batch, transform=train_augment_and_transform, image_processor=image_processor\n... )\n&gt;&gt;&gt; validation_transform_batch = partial(\n...     augment_and_transform_batch, transform=validation_transform, image_processor=image_processor\n... )\n\n&gt;&gt;&gt; cppe5[\"train\"] = cppe5[\"train\"].with_transform(train_transform_batch)\n&gt;&gt;&gt; cppe5[\"validation\"] = cppe5[\"validation\"].with_transform(validation_transform_batch)\n&gt;&gt;&gt; cppe5[\"test\"] = cppe5[\"test\"].with_transform(validation_transform_batch)\n\n&gt;&gt;&gt; cppe5[\"train\"][15]\n{'pixel_values': tensor([[[ 1.9235,  1.9407,  1.9749,  ..., -0.7822, -0.7479, -0.6965],\n          [ 1.9578,  1.9749,  1.9920,  ..., -0.7993, -0.7650, -0.7308],\n          [ 2.0092,  2.0092,  2.0263,  ..., -0.8507, -0.8164, -0.7822],\n          ...,\n          [ 0.0741,  0.0741,  0.0741,  ...,  0.0741,  0.0741,  0.0741],\n          [ 0.0741,  0.0741,  0.0741,  ...,  0.0741,  0.0741,  0.0741],\n          [ 0.0741,  0.0741,  0.0741,  ...,  0.0741,  0.0741,  0.0741]],\n\n          [[ 1.6232,  1.6408,  1.6583,  ...,  0.8704,  1.0105,  1.1331],\n          [ 1.6408,  1.6583,  1.6758,  ...,  0.8529,  0.9930,  1.0980],\n          [ 1.6933,  1.6933,  1.7108,  ...,  0.8179,  0.9580,  1.0630],\n          ...,\n          [ 0.2052,  0.2052,  0.2052,  ...,  0.2052,  0.2052,  0.2052],\n          [ 0.2052,  0.2052,  0.2052,  ...,  0.2052,  0.2052,  0.2052],\n          [ 0.2052,  0.2052,  0.2052,  ...,  0.2052,  0.2052,  0.2052]],\n\n          [[ 1.8905,  1.9080,  1.9428,  ..., -0.1487, -0.0964, -0.0615],\n          [ 1.9254,  1.9428,  1.9603,  ..., -0.1661, -0.1138, -0.0790],\n          [ 1.9777,  1.9777,  1.9951,  ..., -0.2010, -0.1138, -0.0790],\n          ...,\n          [ 0.4265,  0.4265,  0.4265,  ...,  0.4265,  0.4265,  0.4265],\n          [ 0.4265,  0.4265,  0.4265,  ...,  0.4265,  0.4265,  0.4265],\n          [ 0.4265,  0.4265,  0.4265,  ...,  0.4265,  0.4265,  0.4265]]]),\n  'labels': {'image_id': tensor([688]), 'class_labels': tensor([3, 4, 2, 0, 0]), 'boxes': tensor([[0.4700, 0.1933, 0.1467, 0.0767],\n          [0.4858, 0.2600, 0.1150, 0.1000],\n          [0.4042, 0.4517, 0.1217, 0.1300],\n          [0.4242, 0.3217, 0.3617, 0.5567],\n          [0.6617, 0.4033, 0.5400, 0.4533]]), 'area': tensor([ 4048.,  4140.,  5694., 72478., 88128.]), 'iscrowd': tensor([0, 0, 0, 0, 0]), 'orig_size': tensor([480, 480])}}\n</code></pre> <p>You have successfully augmented the individual images and prepared their annotations. However, preprocessing isn't complete yet. In the final step, create a custom <code>collate_fn</code> to batch images together. Pad images (which are now <code>pixel_values</code>) to the largest image in a batch, and create a corresponding <code>pixel_mask</code> to indicate which pixels are real (1) and which are padding (0).</p> Python<pre><code>&gt;&gt;&gt; import torch\n\n&gt;&gt;&gt; def collate_fn(batch):\n...     data = {}\n...     data[\"pixel_values\"] = torch.stack([x[\"pixel_values\"] for x in batch])\n...     data[\"labels\"] = [x[\"labels\"] for x in batch]\n...     if \"pixel_mask\" in batch[0]:\n...         data[\"pixel_mask\"] = torch.stack([x[\"pixel_mask\"] for x in batch])\n...     return data\n</code></pre>"},{"location":"integrations/huggingface/object_detection/#preparing-function-to-compute-map","title":"Preparing function to compute mAP","text":"<p>Object detection models are commonly evaluated with a set of COCO-style metrics. We are going to use <code>torchmetrics</code> to compute <code>mAP</code> (mean average precision) and <code>mAR</code> (mean average recall) metrics and will wrap it to <code>compute_metrics</code> function in order to use in [<code>Trainer</code>] for evaluation.</p> <p>Intermediate format of boxes used for training is <code>YOLO</code> (normalized) but we will compute metrics for boxes in <code>Pascal VOC</code> (absolute) format in order to correctly handle box areas. Let's define a function that converts bounding boxes to <code>Pascal VOC</code> format:</p> Python<pre><code>&gt;&gt;&gt; from transformers.image_transforms import center_to_corners_format\n\n&gt;&gt;&gt; def convert_bbox_yolo_to_pascal(boxes, image_size):\n...     \"\"\"\n...     Convert bounding boxes from YOLO format (x_center, y_center, width, height) in range [0, 1]\n...     to Pascal VOC format (x_min, y_min, x_max, y_max) in absolute coordinates.\n\n...     Args:\n...         boxes (torch.Tensor): Bounding boxes in YOLO format\n...         image_size (Tuple[int, int]): Image size in format (height, width)\n\n...     Returns:\n...         torch.Tensor: Bounding boxes in Pascal VOC format (x_min, y_min, x_max, y_max)\n...     \"\"\"\n...     # convert center to corners format\n...     boxes = center_to_corners_format(boxes)\n\n...     # convert to absolute coordinates\n...     height, width = image_size\n...     boxes = boxes * torch.tensor([[width, height, width, height]])\n\n...     return boxes\n</code></pre> <p>Then, in <code>compute_metrics</code> function we collect <code>predicted</code> and <code>target</code> bounding boxes, scores and labels from evaluation loop results and pass it to the scoring function.</p> Python<pre><code>&gt;&gt;&gt; import numpy as np\n&gt;&gt;&gt; from dataclasses import dataclass\n&gt;&gt;&gt; from torchmetrics.detection.mean_ap import MeanAveragePrecision\n\n\n&gt;&gt;&gt; @dataclass\n&gt;&gt;&gt; class ModelOutput:\n...     logits: torch.Tensor\n...     pred_boxes: torch.Tensor\n\n\n&gt;&gt;&gt; @torch.no_grad()\n&gt;&gt;&gt; def compute_metrics(evaluation_results, image_processor, threshold=0.0, id2label=None):\n...     \"\"\"\n...     Compute mean average mAP, mAR and their variants for the object detection task.\n\n...     Args:\n...         evaluation_results (EvalPrediction): Predictions and targets from evaluation.\n...         threshold (float, optional): Threshold to filter predicted boxes by confidence. Defaults to 0.0.\n...         id2label (Optional[dict], optional): Mapping from class id to class name. Defaults to None.\n\n...     Returns:\n...         Mapping[str, float]: Metrics in a form of dictionary {&lt;metric_name&gt;: &lt;metric_value&gt;}\n...     \"\"\"\n\n...     predictions, targets = evaluation_results.predictions, evaluation_results.label_ids\n\n...     # For metric computation we need to provide:\n...     #  - targets in a form of list of dictionaries with keys \"boxes\", \"labels\"\n...     #  - predictions in a form of list of dictionaries with keys \"boxes\", \"scores\", \"labels\"\n\n...     image_sizes = []\n...     post_processed_targets = []\n...     post_processed_predictions = []\n\n...     # Collect targets in the required format for metric computation\n...     for batch in targets:\n...         # collect image sizes, we will need them for predictions post processing\n...         batch_image_sizes = torch.tensor(np.array([x[\"orig_size\"] for x in batch]))\n...         image_sizes.append(batch_image_sizes)\n...         # collect targets in the required format for metric computation\n...         # boxes were converted to YOLO format needed for model training\n...         # here we will convert them to Pascal VOC format (x_min, y_min, x_max, y_max)\n...         for image_target in batch:\n...             boxes = torch.tensor(image_target[\"boxes\"])\n...             boxes = convert_bbox_yolo_to_pascal(boxes, image_target[\"orig_size\"])\n...             labels = torch.tensor(image_target[\"class_labels\"])\n...             post_processed_targets.append({\"boxes\": boxes, \"labels\": labels})\n\n...     # Collect predictions in the required format for metric computation,\n...     # model produce boxes in YOLO format, then image_processor convert them to Pascal VOC format\n...     for batch, target_sizes in zip(predictions, image_sizes):\n...         batch_logits, batch_boxes = batch[1], batch[2]\n...         output = ModelOutput(logits=torch.tensor(batch_logits), pred_boxes=torch.tensor(batch_boxes))\n...         post_processed_output = image_processor.post_process_object_detection(\n...             output, threshold=threshold, target_sizes=target_sizes\n...         )\n...         post_processed_predictions.extend(post_processed_output)\n\n...     # Compute metrics\n...     metric = MeanAveragePrecision(box_format=\"xyxy\", class_metrics=True)\n...     metric.update(post_processed_predictions, post_processed_targets)\n...     metrics = metric.compute()\n\n...     # Replace list of per class metrics with separate metric for each class\n...     classes = metrics.pop(\"classes\")\n...     map_per_class = metrics.pop(\"map_per_class\")\n...     mar_100_per_class = metrics.pop(\"mar_100_per_class\")\n...     for class_id, class_map, class_mar in zip(classes, map_per_class, mar_100_per_class):\n...         class_name = id2label[class_id.item()] if id2label is not None else class_id.item()\n...         metrics[f\"map_{class_name}\"] = class_map\n...         metrics[f\"mar_100_{class_name}\"] = class_mar\n\n...     metrics = {k: round(v.item(), 4) for k, v in metrics.items()}\n\n...     return metrics\n\n\n&gt;&gt;&gt; eval_compute_metrics_fn = partial(\n...     compute_metrics, image_processor=image_processor, id2label=id2label, threshold=0.0\n... )\n</code></pre>"},{"location":"integrations/huggingface/object_detection/#training-the-detection-model","title":"Training the detection model","text":"<p>You have done most of the heavy lifting in the previous sections, so now you are ready to train your model! The images in this dataset are still quite large, even after resizing. This means that finetuning this model will require at least one GPU.</p> <p>Training involves the following steps: 1. Load the model with [<code>AutoModelForObjectDetection</code>] using the same checkpoint as in the preprocessing. 2. Define your training hyperparameters in [<code>TrainingArguments</code>]. 3. Pass the training arguments to [<code>Trainer</code>] along with the model, dataset, image processor, and data collator. 4. Call [<code>~Trainer.train</code>] to finetune your model.</p> <p>When loading the model from the same checkpoint that you used for the preprocessing, remember to pass the <code>label2id</code> and <code>id2label</code> maps that you created earlier from the dataset's metadata. Additionally, we specify <code>ignore_mismatched_sizes=True</code> to replace the existing classification head with a new one.</p> Python<pre><code>&gt;&gt;&gt; from transformers import AutoModelForObjectDetection\n\n&gt;&gt;&gt; model = AutoModelForObjectDetection.from_pretrained(\n...     MODEL_NAME,\n...     id2label=id2label,\n...     label2id=label2id,\n...     ignore_mismatched_sizes=True,\n... )\n</code></pre> <p>In the [<code>TrainingArguments</code>] use <code>output_dir</code> to specify where to save your model, then configure hyperparameters as you see fit. For <code>num_train_epochs=30</code> training will take about 35 minutes in Google Colab T4 GPU, increase the number of epoch to get better results.</p> <p>Important notes:  - Do not remove unused columns because this will drop the image column. Without the image column, you can't create <code>pixel_values</code>. For this reason, set <code>remove_unused_columns</code> to <code>False</code>.  - Set <code>eval_do_concat_batches=False</code> to get proper evaluation results. Images have different number of target boxes, if batches are concatenated we will not be able to determine which boxes belongs to particular image.</p> <p>If you wish to share your model by pushing to the Hub, set <code>push_to_hub</code> to <code>True</code> (you must be signed in to Hugging Face to upload your model).</p> Python<pre><code>&gt;&gt;&gt; from transformers import TrainingArguments\n\n&gt;&gt;&gt; training_args = TrainingArguments(\n...     output_dir=\"detr_finetuned_cppe5\",\n...     num_train_epochs=30,\n...     fp16=False,\n...     per_device_train_batch_size=8,\n...     dataloader_num_workers=4,\n...     learning_rate=5e-5,\n...     lr_scheduler_type=\"cosine\",\n...     weight_decay=1e-4,\n...     max_grad_norm=0.01,\n...     metric_for_best_model=\"eval_map\",\n...     greater_is_better=True,\n...     load_best_model_at_end=True,\n...     eval_strategy=\"epoch\",\n...     save_strategy=\"epoch\",\n...     save_total_limit=2,\n...     remove_unused_columns=False,\n...     eval_do_concat_batches=False,\n...     push_to_hub=True,\n... )\n</code></pre> <p>Finally, bring everything together, and call [<code>~transformers.Trainer.train</code>]:</p> Python<pre><code>&gt;&gt;&gt; from transformers import Trainer\n\n&gt;&gt;&gt; trainer = Trainer(\n...     model=model,\n...     args=training_args,\n...     train_dataset=cppe5[\"train\"],\n...     eval_dataset=cppe5[\"validation\"],\n...     processing_class=image_processor,\n...     data_collator=collate_fn,\n...     compute_metrics=eval_compute_metrics_fn,\n... )\n\n&gt;&gt;&gt; trainer.train()\n</code></pre>    [3210/3210 26:07, Epoch 30/30]  Epoch Training Loss Validation Loss Map Map 50 Map 75 Map Small Map Medium Map Large Mar 1 Mar 10 Mar 100 Mar Small Mar Medium Mar Large Map Coverall Mar 100 Coverall Map Face Shield Mar 100 Face Shield Map Gloves Mar 100 Gloves Map Goggles Mar 100 Goggles Map Mask Mar 100 Mask 1 No log 2.629903 0.008900 0.023200 0.006500 0.001300 0.002800 0.020500 0.021500 0.070400 0.101400 0.007600 0.106200 0.096100 0.036700 0.232000 0.000300 0.019000 0.003900 0.125400 0.000100 0.003100 0.003500 0.127600 2 No log 3.479864 0.014800 0.034600 0.010800 0.008600 0.011700 0.012500 0.041100 0.098700 0.130000 0.056000 0.062200 0.111900 0.053500 0.447300 0.010600 0.100000 0.000200 0.022800 0.000100 0.015400 0.009700 0.064400 3 No log 2.107622 0.041700 0.094000 0.034300 0.024100 0.026400 0.047400 0.091500 0.182800 0.225800 0.087200 0.199400 0.210600 0.150900 0.571200 0.017300 0.101300 0.007300 0.180400 0.002100 0.026200 0.031000 0.250200 4 No log 2.031242 0.055900 0.120600 0.046900 0.013800 0.038100 0.090300 0.105900 0.225600 0.266100 0.130200 0.228100 0.330000 0.191000 0.572100 0.010600 0.157000 0.014600 0.235300 0.001700 0.052300 0.061800 0.313800 5 3.889400 1.883433 0.089700 0.201800 0.067300 0.022800 0.065300 0.129500 0.136000 0.272200 0.303700 0.112900 0.312500 0.424600 0.300200 0.585100 0.032700 0.202500 0.031300 0.271000 0.008700 0.126200 0.075500 0.333800 6 3.889400 1.807503 0.118500 0.270900 0.090200 0.034900 0.076700 0.152500 0.146100 0.297800 0.325400 0.171700 0.283700 0.545900 0.396900 0.554500 0.043000 0.262000 0.054500 0.271900 0.020300 0.230800 0.077600 0.308000 7 3.889400 1.716169 0.143500 0.307700 0.123200 0.045800 0.097800 0.258300 0.165300 0.327700 0.352600 0.140900 0.336700 0.599400 0.442900 0.620700 0.069400 0.301300 0.081600 0.292000 0.011000 0.230800 0.112700 0.318200 8 3.889400 1.679014 0.153000 0.355800 0.127900 0.038700 0.115600 0.291600 0.176000 0.322500 0.349700 0.135600 0.326100 0.643700 0.431700 0.582900 0.069800 0.265800 0.088600 0.274600 0.028300 0.280000 0.146700 0.345300 9 3.889400 1.618239 0.172100 0.375300 0.137600 0.046100 0.141700 0.308500 0.194000 0.356200 0.386200 0.162400 0.359200 0.677700 0.469800 0.623900 0.102100 0.317700 0.099100 0.290200 0.029300 0.335400 0.160200 0.364000 10 1.599700 1.572512 0.179500 0.400400 0.147200 0.056500 0.141700 0.316700 0.213100 0.357600 0.381300 0.197900 0.344300 0.638500 0.466900 0.623900 0.101300 0.311400 0.104700 0.279500 0.051600 0.338500 0.173000 0.353300 11 1.599700 1.528889 0.192200 0.415000 0.160800 0.053700 0.150500 0.378000 0.211500 0.371700 0.397800 0.204900 0.374600 0.684800 0.491900 0.632400 0.131200 0.346800 0.122000 0.300900 0.038400 0.344600 0.177500 0.364400 12 1.599700 1.517532 0.198300 0.429800 0.159800 0.066400 0.162900 0.383300 0.220700 0.382100 0.405400 0.214800 0.383200 0.672900 0.469000 0.610400 0.167800 0.379700 0.119700 0.307100 0.038100 0.335400 0.196800 0.394200 13 1.599700 1.488849 0.209800 0.452300 0.172300 0.094900 0.171100 0.437800 0.222000 0.379800 0.411500 0.203800 0.397300 0.707500 0.470700 0.620700 0.186900 0.407600 0.124200 0.306700 0.059300 0.355400 0.207700 0.367100 14 1.599700 1.482210 0.228900 0.482600 0.187800 0.083600 0.191800 0.444100 0.225900 0.376900 0.407400 0.182500 0.384800 0.700600 0.512100 0.640100 0.175000 0.363300 0.144300 0.300000 0.083100 0.363100 0.229900 0.370700 15 1.326800 1.475198 0.216300 0.455600 0.174900 0.088500 0.183500 0.424400 0.226900 0.373400 0.404300 0.199200 0.396400 0.677800 0.496300 0.633800 0.166300 0.392400 0.128900 0.312900 0.085200 0.312300 0.205000 0.370200 16 1.326800 1.459697 0.233200 0.504200 0.192200 0.096000 0.202000 0.430800 0.239100 0.382400 0.412600 0.219500 0.403100 0.670400 0.485200 0.625200 0.196500 0.410100 0.135700 0.299600 0.123100 0.356900 0.225300 0.371100 17 1.326800 1.407340 0.243400 0.511900 0.204500 0.121000 0.215700 0.468000 0.246200 0.394600 0.424200 0.225900 0.416100 0.705200 0.494900 0.638300 0.224900 0.430400 0.157200 0.317900 0.115700 0.369200 0.224200 0.365300 18 1.326800 1.419522 0.245100 0.521500 0.210000 0.116100 0.211500 0.489900 0.255400 0.391600 0.419700 0.198800 0.421200 0.701400 0.501800 0.634200 0.226700 0.410100 0.154400 0.321400 0.105900 0.352300 0.236700 0.380400 19 1.158600 1.398764 0.253600 0.519200 0.213600 0.135200 0.207700 0.491900 0.257300 0.397300 0.428000 0.241400 0.401800 0.703500 0.509700 0.631100 0.236700 0.441800 0.155900 0.330800 0.128100 0.352300 0.237500 0.384000 20 1.158600 1.390591 0.248800 0.520200 0.216600 0.127500 0.211400 0.471900 0.258300 0.407000 0.429100 0.240300 0.407600 0.708500 0.505800 0.623400 0.235500 0.431600 0.150000 0.325000 0.125700 0.375400 0.227200 0.390200 21 1.158600 1.360608 0.262700 0.544800 0.222100 0.134700 0.230000 0.487500 0.269500 0.413300 0.436300 0.236200 0.419100 0.709300 0.514100 0.637400 0.257200 0.450600 0.165100 0.338400 0.139400 0.372300 0.237700 0.382700 22 1.158600 1.368296 0.262800 0.542400 0.236400 0.137400 0.228100 0.498500 0.266500 0.409000 0.433000 0.239900 0.418500 0.697500 0.520500 0.641000 0.257500 0.455700 0.162600 0.334800 0.140200 0.353800 0.233200 0.379600 23 1.158600 1.368176 0.264800 0.541100 0.233100 0.138200 0.223900 0.498700 0.272300 0.407400 0.434400 0.233100 0.418300 0.702000 0.524400 0.642300 0.262300 0.444300 0.159700 0.335300 0.140500 0.366200 0.236900 0.384000 24 1.049700 1.355271 0.269700 0.549200 0.239100 0.134700 0.229900 0.519200 0.274800 0.412700 0.437600 0.245400 0.417200 0.711200 0.523200 0.644100 0.272100 0.440500 0.166700 0.341500 0.137700 0.373800 0.249000 0.388000 25 1.049700 1.355180 0.272500 0.547900 0.243800 0.149700 0.229900 0.523100 0.272500 0.415700 0.442200 0.256200 0.420200 0.705800 0.523900 0.639600 0.271700 0.451900 0.166300 0.346900 0.153700 0.383100 0.247000 0.389300 26 1.049700 1.349337 0.275600 0.556300 0.246400 0.146700 0.234800 0.516300 0.274200 0.418300 0.440900 0.248700 0.418900 0.705800 0.523200 0.636500 0.274700 0.440500 0.172400 0.349100 0.155600 0.384600 0.252300 0.393800 27 1.049700 1.350782 0.275200 0.548700 0.246800 0.147300 0.236400 0.527200 0.280100 0.416200 0.442600 0.253400 0.424000 0.710300 0.526600 0.640100 0.273200 0.445600 0.167000 0.346900 0.160100 0.387700 0.249200 0.392900 28 1.049700 1.346533 0.277000 0.552800 0.252900 0.147400 0.240000 0.527600 0.280900 0.420900 0.444100 0.255500 0.424500 0.711200 0.530200 0.646800 0.277400 0.441800 0.170900 0.346900 0.156600 0.389200 0.249600 0.396000 29 0.993700 1.346575 0.277100 0.554800 0.252900 0.148400 0.239700 0.523600 0.278400 0.420000 0.443300 0.256300 0.424000 0.705600 0.529600 0.647300 0.273900 0.439200 0.174300 0.348700 0.157600 0.386200 0.250100 0.395100 30 0.993700 1.346446 0.277400 0.554700 0.252700 0.147900 0.240800 0.523600 0.278800 0.420400 0.443300 0.256100 0.424200 0.705500 0.530100 0.646800 0.275600 0.440500 0.174500 0.348700 0.157300 0.386200 0.249200 0.394200 <p>  If you have set `push_to_hub` to `True` in the `training_args`, the training checkpoints are pushed to the Hugging Face Hub. Upon training completion, push the final model to the Hub as well by calling the [`~transformers.Trainer.push_to_hub`] method.  Python<pre><code>&gt;&gt;&gt; trainer.push_to_hub()\n</code></pre>  ## Evaluate  Python<pre><code>&gt;&gt;&gt; from pprint import pprint\n\n&gt;&gt;&gt; metrics = trainer.evaluate(eval_dataset=cppe5[\"test\"], metric_key_prefix=\"test\")\n&gt;&gt;&gt; pprint(metrics)\n{'epoch': 30.0,\n  'test_loss': 1.0877351760864258,\n  'test_map': 0.4116,\n  'test_map_50': 0.741,\n  'test_map_75': 0.3663,\n  'test_map_Coverall': 0.5937,\n  'test_map_Face_Shield': 0.5863,\n  'test_map_Gloves': 0.3416,\n  'test_map_Goggles': 0.1468,\n  'test_map_Mask': 0.3894,\n  'test_map_large': 0.5637,\n  'test_map_medium': 0.3257,\n  'test_map_small': 0.3589,\n  'test_mar_1': 0.323,\n  'test_mar_10': 0.5237,\n  'test_mar_100': 0.5587,\n  'test_mar_100_Coverall': 0.6756,\n  'test_mar_100_Face_Shield': 0.7294,\n  'test_mar_100_Gloves': 0.4721,\n  'test_mar_100_Goggles': 0.4125,\n  'test_mar_100_Mask': 0.5038,\n  'test_mar_large': 0.7283,\n  'test_mar_medium': 0.4901,\n  'test_mar_small': 0.4469,\n  'test_runtime': 1.6526,\n  'test_samples_per_second': 17.548,\n  'test_steps_per_second': 2.42}\n</code></pre>  These results can be further improved by adjusting the hyperparameters in [`TrainingArguments`]. Give it a go!  ## Inference  Now that you have finetuned a model, evaluated it, and uploaded it to the Hugging Face Hub, you can use it for inference.  Python<pre><code>&gt;&gt;&gt; import torch\n&gt;&gt;&gt; import requests\n\n&gt;&gt;&gt; from PIL import Image, ImageDraw\n&gt;&gt;&gt; from transformers import AutoImageProcessor, AutoModelForObjectDetection\n\n&gt;&gt;&gt; url = \"https://images.pexels.com/photos/8413299/pexels-photo-8413299.jpeg?auto=compress&amp;cs=tinysrgb&amp;w=630&amp;h=375&amp;dpr=2\"\n&gt;&gt;&gt; image = Image.open(requests.get(url, stream=True).raw)\n</code></pre>  Load model and image processor from the Hugging Face Hub (skip to use already trained in this session): Python<pre><code>&gt;&gt;&gt; from accelerate.test_utils.testing import get_backend\n# automatically detects the underlying device type (CUDA, CPU, XPU, MPS, etc.)\n&gt;&gt;&gt; device, _, _ = get_backend()\n&gt;&gt;&gt; model_repo = \"qubvel-hf/detr_finetuned_cppe5\"\n\n&gt;&gt;&gt; image_processor = AutoImageProcessor.from_pretrained(model_repo)\n&gt;&gt;&gt; model = AutoModelForObjectDetection.from_pretrained(model_repo)\n&gt;&gt;&gt; model = model.to(device)\n</code></pre>  And detect bounding boxes:  Python<pre><code>&gt;&gt;&gt; with torch.no_grad():\n...     inputs = image_processor(images=[image], return_tensors=\"pt\")\n...     outputs = model(**inputs.to(device))\n...     target_sizes = torch.tensor([[image.size[1], image.size[0]]])\n...     results = image_processor.post_process_object_detection(outputs, threshold=0.3, target_sizes=target_sizes)[0]\n\n&gt;&gt;&gt; for score, label, box in zip(results[\"scores\"], results[\"labels\"], results[\"boxes\"]):\n...     box = [round(i, 2) for i in box.tolist()]\n...     print(\n...         f\"Detected {model.config.id2label[label.item()]} with confidence \"\n...         f\"{round(score.item(), 3)} at location {box}\"\n...     )\nDetected Gloves with confidence 0.683 at location [244.58, 124.33, 300.35, 185.13]\nDetected Mask with confidence 0.517 at location [143.73, 64.58, 219.57, 125.89]\nDetected Gloves with confidence 0.425 at location [179.15, 155.57, 262.4, 226.35]\nDetected Coverall with confidence 0.407 at location [307.13, -1.18, 477.82, 318.06]\nDetected Coverall with confidence 0.391 at location [68.61, 126.66, 309.03, 318.89]\n</code></pre>  Let's plot the result:  Python<pre><code>&gt;&gt;&gt; draw = ImageDraw.Draw(image)\n\n&gt;&gt;&gt; for score, label, box in zip(results[\"scores\"], results[\"labels\"], results[\"boxes\"]):\n...     box = [round(i, 2) for i in box.tolist()]\n...     x, y, x2, y2 = tuple(box)\n...     draw.rectangle((x, y, x2, y2), outline=\"red\", width=1)\n...     draw.text((x, y), model.config.id2label[label.item()], fill=\"white\")\n\n&gt;&gt;&gt; image\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/","title":"How to Train RT-DETR on Custom Dataset with Roboflow, HuggingFace and Albumentations","text":""},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#how-to-train-rt-detr-on-custom-dataset","title":"How to Train RT-DETR on Custom Dataset","text":"<p>RT-DETR, short for \"Real-Time DEtection TRansformer\", is a computer vision model developed by Peking University and Baidu. In their paper, \"DETRs Beat YOLOs on Real-time Object Detection\" the authors claim that RT-DETR can outperform YOLO models in object detection, both in terms of speed and accuracy. The model has been released under the Apache 2.0 license, making it a great option, especially for enterprise projects.</p> <p></p> <p>Recently, RT-DETR was added to the <code>transformers</code> library, significantly simplifying its fine-tuning process. In this tutorial, we will show you how to train RT-DETR on a custom dataset.</p>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#setup","title":"Setup","text":""},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#configure-your-api-keys","title":"Configure your API keys","text":"<p>To fine-tune RT-DETR, you need to provide your HuggingFace Token and Roboflow API key. Follow these steps:</p> <ul> <li>Open your <code>HuggingFace Settings</code> page. Click <code>Access Tokens</code> then <code>New Token</code> to generate new token.</li> <li>Go to your <code>Roboflow Settings</code> page. Click <code>Copy</code>. This will place your private key in the clipboard.</li> <li>In Colab, go to the left pane and click on <code>Secrets</code> (\ud83d\udd11).<ul> <li>Store HuggingFace Access Token under the name <code>HF_TOKEN</code>.</li> <li>Store Roboflow API Key under the name <code>ROBOFLOW_API_KEY</code>.</li> </ul> </li> </ul>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#select-the-runtime","title":"Select the runtime","text":"<p>Let's make sure that we have access to GPU. We can use <code>nvidia-smi</code> command to do that. In case of any problems navigate to <code>Edit</code> -&gt; <code>Notebook settings</code> -&gt; <code>Hardware accelerator</code>, set it to <code>L4 GPU</code>, and then click <code>Save</code>.</p> Python<pre><code>!nvidia-smi\n</code></pre> <pre><code>Thu Jul 11 09:20:53 2024       \n+---------------------------------------------------------------------------------------+\n| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |\n|-----------------------------------------+----------------------+----------------------+\n| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |\n| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |\n|                                         |                      |               MIG M. |\n|=========================================+======================+======================|\n|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |\n| N/A   65C    P8              11W /  70W |      0MiB / 15360MiB |      0%      Default |\n|                                         |                      |                  N/A |\n+-----------------------------------------+----------------------+----------------------+\n\n+---------------------------------------------------------------------------------------+\n| Processes:                                                                            |\n|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |\n|        ID   ID                                                             Usage      |\n|=======================================================================================|\n|  No running processes found                                                           |\n+---------------------------------------------------------------------------------------+\n</code></pre> <p>NOTE: To make it easier for us to manage datasets, images and models we create a <code>HOME</code> constant.</p> Python<pre><code>import os\nHOME = os.getcwd()\nprint(\"HOME:\", HOME)\n</code></pre> <pre><code>HOME: /content\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#install-dependencies","title":"Install dependencies","text":"Python<pre><code>!pip install -q git+https://github.com/huggingface/transformers.git\n!pip install -q git+https://github.com/roboflow/supervision.git\n!pip install -q accelerate\n!pip install -q roboflow\n!pip install -q torchmetrics\n!pip install -q \"albumentations&gt;=1.4.5\"\n</code></pre> <pre><code>  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n  Building wheel for transformers (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n  Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n  Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n  Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n  Building wheel for supervision (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m314.1/314.1 kB\u001b[0m \u001b[31m5.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m21.3/21.3 MB\u001b[0m \u001b[31m71.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m76.2/76.2 kB\u001b[0m \u001b[31m2.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m178.7/178.7 kB\u001b[0m \u001b[31m10.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m54.5/54.5 kB\u001b[0m \u001b[31m5.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m868.8/868.8 kB\u001b[0m \u001b[31m6.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m165.3/165.3 kB\u001b[0m \u001b[31m4.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m14.9/14.9 MB\u001b[0m \u001b[31m83.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m13.4/13.4 MB\u001b[0m \u001b[31m92.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[2K     \u001b[90m\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u001b[0m \u001b[32m313.5/313.5 kB\u001b[0m \u001b[31m34.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n\u001b[?25h\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#imports","title":"Imports","text":"Python<pre><code>import torch\nimport requests\n\nimport numpy as np\nimport supervision as sv\nimport albumentations as A\n\nfrom PIL import Image\nfrom pprint import pprint\nfrom roboflow import Roboflow\nfrom dataclasses import dataclass, replace\nfrom google.colab import userdata\nfrom torch.utils.data import Dataset\nfrom transformers import (\n    AutoImageProcessor,\n    AutoModelForObjectDetection,\n    TrainingArguments,\n    Trainer\n)\nfrom torchmetrics.detection.mean_ap import MeanAveragePrecision\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#inference-with-pre-trained-rt-detr-model","title":"Inference with pre-trained RT-DETR model","text":"Python<pre><code># @title Load model\n\nCHECKPOINT = \"PekingU/rtdetr_r50vd_coco_o365\"\nDEVICE = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\nmodel = AutoModelForObjectDetection.from_pretrained(CHECKPOINT).to(DEVICE)\nprocessor = AutoImageProcessor.from_pretrained(CHECKPOINT)\n</code></pre> <pre><code>config.json:   0%|          | 0.00/5.11k [00:00&lt;?, ?B/s]\n\n\n\nmodel.safetensors:   0%|          | 0.00/172M [00:00&lt;?, ?B/s]\n\n\n\npreprocessor_config.json:   0%|          | 0.00/841 [00:00&lt;?, ?B/s]\n</code></pre> Python<pre><code># @title Run inference\n\nURL = \"https://media.roboflow.com/notebooks/examples/dog.jpeg\"\n\nimage = Image.open(requests.get(URL, stream=True).raw)\ninputs = processor(image, return_tensors=\"pt\").to(DEVICE)\n\nwith torch.no_grad():\n    outputs = model(**inputs)\n\nw, h = image.size\nresults = processor.post_process_object_detection(\n    outputs, target_sizes=[(h, w)], threshold=0.3)\n</code></pre> Python<pre><code># @title Display result with NMS\n\ndetections = sv.Detections.from_transformers(results[0])\nlabels = [\n    model.config.id2label[class_id]\n    for class_id\n    in detections.class_id\n]\n\nannotated_image = image.copy()\nannotated_image = sv.BoundingBoxAnnotator().annotate(annotated_image, detections)\nannotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels=labels)\nannotated_image.thumbnail((600, 600))\nannotated_image\n</code></pre> Python<pre><code># @title Display result with NMS\n\ndetections = sv.Detections.from_transformers(results[0]).with_nms(threshold=0.1)\nlabels = [\n    model.config.id2label[class_id]\n    for class_id\n    in detections.class_id\n]\n\nannotated_image = image.copy()\nannotated_image = sv.BoundingBoxAnnotator().annotate(annotated_image, detections)\nannotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels=labels)\nannotated_image.thumbnail((600, 600))\nannotated_image\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#fine-tune-rt-detr-on-custom-dataset","title":"Fine-tune RT-DETR on custom dataset","text":"Python<pre><code># @title Download dataset from Roboflow Universe\n\nROBOFLOW_API_KEY = userdata.get('ROBOFLOW_API_KEY')\nrf = Roboflow(api_key=ROBOFLOW_API_KEY)\n\nproject = rf.workspace(\"roboflow-jvuqo\").project(\"poker-cards-fmjio\")\nversion = project.version(4)\ndataset = version.download(\"coco\")\n</code></pre> <pre><code>loading Roboflow workspace...\nloading Roboflow project...\n\n\nDownloading Dataset Version Zip in poker-cards-4 to coco:: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 39123/39123 [00:01&lt;00:00, 27288.54it/s]\n\n\n\n\n\nExtracting Dataset Version Zip to poker-cards-4 in coco:: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 907/907 [00:00&lt;00:00, 2984.59it/s]\n</code></pre> Python<pre><code>ds_train = sv.DetectionDataset.from_coco(\n    images_directory_path=f\"{dataset.location}/train\",\n    annotations_path=f\"{dataset.location}/train/_annotations.coco.json\",\n)\nds_valid = sv.DetectionDataset.from_coco(\n    images_directory_path=f\"{dataset.location}/valid\",\n    annotations_path=f\"{dataset.location}/valid/_annotations.coco.json\",\n)\nds_test = sv.DetectionDataset.from_coco(\n    images_directory_path=f\"{dataset.location}/test\",\n    annotations_path=f\"{dataset.location}/test/_annotations.coco.json\",\n)\n\nprint(f\"Number of training images: {len(ds_train)}\")\nprint(f\"Number of validation images: {len(ds_valid)}\")\nprint(f\"Number of test images: {len(ds_test)}\")\n</code></pre> <pre><code>Number of training images: 811\nNumber of validation images: 44\nNumber of test images: 44\n</code></pre> Python<pre><code># @title Display dataset sample\n\nGRID_SIZE = 5\n\ndef annotate(image, annotations, classes):\n    labels = [\n        classes[class_id]\n        for class_id\n        in annotations.class_id\n    ]\n\n    bounding_box_annotator = sv.BoundingBoxAnnotator()\n    label_annotator = sv.LabelAnnotator(text_scale=1, text_thickness=2)\n\n    annotated_image = image.copy()\n    annotated_image = bounding_box_annotator.annotate(annotated_image, annotations)\n    annotated_image = label_annotator.annotate(annotated_image, annotations, labels=labels)\n    return annotated_image\n\nannotated_images = []\nfor i in range(GRID_SIZE * GRID_SIZE):\n    _, image, annotations = ds_train[i]\n    annotated_image = annotate(image, annotations, ds_train.classes)\n    annotated_images.append(annotated_image)\n\ngrid = sv.create_tiles(\n    annotated_images,\n    grid_size=(GRID_SIZE, GRID_SIZE),\n    single_tile_size=(400, 400),\n    tile_padding_color=sv.Color.WHITE,\n    tile_margin_color=sv.Color.WHITE\n)\nsv.plot_image(grid, size=(10, 10))\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#preprocess-the-data","title":"Preprocess the data","text":"<p>To finetune a model, you must preprocess the data you plan to use to match precisely the approach used for the pre-trained model. AutoImageProcessor takes care of processing image data to create <code>pixel_values</code>, <code>pixel_mask</code>, and <code>labels</code> that a DETR model can train with. The image processor has some attributes that you won't have to worry about:</p> <ul> <li><code>image_mean = [0.485, 0.456, 0.406 ]</code></li> <li><code>image_std = [0.229, 0.224, 0.225]</code></li> </ul> <p>These are the mean and standard deviation used to normalize images during the model pre-training. These values are crucial to replicate when doing inference or finetuning a pre-trained image model.</p> <p>Instantiate the image processor from the same checkpoint as the model you want to finetune.</p> Python<pre><code>IMAGE_SIZE = 480\n\nprocessor = AutoImageProcessor.from_pretrained(\n    CHECKPOINT,\n    do_resize=True,\n    size={\"width\": IMAGE_SIZE, \"height\": IMAGE_SIZE},\n)\n</code></pre> <p>Before passing the images to the <code>processor</code>, apply two preprocessing transformations to the dataset:</p> <ul> <li>Augmenting images</li> <li>Reformatting annotations to meet RT-DETR expectations</li> </ul> <p>First, to make sure the model does not overfit on the training data, you can apply image augmentation with any data augmentation library. Here we use Albumentations. This library ensures that transformations affect the image and update the bounding boxes accordingly.</p> Python<pre><code>train_augmentation_and_transform = A.Compose(\n    [\n        A.Perspective(p=0.1),\n        A.HorizontalFlip(p=0.5),\n        A.RandomBrightnessContrast(p=0.5),\n        A.HueSaturationValue(p=0.1),\n    ],\n    bbox_params=A.BboxParams(\n        format=\"pascal_voc\",\n        label_fields=[\"category\"],\n        clip=True,\n        min_area=25\n    ),\n)\n\nvalid_transform = A.Compose(\n    [A.NoOp()],\n    bbox_params=A.BboxParams(\n        format=\"pascal_voc\",\n        label_fields=[\"category\"],\n        clip=True,\n        min_area=1\n    ),\n)\n</code></pre> Python<pre><code># @title Visualize some augmented images\n\nIMAGE_COUNT = 5\n\nfor i in range(IMAGE_COUNT):\n    _, image, annotations = ds_train[i]\n\n    output = train_augmentation_and_transform(\n        image=image,\n        bboxes=annotations.xyxy,\n        category=annotations.class_id\n    )\n\n    augmented_image = output[\"image\"]\n    augmented_annotations = replace(\n        annotations,\n        xyxy=np.array(output[\"bboxes\"]),\n        class_id=np.array(output[\"category\"])\n    )\n\n    annotated_images = [\n        annotate(image, annotations, ds_train.classes),\n        annotate(augmented_image, augmented_annotations, ds_train.classes)\n    ]\n    grid = sv.create_tiles(\n        annotated_images,\n        titles=['original', 'augmented'],\n        titles_scale=0.5,\n        single_tile_size=(400, 400),\n        tile_padding_color=sv.Color.WHITE,\n        tile_margin_color=sv.Color.WHITE\n    )\n    sv.plot_image(grid, size=(6, 6))\n</code></pre> <p></p> <p></p> <p></p> <p></p> <p></p> <p>The <code>processor</code> expects the annotations to be in the following format: <code>{'image_id': int, 'annotations': List[Dict]}</code>, where each dictionary is a COCO object annotation. Let's add a function to reformat annotations for a single example:</p> Python<pre><code>class PyTorchDetectionDataset(Dataset):\n    def __init__(self, dataset: sv.DetectionDataset, processor, transform: A.Compose = None):\n        self.dataset = dataset\n        self.processor = processor\n        self.transform = transform\n\n    @staticmethod\n    def annotations_as_coco(image_id, categories, boxes):\n        annotations = []\n        for category, bbox in zip(categories, boxes):\n            x1, y1, x2, y2 = bbox\n            formatted_annotation = {\n                \"image_id\": image_id,\n                \"category_id\": category,\n                \"bbox\": [x1, y1, x2 - x1, y2 - y1],\n                \"iscrowd\": 0,\n                \"area\": (x2 - x1) * (y2 - y1),\n            }\n            annotations.append(formatted_annotation)\n\n        return {\n            \"image_id\": image_id,\n            \"annotations\": annotations,\n        }\n\n    def __len__(self):\n        return len(self.dataset)\n\n    def __getitem__(self, idx):\n        _, image, annotations = self.dataset[idx]\n\n        # Convert image to RGB numpy array\n        image = image[:, :, ::-1]\n        boxes = annotations.xyxy\n        categories = annotations.class_id\n\n        if self.transform:\n            transformed = self.transform(\n                image=image,\n                bboxes=boxes,\n                category=categories\n            )\n            image = transformed[\"image\"]\n            boxes = transformed[\"bboxes\"]\n            categories = transformed[\"category\"]\n\n\n        formatted_annotations = self.annotations_as_coco(\n            image_id=idx, categories=categories, boxes=boxes)\n        result = self.processor(\n            images=image, annotations=formatted_annotations, return_tensors=\"pt\")\n\n        # Image processor expands batch dimension, lets squeeze it\n        result = {k: v[0] for k, v in result.items()}\n\n        return result\n</code></pre> <p>Now you can combine the image and annotation transformations to use on a batch of examples:</p> Python<pre><code>pytorch_dataset_train = PyTorchDetectionDataset(\n    ds_train, processor, transform=train_augmentation_and_transform)\npytorch_dataset_valid = PyTorchDetectionDataset(\n    ds_valid, processor, transform=valid_transform)\npytorch_dataset_test = PyTorchDetectionDataset(\n    ds_test, processor, transform=valid_transform)\n\npytorch_dataset_train[15]\n</code></pre> <pre><code>{'pixel_values': tensor([[[0.0745, 0.0745, 0.0745,  ..., 0.2431, 0.2471, 0.2471],\n          [0.0745, 0.0745, 0.0745,  ..., 0.2510, 0.2549, 0.2549],\n          [0.0667, 0.0706, 0.0706,  ..., 0.2588, 0.2588, 0.2588],\n          ...,\n          [0.0118, 0.0118, 0.0118,  ..., 0.0510, 0.0549, 0.0510],\n          [0.0157, 0.0196, 0.0235,  ..., 0.0549, 0.0627, 0.0549],\n          [0.0235, 0.0275, 0.0314,  ..., 0.0549, 0.0627, 0.0549]],\n\n         [[0.0549, 0.0549, 0.0549,  ..., 0.3137, 0.3176, 0.3176],\n          [0.0549, 0.0549, 0.0549,  ..., 0.3216, 0.3255, 0.3255],\n          [0.0471, 0.0510, 0.0510,  ..., 0.3294, 0.3294, 0.3294],\n          ...,\n          [0.0000, 0.0000, 0.0000,  ..., 0.0353, 0.0392, 0.0353],\n          [0.0000, 0.0000, 0.0039,  ..., 0.0392, 0.0471, 0.0392],\n          [0.0000, 0.0039, 0.0078,  ..., 0.0392, 0.0471, 0.0392]],\n\n         [[0.0431, 0.0431, 0.0431,  ..., 0.3922, 0.3961, 0.3961],\n          [0.0431, 0.0431, 0.0431,  ..., 0.4000, 0.4039, 0.4039],\n          [0.0353, 0.0392, 0.0392,  ..., 0.4078, 0.4078, 0.4078],\n          ...,\n          [0.0000, 0.0000, 0.0000,  ..., 0.0314, 0.0353, 0.0314],\n          [0.0000, 0.0000, 0.0039,  ..., 0.0353, 0.0431, 0.0353],\n          [0.0000, 0.0039, 0.0078,  ..., 0.0353, 0.0431, 0.0353]]]),\n 'labels': {'size': tensor([480, 480]), 'image_id': tensor([15]), 'class_labels': tensor([36,  4, 44, 52, 48]), 'boxes': tensor([[0.7891, 0.4437, 0.2094, 0.3562],\n         [0.3984, 0.6484, 0.3187, 0.3906],\n         [0.5891, 0.4070, 0.2219, 0.3859],\n         [0.3484, 0.2812, 0.2625, 0.4094],\n         [0.1602, 0.5023, 0.2672, 0.4109]]), 'area': tensor([17185.5000, 28687.5000, 19729.1250, 24759.0000, 25297.3125]), 'iscrowd': tensor([0, 0, 0, 0, 0]), 'orig_size': tensor([640, 640])}}\n</code></pre> <p>You have successfully augmented the images and prepared their annotations. In the final step, create a custom collate_fn to batch images together.</p> Python<pre><code>def collate_fn(batch):\n    data = {}\n    data[\"pixel_values\"] = torch.stack([x[\"pixel_values\"] for x in batch])\n    data[\"labels\"] = [x[\"labels\"] for x in batch]\n    return data\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#preparing-function-to-compute-map","title":"Preparing function to compute mAP","text":"Python<pre><code>id2label = {id: label for id, label in enumerate(ds_train.classes)}\nlabel2id = {label: id for id, label in enumerate(ds_train.classes)}\n\n\n@dataclass\nclass ModelOutput:\n    logits: torch.Tensor\n    pred_boxes: torch.Tensor\n\n\nclass MAPEvaluator:\n\n    def __init__(self, image_processor, threshold=0.00, id2label=None):\n        self.image_processor = image_processor\n        self.threshold = threshold\n        self.id2label = id2label\n\n    def collect_image_sizes(self, targets):\n        \"\"\"Collect image sizes across the dataset as list of tensors with shape [batch_size, 2].\"\"\"\n        image_sizes = []\n        for batch in targets:\n            batch_image_sizes = torch.tensor(np.array([x[\"size\"] for x in batch]))\n            image_sizes.append(batch_image_sizes)\n        return image_sizes\n\n    def collect_targets(self, targets, image_sizes):\n        post_processed_targets = []\n        for target_batch, image_size_batch in zip(targets, image_sizes):\n            for target, (height, width) in zip(target_batch, image_size_batch):\n                boxes = target[\"boxes\"]\n                boxes = sv.xcycwh_to_xyxy(boxes)\n                boxes = boxes * np.array([width, height, width, height])\n                boxes = torch.tensor(boxes)\n                labels = torch.tensor(target[\"class_labels\"])\n                post_processed_targets.append({\"boxes\": boxes, \"labels\": labels})\n        return post_processed_targets\n\n    def collect_predictions(self, predictions, image_sizes):\n        post_processed_predictions = []\n        for batch, target_sizes in zip(predictions, image_sizes):\n            batch_logits, batch_boxes = batch[1], batch[2]\n            output = ModelOutput(logits=torch.tensor(batch_logits), pred_boxes=torch.tensor(batch_boxes))\n            post_processed_output = self.image_processor.post_process_object_detection(\n                output, threshold=self.threshold, target_sizes=target_sizes\n            )\n            post_processed_predictions.extend(post_processed_output)\n        return post_processed_predictions\n\n    @torch.no_grad()\n    def __call__(self, evaluation_results):\n\n        predictions, targets = evaluation_results.predictions, evaluation_results.label_ids\n\n        image_sizes = self.collect_image_sizes(targets)\n        post_processed_targets = self.collect_targets(targets, image_sizes)\n        post_processed_predictions = self.collect_predictions(predictions, image_sizes)\n\n        evaluator = MeanAveragePrecision(box_format=\"xyxy\", class_metrics=True)\n        evaluator.warn_on_many_detections = False\n        evaluator.update(post_processed_predictions, post_processed_targets)\n\n        metrics = evaluator.compute()\n\n        # Replace list of per class metrics with separate metric for each class\n        classes = metrics.pop(\"classes\")\n        map_per_class = metrics.pop(\"map_per_class\")\n        mar_100_per_class = metrics.pop(\"mar_100_per_class\")\n        for class_id, class_map, class_mar in zip(classes, map_per_class, mar_100_per_class):\n            class_name = id2label[class_id.item()] if id2label is not None else class_id.item()\n            metrics[f\"map_{class_name}\"] = class_map\n            metrics[f\"mar_100_{class_name}\"] = class_mar\n\n        metrics = {k: round(v.item(), 4) for k, v in metrics.items()}\n\n        return metrics\n\neval_compute_metrics_fn = MAPEvaluator(image_processor=processor, threshold=0.01, id2label=id2label)\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#training-the-detection-model","title":"Training the detection model","text":"<p>You have done most of the heavy lifting in the previous sections, so now you are ready to train your model! The images in this dataset are still quite large, even after resizing. This means that finetuning this model will require at least one GPU.</p> <p>Training involves the following steps:</p> <ul> <li>Load the model with <code>AutoModelForObjectDetection</code> using the same checkpoint as in the preprocessing.</li> <li>Define your training hyperparameters in <code>TrainingArguments</code>.</li> <li>Pass the training arguments to <code>Trainer</code> along with the model, dataset, image processor, and data collator.</li> <li>Call <code>train()</code> to finetune your model.</li> </ul> <p>When loading the model from the same checkpoint that you used for the preprocessing, remember to pass the <code>label2id</code> and <code>id2label</code> maps that you created earlier from the dataset's metadata. Additionally, we specify <code>ignore_mismatched_sizes=True</code> to replace the existing classification head with a new one.</p> Python<pre><code>model = AutoModelForObjectDetection.from_pretrained(\n    CHECKPOINT,\n    id2label=id2label,\n    label2id=label2id,\n    anchor_image_size=None,\n    ignore_mismatched_sizes=True,\n)\n</code></pre> <pre><code>Some weights of RTDetrForObjectDetection were not initialized from the model checkpoint at PekingU/rtdetr_r50vd_coco_o365 and are newly initialized because the shapes did not match:\n- model.decoder.class_embed.0.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.0.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.1.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.1.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.2.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.2.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.3.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.3.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.4.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.4.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.decoder.class_embed.5.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.decoder.class_embed.5.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\n- model.denoising_class_embed.weight: found shape torch.Size([81, 256]) in the checkpoint and torch.Size([54, 256]) in the model instantiated\n- model.enc_score_head.bias: found shape torch.Size([80]) in the checkpoint and torch.Size([53]) in the model instantiated\n- model.enc_score_head.weight: found shape torch.Size([80, 256]) in the checkpoint and torch.Size([53, 256]) in the model instantiated\nYou should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n</code></pre> <p>In the <code>TrainingArguments</code> use <code>output_dir</code> to specify where to save your model, then configure hyperparameters as you see fit. For <code>num_train_epochs=10</code> training will take about 15 minutes in Google Colab T4 GPU, increase the number of epoch to get better results.</p> <p>Important notes:</p> <ul> <li>Do not remove unused columns because this will drop the image column. Without the image column, you can't create <code>pixel_values</code>. For this reason, set <code>remove_unused_columns</code> to <code>False</code>.</li> <li>Set <code>eval_do_concat_batches=False</code> to get proper evaluation results. Images have different number of target boxes, if batches are concatenated we will not be able to determine which boxes belongs to particular image.</li> </ul> Python<pre><code>training_args = TrainingArguments(\n    output_dir=f\"{dataset.name.replace(' ', '-')}-finetune\",\n    num_train_epochs=20,\n    max_grad_norm=0.1,\n    learning_rate=5e-5,\n    warmup_steps=300,\n    per_device_train_batch_size=16,\n    dataloader_num_workers=2,\n    metric_for_best_model=\"eval_map\",\n    greater_is_better=True,\n    load_best_model_at_end=True,\n    eval_strategy=\"epoch\",\n    save_strategy=\"epoch\",\n    save_total_limit=2,\n    remove_unused_columns=False,\n    eval_do_concat_batches=False,\n)\n</code></pre> <p>Finally, bring everything together, and call <code>train()</code>:</p> Python<pre><code>trainer = Trainer(\n    model=model,\n    args=training_args,\n    train_dataset=pytorch_dataset_train,\n    eval_dataset=pytorch_dataset_valid,\n    tokenizer=processor,\n    data_collator=collate_fn,\n    compute_metrics=eval_compute_metrics_fn,\n)\n\ntrainer.train()\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#evaluate","title":"Evaluate","text":"Python<pre><code># @title Collect predictions\n\ntargets = []\npredictions = []\n\nfor i in range(len(ds_test)):\n    path, sourece_image, annotations = ds_test[i]\n\n    image = Image.open(path)\n    inputs = processor(image, return_tensors=\"pt\").to(DEVICE)\n\n    with torch.no_grad():\n        outputs = model(**inputs)\n\n    w, h = image.size\n    results = processor.post_process_object_detection(\n        outputs, target_sizes=[(h, w)], threshold=0.3)\n\n    detections = sv.Detections.from_transformers(results[0])\n\n    targets.append(annotations)\n    predictions.append(detections)\n</code></pre> Python<pre><code># @title Calculate mAP\nmean_average_precision = sv.MeanAveragePrecision.from_detections(\n    predictions=predictions,\n    targets=targets,\n)\n\nprint(f\"map50_95: {mean_average_precision.map50_95:.2f}\")\nprint(f\"map50: {mean_average_precision.map50:.2f}\")\nprint(f\"map75: {mean_average_precision.map75:.2f}\")\n</code></pre> <pre><code>map50_95: 0.89\nmap50: 0.94\nmap75: 0.94\n</code></pre> Python<pre><code># @title Calculate Confusion Matrix\nconfusion_matrix = sv.ConfusionMatrix.from_detections(\n    predictions=predictions,\n    targets=targets,\n    classes=ds_test.classes\n)\n\n_ = confusion_matrix.plot()\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#save-fine-tuned-model-on-hard-drive","title":"Save fine-tuned model on hard drive","text":"Python<pre><code>model.save_pretrained(\"/content/rt-detr/\")\nprocessor.save_pretrained(\"/content/rt-detr/\")\n</code></pre> <pre><code>['/content/rt-detr/preprocessor_config.json']\n</code></pre>"},{"location":"integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/#inference-with-fine-tuned-rt-detr-model","title":"Inference with fine-tuned RT-DETR model","text":"Python<pre><code>IMAGE_COUNT = 5\n\nfor i in range(IMAGE_COUNT):\n    path, sourece_image, annotations = ds_test[i]\n\n    image = Image.open(path)\n    inputs = processor(image, return_tensors=\"pt\").to(DEVICE)\n\n    with torch.no_grad():\n        outputs = model(**inputs)\n\n    w, h = image.size\n    results = processor.post_process_object_detection(\n        outputs, target_sizes=[(h, w)], threshold=0.3)\n\n    detections = sv.Detections.from_transformers(results[0]).with_nms(threshold=0.1)\n\n    annotated_images = [\n        annotate(sourece_image, annotations, ds_train.classes),\n        annotate(sourece_image, detections, ds_train.classes)\n    ]\n    grid = sv.create_tiles(\n        annotated_images,\n        titles=['ground truth', 'prediction'],\n        titles_scale=0.5,\n        single_tile_size=(400, 400),\n        tile_padding_color=sv.Color.WHITE,\n        tile_margin_color=sv.Color.WHITE\n    )\n    sv.plot_image(grid, size=(6, 6))\n</code></pre>"},{"location":"introduction/image_augmentation/","title":"What is image augmentation and how it can improve the performance of deep neural networks","text":"<p>Deep neural networks require a lot of training data to obtain good results and prevent overfitting. However, it is often very difficult to get enough training samples. Multiple reasons could make it very hard or even impossible to gather enough data:</p> <ul> <li> <p>To make a training dataset, you need to obtain images and then label them. For example, you need to assign correct class labels if you have an image classification task. For an object detection task, you need to draw bounding boxes around objects.  For a semantic segmentation task, you need to assign a correct class to each input image pixel. This process requires manual labor, and sometimes it could be very costly to label the training data. For example, to correctly label medical images, you need expensive domain experts.</p> </li> <li> <p>Sometimes even collecting training images could be hard. There are many legal restrictions for working with healthcare data, and obtaining it requires a lot of effort. Sometimes getting the training images is more feasible, but it will cost a lot of money. For example, to get satellite images, you need to pay a satellite operator to take those photos. To get images for road scene recognition, you need an operator that will drive a car and collect the required data.</p> </li> </ul>"},{"location":"introduction/image_augmentation/#image-augmentation-to-the-rescue","title":"Image augmentation to the rescue","text":"<p>Image augmentation is a process of creating new training examples from the existing ones. To make a new sample, you slightly change the original image. For instance, you could make a new image a little brighter; you could cut a piece from the original image; you could make a new image by mirroring the original one, etc.</p> <p>Here are some examples of transformations of the original image that will create a new training sample.</p> <p></p> <p>By applying those transformations to the original training dataset, you could create an almost infinite amount of new training samples.</p>"},{"location":"introduction/image_augmentation/#how-much-does-image-augmentation-improves-the-quality-and-performance-of-deep-neural-networks","title":"How much does image augmentation improves the quality and performance of deep neural networks","text":"<p>Basic augmentations techniques were used almost in all papers that describe the state-of-the-art models for image recognition.</p> <p>AlexNet was the first model that demonstrated exceptional capabilities of using deep neural networks for image recognition. For training, the authors used a set of basic image augmentation techniques. They resized original images to the fixed size of 256 by 256 pixels, and then they cropped patches of size 224 by 224 pixels as well as their horizontal reflections from those resized images. Also, they altered the intensities of the RGB channels in images.</p> <p>Successive state-of-the-art models such as Inception, ResNet, and EfficientNet also used image augmentation techniques for training.</p> <p>In 2018 Google published a paper about AutoAugment - an algorithm that automatically discovers the best set of augmentations for the dataset. They showed that a custom set of augmentations improves the performance of the model.</p> <p>Here is a comparison between a model that used only the base set of augmentations and a model that used a specific set of augmentations discovered by AutoAugment. The table shows Top-1 accuracy (%) on the ImageNet validation set; higher is better.</p> Model Base augmentations AutoAugment augmentations ResNet-50 76.3 77.6 ResNet-200 78.5 80.0 AmoebaNet-B (6,190) 82.2 82.8 AmoebaNet-C (6,228) 83.1 83.5 <p>The table demonstrates that a diverse set of image augmentations improves the performance of neural networks compared to a base set with only a few most popular transformation techniques.</p> <p>Augmentations help to fight overfitting and improve the performance of deep neural networks for computer vision tasks such as classification, segmentation, and object detection. The best part is that image augmentations libraries such as Albumentations make it possible to add image augmentations to any computer vision pipeline with minimal effort.</p>"},{"location":"introduction/why_albumentations/","title":"Why Albumentations","text":""},{"location":"introduction/why_albumentations/#a-single-interface-to-work-with-images-masks-bounding-boxes-and-key-points","title":"A single interface to work with images, masks, bounding boxes, and key points","text":"<p>Albumentations provides a single interface to work with different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, pose estimation, etc.</p>"},{"location":"introduction/why_albumentations/#battle-tested","title":"Battle-tested","text":"<p>The library is widely used in industry, deep learning research, machine learning competitions, and open source projects.</p>"},{"location":"introduction/why_albumentations/#high-performance","title":"High performance","text":"<p>Albumentations optimized for maximum speed and performance. Under the hood, the library uses highly optimized functions from OpenCV and NumPy for data processing. We have a regularly updated benchmark that compares the speed of popular image augmentations libraries for the most common image transformations. Albumentations demonstrates the best performance in most cases.</p>"},{"location":"introduction/why_albumentations/#diverse-set-of-supported-augmentations","title":"Diverse set of supported augmentations","text":"<p>Albumentations supports more than 60 different image augmentations.</p>"},{"location":"introduction/why_albumentations/#extensibility","title":"Extensibility","text":"<p>Albumentations allows to easily add new augmentations and use them in computer vision pipelines through a single interface along with built-in transformations.</p>"},{"location":"introduction/why_albumentations/#rigorous-testing","title":"Rigorous testing","text":"<p>Bugs in the augmentation pipeline could silently corrupt the input data. They can easily go unnoticed, but the performance of the models trained with incorrect data will degrade. Albumentations has an extensive test suite that helps to discover bugs during development.</p>"},{"location":"introduction/why_albumentations/#it-is-open-source-and-mit-licensed","title":"It is open source and MIT licensed","text":"<p>You can find the source code on GitHub.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/","title":"Why you need a dedicated library for image augmentation","text":"<p>At first glance, image augmentations look very simple; you apply basic transformations to an image: mirroring, cropping, changing brightness and contrast, etc.</p> <p>There are a lot of libraries that could do such image transformations. Here is an example of how you could use Pillow, a popular image processing library for Python, to make simple augmentations.</p> Python<pre><code>from PIL import Image, ImageEnhance\n\nimage = Image.open(\"parrot.jpg\")\n\nmirrored_image = image.transpose(Image.FLIP_LEFT_RIGHT)\n\nrotated_image = image.rotate(45)\n\nbrightness_enhancer = ImageEnhance.Brightness(image)\nbrighter_image = brightness_enhancer.enhance(factor=1.5)\n</code></pre> <p></p> <p>However, this approach has many limitations, and it doesn't handle all cases with image augmentation. An image augmentation library such as Albumentations gives you a lot of advantages.</p> <p>Here is a list of few pitfalls that augmentation libraries can handle very well.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#the-need-to-apply-the-same-transform-to-an-image-and-for-labels-for-segmentation-object-detection-and-keypoint-detection-tasks","title":"The need to apply the same transform to an image and for labels for segmentation, object detection, and keypoint detection tasks.","text":"<p>For image classification, you need to modify only an input image and keep output labels intact because output labels are invariant to image modifications.</p> <p></p> <p>Note</p> <p>There are some exceptions to this rule. For example, an image could contain a cat and have an assigned label <code>cat</code>. During image augmentation, if you crop a part of an image that doesn't have a cat on it, then the output label <code>cat</code> becomes wrong and misleading. Usually, you deal with those situations by deciding which augmentations you could apply to a dataset without risking to have problems with incorrect labels.</p> <p></p> <p>For segmentation, you need to apply some transformations both to an input image and an output mask. You also have to use the same parameters both for the image transformation and the mask transformation.</p> <p>Let's look at an example of a semantic segmentation task from Inria Aerial Image Labeling Dataset. The dataset contains aerial photos as well as masks for those photos. Each pixel of the mask is marked either as 1 if the pixel belongs to the class <code>building</code> and 0 otherwise.</p> <p>There are two types of image augmentations: pixel-level augmentations and spatial-level augmentations.</p> <p>Pixel-level augmentations change the values of pixels of the original image, but they don't change the output mask. Image transformations such as changing brightness or contrast of adjusting values of the RGB-palette of the image are pixel-level augmentations.</p> <p> We modify the input image by adjusting its brightness, but we keep the output mask intact.</p> <p>On the contrary, spatial-level augmentations change both the image and the mask. When you apply image transformations such as mirroring or rotation or cropping a part of the input image, you also need to apply the same transformation to the output label to preserve its correctness.</p> <p> We rotate both the input image and the output mask. We use the same set of transformations with the same parameters, both for the image and the mask.</p> <p>The same is true for object detection tasks. For pixel-level augmentations, you only need to change the input image. With spatial-level augmentations, you need to apply the same transformation not only to the image but for bounding boxes coordinates as well. After applying spatial-level augmentations, you need to update coordinates of bounding boxes to represent the correct locations of objects on the augmented image.</p> <p> Pixel-level augmentations such as brightness adjustment change only the input image but not the coordinates of bounding boxes. Spatial-level augmentations such as mirroring and cropping a part of the image change both the input image and the bounding boxes' coordinates.</p> <p>Albumentations knows how to correctly apply transformation both to the input data as well as the output labels.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#working-with-probabilities","title":"Working with probabilities","text":"<p>During training, you usually want to apply augmentations with a probability of less than 100% since you also need to have the original images in your training pipeline. Also, it is beneficial to be able to control the magnitude of image augmentation, how much does the augmentation change the original image. If the original dataset is large, you could apply only the basic augmentations with probability around 10-30% and with a small magnitude of changes. If the dataset is small, you need to act more aggressively with augmentations to prevent overfitting of neural networks, so you usually need to increase the probability of applying each augmentation to 40-50% and increase the magnitude of changes the augmentation makes to the image.</p> <p>Image augmentation libraries allow you to set the required probabilities and the magnitude of values for each transformation.</p>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#declarative-definition-of-the-augmentation-pipeline-and-unified-interface","title":"Declarative definition of the augmentation pipeline and unified interface","text":"<p>Usually, you want to apply not a single augmentation, but a set of augmentations with specific parameters such as probability and magnitude of changes. Augmentation libraries allow you to declare such a pipeline in a single place and then use it for image transformation through a unified interface. Some libraries can store and load transformation parameters to formats such as JSON, YAML, etc.</p> <p>Here is an example definition of an augmentation pipeline. This pipeline will first crop a random 512px x 512px part of the input image. Then with probability 30%, it will randomly change brightness and contrast of that crop. Finally, with probability 50%, it will horizontally flip the resulting image.</p> Python<pre><code>import albumentations as A\n\ntransform = A.Compose([\n    A.RandomCrop(512, 512),\n    A.RandomBrightnessContrast(p=0.3),\n    A.HorizontalFlip(p=0.5),\n])\n</code></pre>"},{"location":"introduction/why_you_need_a_dedicated_library_for_image_augmentation/#rigorous-testing","title":"Rigorous testing","text":"<p>A bug in the augmentation pipeline could easily go unnoticed. A buggy pipeline could silently corrupt input data. There won't be any exceptions and code failures, but the performance of trained neural networks will degrade because they received a garbage input during training. Augmentation libraries usually have large test suites that capture regressions during development. Also large user base helps to find unnoticed bugs and report them to developers.</p>"}]}
\ No newline at end of file
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index eebee092..517e522f 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -2,366 +2,366 @@
 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
     <url>
          <loc>https://albumentations.ai/docs/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/CONTRIBUTING/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/benchmarking_results/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/faq/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/frameworks_and_libraries/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/full_reference/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/autoalbument/benchmarks/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/blur/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/blur/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/blur/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/crops/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/crops/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/crops/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/channel_dropout/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/coarse_dropout/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/mask_dropout/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/xy_masking/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/resize/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/rotate/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/mixing/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/mixing/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/mixing/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/transforms3d/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/transforms3d/functional/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/augmentations/transforms3d/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/core/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/core/bbox_utils/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/core/composition/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/core/keypoints_utils/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/core/serialization/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/core/transforms_interface/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/pytorch/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/api_reference/pytorch/transforms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/benchmarks/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/custom_model/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/docker/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/faq/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/how_to_use/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/installation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/introduction/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/metrics/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/search_algorithms/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/tuning_parameters/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/examples/cifar10/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/examples/cityscapes/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/examples/imagenet/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/examples/list/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/examples/pascal_voc/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/autoalbument/examples/svhn/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/contributing/coding_guidelines/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/contributing/environment_setup/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/examples/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/external_resources/blog_posts_podcasts_talks/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/external_resources/books/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/external_resources/online_courses/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/augmentation_mapping/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/image_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/installation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/keypoints_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/mask_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/setting_probabilities/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/simultaneous_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/transforms_and_targets/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/video_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/getting_started/volumetric_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/integrations/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/integrations/fiftyone/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/integrations/huggingface/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/integrations/huggingface/image_classification_albumentations/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/integrations/huggingface/object_detection/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/introduction/image_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/introduction/why_albumentations/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
     <url>
          <loc>https://albumentations.ai/docs/introduction/why_you_need_a_dedicated_library_for_image_augmentation/</loc>
-         <lastmod>2024-12-23</lastmod>
+         <lastmod>2024-12-24</lastmod>
     </url>
 </urlset>
\ No newline at end of file
diff --git a/docs/sitemap.xml.gz b/docs/sitemap.xml.gz
index d6aa4c7c246f7a44fa55b89acfe81634f44ba5ca..17ef6d8a13f223f2e400c076c1d83b89546aff33 100644
GIT binary patch
delta 290
zcmV+-0p0$`2ge5oABzYGfca^W2Oj}*ktd{oZT0Y2J-+2G=eAi${VL@@$out=4=;bd
z|K;P$`wze0TzJDuA4c$7*1Pf;l8v|5KnH)_UZ#f;_vE)hfoqjDjR$Y=_CFpb!;0_F
zvn8;w^e$N|1_n}p=_87zr4fu%(84q~iK6S0)A<gpD6Af7#ORF{MHM}`(2*Zyc^IUB
zNMF8oiP5qc8=Jg5AfeuWViW3Y+S4SFNT?}iIH9(by)=DGC)CE@f=abFJ0qnil1UHX
z99kusk-!HOjVN`Bobh^ZTG9F&A#6nh9*!zA@(%oa!u~I)#P4%F2J;!E!V$Mgg%i5i
oj^-e3lAe)-&V@6bP;*|eerV-_2{rAr$pQ8O2#}IC;?Xn!0Hs-s{{R30

delta 290
zcmV+-0p0$`2ge5oABzYG0H|n@2Oj}?ktd{ohw9<6dVI@W&TX@h`c=w*koW5!A71`^
z|I5dh_aA=0x$uUSK8)bEtas%xBpYw9fe!w<y-W`y?#XY10@o^Q8V}y!?SDK>h85qT
zXG>sV>0Po`3=E|H(nl0aOCuPkpoM8}5=GY~r}G_HQCL0Fh|wD@iYj_=p(8)a@-Rq$
zk-mKG5~F1?Ha2;AKtjF$#3t0)w5LfTkx*04a6)Y<dujTXPN<E)1(j-Vc1B83B$FP%
zIkZYNBY_Vn8d2&LIpg)-w4(JlLfDE1JRDVK<Q@3;g#BMoiQngV4CXURg(Gg03MX{2
o9nC@7Bt0VuoeO6=q2|0`y=~=!2{rAr$pQ8O2qgy*G0`*t0H3sl-~a#s

diff --git a/index.html b/index.html
index bf9368f0..b838c9c7 100755
--- a/index.html
+++ b/index.html
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" as="image" href="/assets/top_image.jpg"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="stylesheet" href="/_next/static/css/4d3d9169b46fed63.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="icon" href="/icon.svg?ed95530d83f93aed" type="image/svg+xml" sizes="any"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><main><div class="container mx-auto px-4 py-12 md:py-24"><div class="grid grid-cols-1 md:grid-cols-12 gap-8"><div class="md:col-span-7"><h1 class="text-4xl font-medium mb-6">Do more with less data</h1><div class="text-lg mb-8"><p class="mb-4">Albumentations is a computer vision tool that boosts the performance of deep convolutional neural networks.</p><p>The library is widely used in industry, deep learning research, machine learning competitions, and open source projects.</p></div><div class="flex flex-col gap-4 max-w-xl"><div class="grid grid-cols-1 sm:grid-cols-2 gap-4"><a href="https://clickpy.clickhouse.com/dashboard/albumentations" target="_blank" rel="noopener noreferrer" class="btn-stats group"><span class="flex items-center gap-2"><i class="fas fa-download text-gray-500 group-hover:text-gray-700"></i>Downloads (last 30 days)</span><span class="border-l pl-4 font-medium">5.1M</span></a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="btn-stats group"><span class="flex items-center gap-2"><i class="fab fa-github text-gray-500 group-hover:text-gray-700"></i>Star on GitHub</span><span class="border-l pl-4 font-medium"><i class="fa fa-star mr-1"></i>14.4K</span></a></div><a class="btn-primary text-center" href="/docs/">Open documentation</a></div></div><div class="md:col-span-5"><div class="relative"><img alt="Albumentations example" width="350" height="350" decoding="async" data-nimg="1" class="w-full h-auto rounded-lg shadow-lg" style="color:transparent" src="/assets/top_image.jpg"/></div></div></div></div><div class="bg-gradient-to-b from-blue-50 to-white py-8 md:py-12"><div class="container mx-auto px-4"><div class="text-center mb-8"><h2 class="text-2xl font-medium mb-3">Community-Driven Project, Supported By</h2><p class="text-gray-600 text-lg max-w-2xl mx-auto">Albumentations thrives on developer contributions. We appreciate our sponsors who help sustain the project&#x27;s infrastructure.</p></div><div class="grid md:grid-cols-3 gap-8 max-w-6xl mx-auto"><div class="bg-gradient-to-b from-amber-50 to-white p-6 rounded-xl border border-amber-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-600 bg-amber-50 px-4 py-1.5 rounded-full">Gold Sponsors</span></div><div class="h-32 flex items-center justify-center text-gray-400 text-lg">Your company could be here</div></div><div class="bg-gradient-to-b from-gray-50 to-white p-6 rounded-xl border border-gray-100"><div class="text-center mb-4"><span class="text-base font-semibold text-gray-600 bg-gray-50 px-4 py-1.5 rounded-full">Silver Sponsors</span></div><div class="flex flex-col gap-6 items-center"><a href="https://datature.io" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-4 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Datature" loading="lazy" width="180" height="60" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/datature-full.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Build, Train, and Deploy Enterprise Computer Vision Applications in One Platform</div></div></a></div></div><div class="bg-gradient-to-b from-orange-50 to-white p-6 rounded-xl border border-orange-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-700 bg-amber-50 px-4 py-1.5 rounded-full">Bronze Sponsors</span></div><div class="flex flex-col gap-4 items-center"><a href="https://roboflow.com" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-3 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Roboflow" loading="lazy" width="160" height="50" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/roboflow.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Computer vision infrastructure for developers</div></div></a></div></div></div><div class="text-center mt-8 space-y-4"><a href="https://github.com/sponsors/albumentations-team" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-3 px-6 py-3 bg-white border-2 border-pink-100 hover:border-pink-200 rounded-full shadow-sm hover:shadow-md transition-all group"><span class="text-lg font-medium text-gray-700">Become a Sponsor</span><i class="fas fa-heart text-pink-500 group-hover:scale-110 transition-transform"></i></a><div class="text-gray-500">View sponsorship tiers and benefits on GitHub Sponsors<i class="fas fa-external-link-alt ml-2 text-sm"></i></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Why Albumentations</h2><div class=""><div class="max-w-5xl mx-auto"><p class="text-2xl text-center text-gray-700 mb-12 leading-relaxed">The fastest and most flexible image augmentation library, trusted by thousands of AI engineers and researchers worldwide</p><div class="grid md:grid-cols-3 gap-8 mb-12"><div class="text-center p-6"><div class="text-blue-600 text-3xl mb-4"><i class="fas fa-bolt"></i></div><h3 class="text-xl font-medium mb-3">Lightning Fast</h3><p class="text-gray-600">Up to 10x faster than other libraries.<!-- --> <a href="https://albumentations.ai/docs/benchmarking_results/" class="text-blue-600 hover:underline">See benchmarks</a></p></div><div class="text-center p-6"><div class="text-blue-600 text-3xl mb-4"><i class="fas fa-puzzle-piece"></i></div><h3 class="text-xl font-medium mb-3">Versatile</h3><p class="text-gray-600">Supports classification, segmentation, detection, and more tasks out of the box</p></div><div class="text-center p-6"><div class="text-blue-600 text-3xl mb-4"><i class="fas fa-code"></i></div><h3 class="text-xl font-medium mb-3">Easy to Use</h3><p class="text-gray-600">Simple, intuitive API with comprehensive documentation and examples</p></div></div><div class="text-lg text-gray-700 leading-relaxed bg-white rounded-xl p-8 shadow-sm"><p class="mb-4">Albumentations is a Python library for<!-- --> <span class="relative inline-block group"><span class="border-b-2 border-dotted border-blue-600 cursor-help">image augmentations</span></span> <!-- -->that provides:</p><ul class="list-disc list-inside space-y-2 ml-4"><li>Optimized performance for production environments</li><li>Rich variety of transform operations</li><li>Support for all major computer vision tasks</li><li>Seamless integration with PyTorch, TensorFlow, and other frameworks</li></ul></div><div class="text-center mt-8"><a href="https://albumentations.ai/docs/benchmarking_results/" class="inline-flex items-center gap-2 text-blue-600 hover:text-blue-800 font-medium">Compare Albumentations with other libraries<i class="fas fa-arrow-right"></i></a></div></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Community Feedback</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-3 gap-8 mb-8"><div class="space-y-3"><div class="relative group"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm" style="color:transparent" src="/assets/testimonials/datature.png"/><a href="https://www.linkedin.com/feed/update/urn:li:activity:7261972596458438656/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 opacity-0 group-hover:opacity-100 transition-opacity"><i class="fab fa-linkedin text-white text-xl"></i></a></div><div class="flex items-center gap-2"><i class="fab fa-linkedin text-gray-400 text-lg"></i><span class="text-gray-600 font-medium">CEO of Datature</span></div></div><div class="space-y-3"><div class="relative group"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm" style="color:transparent" src="/assets/testimonials/christof_henkel.png"/><a href="https://x.com/kagglingdieter/status/1775760029754253659" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 opacity-0 group-hover:opacity-100 transition-opacity"><i class="fab fa-twitter text-white text-xl"></i></a></div><div class="flex items-center gap-2"><i class="fab fa-twitter text-gray-400 text-lg"></i><span class="text-gray-600 font-medium">Kaggle Competitions Grandmaster. Top 1 in the world.</span></div></div><div class="space-y-3"><div class="relative group"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm" style="color:transparent" src="/assets/testimonials/alexandr_simonyan.png"/><a href="https://www.linkedin.com/feed/update/urn:li:activity:7259736304416874497/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 opacity-0 group-hover:opacity-100 transition-opacity"><i class="fab fa-linkedin text-white text-xl"></i></a></div><div class="flex items-center gap-2"><i class="fab fa-linkedin text-gray-400 text-lg"></i><span class="text-gray-600 font-medium">Computer Vision Engineer</span></div></div></div><div class="text-center"><a class="inline-flex items-center gap-2 text-blue-600 hover:text-blue-800 font-medium group" href="/testimonials/">View more feedback<i class="fas fa-arrow-right transition-transform group-hover:translate-x-1"></i></a></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Industry users of Albumentations</h2><div class=""><div class="swiper industry-carousel px-2"><div class="swiper-wrapper"><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.apple.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Apple" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/apple.jpeg"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.apple.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Apple</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://research.google/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Google Research" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/google.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://research.google/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Google Research</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.fb.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Meta Research" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/meta_research.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.fb.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Meta Research</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.nvidia.com/en-us/research/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="NVIDIA Research Projects" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/nvidia_research.jpeg"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.nvidia.com/en-us/research/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">NVIDIA Research Projects</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.amazon.science/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Amazon Science" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/amazon_science.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.amazon.science/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Amazon Science</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.microsoft.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Microsoft Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/microsoft.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.microsoft.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Microsoft Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://engineering.salesforce.com/open-source/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Salesforce Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/salesforce_open_source.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://engineering.salesforce.com/open-source/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Salesforce Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://stability.ai/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Stability AI" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/stability.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://stability.ai/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Stability AI</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.ibm.com/opensource/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="IBM Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/ibm.jpeg"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.ibm.com/opensource/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">IBM Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://huggingface.co/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Hugging Face" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/hugging_face.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://huggingface.co/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Hugging Face</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.sony.com/en/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Sony" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/sony.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.sony.com/en/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Sony</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.alibaba.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Alibaba Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/alibaba.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.alibaba.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Alibaba Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.tencent.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Tencent Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/tencent.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.tencent.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Tencent Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://h2o.ai/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="H2O.ai" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/h2o_ai.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://h2o.ai/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">H2O.ai</a></div></div></div><div class="swiper-button-prev"></div><div class="swiper-button-next"></div></div></div></div></div><div class="features"><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Different tasks</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-2 gap-8 items-center "><div class="md:pr-8"><h2 class="text-2xl font-semibold mb-4">Different tasks</h2><p class="text-gray-600 text-lg">Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.</p></div><div><img alt="Computer vision tasks" loading="lazy" width="500" height="300" decoding="async" data-nimg="1" class="w-full h-auto" style="color:transparent" src="/assets/custom/tasks.png"/></div></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Different domains</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-2 gap-8 items-center md:flex-row-reverse"><div class="md:pl-8"><h2 class="text-2xl font-semibold mb-4">Different domains</h2><p class="text-gray-600 text-lg">Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks.</p></div><div><img alt="Different domains" loading="lazy" width="500" height="300" decoding="async" data-nimg="1" class="w-full h-auto" style="color:transparent" src="/assets/custom/domains.png"/></div></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Seamless integration with deep learning frameworks</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-2 gap-8 items-center "><div class="md:pr-8"><h2 class="text-2xl font-semibold mb-4">Seamless integration with deep learning frameworks</h2><p class="text-gray-600 text-lg">Albumentations can work with various deep learning frameworks such as PyTorch and Keras. The library is a part of the PyTorch ecosystem. MMDetection and YOLOv5 use Albumentations.</p></div><div><img alt="Deep learning frameworks" loading="lazy" width="500" height="300" decoding="async" data-nimg="1" class="w-full h-auto" style="color:transparent" src="/assets/custom/deep_learning_frameworks.png"/></div></div></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Getting started</h2><div class=""><div class="max-w-3xl mx-auto text-center"><p class="text-lg mb-4">Albumentations requires Python 3.9 or higher. To install the library from PyPI run</p><div class="bg-white border border-gray-400 rounded-lg p-4 mb-6 font-mono text-xl">pip install -U albumentations</div><a class="inline-flex items-center justify-center font-medium transition-colors rounded-md bg-blue-600 text-white hover:bg-blue-700 px-6 py-3 text-lg" href="/docs/"><i class="fas fa-angle-right mr-2"></i>Open documentation</a></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Support Open Source Development</h2><div class="max-w-5xl mx-auto"><div class="space-y-8"><div class="text-center max-w-3xl mx-auto"><p class="text-xl text-gray-700 mb-4">Albumentations is a free, open-source project maintained by a dedicated team of developers</p><p class="text-gray-600">Your sponsorship helps us maintain high-quality code, provide timely updates, and develop new features</p></div><div class="grid md:grid-cols-2 gap-8"><div class="bg-white p-6 rounded-xl shadow-sm border border-gray-100"><h3 class="text-lg font-medium mb-3 flex items-center gap-2"><i class="fas fa-heart text-pink-500"></i>Individual Sponsors</h3><p class="text-gray-600 mb-4">Support open source with a monthly contribution of any size. Every dollar helps maintain and improve Albumentations.</p><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a></div><div class="bg-white p-6 rounded-xl shadow-sm border border-gray-100"><h3 class="text-lg font-medium mb-3 flex items-center gap-2"><i class="fas fa-building text-blue-500"></i>Company Sponsorship</h3><p class="text-gray-600 mb-4">Companies using Albumentations can become official sponsors, getting their logo featured on our website and documentation.</p><a href="https://github.com/sponsors/albumentations-team" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-2 px-4 py-2 bg-blue-50 hover:bg-blue-100 text-blue-700 rounded-lg transition-colors"><i class="fab fa-github"></i>View Sponsorship Tiers</a></div></div><div class="text-center text-sm text-gray-500">100% of sponsorships go directly to supporting development and maintenance.</div></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Citing</h2><div class=""><div class="max-w-4xl mx-auto"><p class="text-lg mb-4">If you find this library useful for your research, please consider citing<!-- --> <a href="https://www.mdpi.com/2078-2489/11/2/125" target="_blank" rel="noopener noreferrer" class="text-blue-600 hover:text-blue-800 underline">Albumentations: Fast and Flexible Image Augmentations</a>:</p><pre class="bg-gray-50 border border-gray-300 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed"><code>@Article{info11020125,
+<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" as="image" href="/assets/top_image.jpg"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="stylesheet" href="/_next/static/css/4d3d9169b46fed63.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="icon" href="/icon.svg?ed95530d83f93aed" type="image/svg+xml" sizes="any"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><main><div class="container mx-auto px-4 py-12 md:py-24"><div class="grid grid-cols-1 md:grid-cols-12 gap-8"><div class="md:col-span-7"><h1 class="text-4xl font-medium mb-6">Do more with less data</h1><div class="text-lg mb-8"><p class="mb-4">Albumentations is a computer vision tool that boosts the performance of deep convolutional neural networks.</p><p>The library is widely used in industry, deep learning research, machine learning competitions, and open source projects.</p></div><div class="flex flex-col gap-4 max-w-xl"><div class="grid grid-cols-1 sm:grid-cols-2 gap-4"><a href="https://clickpy.clickhouse.com/dashboard/albumentations" target="_blank" rel="noopener noreferrer" class="btn-stats group"><span class="flex items-center gap-2"><i class="fas fa-download text-gray-500 group-hover:text-gray-700"></i>Downloads (last 30 days)</span><span class="border-l pl-4 font-medium">5.2M</span></a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="btn-stats group"><span class="flex items-center gap-2"><i class="fab fa-github text-gray-500 group-hover:text-gray-700"></i>Star on GitHub</span><span class="border-l pl-4 font-medium"><i class="fa fa-star mr-1"></i>14.4K</span></a></div><a class="btn-primary text-center" href="/docs/">Open documentation</a></div></div><div class="md:col-span-5"><div class="relative"><img alt="Albumentations example" width="350" height="350" decoding="async" data-nimg="1" class="w-full h-auto rounded-lg shadow-lg" style="color:transparent" src="/assets/top_image.jpg"/></div></div></div></div><div class="bg-gradient-to-b from-blue-50 to-white py-8 md:py-12"><div class="container mx-auto px-4"><div class="text-center mb-8"><h2 class="text-2xl font-medium mb-3">Community-Driven Project, Supported By</h2><p class="text-gray-600 text-lg max-w-2xl mx-auto">Albumentations thrives on developer contributions. We appreciate our sponsors who help sustain the project&#x27;s infrastructure.</p></div><div class="grid md:grid-cols-3 gap-8 max-w-6xl mx-auto"><div class="bg-gradient-to-b from-amber-50 to-white p-6 rounded-xl border border-amber-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-600 bg-amber-50 px-4 py-1.5 rounded-full">Gold Sponsors</span></div><div class="h-32 flex items-center justify-center text-gray-400 text-lg">Your company could be here</div></div><div class="bg-gradient-to-b from-gray-50 to-white p-6 rounded-xl border border-gray-100"><div class="text-center mb-4"><span class="text-base font-semibold text-gray-600 bg-gray-50 px-4 py-1.5 rounded-full">Silver Sponsors</span></div><div class="flex flex-col gap-6 items-center"><a href="https://datature.io" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-4 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Datature" loading="lazy" width="180" height="60" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/datature-full.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Build, Train, and Deploy Enterprise Computer Vision Applications in One Platform</div></div></a></div></div><div class="bg-gradient-to-b from-orange-50 to-white p-6 rounded-xl border border-orange-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-700 bg-amber-50 px-4 py-1.5 rounded-full">Bronze Sponsors</span></div><div class="flex flex-col gap-4 items-center"><a href="https://roboflow.com" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-3 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Roboflow" loading="lazy" width="160" height="50" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/roboflow.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Computer vision infrastructure for developers</div></div></a></div></div></div><div class="text-center mt-8 space-y-4"><a href="https://github.com/sponsors/albumentations-team" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-3 px-6 py-3 bg-white border-2 border-pink-100 hover:border-pink-200 rounded-full shadow-sm hover:shadow-md transition-all group"><span class="text-lg font-medium text-gray-700">Become a Sponsor</span><i class="fas fa-heart text-pink-500 group-hover:scale-110 transition-transform"></i></a><div class="text-gray-500">View sponsorship tiers and benefits on GitHub Sponsors<i class="fas fa-external-link-alt ml-2 text-sm"></i></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Why Albumentations</h2><div class=""><div class="max-w-5xl mx-auto"><p class="text-2xl text-center text-gray-700 mb-12 leading-relaxed">The fastest and most flexible image augmentation library, trusted by thousands of AI engineers and researchers worldwide</p><div class="grid md:grid-cols-3 gap-8 mb-12"><div class="text-center p-6"><div class="text-blue-600 text-3xl mb-4"><i class="fas fa-bolt"></i></div><h3 class="text-xl font-medium mb-3">Lightning Fast</h3><p class="text-gray-600">Up to 10x faster than other libraries.<!-- --> <a href="https://albumentations.ai/docs/benchmarking_results/" class="text-blue-600 hover:underline">See benchmarks</a></p></div><div class="text-center p-6"><div class="text-blue-600 text-3xl mb-4"><i class="fas fa-puzzle-piece"></i></div><h3 class="text-xl font-medium mb-3">Versatile</h3><p class="text-gray-600">Supports classification, segmentation, detection, and more tasks out of the box</p></div><div class="text-center p-6"><div class="text-blue-600 text-3xl mb-4"><i class="fas fa-code"></i></div><h3 class="text-xl font-medium mb-3">Easy to Use</h3><p class="text-gray-600">Simple, intuitive API with comprehensive documentation and examples</p></div></div><div class="text-lg text-gray-700 leading-relaxed bg-white rounded-xl p-8 shadow-sm"><p class="mb-4">Albumentations is a Python library for<!-- --> <span class="relative inline-block group"><span class="border-b-2 border-dotted border-blue-600 cursor-help">image augmentations</span></span> <!-- -->that provides:</p><ul class="list-disc list-inside space-y-2 ml-4"><li>Optimized performance for production environments</li><li>Rich variety of transform operations</li><li>Support for all major computer vision tasks</li><li>Seamless integration with PyTorch, TensorFlow, and other frameworks</li></ul></div><div class="text-center mt-8"><a href="https://albumentations.ai/docs/benchmarking_results/" class="inline-flex items-center gap-2 text-blue-600 hover:text-blue-800 font-medium">Compare Albumentations with other libraries<i class="fas fa-arrow-right"></i></a></div></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Community Feedback</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-3 gap-8 mb-8"><div class="space-y-3"><div class="relative group"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm" style="color:transparent" src="/assets/testimonials/datature.png"/><a href="https://www.linkedin.com/feed/update/urn:li:activity:7261972596458438656/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 opacity-0 group-hover:opacity-100 transition-opacity"><i class="fab fa-linkedin text-white text-xl"></i></a></div><div class="flex items-center gap-2"><i class="fab fa-linkedin text-gray-400 text-lg"></i><span class="text-gray-600 font-medium">CEO of Datature</span></div></div><div class="space-y-3"><div class="relative group"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm" style="color:transparent" src="/assets/testimonials/christof_henkel.png"/><a href="https://x.com/kagglingdieter/status/1775760029754253659" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 opacity-0 group-hover:opacity-100 transition-opacity"><i class="fab fa-twitter text-white text-xl"></i></a></div><div class="flex items-center gap-2"><i class="fab fa-twitter text-gray-400 text-lg"></i><span class="text-gray-600 font-medium">Kaggle Competitions Grandmaster. Top 1 in the world.</span></div></div><div class="space-y-3"><div class="relative group"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm" style="color:transparent" src="/assets/testimonials/alexandr_simonyan.png"/><a href="https://www.linkedin.com/feed/update/urn:li:activity:7259736304416874497/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 opacity-0 group-hover:opacity-100 transition-opacity"><i class="fab fa-linkedin text-white text-xl"></i></a></div><div class="flex items-center gap-2"><i class="fab fa-linkedin text-gray-400 text-lg"></i><span class="text-gray-600 font-medium">Computer Vision Engineer</span></div></div></div><div class="text-center"><a class="inline-flex items-center gap-2 text-blue-600 hover:text-blue-800 font-medium group" href="/testimonials/">View more feedback<i class="fas fa-arrow-right transition-transform group-hover:translate-x-1"></i></a></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Industry users of Albumentations</h2><div class=""><div class="swiper industry-carousel px-2"><div class="swiper-wrapper"><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.apple.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Apple" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/apple.jpeg"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.apple.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Apple</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://research.google/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Google Research" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/google.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://research.google/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Google Research</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.fb.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Meta Research" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/meta_research.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.fb.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Meta Research</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.nvidia.com/en-us/research/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="NVIDIA Research Projects" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/nvidia_research.jpeg"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.nvidia.com/en-us/research/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">NVIDIA Research Projects</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.amazon.science/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Amazon Science" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/amazon_science.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.amazon.science/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Amazon Science</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.microsoft.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Microsoft Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/microsoft.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.microsoft.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Microsoft Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://engineering.salesforce.com/open-source/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Salesforce Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/salesforce_open_source.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://engineering.salesforce.com/open-source/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Salesforce Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://stability.ai/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Stability AI" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/stability.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://stability.ai/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Stability AI</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.ibm.com/opensource/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="IBM Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/ibm.jpeg"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.ibm.com/opensource/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">IBM Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://huggingface.co/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Hugging Face" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/hugging_face.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://huggingface.co/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Hugging Face</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://www.sony.com/en/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Sony" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/sony.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://www.sony.com/en/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Sony</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.alibaba.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Alibaba Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/alibaba.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.alibaba.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Alibaba Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://opensource.tencent.com/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="Tencent Open Source" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/tencent.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://opensource.tencent.com/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">Tencent Open Source</a></div></div><div class="swiper-slide"><div class="h-[160px] flex items-center justify-center"><a href="https://h2o.ai/" target="_blank" rel="noopener noreferrer" class="block w-full px-4 transition-transform hover:scale-105"><img alt="H2O.ai" loading="lazy" width="100" height="100" decoding="async" data-nimg="1" class="w-auto h-auto max-h-[100px] mx-auto object-contain" style="color:transparent" src="/assets/industry/h2o_ai.png"/></a></div><div class="text-center mt-6 hidden md:block"><a href="https://h2o.ai/" target="_blank" rel="noopener noreferrer" class="text-lg font-medium text-gray-800 hover:text-blue-600 transition-colors">H2O.ai</a></div></div></div><div class="swiper-button-prev"></div><div class="swiper-button-next"></div></div></div></div></div><div class="features"><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Different tasks</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-2 gap-8 items-center "><div class="md:pr-8"><h2 class="text-2xl font-semibold mb-4">Different tasks</h2><p class="text-gray-600 text-lg">Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.</p></div><div><img alt="Computer vision tasks" loading="lazy" width="500" height="300" decoding="async" data-nimg="1" class="w-full h-auto" style="color:transparent" src="/assets/custom/tasks.png"/></div></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Different domains</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-2 gap-8 items-center md:flex-row-reverse"><div class="md:pl-8"><h2 class="text-2xl font-semibold mb-4">Different domains</h2><p class="text-gray-600 text-lg">Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks.</p></div><div><img alt="Different domains" loading="lazy" width="500" height="300" decoding="async" data-nimg="1" class="w-full h-auto" style="color:transparent" src="/assets/custom/domains.png"/></div></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Seamless integration with deep learning frameworks</h2><div class=""><div class="grid grid-cols-1 md:grid-cols-2 gap-8 items-center "><div class="md:pr-8"><h2 class="text-2xl font-semibold mb-4">Seamless integration with deep learning frameworks</h2><p class="text-gray-600 text-lg">Albumentations can work with various deep learning frameworks such as PyTorch and Keras. The library is a part of the PyTorch ecosystem. MMDetection and YOLOv5 use Albumentations.</p></div><div><img alt="Deep learning frameworks" loading="lazy" width="500" height="300" decoding="async" data-nimg="1" class="w-full h-auto" style="color:transparent" src="/assets/custom/deep_learning_frameworks.png"/></div></div></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Getting started</h2><div class=""><div class="max-w-3xl mx-auto text-center"><p class="text-lg mb-4">Albumentations requires Python 3.9 or higher. To install the library from PyPI run</p><div class="bg-white border border-gray-400 rounded-lg p-4 mb-6 font-mono text-xl">pip install -U albumentations</div><a class="inline-flex items-center justify-center font-medium transition-colors rounded-md bg-blue-600 text-white hover:bg-blue-700 px-6 py-3 text-lg" href="/docs/"><i class="fas fa-angle-right mr-2"></i>Open documentation</a></div></div></div></div><div class="bg-gray-50 py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Support Open Source Development</h2><div class="max-w-5xl mx-auto"><div class="space-y-8"><div class="text-center max-w-3xl mx-auto"><p class="text-xl text-gray-700 mb-4">Albumentations is a free, open-source project maintained by a dedicated team of developers</p><p class="text-gray-600">Your sponsorship helps us maintain high-quality code, provide timely updates, and develop new features</p></div><div class="grid md:grid-cols-2 gap-8"><div class="bg-white p-6 rounded-xl shadow-sm border border-gray-100"><h3 class="text-lg font-medium mb-3 flex items-center gap-2"><i class="fas fa-heart text-pink-500"></i>Individual Sponsors</h3><p class="text-gray-600 mb-4">Support open source with a monthly contribution of any size. Every dollar helps maintain and improve Albumentations.</p><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a></div><div class="bg-white p-6 rounded-xl shadow-sm border border-gray-100"><h3 class="text-lg font-medium mb-3 flex items-center gap-2"><i class="fas fa-building text-blue-500"></i>Company Sponsorship</h3><p class="text-gray-600 mb-4">Companies using Albumentations can become official sponsors, getting their logo featured on our website and documentation.</p><a href="https://github.com/sponsors/albumentations-team" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-2 px-4 py-2 bg-blue-50 hover:bg-blue-100 text-blue-700 rounded-lg transition-colors"><i class="fab fa-github"></i>View Sponsorship Tiers</a></div></div><div class="text-center text-sm text-gray-500">100% of sponsorships go directly to supporting development and maintenance.</div></div></div></div></div><div class="bg-white py-12 md:py-16"><div class="container mx-auto px-4"><h2 class="text-2xl md:text-3xl font-medium text-center mb-8">Citing</h2><div class=""><div class="max-w-4xl mx-auto"><p class="text-lg mb-4">If you find this library useful for your research, please consider citing<!-- --> <a href="https://www.mdpi.com/2078-2489/11/2/125" target="_blank" rel="noopener noreferrer" class="text-blue-600 hover:text-blue-800 underline">Albumentations: Fast and Flexible Image Augmentations</a>:</p><pre class="bg-gray-50 border border-gray-300 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed"><code>@Article{info11020125,
     AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},
     TITLE = {Albumentations: Fast and Flexible Image Augmentations},
     JOURNAL = {Information},
@@ -9,4 +9,4 @@
     URL = {https://www.mdpi.com/2078-2489/11/2/125},
     ISSN = {2078-2489},
     DOI = {10.3390/info11020125}
-}</code></pre></div></div></div></div></main></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"5:\"$Sreact.fragment\"\n6:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n7:I[5244,[],\"\"]\n8:I[3866,[],\"\"]\n9:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\na:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\nc:I[6213,[],\"OutletBoundary\"]\ne:I[6213,[],\"MetadataBoundary\"]\n10:I[6213,[],\"ViewportBoundary\"]\n12:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n4:HL[\"/_next/static/css/4d3d9169b46fed63.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"ZNcQrWMYk7ymGN9y8PjnQ\",\"p\":\"\",\"c\":[\"\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$5\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L6\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L7\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L8\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L9\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L9\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L9\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L9\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$La\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$5\",\"c\",{\"children\":[\"$Lb\",[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/4d3d9169b46fed63.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"$Lc\",null,{\"children\":\"$Ld\"}]]}],{},null]},null],[\"$\",\"$5\",\"h\",{\"children\":[null,[\"$\",\"$5\",\"MBt7JfAd9ag4e5v5N-19O\",{\"children\":[[\"$\",\"$Le\",null,{\"children\":\"$Lf\"}],[\"$\",\"$L10\",null,{\"children\":\"$L11\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$12\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"11:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\nf:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"link\",\"21\",{\"re"])</script><script>self.__next_f.push([1,"l\":\"icon\",\"href\":\"/icon.svg?ed95530d83f93aed\",\"type\":\"image/svg+xml\",\"sizes\":\"any\"}]]\n"])</script><script>self.__next_f.push([1,"d:null\n"])</script><script>self.__next_f.push([1,"13:I[9865,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n14:I[4265,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n15:I[4560,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n16:I[9264,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n17:I[4755,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n18:I[7970,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"Image\"]\n"])</script><script>self.__next_f.push([1,"b:[\"$\",\"main\",null,{\"children\":[[\"$\",\"$L13\",null,{\"starsCount\":14419,\"downloadsCount\":5135449}],[\"$\",\"$L14\",null,{}],[\"$\",\"$L15\",null,{}],[\"$\",\"$L16\",null,{}],[\"$\",\"$L17\",null,{}],[\"$\",\"div\",null,{\"className\":\"features\",\"children\":[[\"$\",\"div\",\"Different tasks\",{\"className\":\"bg-gray-50 py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Different tasks\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-2 gap-8 items-center \",\"children\":[[\"$\",\"div\",null,{\"className\":\"md:pr-8\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl font-semibold mb-4\",\"children\":\"Different tasks\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 text-lg\",\"children\":\"Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.\"}]]}],[\"$\",\"div\",null,{\"children\":[\"$\",\"$L18\",null,{\"src\":\"/assets/custom/tasks.png\",\"alt\":\"Computer vision tasks\",\"width\":500,\"height\":300,\"className\":\"w-full h-auto\"}]}]]}]}]]}]}],[\"$\",\"div\",\"Different domains\",{\"className\":\"bg-white py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Different domains\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-2 gap-8 items-center md:flex-row-reverse\",\"children\":[[\"$\",\"div\",null,{\"className\":\"md:pl-8\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl font-semibold mb-4\",\"children\":\"Different domains\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 text-lg\",\"children\":\"Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks.\"}]]}],[\"$\",\"div\",null,{\"children\":[\"$\",\"$L18\",null,{\"src\":\"/assets/custom/domains.png\",\"alt\":\"Different domains\",\"width\":500,\"height\":300,\"className\":\"w-full h-auto\"}]}]]}]}]]}]}],[\"$\",\"div\",\"Seamless integration with deep learning frameworks\",{\"className\":\"bg-gray-50 py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Seamless integration with deep learning frameworks\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-2 gap-8 items-center \",\"children\":[[\"$\",\"div\",null,{\"className\":\"md:pr-8\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl font-semibold mb-4\",\"children\":\"Seamless integration with deep learning frameworks\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 text-lg\",\"children\":\"Albumentations can work with various deep learning frameworks such as PyTorch and Keras. The library is a part of the PyTorch ecosystem. MMDetection and YOLOv5 use Albumentations.\"}]]}],[\"$\",\"div\",null,{\"children\":[\"$\",\"$L18\",null,{\"src\":\"/assets/custom/deep_learning_frameworks.png\",\"alt\":\"Deep learning frameworks\",\"width\":500,\"height\":300,\"className\":\"w-full h-auto\"}]}]]}]}]]}]}]]}],[\"$\",\"div\",null,{\"className\":\"bg-white py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Getting started\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-3xl mx-auto text-center\",\"children\":[[\"$\",\"p\",null,{\"className\":\"text-lg mb-4\",\"children\":\"Albumentations requires Python 3.9 or higher. To install the library from PyPI run\"}],[\"$\",\"div\",null,{\"className\":\"bg-white border border-gray-400 rounded-lg p-4 mb-6 font-mono text-xl\",\"children\":\"pip install -U albumentations\"}],[\"$\",\"$L9\",null,{\"href\":\"/docs\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md bg-blue-600 text-white hover:bg-blue-700 px-6 py-3 text-lg\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fas fa-angle-right mr-2\"}],\"Open documentation\"]}]]}]}]]}]}],[\"$\",\"div\",null,{\"className\":\"bg-gray-50 py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Support Open Source Development\"}],[\"$\",\"div\",null,{\"className\":\"max-w-5xl mx-auto\",\"children\":[\"$\",\"div\",null,{\"className\":\"space-y-8\",\"children\":[[\"$\",\"div\",null,{\"className\":\"text-center max-w-3xl mx-auto\",\"children\":[[\"$\",\"p\",null,{\"className\":\"text-xl text-gray-700 mb-4\",\"children\":\"Albumentations is a free, open-source project maintained by a dedicated team of developers\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600\",\"children\":\"Your sponsorship helps us maintain high-quality code, provide timely updates, and develop new features\"}]]}],[\"$\",\"div\",null,{\"className\":\"grid md:grid-cols-2 gap-8\",\"children\":[[\"$\",\"div\",null,{\"className\":\"bg-white p-6 rounded-xl shadow-sm border border-gray-100\",\"children\":[[\"$\",\"h3\",null,{\"className\":\"text-lg font-medium mb-3 flex items-center gap-2\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fas fa-heart text-pink-500\"}],\"Individual Sponsors\"]}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 mb-4\",\"children\":\"Support open source with a monthly contribution of any size. Every dollar helps maintain and improve Albumentations.\"}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fa fa-heart mr-2\"}],\"Sponsor\"]}]]}],[\"$\",\"div\",null,{\"className\":\"bg-white p-6 rounded-xl shadow-sm border border-gray-100\",\"children\":[[\"$\",\"h3\",null,{\"className\":\"text-lg font-medium mb-3 flex items-center gap-2\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fas fa-building text-blue-500\"}],\"Company Sponsorship\"]}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 mb-4\",\"children\":\"Companies using Albumentations can become official sponsors, getting their logo featured on our website and documentation.\"}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"inline-flex items-center gap-2 px-4 py-2 bg-blue-50 hover:bg-blue-100 text-blue-700 rounded-lg transition-colors\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fab fa-github\"}],\"View Sponsorship Tiers\"]}]]}]]}],[\"$\",\"div\",null,{\"className\":\"text-center text-sm text-gray-500\",\"children\":\"100% of sponsorships go directly to supporting development and maintenance.\"}]]}]}]]}]}],[\"$\",\"div\",null,{\"className\":\"bg-white py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Citing\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-4xl mx-auto\",\"children\":[[\"$\",\"p\",null,{\"className\":\"text-lg mb-4\",\"children\":[\"If you find this library useful for your research, please consider citing\",\" \",[\"$\",\"a\",null,{\"href\":\"https://www.mdpi.com/2078-2489/11/2/125\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-blue-600 hover:text-blue-800 underline\",\"children\":\"Albumentations: Fast and Flexible Image Augmentations\"}],\":\"]}],[\"$\",\"pre\",null,{\"className\":\"bg-gray-50 border border-gray-300 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed\",\"children\":[\"$\",\"code\",null,{\"children\":\"@Article{info11020125,\\n    AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},\\n    TITLE = {Albumentations: Fast and Flexible Image Augmentations},\\n    JOURNAL = {Information},\\n    VOLUME = {11},\\n    YEAR = {2020},\\n    NUMBER = {2},\\n    ARTICLE-NUMBER = {125},\\n    URL = {https://www.mdpi.com/2078-2489/11/2/125},\\n    ISSN = {2078-2489},\\n    DOI = {10.3390/info11020125}\\n}\"}]}]]}]}]]}]}]]}]\n"])</script></body></html>
\ No newline at end of file
+}</code></pre></div></div></div></div></main></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"5:\"$Sreact.fragment\"\n6:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n7:I[5244,[],\"\"]\n8:I[3866,[],\"\"]\n9:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\na:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\nc:I[6213,[],\"OutletBoundary\"]\ne:I[6213,[],\"MetadataBoundary\"]\n10:I[6213,[],\"ViewportBoundary\"]\n12:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n4:HL[\"/_next/static/css/4d3d9169b46fed63.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"vJuFcpWvbN6zbAub4gcIZ\",\"p\":\"\",\"c\":[\"\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"__PAGE__\",{}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$5\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L6\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L7\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L8\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L9\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L9\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L9\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L9\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$La\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$5\",\"c\",{\"children\":[\"$Lb\",[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/4d3d9169b46fed63.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"$Lc\",null,{\"children\":\"$Ld\"}]]}],{},null]},null],[\"$\",\"$5\",\"h\",{\"children\":[null,[\"$\",\"$5\",\"Xb-XWebX4_Ox_oTcMoDXa\",{\"children\":[[\"$\",\"$Le\",null,{\"children\":\"$Lf\"}],[\"$\",\"$L10\",null,{\"children\":\"$L11\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$12\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"11:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\nf:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"link\",\"21\",{\"re"])</script><script>self.__next_f.push([1,"l\":\"icon\",\"href\":\"/icon.svg?ed95530d83f93aed\",\"type\":\"image/svg+xml\",\"sizes\":\"any\"}]]\n"])</script><script>self.__next_f.push([1,"d:null\n"])</script><script>self.__next_f.push([1,"13:I[9865,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n14:I[4265,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n15:I[4560,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n16:I[9264,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n17:I[4755,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"default\"]\n18:I[7970,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"Image\"]\n"])</script><script>self.__next_f.push([1,"b:[\"$\",\"main\",null,{\"children\":[[\"$\",\"$L13\",null,{\"starsCount\":14422,\"downloadsCount\":5151795}],[\"$\",\"$L14\",null,{}],[\"$\",\"$L15\",null,{}],[\"$\",\"$L16\",null,{}],[\"$\",\"$L17\",null,{}],[\"$\",\"div\",null,{\"className\":\"features\",\"children\":[[\"$\",\"div\",\"Different tasks\",{\"className\":\"bg-gray-50 py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Different tasks\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-2 gap-8 items-center \",\"children\":[[\"$\",\"div\",null,{\"className\":\"md:pr-8\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl font-semibold mb-4\",\"children\":\"Different tasks\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 text-lg\",\"children\":\"Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation.\"}]]}],[\"$\",\"div\",null,{\"children\":[\"$\",\"$L18\",null,{\"src\":\"/assets/custom/tasks.png\",\"alt\":\"Computer vision tasks\",\"width\":500,\"height\":300,\"className\":\"w-full h-auto\"}]}]]}]}]]}]}],[\"$\",\"div\",\"Different domains\",{\"className\":\"bg-white py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Different domains\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-2 gap-8 items-center md:flex-row-reverse\",\"children\":[[\"$\",\"div\",null,{\"className\":\"md:pl-8\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl font-semibold mb-4\",\"children\":\"Different domains\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 text-lg\",\"children\":\"Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks.\"}]]}],[\"$\",\"div\",null,{\"children\":[\"$\",\"$L18\",null,{\"src\":\"/assets/custom/domains.png\",\"alt\":\"Different domains\",\"width\":500,\"height\":300,\"className\":\"w-full h-auto\"}]}]]}]}]]}]}],[\"$\",\"div\",\"Seamless integration with deep learning frameworks\",{\"className\":\"bg-gray-50 py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Seamless integration with deep learning frameworks\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-2 gap-8 items-center \",\"children\":[[\"$\",\"div\",null,{\"className\":\"md:pr-8\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl font-semibold mb-4\",\"children\":\"Seamless integration with deep learning frameworks\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 text-lg\",\"children\":\"Albumentations can work with various deep learning frameworks such as PyTorch and Keras. The library is a part of the PyTorch ecosystem. MMDetection and YOLOv5 use Albumentations.\"}]]}],[\"$\",\"div\",null,{\"children\":[\"$\",\"$L18\",null,{\"src\":\"/assets/custom/deep_learning_frameworks.png\",\"alt\":\"Deep learning frameworks\",\"width\":500,\"height\":300,\"className\":\"w-full h-auto\"}]}]]}]}]]}]}]]}],[\"$\",\"div\",null,{\"className\":\"bg-white py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Getting started\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-3xl mx-auto text-center\",\"children\":[[\"$\",\"p\",null,{\"className\":\"text-lg mb-4\",\"children\":\"Albumentations requires Python 3.9 or higher. To install the library from PyPI run\"}],[\"$\",\"div\",null,{\"className\":\"bg-white border border-gray-400 rounded-lg p-4 mb-6 font-mono text-xl\",\"children\":\"pip install -U albumentations\"}],[\"$\",\"$L9\",null,{\"href\":\"/docs\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md bg-blue-600 text-white hover:bg-blue-700 px-6 py-3 text-lg\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fas fa-angle-right mr-2\"}],\"Open documentation\"]}]]}]}]]}]}],[\"$\",\"div\",null,{\"className\":\"bg-gray-50 py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Support Open Source Development\"}],[\"$\",\"div\",null,{\"className\":\"max-w-5xl mx-auto\",\"children\":[\"$\",\"div\",null,{\"className\":\"space-y-8\",\"children\":[[\"$\",\"div\",null,{\"className\":\"text-center max-w-3xl mx-auto\",\"children\":[[\"$\",\"p\",null,{\"className\":\"text-xl text-gray-700 mb-4\",\"children\":\"Albumentations is a free, open-source project maintained by a dedicated team of developers\"}],[\"$\",\"p\",null,{\"className\":\"text-gray-600\",\"children\":\"Your sponsorship helps us maintain high-quality code, provide timely updates, and develop new features\"}]]}],[\"$\",\"div\",null,{\"className\":\"grid md:grid-cols-2 gap-8\",\"children\":[[\"$\",\"div\",null,{\"className\":\"bg-white p-6 rounded-xl shadow-sm border border-gray-100\",\"children\":[[\"$\",\"h3\",null,{\"className\":\"text-lg font-medium mb-3 flex items-center gap-2\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fas fa-heart text-pink-500\"}],\"Individual Sponsors\"]}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 mb-4\",\"children\":\"Support open source with a monthly contribution of any size. Every dollar helps maintain and improve Albumentations.\"}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fa fa-heart mr-2\"}],\"Sponsor\"]}]]}],[\"$\",\"div\",null,{\"className\":\"bg-white p-6 rounded-xl shadow-sm border border-gray-100\",\"children\":[[\"$\",\"h3\",null,{\"className\":\"text-lg font-medium mb-3 flex items-center gap-2\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fas fa-building text-blue-500\"}],\"Company Sponsorship\"]}],[\"$\",\"p\",null,{\"className\":\"text-gray-600 mb-4\",\"children\":\"Companies using Albumentations can become official sponsors, getting their logo featured on our website and documentation.\"}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"inline-flex items-center gap-2 px-4 py-2 bg-blue-50 hover:bg-blue-100 text-blue-700 rounded-lg transition-colors\",\"children\":[[\"$\",\"i\",null,{\"className\":\"fab fa-github\"}],\"View Sponsorship Tiers\"]}]]}]]}],[\"$\",\"div\",null,{\"className\":\"text-center text-sm text-gray-500\",\"children\":\"100% of sponsorships go directly to supporting development and maintenance.\"}]]}]}]]}]}],[\"$\",\"div\",null,{\"className\":\"bg-white py-12 md:py-16\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4\",\"children\":[[\"$\",\"h2\",null,{\"className\":\"text-2xl md:text-3xl font-medium text-center mb-8\",\"children\":\"Citing\"}],[\"$\",\"div\",null,{\"className\":\"\",\"children\":[\"$\",\"div\",null,{\"className\":\"max-w-4xl mx-auto\",\"children\":[[\"$\",\"p\",null,{\"className\":\"text-lg mb-4\",\"children\":[\"If you find this library useful for your research, please consider citing\",\" \",[\"$\",\"a\",null,{\"href\":\"https://www.mdpi.com/2078-2489/11/2/125\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-blue-600 hover:text-blue-800 underline\",\"children\":\"Albumentations: Fast and Flexible Image Augmentations\"}],\":\"]}],[\"$\",\"pre\",null,{\"className\":\"bg-gray-50 border border-gray-300 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed\",\"children\":[\"$\",\"code\",null,{\"children\":\"@Article{info11020125,\\n    AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},\\n    TITLE = {Albumentations: Fast and Flexible Image Augmentations},\\n    JOURNAL = {Information},\\n    VOLUME = {11},\\n    YEAR = {2020},\\n    NUMBER = {2},\\n    ARTICLE-NUMBER = {125},\\n    URL = {https://www.mdpi.com/2078-2489/11/2/125},\\n    ISSN = {2078-2489},\\n    DOI = {10.3390/info11020125}\\n}\"}]}]]}]}]]}]}]]}]\n"])</script></body></html>
\ No newline at end of file
diff --git a/index.txt b/index.txt
index c9f480f2..17e31a41 100755
--- a/index.txt
+++ b/index.txt
@@ -12,7 +12,7 @@ e:I[6213,[],"MetadataBoundary"]
 2:HL["/_next/static/media/463dafcda517f24f-s.p.woff","font",{"crossOrigin":"","type":"font/woff"}]
 3:HL["/_next/static/css/8043a7c984777fb1.css","style"]
 4:HL["/_next/static/css/4d3d9169b46fed63.css","style"]
-0:{"P":null,"b":"ZNcQrWMYk7ymGN9y8PjnQ","p":"","c":["",""],"i":false,"f":[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",["$","$5","c",{"children":[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/8043a7c984777fb1.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","html",null,{"lang":"en","children":[["$","head",null,{"children":[["$","link",null,{"rel":"shortcut icon","href":"/albumentations_logo.png"}],["$","link",null,{"rel":"stylesheet","href":"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"}]]}],["$","body",null,{"className":"__variable_1e4310 __variable_c3aa02 font-sans antialiased","children":[["$","$L6",null,{}],["$","main",null,{"className":"pt-16","children":["$","$L7",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L8",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}],["$","footer",null,{"className":"border-t","children":["$","div",null,{"className":"container mx-auto px-4 py-8","children":["$","div",null,{"className":"flex flex-col md:flex-row justify-between items-start md:items-center gap-6","children":[["$","nav",null,{"className":"flex flex-wrap gap-x-6 gap-y-2","children":[["$","$L9","/",{"href":"/","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Home"}],["$","$L9","/docs",{"href":"/docs","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Documentation"}],["$","$L9","/whos_using",{"href":"/whos_using","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Who's using"}],["$","a","https://explore.albumentations.ai",{"href":"https://explore.albumentations.ai","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Explore"}],["$","$L9","/people",{"href":"/people","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"People"}],["$","a","https://github.com/albumentations-team/albumentations",{"href":"https://github.com/albumentations-team/albumentations","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"GitHub"}]]}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":["$undefined","Sponsor"]}]]}]}]}]]}],["$","$La",null,{"gaId":"G-DCXRDR9HJ0"}]]}]]}],{"children":["__PAGE__",["$","$5","c",{"children":["$Lb",[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/4d3d9169b46fed63.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","$Lc",null,{"children":"$Ld"}]]}],{},null]},null],["$","$5","h",{"children":[null,["$","$5","MBt7JfAd9ag4e5v5N-19O",{"children":[["$","$Le",null,{"children":"$Lf"}],["$","$L10",null,{"children":"$L11"}],["$","meta",null,{"name":"next-size-adjust"}]]}]]}]]],"m":"$undefined","G":["$12","$undefined"],"s":false,"S":true}
+0:{"P":null,"b":"vJuFcpWvbN6zbAub4gcIZ","p":"","c":["",""],"i":false,"f":[[["",{"children":["__PAGE__",{}]},"$undefined","$undefined",true],["",["$","$5","c",{"children":[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/8043a7c984777fb1.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","html",null,{"lang":"en","children":[["$","head",null,{"children":[["$","link",null,{"rel":"shortcut icon","href":"/albumentations_logo.png"}],["$","link",null,{"rel":"stylesheet","href":"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"}]]}],["$","body",null,{"className":"__variable_1e4310 __variable_c3aa02 font-sans antialiased","children":[["$","$L6",null,{}],["$","main",null,{"className":"pt-16","children":["$","$L7",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L8",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}],["$","footer",null,{"className":"border-t","children":["$","div",null,{"className":"container mx-auto px-4 py-8","children":["$","div",null,{"className":"flex flex-col md:flex-row justify-between items-start md:items-center gap-6","children":[["$","nav",null,{"className":"flex flex-wrap gap-x-6 gap-y-2","children":[["$","$L9","/",{"href":"/","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Home"}],["$","$L9","/docs",{"href":"/docs","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Documentation"}],["$","$L9","/whos_using",{"href":"/whos_using","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Who's using"}],["$","a","https://explore.albumentations.ai",{"href":"https://explore.albumentations.ai","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Explore"}],["$","$L9","/people",{"href":"/people","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"People"}],["$","a","https://github.com/albumentations-team/albumentations",{"href":"https://github.com/albumentations-team/albumentations","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"GitHub"}]]}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":["$undefined","Sponsor"]}]]}]}]}]]}],["$","$La",null,{"gaId":"G-DCXRDR9HJ0"}]]}]]}],{"children":["__PAGE__",["$","$5","c",{"children":["$Lb",[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/4d3d9169b46fed63.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","$Lc",null,{"children":"$Ld"}]]}],{},null]},null],["$","$5","h",{"children":[null,["$","$5","Xb-XWebX4_Ox_oTcMoDXa",{"children":[["$","$Le",null,{"children":"$Lf"}],["$","$L10",null,{"children":"$L11"}],["$","meta",null,{"name":"next-size-adjust"}]]}]]}]]],"m":"$undefined","G":["$12","$undefined"],"s":false,"S":true}
 11:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}]]
 f:[["$","meta","0",{"charSet":"utf-8"}],["$","title","1",{"children":"Albumentations: fast and flexible image augmentations"}],["$","meta","2",{"name":"description","content":"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."}],["$","meta","3",{"name":"robots","content":"index, follow"}],["$","link","4",{"rel":"canonical","href":"https://albumentations.ai/"}],["$","meta","5",{"property":"og:title","content":"Albumentations: fast and flexible image augmentations"}],["$","meta","6",{"property":"og:description","content":"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."}],["$","meta","7",{"property":"og:url","content":"https://albumentations.ai/"}],["$","meta","8",{"property":"og:site_name","content":"Albumentations"}],["$","meta","9",{"property":"og:locale","content":"en_US"}],["$","meta","10",{"property":"og:image","content":"https://albumentations.ai/assets/albumentations_card.png"}],["$","meta","11",{"property":"og:image:width","content":"1200"}],["$","meta","12",{"property":"og:image:height","content":"630"}],["$","meta","13",{"property":"og:image:alt","content":"Albumentations"}],["$","meta","14",{"property":"og:type","content":"website"}],["$","meta","15",{"name":"twitter:card","content":"summary_large_image"}],["$","meta","16",{"name":"twitter:site","content":"@albumentations"}],["$","meta","17",{"name":"twitter:creator","content":"@viglovikov"}],["$","meta","18",{"name":"twitter:title","content":"Albumentations: fast and flexible image augmentations"}],["$","meta","19",{"name":"twitter:description","content":"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."}],["$","meta","20",{"name":"twitter:image","content":"https://albumentations.ai/assets/albumentations_card.png"}],["$","link","21",{"rel":"icon","href":"/icon.svg?ed95530d83f93aed","type":"image/svg+xml","sizes":"any"}]]
 d:null
@@ -22,4 +22,4 @@ d:null
 16:I[9264,["565","static/chunks/565-7d6f0e76202f7b1c.js","396","static/chunks/396-4e8a41fcd07efacf.js","158","static/chunks/158-fba4c5b26e754598.js","974","static/chunks/app/page-01034d6f519d879c.js"],"default"]
 17:I[4755,["565","static/chunks/565-7d6f0e76202f7b1c.js","396","static/chunks/396-4e8a41fcd07efacf.js","158","static/chunks/158-fba4c5b26e754598.js","974","static/chunks/app/page-01034d6f519d879c.js"],"default"]
 18:I[7970,["565","static/chunks/565-7d6f0e76202f7b1c.js","396","static/chunks/396-4e8a41fcd07efacf.js","158","static/chunks/158-fba4c5b26e754598.js","974","static/chunks/app/page-01034d6f519d879c.js"],"Image"]
-b:["$","main",null,{"children":[["$","$L13",null,{"starsCount":14419,"downloadsCount":5135449}],["$","$L14",null,{}],["$","$L15",null,{}],["$","$L16",null,{}],["$","$L17",null,{}],["$","div",null,{"className":"features","children":[["$","div","Different tasks",{"className":"bg-gray-50 py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Different tasks"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"grid grid-cols-1 md:grid-cols-2 gap-8 items-center ","children":[["$","div",null,{"className":"md:pr-8","children":[["$","h2",null,{"className":"text-2xl font-semibold mb-4","children":"Different tasks"}],["$","p",null,{"className":"text-gray-600 text-lg","children":"Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation."}]]}],["$","div",null,{"children":["$","$L18",null,{"src":"/assets/custom/tasks.png","alt":"Computer vision tasks","width":500,"height":300,"className":"w-full h-auto"}]}]]}]}]]}]}],["$","div","Different domains",{"className":"bg-white py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Different domains"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"grid grid-cols-1 md:grid-cols-2 gap-8 items-center md:flex-row-reverse","children":[["$","div",null,{"className":"md:pl-8","children":[["$","h2",null,{"className":"text-2xl font-semibold mb-4","children":"Different domains"}],["$","p",null,{"className":"text-gray-600 text-lg","children":"Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks."}]]}],["$","div",null,{"children":["$","$L18",null,{"src":"/assets/custom/domains.png","alt":"Different domains","width":500,"height":300,"className":"w-full h-auto"}]}]]}]}]]}]}],["$","div","Seamless integration with deep learning frameworks",{"className":"bg-gray-50 py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Seamless integration with deep learning frameworks"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"grid grid-cols-1 md:grid-cols-2 gap-8 items-center ","children":[["$","div",null,{"className":"md:pr-8","children":[["$","h2",null,{"className":"text-2xl font-semibold mb-4","children":"Seamless integration with deep learning frameworks"}],["$","p",null,{"className":"text-gray-600 text-lg","children":"Albumentations can work with various deep learning frameworks such as PyTorch and Keras. The library is a part of the PyTorch ecosystem. MMDetection and YOLOv5 use Albumentations."}]]}],["$","div",null,{"children":["$","$L18",null,{"src":"/assets/custom/deep_learning_frameworks.png","alt":"Deep learning frameworks","width":500,"height":300,"className":"w-full h-auto"}]}]]}]}]]}]}]]}],["$","div",null,{"className":"bg-white py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Getting started"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"max-w-3xl mx-auto text-center","children":[["$","p",null,{"className":"text-lg mb-4","children":"Albumentations requires Python 3.9 or higher. To install the library from PyPI run"}],["$","div",null,{"className":"bg-white border border-gray-400 rounded-lg p-4 mb-6 font-mono text-xl","children":"pip install -U albumentations"}],["$","$L9",null,{"href":"/docs","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md bg-blue-600 text-white hover:bg-blue-700 px-6 py-3 text-lg","children":[["$","i",null,{"className":"fas fa-angle-right mr-2"}],"Open documentation"]}]]}]}]]}]}],["$","div",null,{"className":"bg-gray-50 py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Support Open Source Development"}],["$","div",null,{"className":"max-w-5xl mx-auto","children":["$","div",null,{"className":"space-y-8","children":[["$","div",null,{"className":"text-center max-w-3xl mx-auto","children":[["$","p",null,{"className":"text-xl text-gray-700 mb-4","children":"Albumentations is a free, open-source project maintained by a dedicated team of developers"}],["$","p",null,{"className":"text-gray-600","children":"Your sponsorship helps us maintain high-quality code, provide timely updates, and develop new features"}]]}],["$","div",null,{"className":"grid md:grid-cols-2 gap-8","children":[["$","div",null,{"className":"bg-white p-6 rounded-xl shadow-sm border border-gray-100","children":[["$","h3",null,{"className":"text-lg font-medium mb-3 flex items-center gap-2","children":[["$","i",null,{"className":"fas fa-heart text-pink-500"}],"Individual Sponsors"]}],["$","p",null,{"className":"text-gray-600 mb-4","children":"Support open source with a monthly contribution of any size. Every dollar helps maintain and improve Albumentations."}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":[["$","i",null,{"className":"fa fa-heart mr-2"}],"Sponsor"]}]]}],["$","div",null,{"className":"bg-white p-6 rounded-xl shadow-sm border border-gray-100","children":[["$","h3",null,{"className":"text-lg font-medium mb-3 flex items-center gap-2","children":[["$","i",null,{"className":"fas fa-building text-blue-500"}],"Company Sponsorship"]}],["$","p",null,{"className":"text-gray-600 mb-4","children":"Companies using Albumentations can become official sponsors, getting their logo featured on our website and documentation."}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","target":"_blank","rel":"noopener noreferrer","className":"inline-flex items-center gap-2 px-4 py-2 bg-blue-50 hover:bg-blue-100 text-blue-700 rounded-lg transition-colors","children":[["$","i",null,{"className":"fab fa-github"}],"View Sponsorship Tiers"]}]]}]]}],["$","div",null,{"className":"text-center text-sm text-gray-500","children":"100% of sponsorships go directly to supporting development and maintenance."}]]}]}]]}]}],["$","div",null,{"className":"bg-white py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Citing"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"max-w-4xl mx-auto","children":[["$","p",null,{"className":"text-lg mb-4","children":["If you find this library useful for your research, please consider citing"," ",["$","a",null,{"href":"https://www.mdpi.com/2078-2489/11/2/125","target":"_blank","rel":"noopener noreferrer","className":"text-blue-600 hover:text-blue-800 underline","children":"Albumentations: Fast and Flexible Image Augmentations"}],":"]}],["$","pre",null,{"className":"bg-gray-50 border border-gray-300 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed","children":["$","code",null,{"children":"@Article{info11020125,\n    AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},\n    TITLE = {Albumentations: Fast and Flexible Image Augmentations},\n    JOURNAL = {Information},\n    VOLUME = {11},\n    YEAR = {2020},\n    NUMBER = {2},\n    ARTICLE-NUMBER = {125},\n    URL = {https://www.mdpi.com/2078-2489/11/2/125},\n    ISSN = {2078-2489},\n    DOI = {10.3390/info11020125}\n}"}]}]]}]}]]}]}]]}]
+b:["$","main",null,{"children":[["$","$L13",null,{"starsCount":14422,"downloadsCount":5151795}],["$","$L14",null,{}],["$","$L15",null,{}],["$","$L16",null,{}],["$","$L17",null,{}],["$","div",null,{"className":"features","children":[["$","div","Different tasks",{"className":"bg-gray-50 py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Different tasks"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"grid grid-cols-1 md:grid-cols-2 gap-8 items-center ","children":[["$","div",null,{"className":"md:pr-8","children":[["$","h2",null,{"className":"text-2xl font-semibold mb-4","children":"Different tasks"}],["$","p",null,{"className":"text-gray-600 text-lg","children":"Albumentations supports different computer vision tasks such as classification, semantic segmentation, instance segmentation, object detection, and pose estimation."}]]}],["$","div",null,{"children":["$","$L18",null,{"src":"/assets/custom/tasks.png","alt":"Computer vision tasks","width":500,"height":300,"className":"w-full h-auto"}]}]]}]}]]}]}],["$","div","Different domains",{"className":"bg-white py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Different domains"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"grid grid-cols-1 md:grid-cols-2 gap-8 items-center md:flex-row-reverse","children":[["$","div",null,{"className":"md:pl-8","children":[["$","h2",null,{"className":"text-2xl font-semibold mb-4","children":"Different domains"}],["$","p",null,{"className":"text-gray-600 text-lg","children":"Albumentations works well with data from different domains: photos, medical images, satellite imagery, manufacturing and industrial applications, Generative Adversarial Networks."}]]}],["$","div",null,{"children":["$","$L18",null,{"src":"/assets/custom/domains.png","alt":"Different domains","width":500,"height":300,"className":"w-full h-auto"}]}]]}]}]]}]}],["$","div","Seamless integration with deep learning frameworks",{"className":"bg-gray-50 py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Seamless integration with deep learning frameworks"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"grid grid-cols-1 md:grid-cols-2 gap-8 items-center ","children":[["$","div",null,{"className":"md:pr-8","children":[["$","h2",null,{"className":"text-2xl font-semibold mb-4","children":"Seamless integration with deep learning frameworks"}],["$","p",null,{"className":"text-gray-600 text-lg","children":"Albumentations can work with various deep learning frameworks such as PyTorch and Keras. The library is a part of the PyTorch ecosystem. MMDetection and YOLOv5 use Albumentations."}]]}],["$","div",null,{"children":["$","$L18",null,{"src":"/assets/custom/deep_learning_frameworks.png","alt":"Deep learning frameworks","width":500,"height":300,"className":"w-full h-auto"}]}]]}]}]]}]}]]}],["$","div",null,{"className":"bg-white py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Getting started"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"max-w-3xl mx-auto text-center","children":[["$","p",null,{"className":"text-lg mb-4","children":"Albumentations requires Python 3.9 or higher. To install the library from PyPI run"}],["$","div",null,{"className":"bg-white border border-gray-400 rounded-lg p-4 mb-6 font-mono text-xl","children":"pip install -U albumentations"}],["$","$L9",null,{"href":"/docs","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md bg-blue-600 text-white hover:bg-blue-700 px-6 py-3 text-lg","children":[["$","i",null,{"className":"fas fa-angle-right mr-2"}],"Open documentation"]}]]}]}]]}]}],["$","div",null,{"className":"bg-gray-50 py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Support Open Source Development"}],["$","div",null,{"className":"max-w-5xl mx-auto","children":["$","div",null,{"className":"space-y-8","children":[["$","div",null,{"className":"text-center max-w-3xl mx-auto","children":[["$","p",null,{"className":"text-xl text-gray-700 mb-4","children":"Albumentations is a free, open-source project maintained by a dedicated team of developers"}],["$","p",null,{"className":"text-gray-600","children":"Your sponsorship helps us maintain high-quality code, provide timely updates, and develop new features"}]]}],["$","div",null,{"className":"grid md:grid-cols-2 gap-8","children":[["$","div",null,{"className":"bg-white p-6 rounded-xl shadow-sm border border-gray-100","children":[["$","h3",null,{"className":"text-lg font-medium mb-3 flex items-center gap-2","children":[["$","i",null,{"className":"fas fa-heart text-pink-500"}],"Individual Sponsors"]}],["$","p",null,{"className":"text-gray-600 mb-4","children":"Support open source with a monthly contribution of any size. Every dollar helps maintain and improve Albumentations."}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":[["$","i",null,{"className":"fa fa-heart mr-2"}],"Sponsor"]}]]}],["$","div",null,{"className":"bg-white p-6 rounded-xl shadow-sm border border-gray-100","children":[["$","h3",null,{"className":"text-lg font-medium mb-3 flex items-center gap-2","children":[["$","i",null,{"className":"fas fa-building text-blue-500"}],"Company Sponsorship"]}],["$","p",null,{"className":"text-gray-600 mb-4","children":"Companies using Albumentations can become official sponsors, getting their logo featured on our website and documentation."}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","target":"_blank","rel":"noopener noreferrer","className":"inline-flex items-center gap-2 px-4 py-2 bg-blue-50 hover:bg-blue-100 text-blue-700 rounded-lg transition-colors","children":[["$","i",null,{"className":"fab fa-github"}],"View Sponsorship Tiers"]}]]}]]}],["$","div",null,{"className":"text-center text-sm text-gray-500","children":"100% of sponsorships go directly to supporting development and maintenance."}]]}]}]]}]}],["$","div",null,{"className":"bg-white py-12 md:py-16","children":["$","div",null,{"className":"container mx-auto px-4","children":[["$","h2",null,{"className":"text-2xl md:text-3xl font-medium text-center mb-8","children":"Citing"}],["$","div",null,{"className":"","children":["$","div",null,{"className":"max-w-4xl mx-auto","children":[["$","p",null,{"className":"text-lg mb-4","children":["If you find this library useful for your research, please consider citing"," ",["$","a",null,{"href":"https://www.mdpi.com/2078-2489/11/2/125","target":"_blank","rel":"noopener noreferrer","className":"text-blue-600 hover:text-blue-800 underline","children":"Albumentations: Fast and Flexible Image Augmentations"}],":"]}],["$","pre",null,{"className":"bg-gray-50 border border-gray-300 rounded-lg p-4 overflow-x-auto text-sm leading-relaxed","children":["$","code",null,{"children":"@Article{info11020125,\n    AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},\n    TITLE = {Albumentations: Fast and Flexible Image Augmentations},\n    JOURNAL = {Information},\n    VOLUME = {11},\n    YEAR = {2020},\n    NUMBER = {2},\n    ARTICLE-NUMBER = {125},\n    URL = {https://www.mdpi.com/2078-2489/11/2/125},\n    ISSN = {2078-2489},\n    DOI = {10.3390/info11020125}\n}"}]}]]}]}]]}]}]]}]
diff --git a/people/index.html b/people/index.html
index 9cfcc148..8f769e7d 100755
--- a/people/index.html
+++ b/people/index.html
@@ -1 +1 @@
-<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><script src="/_next/static/chunks/app/people/page-17007ded85d614d1.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="icon" href="/icon.svg?ed95530d83f93aed" type="image/svg+xml" sizes="any"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div class="container mx-auto px-4 py-8"><div class="mb-12"><h1 class="text-3xl font-medium text-center mb-8">Core team</h1><div class="grid grid-cols-1 grid-cols-1 gap-4 max-w-5xl mx-auto"><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Vladimir Iglovikov" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/iglovikov.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Vladimir Iglovikov</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/iglovikov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/ternaus" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/viglovikov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/iglovikov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li><li><a href="https://scholar.google.com/citations?user=vkjh9X0AAAAJ" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-google "></i></a></li><li><a href="https://instagram.com/ternaus" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-instagram "></i></a></li></ul></div></div></div></div><div class="mb-12"><h1 class="text-2xl font-medium text-center mb-8">Honorary Developers. Were behind the creation of the library. Sadly not active anymore.</h1><div class="grid grid-cols-1 md:grid-cols-3 xl:grid-cols-4 gap-4 max-w-5xl mx-auto"><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Alexander Buslaev" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/buslaev.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Alexander Buslaev</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/al-buslaev" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/albu" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/AlBuslaev" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/albuslaev" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Alex Parinov" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/parinov.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Alex Parinov</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/alex-parinov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/creafz" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/creaf" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/creafz" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Evegene Khvedchenya" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/khvedchenya.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Evegene Khvedchenya</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/cvtalks" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/BloodAxe" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/cvtalks" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/bloodaxe" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Mikhail Druzhinin" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/druzhinin.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Mikhail Druzhinin</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/mikhail-druzhinin-548229100" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/Dipet" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/Dipetm" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/dipetm" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div></div></div><div class="mb-12"><h1 class="text-3xl font-medium text-center mb-8">Contributors</h1><div class="flex flex-wrap gap-2"><div class="border border-gray-300"><a href="https://github.com/ternaus" target="_blank" rel="noopener noreferrer"><img alt="ternaus" title="ternaus" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5481618?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/creafz" target="_blank" rel="noopener noreferrer"><img alt="creafz" title="creafz" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/681989?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Dipet" target="_blank" rel="noopener noreferrer"><img alt="Dipet" title="Dipet" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7512250?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/albu" target="_blank" rel="noopener noreferrer"><img alt="albu" title="albu" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1128788?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/BloodAxe" target="_blank" rel="noopener noreferrer"><img alt="BloodAxe" title="BloodAxe" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/532320?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ayasyrev" target="_blank" rel="noopener noreferrer"><img alt="ayasyrev" title="ayasyrev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/46181683?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/vfdev-5" target="_blank" rel="noopener noreferrer"><img alt="vfdev-5" title="vfdev-5" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2459423?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/arsenyinfo" target="_blank" rel="noopener noreferrer"><img alt="arsenyinfo" title="arsenyinfo" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4929993?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/qubvel" target="_blank" rel="noopener noreferrer"><img alt="qubvel" title="qubvel" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/31920396?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ZFTurbo" target="_blank" rel="noopener noreferrer"><img alt="ZFTurbo" title="ZFTurbo" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4013811?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/i-aki-y" target="_blank" rel="noopener noreferrer"><img alt="i-aki-y" title="i-aki-y" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/9190086?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/MichaelMonashev" target="_blank" rel="noopener noreferrer"><img alt="MichaelMonashev" title="MichaelMonashev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/6323434?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/zakajd" target="_blank" rel="noopener noreferrer"><img alt="zakajd" title="zakajd" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/15848838?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/victor1cea" target="_blank" rel="noopener noreferrer"><img alt="victor1cea" title="victor1cea" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/12428490?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/akarsakov" target="_blank" rel="noopener noreferrer"><img alt="akarsakov" title="akarsakov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5549721?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/IlyaOvodov" target="_blank" rel="noopener noreferrer"><img alt="IlyaOvodov" title="IlyaOvodov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34230114?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/LinaShiryaeva" target="_blank" rel="noopener noreferrer"><img alt="LinaShiryaeva" title="LinaShiryaeva" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8642759?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/IliaLarchenko" target="_blank" rel="noopener noreferrer"><img alt="IliaLarchenko" title="IliaLarchenko" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/41329713?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/StrikerRUS" target="_blank" rel="noopener noreferrer"><img alt="StrikerRUS" title="StrikerRUS" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25141164?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Arquestro" target="_blank" rel="noopener noreferrer"><img alt="Arquestro" title="Arquestro" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7341001?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jangop" target="_blank" rel="noopener noreferrer"><img alt="jangop" title="jangop" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/17318436?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/momincks" target="_blank" rel="noopener noreferrer"><img alt="momincks" title="momincks" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/63245608?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/guillaume-rochette-oxb" target="_blank" rel="noopener noreferrer"><img alt="guillaume-rochette-oxb" title="guillaume-rochette-oxb" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/129938647?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bfialkoff" target="_blank" rel="noopener noreferrer"><img alt="bfialkoff" title="bfialkoff" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/39707315?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/VirajBagal" target="_blank" rel="noopener noreferrer"><img alt="VirajBagal" title="VirajBagal" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/51148252?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/5n7-sk" target="_blank" rel="noopener noreferrer"><img alt="5n7-sk" title="5n7-sk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/59395084?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/sergei3000" target="_blank" rel="noopener noreferrer"><img alt="sergei3000" title="sergei3000" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18502517?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/libfun" target="_blank" rel="noopener noreferrer"><img alt="libfun" title="libfun" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/301147?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/gavrin-s" target="_blank" rel="noopener noreferrer"><img alt="gavrin-s" title="gavrin-s" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/15730417?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/SunQpark" target="_blank" rel="noopener noreferrer"><img alt="SunQpark" title="SunQpark" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/31339583?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alimbekovKZ" target="_blank" rel="noopener noreferrer"><img alt="alimbekovKZ" title="alimbekovKZ" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2087546?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/onurtore" target="_blank" rel="noopener noreferrer"><img alt="onurtore" title="onurtore" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20031290?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ocourtin" target="_blank" rel="noopener noreferrer"><img alt="ocourtin" title="ocourtin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1718264?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Aloqeely" target="_blank" rel="noopener noreferrer"><img alt="Aloqeely" title="Aloqeely" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/52792999?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ajinkyakhadilkar" target="_blank" rel="noopener noreferrer"><img alt="ajinkyakhadilkar" title="ajinkyakhadilkar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/27129645?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alekseynp" target="_blank" rel="noopener noreferrer"><img alt="alekseynp" title="alekseynp" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4382608?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alexander-rakhlin" target="_blank" rel="noopener noreferrer"><img alt="alexander-rakhlin" title="alexander-rakhlin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10095956?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Andredance" target="_blank" rel="noopener noreferrer"><img alt="Andredance" title="Andredance" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/23104744?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/toshiks" target="_blank" rel="noopener noreferrer"><img alt="toshiks" title="toshiks" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/29678445?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/erikgaas" target="_blank" rel="noopener noreferrer"><img alt="erikgaas" title="erikgaas" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2773725?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/domef" target="_blank" rel="noopener noreferrer"><img alt="domef" title="domef" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/38183728?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/4pygmalion" target="_blank" rel="noopener noreferrer"><img alt="4pygmalion" title="4pygmalion" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/45510932?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/JonasKlotz" target="_blank" rel="noopener noreferrer"><img alt="JonasKlotz" title="JonasKlotz" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/19370958?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/KiriLev" target="_blank" rel="noopener noreferrer"><img alt="KiriLev" title="KiriLev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13079303?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/marcocaccin" target="_blank" rel="noopener noreferrer"><img alt="marcocaccin" title="marcocaccin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7538983?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/mys007" target="_blank" rel="noopener noreferrer"><img alt="mys007" title="mys007" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5921083?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/PerchunPak" target="_blank" rel="noopener noreferrer"><img alt="PerchunPak" title="PerchunPak" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/68118654?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/philipp-fischer" target="_blank" rel="noopener noreferrer"><img alt="philipp-fischer" title="philipp-fischer" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/12006597?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/huuquan1994" target="_blank" rel="noopener noreferrer"><img alt="huuquan1994" title="huuquan1994" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/9111958?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ruancomelli" target="_blank" rel="noopener noreferrer"><img alt="ruancomelli" title="ruancomelli" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/22752929?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kaichoulyc" target="_blank" rel="noopener noreferrer"><img alt="kaichoulyc" title="kaichoulyc" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/32312252?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/rsokl" target="_blank" rel="noopener noreferrer"><img alt="rsokl" title="rsokl" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/29104956?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bryant1410" target="_blank" rel="noopener noreferrer"><img alt="bryant1410" title="bryant1410" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3905501?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bes-dev" target="_blank" rel="noopener noreferrer"><img alt="bes-dev" title="bes-dev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3617413?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/matsumotosan" target="_blank" rel="noopener noreferrer"><img alt="matsumotosan" title="matsumotosan" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/27728280?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tashay" target="_blank" rel="noopener noreferrer"><img alt="tashay" title="tashay" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14319581?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tmramalho" target="_blank" rel="noopener noreferrer"><img alt="tmramalho" title="tmramalho" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2317298?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/timgates42" target="_blank" rel="noopener noreferrer"><img alt="timgates42" title="timgates42" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47873678?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tabula-rosa" target="_blank" rel="noopener noreferrer"><img alt="tabula-rosa" title="tabula-rosa" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/45348789?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/rbu" target="_blank" rel="noopener noreferrer"><img alt="rbu" title="rbu" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/65913?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ORippler" target="_blank" rel="noopener noreferrer"><img alt="ORippler" title="ORippler" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/24656669?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/nathanhubens" target="_blank" rel="noopener noreferrer"><img alt="nathanhubens" title="nathanhubens" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/23050329?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/NatanBagrov" target="_blank" rel="noopener noreferrer"><img alt="NatanBagrov" title="NatanBagrov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20153650?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/namangup" target="_blank" rel="noopener noreferrer"><img alt="namangup" title="namangup" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/32638824?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/b0nce" target="_blank" rel="noopener noreferrer"><img alt="b0nce" title="b0nce" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/27873390?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/amirassov" target="_blank" rel="noopener noreferrer"><img alt="amirassov" title="amirassov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14277430?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/deleomike" target="_blank" rel="noopener noreferrer"><img alt="deleomike" title="deleomike" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34043765?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/mennohofste" target="_blank" rel="noopener noreferrer"><img alt="mennohofste" title="mennohofste" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/21225762?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/matejpekar" target="_blank" rel="noopener noreferrer"><img alt="matejpekar" title="matejpekar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/69220871?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Matthew-J-Payne" target="_blank" rel="noopener noreferrer"><img alt="Matthew-J-Payne" title="Matthew-J-Payne" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/106391289?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/gogetron" target="_blank" rel="noopener noreferrer"><img alt="gogetron" title="gogetron" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/65114284?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/mrsmrynk" target="_blank" rel="noopener noreferrer"><img alt="mrsmrynk" title="mrsmrynk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/66388955?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/daisukelab" target="_blank" rel="noopener noreferrer"><img alt="daisukelab" title="daisukelab" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14831220?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Diyago" target="_blank" rel="noopener noreferrer"><img alt="Diyago" title="Diyago" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8073766?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/thomaoc1" target="_blank" rel="noopener noreferrer"><img alt="thomaoc1" title="thomaoc1" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/73828133?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tatigabru" target="_blank" rel="noopener noreferrer"><img alt="tatigabru" title="tatigabru" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47729437?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/shyn" target="_blank" rel="noopener noreferrer"><img alt="shyn" title="shyn" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/899449?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ryoryon66" target="_blank" rel="noopener noreferrer"><img alt="ryoryon66" title="ryoryon66" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/46624038?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/oguz-hanoglu" target="_blank" rel="noopener noreferrer"><img alt="oguz-hanoglu" title="oguz-hanoglu" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/99868596?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/notmatthancock" target="_blank" rel="noopener noreferrer"><img alt="notmatthancock" title="notmatthancock" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1588456?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/loicmagne" target="_blank" rel="noopener noreferrer"><img alt="loicmagne" title="loicmagne" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/53355258?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kmistry-wx" target="_blank" rel="noopener noreferrer"><img alt="kmistry-wx" title="kmistry-wx" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/39906469?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jovenwayfarer" target="_blank" rel="noopener noreferrer"><img alt="jovenwayfarer" title="jovenwayfarer" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47921506?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jasonrock-a3" target="_blank" rel="noopener noreferrer"><img alt="jasonrock-a3" title="jasonrock-a3" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/107961397?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/iRyoka" target="_blank" rel="noopener noreferrer"><img alt="iRyoka" title="iRyoka" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/795201?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/pomacanthidae" target="_blank" rel="noopener noreferrer"><img alt="pomacanthidae" title="pomacanthidae" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18525734?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/haarisr" target="_blank" rel="noopener noreferrer"><img alt="haarisr" title="haarisr" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/122410226?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/gyaneshwar-sunkara" target="_blank" rel="noopener noreferrer"><img alt="gyaneshwar-sunkara" title="gyaneshwar-sunkara" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/52619288?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/dmitrie-ai" target="_blank" rel="noopener noreferrer"><img alt="dmitrie-ai" title="dmitrie-ai" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/72915468?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/dependabot[bot]" target="_blank" rel="noopener noreferrer"><img alt="dependabot[bot]" title="dependabot[bot]" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/in/29110?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bonlime" target="_blank" rel="noopener noreferrer"><img alt="bonlime" title="bonlime" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14337581?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/aaroswings" target="_blank" rel="noopener noreferrer"><img alt="aaroswings" title="aaroswings" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/35984064?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/zahragolpa" target="_blank" rel="noopener noreferrer"><img alt="zahragolpa" title="zahragolpa" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20627999?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/yisaienkov" target="_blank" rel="noopener noreferrer"><img alt="yisaienkov" title="yisaienkov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25550114?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/WesleyYue" target="_blank" rel="noopener noreferrer"><img alt="WesleyYue" title="WesleyYue" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5280818?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/notplus" target="_blank" rel="noopener noreferrer"><img alt="notplus" title="notplus" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47047973?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/vladserkoff" target="_blank" rel="noopener noreferrer"><img alt="vladserkoff" title="vladserkoff" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/9671366?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/johngull" target="_blank" rel="noopener noreferrer"><img alt="johngull" title="johngull" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1451797?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Vcv85" target="_blank" rel="noopener noreferrer"><img alt="Vcv85" title="Vcv85" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34249234?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/t-hanya" target="_blank" rel="noopener noreferrer"><img alt="t-hanya" title="t-hanya" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18238865?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/dskkato" target="_blank" rel="noopener noreferrer"><img alt="dskkato" title="dskkato" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/46243939?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/DBusAI" target="_blank" rel="noopener noreferrer"><img alt="DBusAI" title="DBusAI" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25797646?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ChristofHenkel" target="_blank" rel="noopener noreferrer"><img alt="ChristofHenkel" title="ChristofHenkel" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/24292431?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Juphex" target="_blank" rel="noopener noreferrer"><img alt="Juphex" title="Juphex" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20795522?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/cdicle" target="_blank" rel="noopener noreferrer"><img alt="cdicle" title="cdicle" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4490559?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Multihuntr" target="_blank" rel="noopener noreferrer"><img alt="Multihuntr" title="Multihuntr" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10515040?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/spsancti" target="_blank" rel="noopener noreferrer"><img alt="spsancti" title="spsancti" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8904387?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Callidior" target="_blank" rel="noopener noreferrer"><img alt="Callidior" title="Callidior" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7915048?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/poke1024" target="_blank" rel="noopener noreferrer"><img alt="poke1024" title="poke1024" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/11859538?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bmabey" target="_blank" rel="noopener noreferrer"><img alt="bmabey" title="bmabey" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1837?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/artyompal" target="_blank" rel="noopener noreferrer"><img alt="artyompal" title="artyompal" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/15948033?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/sneddy" target="_blank" rel="noopener noreferrer"><img alt="sneddy" title="sneddy" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/11714584?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/AndreyGurevich" target="_blank" rel="noopener noreferrer"><img alt="AndreyGurevich" title="AndreyGurevich" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8684917?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Erlemar" target="_blank" rel="noopener noreferrer"><img alt="Erlemar" title="Erlemar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/19382479?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/agchang-cgl" target="_blank" rel="noopener noreferrer"><img alt="agchang-cgl" title="agchang-cgl" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/61133390?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alicangok" target="_blank" rel="noopener noreferrer"><img alt="alicangok" title="alicangok" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/26303095?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/golunovas" target="_blank" rel="noopener noreferrer"><img alt="golunovas" title="golunovas" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/19951039?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alxndrkalinin" target="_blank" rel="noopener noreferrer"><img alt="alxndrkalinin" title="alxndrkalinin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1107762?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kinoooshnik" target="_blank" rel="noopener noreferrer"><img alt="kinoooshnik" title="kinoooshnik" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3081499?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Alex-JG3" target="_blank" rel="noopener noreferrer"><img alt="Alex-JG3" title="Alex-JG3" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/49785867?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/aidonchuk" target="_blank" rel="noopener noreferrer"><img alt="aidonchuk" title="aidonchuk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/485590?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alessiobonfiglio" target="_blank" rel="noopener noreferrer"><img alt="alessiobonfiglio" title="alessiobonfiglio" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/48260001?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/belskikh" target="_blank" rel="noopener noreferrer"><img alt="belskikh" title="belskikh" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13022305?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/aaronzs" target="_blank" rel="noopener noreferrer"><img alt="aaronzs" title="aaronzs" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1827365?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/AaronPinto" target="_blank" rel="noopener noreferrer"><img alt="AaronPinto" title="AaronPinto" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10177767?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ah651" target="_blank" rel="noopener noreferrer"><img alt="ah651" title="ah651" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/26356801?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/maremun" target="_blank" rel="noopener noreferrer"><img alt="maremun" title="maremun" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3645135?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/InCogNiTo124" target="_blank" rel="noopener noreferrer"><img alt="InCogNiTo124" title="InCogNiTo124" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/12953598?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/maksimovkonstantin" target="_blank" rel="noopener noreferrer"><img alt="maksimovkonstantin" title="maksimovkonstantin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10925437?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Kupchanski" target="_blank" rel="noopener noreferrer"><img alt="Kupchanski" title="Kupchanski" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13433909?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kirillbobyrev" target="_blank" rel="noopener noreferrer"><img alt="kirillbobyrev" title="kirillbobyrev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3352968?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/immortalCO" target="_blank" rel="noopener noreferrer"><img alt="immortalCO" title="immortalCO" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/22522323?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jveitchmichaelis" target="_blank" rel="noopener noreferrer"><img alt="jveitchmichaelis" title="jveitchmichaelis" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3159591?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Erotemic" target="_blank" rel="noopener noreferrer"><img alt="Erotemic" title="Erotemic" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3186211?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/xiaoyuan0203" target="_blank" rel="noopener noreferrer"><img alt="xiaoyuan0203" title="xiaoyuan0203" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/78028879?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jaehyuck0103" target="_blank" rel="noopener noreferrer"><img alt="jaehyuck0103" title="jaehyuck0103" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/16634896?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/cannon" target="_blank" rel="noopener noreferrer"><img alt="cannon" title="cannon" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8286907?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ifeherva" target="_blank" rel="noopener noreferrer"><img alt="ifeherva" title="ifeherva" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3716849?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Ingwar" target="_blank" rel="noopener noreferrer"><img alt="Ingwar" title="Ingwar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1016854?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/hoel-bagard" target="_blank" rel="noopener noreferrer"><img alt="hoel-bagard" title="hoel-bagard" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34478245?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/henrique" target="_blank" rel="noopener noreferrer"><img alt="henrique" title="henrique" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/128897?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/hassiahk" target="_blank" rel="noopener noreferrer"><img alt="hassiahk" title="hassiahk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13920778?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/steermomo" target="_blank" rel="noopener noreferrer"><img alt="steermomo" title="steermomo" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10905619?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Gurubaseio" target="_blank" rel="noopener noreferrer"><img alt="Gurubaseio" title="Gurubaseio" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/190025715?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/georgymironov" target="_blank" rel="noopener noreferrer"><img alt="georgymironov" title="georgymironov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5487801?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/GalDude33" target="_blank" rel="noopener noreferrer"><img alt="GalDude33" title="GalDude33" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18334824?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/farizrahman4u" target="_blank" rel="noopener noreferrer"><img alt="farizrahman4u" title="farizrahman4u" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/11006006?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/maruschin" target="_blank" rel="noopener noreferrer"><img alt="maruschin" title="maruschin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/678620?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ErlingLie" target="_blank" rel="noopener noreferrer"><img alt="ErlingLie" title="ErlingLie" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/55686577?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/zetyquickly" target="_blank" rel="noopener noreferrer"><img alt="zetyquickly" title="zetyquickly" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25350960?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/plashchynski" target="_blank" rel="noopener noreferrer"><img alt="plashchynski" title="plashchynski" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/30833?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/cortwave" target="_blank" rel="noopener noreferrer"><img alt="cortwave" title="cortwave" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8936357?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Datasciensyash" target="_blank" rel="noopener noreferrer"><img alt="Datasciensyash" title="Datasciensyash" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/60406311?v=4"/></a></div></div></div><div class="bg-gradient-to-b from-blue-50 to-white py-8 md:py-12"><div class="container mx-auto px-4"><div class="text-center mb-8"><h2 class="text-2xl font-medium mb-3">Community-Driven Project, Supported By</h2><p class="text-gray-600 text-lg max-w-2xl mx-auto">Albumentations thrives on developer contributions. We appreciate our sponsors who help sustain the project&#x27;s infrastructure.</p></div><div class="grid md:grid-cols-3 gap-8 max-w-6xl mx-auto"><div class="bg-gradient-to-b from-amber-50 to-white p-6 rounded-xl border border-amber-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-600 bg-amber-50 px-4 py-1.5 rounded-full">Gold Sponsors</span></div><div class="h-32 flex items-center justify-center text-gray-400 text-lg">Your company could be here</div></div><div class="bg-gradient-to-b from-gray-50 to-white p-6 rounded-xl border border-gray-100"><div class="text-center mb-4"><span class="text-base font-semibold text-gray-600 bg-gray-50 px-4 py-1.5 rounded-full">Silver Sponsors</span></div><div class="flex flex-col gap-6 items-center"><a href="https://datature.io" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-4 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Datature" loading="lazy" width="180" height="60" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/datature-full.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Build, Train, and Deploy Enterprise Computer Vision Applications in One Platform</div></div></a></div></div><div class="bg-gradient-to-b from-orange-50 to-white p-6 rounded-xl border border-orange-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-700 bg-amber-50 px-4 py-1.5 rounded-full">Bronze Sponsors</span></div><div class="flex flex-col gap-4 items-center"><a href="https://roboflow.com" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-3 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Roboflow" loading="lazy" width="160" height="50" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/roboflow.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Computer vision infrastructure for developers</div></div></a></div></div></div><div class="text-center mt-8 space-y-4"><a href="https://github.com/sponsors/albumentations-team" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-3 px-6 py-3 bg-white border-2 border-pink-100 hover:border-pink-200 rounded-full shadow-sm hover:shadow-md transition-all group"><span class="text-lg font-medium text-gray-700">Become a Sponsor</span><i class="fas fa-heart text-pink-500 group-hover:scale-110 transition-transform"></i></a><div class="text-gray-500">View sponsorship tiers and benefits on GitHub Sponsors<i class="fas fa-external-link-alt ml-2 text-sm"></i></div></div></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\nb:I[6213,[],\"OutletBoundary\"]\nd:I[6213,[],\"MetadataBoundary\"]\nf:I[6213,[],\"ViewportBoundary\"]\n11:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"ZNcQrWMYk7ymGN9y8PjnQ\",\"p\":\"\",\"c\":[\"\",\"people\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"people\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"people\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"people\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[\"$La\",null,[\"$\",\"$Lb\",null,{\"children\":\"$Lc\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[null,[\"$\",\"$4\",\"mvY_dJGdwDys125CocXmB\",{\"children\":[[\"$\",\"$Ld\",null,{\"children\":\"$Le\"}],[\"$\",\"$Lf\",null,{\"children\":\"$L10\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$11\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"10:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\ne:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"link\",\"21\",{\"re"])</script><script>self.__next_f.push([1,"l\":\"icon\",\"href\":\"/icon.svg?ed95530d83f93aed\",\"type\":\"image/svg+xml\",\"sizes\":\"any\"}]]\n"])</script><script>self.__next_f.push([1,"c:null\n"])</script><script>self.__next_f.push([1,"12:I[5984,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"284\",\"static/chunks/app/people/page-17007ded85d614d1.js\"],\"TeamMemberCard\"]\n13:I[7970,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"284\",\"static/chunks/app/people/page-17007ded85d614d1.js\"],\"Image\"]\n14:I[4265,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"284\",\"static/chunks/app/people/page-17007ded85d614d1.js\"],\"default\"]\n"])</script><script>self.__next_f.push([1,"a:[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[[\"$\",\"div\",null,{\"className\":\"mb-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-3xl font-medium text-center mb-8\",\"children\":\"Core team\"}],[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 grid-cols-1 gap-4 max-w-5xl mx-auto\",\"children\":[[\"$\",\"$L12\",\"Vladimir Iglovikov\",{\"member\":{\"name\":\"Vladimir Iglovikov\",\"photo\":\"iglovikov.jpg\",\"twitter\":\"viglovikov\",\"github\":\"ternaus\",\"linkedin\":\"iglovikov\",\"instagram\":\"ternaus\",\"kaggle\":\"iglovikov\",\"google_scholar\":\"vkjh9X0AAAAJ\"}}]]}]]}],[\"$\",\"div\",null,{\"className\":\"mb-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-2xl font-medium text-center mb-8\",\"children\":\"Honorary Developers. Were behind the creation of the library. Sadly not active anymore.\"}],[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-3 xl:grid-cols-4 gap-4 max-w-5xl mx-auto\",\"children\":[[\"$\",\"$L12\",\"Alexander Buslaev\",{\"member\":{\"name\":\"Alexander Buslaev\",\"photo\":\"buslaev.jpg\",\"twitter\":\"AlBuslaev\",\"github\":\"albu\",\"linkedin\":\"al-buslaev\",\"kaggle\":\"albuslaev\"}}],[\"$\",\"$L12\",\"Alex Parinov\",{\"member\":{\"name\":\"Alex Parinov\",\"photo\":\"parinov.jpg\",\"twitter\":\"creaf\",\"github\":\"creafz\",\"linkedin\":\"alex-parinov\",\"kaggle\":\"creafz\"}}],[\"$\",\"$L12\",\"Evegene Khvedchenya\",{\"member\":{\"name\":\"Evegene Khvedchenya\",\"photo\":\"khvedchenya.jpg\",\"twitter\":\"cvtalks\",\"github\":\"BloodAxe\",\"linkedin\":\"cvtalks\",\"kaggle\":\"bloodaxe\"}}],[\"$\",\"$L12\",\"Mikhail Druzhinin\",{\"member\":{\"name\":\"Mikhail Druzhinin\",\"photo\":\"druzhinin.jpg\",\"twitter\":\"Dipetm\",\"github\":\"Dipet\",\"linkedin\":\"mikhail-druzhinin-548229100\",\"kaggle\":\"dipetm\"}}]]}]]}],[\"$\",\"div\",null,{\"className\":\"mb-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-3xl font-medium text-center mb-8\",\"children\":\"Contributors\"}],[\"$\",\"div\",null,{\"className\":\"flex flex-wrap gap-2\",\"children\":[[\"$\",\"div\",\"ternaus\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ternaus\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5481618?v=4\",\"alt\":\"ternaus\",\"title\":\"ternaus\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"creafz\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/creafz\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/681989?v=4\",\"alt\":\"creafz\",\"title\":\"creafz\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Dipet\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Dipet\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7512250?v=4\",\"alt\":\"Dipet\",\"title\":\"Dipet\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"albu\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/albu\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1128788?v=4\",\"alt\":\"albu\",\"title\":\"albu\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"BloodAxe\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/BloodAxe\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/532320?v=4\",\"alt\":\"BloodAxe\",\"title\":\"BloodAxe\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ayasyrev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ayasyrev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/46181683?v=4\",\"alt\":\"ayasyrev\",\"title\":\"ayasyrev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"vfdev-5\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/vfdev-5\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2459423?v=4\",\"alt\":\"vfdev-5\",\"title\":\"vfdev-5\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"arsenyinfo\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/arsenyinfo\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4929993?v=4\",\"alt\":\"arsenyinfo\",\"title\":\"arsenyinfo\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"qubvel\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/qubvel\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/31920396?v=4\",\"alt\":\"qubvel\",\"title\":\"qubvel\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ZFTurbo\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ZFTurbo\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4013811?v=4\",\"alt\":\"ZFTurbo\",\"title\":\"ZFTurbo\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"i-aki-y\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/i-aki-y\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/9190086?v=4\",\"alt\":\"i-aki-y\",\"title\":\"i-aki-y\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"MichaelMonashev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/MichaelMonashev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/6323434?v=4\",\"alt\":\"MichaelMonashev\",\"title\":\"MichaelMonashev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"zakajd\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/zakajd\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/15848838?v=4\",\"alt\":\"zakajd\",\"title\":\"zakajd\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"victor1cea\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/victor1cea\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/12428490?v=4\",\"alt\":\"victor1cea\",\"title\":\"victor1cea\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"akarsakov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/akarsakov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5549721?v=4\",\"alt\":\"akarsakov\",\"title\":\"akarsakov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"IlyaOvodov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/IlyaOvodov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34230114?v=4\",\"alt\":\"IlyaOvodov\",\"title\":\"IlyaOvodov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"LinaShiryaeva\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/LinaShiryaeva\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8642759?v=4\",\"alt\":\"LinaShiryaeva\",\"title\":\"LinaShiryaeva\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"IliaLarchenko\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/IliaLarchenko\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/41329713?v=4\",\"alt\":\"IliaLarchenko\",\"title\":\"IliaLarchenko\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"StrikerRUS\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/StrikerRUS\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25141164?v=4\",\"alt\":\"StrikerRUS\",\"title\":\"StrikerRUS\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Arquestro\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Arquestro\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7341001?v=4\",\"alt\":\"Arquestro\",\"title\":\"Arquestro\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jangop\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jangop\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/17318436?v=4\",\"alt\":\"jangop\",\"title\":\"jangop\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"momincks\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/momincks\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/63245608?v=4\",\"alt\":\"momincks\",\"title\":\"momincks\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"guillaume-rochette-oxb\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/guillaume-rochette-oxb\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/129938647?v=4\",\"alt\":\"guillaume-rochette-oxb\",\"title\":\"guillaume-rochette-oxb\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bfialkoff\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bfialkoff\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/39707315?v=4\",\"alt\":\"bfialkoff\",\"title\":\"bfialkoff\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"VirajBagal\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/VirajBagal\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/51148252?v=4\",\"alt\":\"VirajBagal\",\"title\":\"VirajBagal\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"5n7-sk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/5n7-sk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/59395084?v=4\",\"alt\":\"5n7-sk\",\"title\":\"5n7-sk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"sergei3000\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/sergei3000\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18502517?v=4\",\"alt\":\"sergei3000\",\"title\":\"sergei3000\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"libfun\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/libfun\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/301147?v=4\",\"alt\":\"libfun\",\"title\":\"libfun\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"gavrin-s\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/gavrin-s\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/15730417?v=4\",\"alt\":\"gavrin-s\",\"title\":\"gavrin-s\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"SunQpark\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/SunQpark\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/31339583?v=4\",\"alt\":\"SunQpark\",\"title\":\"SunQpark\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alimbekovKZ\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alimbekovKZ\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2087546?v=4\",\"alt\":\"alimbekovKZ\",\"title\":\"alimbekovKZ\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"onurtore\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/onurtore\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20031290?v=4\",\"alt\":\"onurtore\",\"title\":\"onurtore\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ocourtin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ocourtin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1718264?v=4\",\"alt\":\"ocourtin\",\"title\":\"ocourtin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Aloqeely\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Aloqeely\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/52792999?v=4\",\"alt\":\"Aloqeely\",\"title\":\"Aloqeely\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ajinkyakhadilkar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ajinkyakhadilkar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/27129645?v=4\",\"alt\":\"ajinkyakhadilkar\",\"title\":\"ajinkyakhadilkar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alekseynp\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alekseynp\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4382608?v=4\",\"alt\":\"alekseynp\",\"title\":\"alekseynp\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alexander-rakhlin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alexander-rakhlin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10095956?v=4\",\"alt\":\"alexander-rakhlin\",\"title\":\"alexander-rakhlin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Andredance\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Andredance\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/23104744?v=4\",\"alt\":\"Andredance\",\"title\":\"Andredance\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"toshiks\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/toshiks\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/29678445?v=4\",\"alt\":\"toshiks\",\"title\":\"toshiks\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"erikgaas\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/erikgaas\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2773725?v=4\",\"alt\":\"erikgaas\",\"title\":\"erikgaas\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"domef\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/domef\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/38183728?v=4\",\"alt\":\"domef\",\"title\":\"domef\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"4pygmalion\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/4pygmalion\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/45510932?v=4\",\"alt\":\"4pygmalion\",\"title\":\"4pygmalion\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"JonasKlotz\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/JonasKlotz\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/19370958?v=4\",\"alt\":\"JonasKlotz\",\"title\":\"JonasKlotz\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"KiriLev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/KiriLev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13079303?v=4\",\"alt\":\"KiriLev\",\"title\":\"KiriLev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"marcocaccin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/marcocaccin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7538983?v=4\",\"alt\":\"marcocaccin\",\"title\":\"marcocaccin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"mys007\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/mys007\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5921083?v=4\",\"alt\":\"mys007\",\"title\":\"mys007\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"PerchunPak\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/PerchunPak\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/68118654?v=4\",\"alt\":\"PerchunPak\",\"title\":\"PerchunPak\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"philipp-fischer\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/philipp-fischer\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/12006597?v=4\",\"alt\":\"philipp-fischer\",\"title\":\"philipp-fischer\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"huuquan1994\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/huuquan1994\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/9111958?v=4\",\"alt\":\"huuquan1994\",\"title\":\"huuquan1994\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ruancomelli\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ruancomelli\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/22752929?v=4\",\"alt\":\"ruancomelli\",\"title\":\"ruancomelli\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kaichoulyc\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kaichoulyc\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/32312252?v=4\",\"alt\":\"kaichoulyc\",\"title\":\"kaichoulyc\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"rsokl\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/rsokl\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/29104956?v=4\",\"alt\":\"rsokl\",\"title\":\"rsokl\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bryant1410\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bryant1410\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3905501?v=4\",\"alt\":\"bryant1410\",\"title\":\"bryant1410\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bes-dev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bes-dev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3617413?v=4\",\"alt\":\"bes-dev\",\"title\":\"bes-dev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"matsumotosan\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/matsumotosan\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/27728280?v=4\",\"alt\":\"matsumotosan\",\"title\":\"matsumotosan\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tashay\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tashay\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14319581?v=4\",\"alt\":\"tashay\",\"title\":\"tashay\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tmramalho\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tmramalho\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2317298?v=4\",\"alt\":\"tmramalho\",\"title\":\"tmramalho\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"timgates42\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/timgates42\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47873678?v=4\",\"alt\":\"timgates42\",\"title\":\"timgates42\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tabula-rosa\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tabula-rosa\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/45348789?v=4\",\"alt\":\"tabula-rosa\",\"title\":\"tabula-rosa\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"rbu\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/rbu\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/65913?v=4\",\"alt\":\"rbu\",\"title\":\"rbu\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ORippler\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ORippler\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/24656669?v=4\",\"alt\":\"ORippler\",\"title\":\"ORippler\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"nathanhubens\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/nathanhubens\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/23050329?v=4\",\"alt\":\"nathanhubens\",\"title\":\"nathanhubens\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"NatanBagrov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/NatanBagrov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20153650?v=4\",\"alt\":\"NatanBagrov\",\"title\":\"NatanBagrov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"namangup\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/namangup\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/32638824?v=4\",\"alt\":\"namangup\",\"title\":\"namangup\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"b0nce\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/b0nce\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/27873390?v=4\",\"alt\":\"b0nce\",\"title\":\"b0nce\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"amirassov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/amirassov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14277430?v=4\",\"alt\":\"amirassov\",\"title\":\"amirassov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"deleomike\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/deleomike\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34043765?v=4\",\"alt\":\"deleomike\",\"title\":\"deleomike\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"mennohofste\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/mennohofste\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/21225762?v=4\",\"alt\":\"mennohofste\",\"title\":\"mennohofste\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"matejpekar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/matejpekar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/69220871?v=4\",\"alt\":\"matejpekar\",\"title\":\"matejpekar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Matthew-J-Payne\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Matthew-J-Payne\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/106391289?v=4\",\"alt\":\"Matthew-J-Payne\",\"title\":\"Matthew-J-Payne\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"gogetron\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/gogetron\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/65114284?v=4\",\"alt\":\"gogetron\",\"title\":\"gogetron\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"mrsmrynk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/mrsmrynk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/66388955?v=4\",\"alt\":\"mrsmrynk\",\"title\":\"mrsmrynk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"daisukelab\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/daisukelab\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14831220?v=4\",\"alt\":\"daisukelab\",\"title\":\"daisukelab\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Diyago\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Diyago\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8073766?v=4\",\"alt\":\"Diyago\",\"title\":\"Diyago\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"thomaoc1\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/thomaoc1\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/73828133?v=4\",\"alt\":\"thomaoc1\",\"title\":\"thomaoc1\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tatigabru\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tatigabru\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47729437?v=4\",\"alt\":\"tatigabru\",\"title\":\"tatigabru\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"shyn\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/shyn\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/899449?v=4\",\"alt\":\"shyn\",\"title\":\"shyn\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ryoryon66\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ryoryon66\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/46624038?v=4\",\"alt\":\"ryoryon66\",\"title\":\"ryoryon66\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"oguz-hanoglu\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/oguz-hanoglu\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/99868596?v=4\",\"alt\":\"oguz-hanoglu\",\"title\":\"oguz-hanoglu\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"notmatthancock\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/notmatthancock\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1588456?v=4\",\"alt\":\"notmatthancock\",\"title\":\"notmatthancock\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"loicmagne\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/loicmagne\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/53355258?v=4\",\"alt\":\"loicmagne\",\"title\":\"loicmagne\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kmistry-wx\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kmistry-wx\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/39906469?v=4\",\"alt\":\"kmistry-wx\",\"title\":\"kmistry-wx\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jovenwayfarer\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jovenwayfarer\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47921506?v=4\",\"alt\":\"jovenwayfarer\",\"title\":\"jovenwayfarer\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jasonrock-a3\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jasonrock-a3\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/107961397?v=4\",\"alt\":\"jasonrock-a3\",\"title\":\"jasonrock-a3\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"iRyoka\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/iRyoka\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/795201?v=4\",\"alt\":\"iRyoka\",\"title\":\"iRyoka\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"pomacanthidae\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/pomacanthidae\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18525734?v=4\",\"alt\":\"pomacanthidae\",\"title\":\"pomacanthidae\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"haarisr\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/haarisr\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/122410226?v=4\",\"alt\":\"haarisr\",\"title\":\"haarisr\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"gyaneshwar-sunkara\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/gyaneshwar-sunkara\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/52619288?v=4\",\"alt\":\"gyaneshwar-sunkara\",\"title\":\"gyaneshwar-sunkara\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"dmitrie-ai\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/dmitrie-ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/72915468?v=4\",\"alt\":\"dmitrie-ai\",\"title\":\"dmitrie-ai\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"dependabot[bot]\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/dependabot[bot]\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/in/29110?v=4\",\"alt\":\"dependabot[bot]\",\"title\":\"dependabot[bot]\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bonlime\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bonlime\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14337581?v=4\",\"alt\":\"bonlime\",\"title\":\"bonlime\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"aaroswings\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/aaroswings\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/35984064?v=4\",\"alt\":\"aaroswings\",\"title\":\"aaroswings\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"zahragolpa\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/zahragolpa\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20627999?v=4\",\"alt\":\"zahragolpa\",\"title\":\"zahragolpa\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"yisaienkov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/yisaienkov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25550114?v=4\",\"alt\":\"yisaienkov\",\"title\":\"yisaienkov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"WesleyYue\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/WesleyYue\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5280818?v=4\",\"alt\":\"WesleyYue\",\"title\":\"WesleyYue\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"notplus\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/notplus\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47047973?v=4\",\"alt\":\"notplus\",\"title\":\"notplus\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"vladserkoff\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/vladserkoff\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/9671366?v=4\",\"alt\":\"vladserkoff\",\"title\":\"vladserkoff\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"johngull\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/johngull\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1451797?v=4\",\"alt\":\"johngull\",\"title\":\"johngull\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Vcv85\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Vcv85\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34249234?v=4\",\"alt\":\"Vcv85\",\"title\":\"Vcv85\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"t-hanya\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/t-hanya\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18238865?v=4\",\"alt\":\"t-hanya\",\"title\":\"t-hanya\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"dskkato\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/dskkato\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/46243939?v=4\",\"alt\":\"dskkato\",\"title\":\"dskkato\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"DBusAI\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/DBusAI\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25797646?v=4\",\"alt\":\"DBusAI\",\"title\":\"DBusAI\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ChristofHenkel\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ChristofHenkel\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/24292431?v=4\",\"alt\":\"ChristofHenkel\",\"title\":\"ChristofHenkel\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Juphex\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Juphex\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20795522?v=4\",\"alt\":\"Juphex\",\"title\":\"Juphex\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"cdicle\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/cdicle\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4490559?v=4\",\"alt\":\"cdicle\",\"title\":\"cdicle\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Multihuntr\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Multihuntr\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10515040?v=4\",\"alt\":\"Multihuntr\",\"title\":\"Multihuntr\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"spsancti\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/spsancti\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8904387?v=4\",\"alt\":\"spsancti\",\"title\":\"spsancti\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Callidior\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Callidior\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7915048?v=4\",\"alt\":\"Callidior\",\"title\":\"Callidior\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"poke1024\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/poke1024\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/11859538?v=4\",\"alt\":\"poke1024\",\"title\":\"poke1024\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bmabey\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bmabey\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1837?v=4\",\"alt\":\"bmabey\",\"title\":\"bmabey\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"artyompal\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/artyompal\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/15948033?v=4\",\"alt\":\"artyompal\",\"title\":\"artyompal\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"sneddy\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/sneddy\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/11714584?v=4\",\"alt\":\"sneddy\",\"title\":\"sneddy\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"AndreyGurevich\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/AndreyGurevich\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8684917?v=4\",\"alt\":\"AndreyGurevich\",\"title\":\"AndreyGurevich\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Erlemar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Erlemar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/19382479?v=4\",\"alt\":\"Erlemar\",\"title\":\"Erlemar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"agchang-cgl\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/agchang-cgl\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/61133390?v=4\",\"alt\":\"agchang-cgl\",\"title\":\"agchang-cgl\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alicangok\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alicangok\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/26303095?v=4\",\"alt\":\"alicangok\",\"title\":\"alicangok\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"golunovas\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/golunovas\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/19951039?v=4\",\"alt\":\"golunovas\",\"title\":\"golunovas\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alxndrkalinin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alxndrkalinin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1107762?v=4\",\"alt\":\"alxndrkalinin\",\"title\":\"alxndrkalinin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kinoooshnik\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kinoooshnik\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3081499?v=4\",\"alt\":\"kinoooshnik\",\"title\":\"kinoooshnik\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Alex-JG3\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Alex-JG3\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/49785867?v=4\",\"alt\":\"Alex-JG3\",\"title\":\"Alex-JG3\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"aidonchuk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/aidonchuk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/485590?v=4\",\"alt\":\"aidonchuk\",\"title\":\"aidonchuk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alessiobonfiglio\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alessiobonfiglio\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/48260001?v=4\",\"alt\":\"alessiobonfiglio\",\"title\":\"alessiobonfiglio\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"belskikh\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/belskikh\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13022305?v=4\",\"alt\":\"belskikh\",\"title\":\"belskikh\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"aaronzs\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/aaronzs\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1827365?v=4\",\"alt\":\"aaronzs\",\"title\":\"aaronzs\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"AaronPinto\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/AaronPinto\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10177767?v=4\",\"alt\":\"AaronPinto\",\"title\":\"AaronPinto\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ah651\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ah651\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/26356801?v=4\",\"alt\":\"ah651\",\"title\":\"ah651\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"maremun\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/maremun\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3645135?v=4\",\"alt\":\"maremun\",\"title\":\"maremun\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"InCogNiTo124\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/InCogNiTo124\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/12953598?v=4\",\"alt\":\"InCogNiTo124\",\"title\":\"InCogNiTo124\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"maksimovkonstantin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/maksimovkonstantin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10925437?v=4\",\"alt\":\"maksimovkonstantin\",\"title\":\"maksimovkonstantin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Kupchanski\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Kupchanski\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13433909?v=4\",\"alt\":\"Kupchanski\",\"title\":\"Kupchanski\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kirillbobyrev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kirillbobyrev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3352968?v=4\",\"alt\":\"kirillbobyrev\",\"title\":\"kirillbobyrev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"immortalCO\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/immortalCO\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/22522323?v=4\",\"alt\":\"immortalCO\",\"title\":\"immortalCO\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jveitchmichaelis\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jveitchmichaelis\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3159591?v=4\",\"alt\":\"jveitchmichaelis\",\"title\":\"jveitchmichaelis\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Erotemic\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Erotemic\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3186211?v=4\",\"alt\":\"Erotemic\",\"title\":\"Erotemic\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"xiaoyuan0203\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/xiaoyuan0203\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/78028879?v=4\",\"alt\":\"xiaoyuan0203\",\"title\":\"xiaoyuan0203\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jaehyuck0103\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jaehyuck0103\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/16634896?v=4\",\"alt\":\"jaehyuck0103\",\"title\":\"jaehyuck0103\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"cannon\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/cannon\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8286907?v=4\",\"alt\":\"cannon\",\"title\":\"cannon\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ifeherva\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ifeherva\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3716849?v=4\",\"alt\":\"ifeherva\",\"title\":\"ifeherva\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Ingwar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Ingwar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1016854?v=4\",\"alt\":\"Ingwar\",\"title\":\"Ingwar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"hoel-bagard\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/hoel-bagard\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34478245?v=4\",\"alt\":\"hoel-bagard\",\"title\":\"hoel-bagard\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"henrique\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/henrique\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/128897?v=4\",\"alt\":\"henrique\",\"title\":\"henrique\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"hassiahk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/hassiahk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13920778?v=4\",\"alt\":\"hassiahk\",\"title\":\"hassiahk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"steermomo\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/steermomo\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10905619?v=4\",\"alt\":\"steermomo\",\"title\":\"steermomo\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Gurubaseio\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Gurubaseio\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/190025715?v=4\",\"alt\":\"Gurubaseio\",\"title\":\"Gurubaseio\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"georgymironov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/georgymironov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5487801?v=4\",\"alt\":\"georgymironov\",\"title\":\"georgymironov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"GalDude33\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/GalDude33\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18334824?v=4\",\"alt\":\"GalDude33\",\"title\":\"GalDude33\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"farizrahman4u\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/farizrahman4u\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/11006006?v=4\",\"alt\":\"farizrahman4u\",\"title\":\"farizrahman4u\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"maruschin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/maruschin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/678620?v=4\",\"alt\":\"maruschin\",\"title\":\"maruschin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ErlingLie\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ErlingLie\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/55686577?v=4\",\"alt\":\"ErlingLie\",\"title\":\"ErlingLie\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"zetyquickly\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/zetyquickly\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25350960?v=4\",\"alt\":\"zetyquickly\",\"title\":\"zetyquickly\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"plashchynski\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/plashchynski\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/30833?v=4\",\"alt\":\"plashchynski\",\"title\":\"plashchynski\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"cortwave\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/cortwave\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8936357?v=4\",\"alt\":\"cortwave\",\"title\":\"cortwave\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Datasciensyash\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Datasciensyash\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/60406311?v=4\",\"alt\":\"Datasciensyash\",\"title\":\"Datasciensyash\",\"width\":70,\"height\":70}]}]}]]}]]}],[\"$\",\"$L14\",null,{}]]}]\n"])</script></body></html>
\ No newline at end of file
+<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><script src="/_next/static/chunks/app/people/page-17007ded85d614d1.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>Albumentations: fast and flexible image augmentations</title><meta name="description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Albumentations: fast and flexible image augmentations"/><meta property="og:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Albumentations: fast and flexible image augmentations"/><meta name="twitter:description" content="Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="icon" href="/icon.svg?ed95530d83f93aed" type="image/svg+xml" sizes="any"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div class="container mx-auto px-4 py-8"><div class="mb-12"><h1 class="text-3xl font-medium text-center mb-8">Core team</h1><div class="grid grid-cols-1 grid-cols-1 gap-4 max-w-5xl mx-auto"><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Vladimir Iglovikov" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/iglovikov.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Vladimir Iglovikov</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/iglovikov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/ternaus" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/viglovikov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/iglovikov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li><li><a href="https://scholar.google.com/citations?user=vkjh9X0AAAAJ" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-google "></i></a></li><li><a href="https://instagram.com/ternaus" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-instagram "></i></a></li></ul></div></div></div></div><div class="mb-12"><h1 class="text-2xl font-medium text-center mb-8">Honorary Developers. Were behind the creation of the library. Sadly not active anymore.</h1><div class="grid grid-cols-1 md:grid-cols-3 xl:grid-cols-4 gap-4 max-w-5xl mx-auto"><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Alexander Buslaev" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/buslaev.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Alexander Buslaev</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/al-buslaev" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/albu" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/AlBuslaev" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/albuslaev" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Alex Parinov" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/parinov.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Alex Parinov</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/alex-parinov" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/creafz" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/creaf" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/creafz" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Evegene Khvedchenya" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/khvedchenya.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Evegene Khvedchenya</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/cvtalks" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/BloodAxe" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/cvtalks" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/bloodaxe" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div><div class="col px-2 mb-3"><div class="card shadow p-4 flex flex-col items-center text-center "><div class="mb-4"><img alt="Mikhail Druzhinin" loading="lazy" width="200" height="200" decoding="async" data-nimg="1" class="img-fluid" style="color:transparent" src="/assets/team_avatars/druzhinin.jpg"/></div><h4 class="h5 h-[60px] font-bold text-xl">Mikhail Druzhinin</h4><ul class="list-none flex gap-2 mb-0 mt-2"><li><a href="https://linkedin.com/in/mikhail-druzhinin-548229100" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-linkedin "></i></a></li><li><a href="https://github.com/Dipet" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-github "></i></a></li><li><a href="https://twitter.com/Dipetm" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter "></i></a></li><li><a href="https://kaggle.com/dipetm" class="inline-flex items-center justify-center font-medium transition-colors border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm rounded" target="_blank" rel="noopener noreferrer"><i class="fab fa-kaggle "></i></a></li></ul></div></div></div></div><div class="mb-12"><h1 class="text-3xl font-medium text-center mb-8">Contributors</h1><div class="flex flex-wrap gap-2"><div class="border border-gray-300"><a href="https://github.com/ternaus" target="_blank" rel="noopener noreferrer"><img alt="ternaus" title="ternaus" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5481618?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/creafz" target="_blank" rel="noopener noreferrer"><img alt="creafz" title="creafz" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/681989?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Dipet" target="_blank" rel="noopener noreferrer"><img alt="Dipet" title="Dipet" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7512250?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/albu" target="_blank" rel="noopener noreferrer"><img alt="albu" title="albu" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1128788?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/BloodAxe" target="_blank" rel="noopener noreferrer"><img alt="BloodAxe" title="BloodAxe" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/532320?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ayasyrev" target="_blank" rel="noopener noreferrer"><img alt="ayasyrev" title="ayasyrev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/46181683?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/vfdev-5" target="_blank" rel="noopener noreferrer"><img alt="vfdev-5" title="vfdev-5" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2459423?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/arsenyinfo" target="_blank" rel="noopener noreferrer"><img alt="arsenyinfo" title="arsenyinfo" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4929993?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/qubvel" target="_blank" rel="noopener noreferrer"><img alt="qubvel" title="qubvel" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/31920396?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ZFTurbo" target="_blank" rel="noopener noreferrer"><img alt="ZFTurbo" title="ZFTurbo" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4013811?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/i-aki-y" target="_blank" rel="noopener noreferrer"><img alt="i-aki-y" title="i-aki-y" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/9190086?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/MichaelMonashev" target="_blank" rel="noopener noreferrer"><img alt="MichaelMonashev" title="MichaelMonashev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/6323434?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/zakajd" target="_blank" rel="noopener noreferrer"><img alt="zakajd" title="zakajd" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/15848838?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/victor1cea" target="_blank" rel="noopener noreferrer"><img alt="victor1cea" title="victor1cea" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/12428490?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/akarsakov" target="_blank" rel="noopener noreferrer"><img alt="akarsakov" title="akarsakov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5549721?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/IlyaOvodov" target="_blank" rel="noopener noreferrer"><img alt="IlyaOvodov" title="IlyaOvodov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34230114?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/LinaShiryaeva" target="_blank" rel="noopener noreferrer"><img alt="LinaShiryaeva" title="LinaShiryaeva" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8642759?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/IliaLarchenko" target="_blank" rel="noopener noreferrer"><img alt="IliaLarchenko" title="IliaLarchenko" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/41329713?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/StrikerRUS" target="_blank" rel="noopener noreferrer"><img alt="StrikerRUS" title="StrikerRUS" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25141164?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Arquestro" target="_blank" rel="noopener noreferrer"><img alt="Arquestro" title="Arquestro" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7341001?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jangop" target="_blank" rel="noopener noreferrer"><img alt="jangop" title="jangop" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/17318436?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/momincks" target="_blank" rel="noopener noreferrer"><img alt="momincks" title="momincks" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/63245608?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/guillaume-rochette-oxb" target="_blank" rel="noopener noreferrer"><img alt="guillaume-rochette-oxb" title="guillaume-rochette-oxb" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/129938647?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bfialkoff" target="_blank" rel="noopener noreferrer"><img alt="bfialkoff" title="bfialkoff" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/39707315?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/VirajBagal" target="_blank" rel="noopener noreferrer"><img alt="VirajBagal" title="VirajBagal" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/51148252?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/5n7-sk" target="_blank" rel="noopener noreferrer"><img alt="5n7-sk" title="5n7-sk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/59395084?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/sergei3000" target="_blank" rel="noopener noreferrer"><img alt="sergei3000" title="sergei3000" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18502517?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/libfun" target="_blank" rel="noopener noreferrer"><img alt="libfun" title="libfun" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/301147?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/gavrin-s" target="_blank" rel="noopener noreferrer"><img alt="gavrin-s" title="gavrin-s" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/15730417?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/SunQpark" target="_blank" rel="noopener noreferrer"><img alt="SunQpark" title="SunQpark" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/31339583?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alimbekovKZ" target="_blank" rel="noopener noreferrer"><img alt="alimbekovKZ" title="alimbekovKZ" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2087546?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/onurtore" target="_blank" rel="noopener noreferrer"><img alt="onurtore" title="onurtore" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20031290?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ocourtin" target="_blank" rel="noopener noreferrer"><img alt="ocourtin" title="ocourtin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1718264?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Aloqeely" target="_blank" rel="noopener noreferrer"><img alt="Aloqeely" title="Aloqeely" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/52792999?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ajinkyakhadilkar" target="_blank" rel="noopener noreferrer"><img alt="ajinkyakhadilkar" title="ajinkyakhadilkar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/27129645?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alekseynp" target="_blank" rel="noopener noreferrer"><img alt="alekseynp" title="alekseynp" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4382608?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alexander-rakhlin" target="_blank" rel="noopener noreferrer"><img alt="alexander-rakhlin" title="alexander-rakhlin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10095956?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Andredance" target="_blank" rel="noopener noreferrer"><img alt="Andredance" title="Andredance" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/23104744?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/toshiks" target="_blank" rel="noopener noreferrer"><img alt="toshiks" title="toshiks" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/29678445?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/erikgaas" target="_blank" rel="noopener noreferrer"><img alt="erikgaas" title="erikgaas" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2773725?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/domef" target="_blank" rel="noopener noreferrer"><img alt="domef" title="domef" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/38183728?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/4pygmalion" target="_blank" rel="noopener noreferrer"><img alt="4pygmalion" title="4pygmalion" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/45510932?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/JonasKlotz" target="_blank" rel="noopener noreferrer"><img alt="JonasKlotz" title="JonasKlotz" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/19370958?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/KiriLev" target="_blank" rel="noopener noreferrer"><img alt="KiriLev" title="KiriLev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13079303?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/marcocaccin" target="_blank" rel="noopener noreferrer"><img alt="marcocaccin" title="marcocaccin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7538983?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/mys007" target="_blank" rel="noopener noreferrer"><img alt="mys007" title="mys007" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5921083?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/PerchunPak" target="_blank" rel="noopener noreferrer"><img alt="PerchunPak" title="PerchunPak" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/68118654?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/philipp-fischer" target="_blank" rel="noopener noreferrer"><img alt="philipp-fischer" title="philipp-fischer" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/12006597?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/huuquan1994" target="_blank" rel="noopener noreferrer"><img alt="huuquan1994" title="huuquan1994" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/9111958?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ruancomelli" target="_blank" rel="noopener noreferrer"><img alt="ruancomelli" title="ruancomelli" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/22752929?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kaichoulyc" target="_blank" rel="noopener noreferrer"><img alt="kaichoulyc" title="kaichoulyc" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/32312252?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/rsokl" target="_blank" rel="noopener noreferrer"><img alt="rsokl" title="rsokl" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/29104956?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bryant1410" target="_blank" rel="noopener noreferrer"><img alt="bryant1410" title="bryant1410" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3905501?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bes-dev" target="_blank" rel="noopener noreferrer"><img alt="bes-dev" title="bes-dev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3617413?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/matsumotosan" target="_blank" rel="noopener noreferrer"><img alt="matsumotosan" title="matsumotosan" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/27728280?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tashay" target="_blank" rel="noopener noreferrer"><img alt="tashay" title="tashay" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14319581?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tmramalho" target="_blank" rel="noopener noreferrer"><img alt="tmramalho" title="tmramalho" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/2317298?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/timgates42" target="_blank" rel="noopener noreferrer"><img alt="timgates42" title="timgates42" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47873678?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tabula-rosa" target="_blank" rel="noopener noreferrer"><img alt="tabula-rosa" title="tabula-rosa" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/45348789?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/rbu" target="_blank" rel="noopener noreferrer"><img alt="rbu" title="rbu" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/65913?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ORippler" target="_blank" rel="noopener noreferrer"><img alt="ORippler" title="ORippler" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/24656669?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/nathanhubens" target="_blank" rel="noopener noreferrer"><img alt="nathanhubens" title="nathanhubens" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/23050329?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/NatanBagrov" target="_blank" rel="noopener noreferrer"><img alt="NatanBagrov" title="NatanBagrov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20153650?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/namangup" target="_blank" rel="noopener noreferrer"><img alt="namangup" title="namangup" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/32638824?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/b0nce" target="_blank" rel="noopener noreferrer"><img alt="b0nce" title="b0nce" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/27873390?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/amirassov" target="_blank" rel="noopener noreferrer"><img alt="amirassov" title="amirassov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14277430?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/deleomike" target="_blank" rel="noopener noreferrer"><img alt="deleomike" title="deleomike" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34043765?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/mennohofste" target="_blank" rel="noopener noreferrer"><img alt="mennohofste" title="mennohofste" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/21225762?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/matejpekar" target="_blank" rel="noopener noreferrer"><img alt="matejpekar" title="matejpekar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/69220871?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Matthew-J-Payne" target="_blank" rel="noopener noreferrer"><img alt="Matthew-J-Payne" title="Matthew-J-Payne" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/106391289?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/gogetron" target="_blank" rel="noopener noreferrer"><img alt="gogetron" title="gogetron" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/65114284?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/mrsmrynk" target="_blank" rel="noopener noreferrer"><img alt="mrsmrynk" title="mrsmrynk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/66388955?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/daisukelab" target="_blank" rel="noopener noreferrer"><img alt="daisukelab" title="daisukelab" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14831220?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Diyago" target="_blank" rel="noopener noreferrer"><img alt="Diyago" title="Diyago" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8073766?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/thomaoc1" target="_blank" rel="noopener noreferrer"><img alt="thomaoc1" title="thomaoc1" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/73828133?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/tatigabru" target="_blank" rel="noopener noreferrer"><img alt="tatigabru" title="tatigabru" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47729437?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/shyn" target="_blank" rel="noopener noreferrer"><img alt="shyn" title="shyn" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/899449?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ryoryon66" target="_blank" rel="noopener noreferrer"><img alt="ryoryon66" title="ryoryon66" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/46624038?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/oguz-hanoglu" target="_blank" rel="noopener noreferrer"><img alt="oguz-hanoglu" title="oguz-hanoglu" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/99868596?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/notmatthancock" target="_blank" rel="noopener noreferrer"><img alt="notmatthancock" title="notmatthancock" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1588456?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/loicmagne" target="_blank" rel="noopener noreferrer"><img alt="loicmagne" title="loicmagne" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/53355258?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kmistry-wx" target="_blank" rel="noopener noreferrer"><img alt="kmistry-wx" title="kmistry-wx" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/39906469?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jovenwayfarer" target="_blank" rel="noopener noreferrer"><img alt="jovenwayfarer" title="jovenwayfarer" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47921506?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jasonrock-a3" target="_blank" rel="noopener noreferrer"><img alt="jasonrock-a3" title="jasonrock-a3" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/107961397?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/iRyoka" target="_blank" rel="noopener noreferrer"><img alt="iRyoka" title="iRyoka" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/795201?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/pomacanthidae" target="_blank" rel="noopener noreferrer"><img alt="pomacanthidae" title="pomacanthidae" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18525734?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/haarisr" target="_blank" rel="noopener noreferrer"><img alt="haarisr" title="haarisr" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/122410226?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/gyaneshwar-sunkara" target="_blank" rel="noopener noreferrer"><img alt="gyaneshwar-sunkara" title="gyaneshwar-sunkara" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/52619288?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/dmitrie-ai" target="_blank" rel="noopener noreferrer"><img alt="dmitrie-ai" title="dmitrie-ai" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/72915468?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/dependabot[bot]" target="_blank" rel="noopener noreferrer"><img alt="dependabot[bot]" title="dependabot[bot]" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/in/29110?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bonlime" target="_blank" rel="noopener noreferrer"><img alt="bonlime" title="bonlime" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/14337581?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/aaroswings" target="_blank" rel="noopener noreferrer"><img alt="aaroswings" title="aaroswings" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/35984064?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/zahragolpa" target="_blank" rel="noopener noreferrer"><img alt="zahragolpa" title="zahragolpa" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20627999?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/yisaienkov" target="_blank" rel="noopener noreferrer"><img alt="yisaienkov" title="yisaienkov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25550114?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/WesleyYue" target="_blank" rel="noopener noreferrer"><img alt="WesleyYue" title="WesleyYue" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5280818?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/notplus" target="_blank" rel="noopener noreferrer"><img alt="notplus" title="notplus" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/47047973?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/vladserkoff" target="_blank" rel="noopener noreferrer"><img alt="vladserkoff" title="vladserkoff" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/9671366?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/johngull" target="_blank" rel="noopener noreferrer"><img alt="johngull" title="johngull" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1451797?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Vcv85" target="_blank" rel="noopener noreferrer"><img alt="Vcv85" title="Vcv85" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34249234?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/t-hanya" target="_blank" rel="noopener noreferrer"><img alt="t-hanya" title="t-hanya" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18238865?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/dskkato" target="_blank" rel="noopener noreferrer"><img alt="dskkato" title="dskkato" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/46243939?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/DBusAI" target="_blank" rel="noopener noreferrer"><img alt="DBusAI" title="DBusAI" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25797646?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ChristofHenkel" target="_blank" rel="noopener noreferrer"><img alt="ChristofHenkel" title="ChristofHenkel" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/24292431?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Juphex" target="_blank" rel="noopener noreferrer"><img alt="Juphex" title="Juphex" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/20795522?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/cdicle" target="_blank" rel="noopener noreferrer"><img alt="cdicle" title="cdicle" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/4490559?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Multihuntr" target="_blank" rel="noopener noreferrer"><img alt="Multihuntr" title="Multihuntr" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10515040?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/spsancti" target="_blank" rel="noopener noreferrer"><img alt="spsancti" title="spsancti" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8904387?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Callidior" target="_blank" rel="noopener noreferrer"><img alt="Callidior" title="Callidior" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/7915048?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/poke1024" target="_blank" rel="noopener noreferrer"><img alt="poke1024" title="poke1024" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/11859538?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/bmabey" target="_blank" rel="noopener noreferrer"><img alt="bmabey" title="bmabey" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1837?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/artyompal" target="_blank" rel="noopener noreferrer"><img alt="artyompal" title="artyompal" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/15948033?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/sneddy" target="_blank" rel="noopener noreferrer"><img alt="sneddy" title="sneddy" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/11714584?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/AndreyGurevich" target="_blank" rel="noopener noreferrer"><img alt="AndreyGurevich" title="AndreyGurevich" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8684917?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Erlemar" target="_blank" rel="noopener noreferrer"><img alt="Erlemar" title="Erlemar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/19382479?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/agchang-cgl" target="_blank" rel="noopener noreferrer"><img alt="agchang-cgl" title="agchang-cgl" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/61133390?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alicangok" target="_blank" rel="noopener noreferrer"><img alt="alicangok" title="alicangok" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/26303095?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/golunovas" target="_blank" rel="noopener noreferrer"><img alt="golunovas" title="golunovas" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/19951039?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alxndrkalinin" target="_blank" rel="noopener noreferrer"><img alt="alxndrkalinin" title="alxndrkalinin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1107762?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kinoooshnik" target="_blank" rel="noopener noreferrer"><img alt="kinoooshnik" title="kinoooshnik" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3081499?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Alex-JG3" target="_blank" rel="noopener noreferrer"><img alt="Alex-JG3" title="Alex-JG3" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/49785867?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/aidonchuk" target="_blank" rel="noopener noreferrer"><img alt="aidonchuk" title="aidonchuk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/485590?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/alessiobonfiglio" target="_blank" rel="noopener noreferrer"><img alt="alessiobonfiglio" title="alessiobonfiglio" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/48260001?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/belskikh" target="_blank" rel="noopener noreferrer"><img alt="belskikh" title="belskikh" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13022305?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/aaronzs" target="_blank" rel="noopener noreferrer"><img alt="aaronzs" title="aaronzs" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1827365?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/AaronPinto" target="_blank" rel="noopener noreferrer"><img alt="AaronPinto" title="AaronPinto" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10177767?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ah651" target="_blank" rel="noopener noreferrer"><img alt="ah651" title="ah651" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/26356801?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/maremun" target="_blank" rel="noopener noreferrer"><img alt="maremun" title="maremun" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3645135?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/InCogNiTo124" target="_blank" rel="noopener noreferrer"><img alt="InCogNiTo124" title="InCogNiTo124" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/12953598?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/maksimovkonstantin" target="_blank" rel="noopener noreferrer"><img alt="maksimovkonstantin" title="maksimovkonstantin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10925437?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Kupchanski" target="_blank" rel="noopener noreferrer"><img alt="Kupchanski" title="Kupchanski" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13433909?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/kirillbobyrev" target="_blank" rel="noopener noreferrer"><img alt="kirillbobyrev" title="kirillbobyrev" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3352968?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/immortalCO" target="_blank" rel="noopener noreferrer"><img alt="immortalCO" title="immortalCO" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/22522323?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jveitchmichaelis" target="_blank" rel="noopener noreferrer"><img alt="jveitchmichaelis" title="jveitchmichaelis" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3159591?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Erotemic" target="_blank" rel="noopener noreferrer"><img alt="Erotemic" title="Erotemic" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3186211?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/xiaoyuan0203" target="_blank" rel="noopener noreferrer"><img alt="xiaoyuan0203" title="xiaoyuan0203" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/78028879?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/jaehyuck0103" target="_blank" rel="noopener noreferrer"><img alt="jaehyuck0103" title="jaehyuck0103" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/16634896?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/cannon" target="_blank" rel="noopener noreferrer"><img alt="cannon" title="cannon" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8286907?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ifeherva" target="_blank" rel="noopener noreferrer"><img alt="ifeherva" title="ifeherva" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/3716849?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Ingwar" target="_blank" rel="noopener noreferrer"><img alt="Ingwar" title="Ingwar" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/1016854?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/hoel-bagard" target="_blank" rel="noopener noreferrer"><img alt="hoel-bagard" title="hoel-bagard" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/34478245?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/henrique" target="_blank" rel="noopener noreferrer"><img alt="henrique" title="henrique" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/128897?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/hassiahk" target="_blank" rel="noopener noreferrer"><img alt="hassiahk" title="hassiahk" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/13920778?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/steermomo" target="_blank" rel="noopener noreferrer"><img alt="steermomo" title="steermomo" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/10905619?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Gurubaseio" target="_blank" rel="noopener noreferrer"><img alt="Gurubaseio" title="Gurubaseio" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/190025715?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/georgymironov" target="_blank" rel="noopener noreferrer"><img alt="georgymironov" title="georgymironov" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/5487801?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/GalDude33" target="_blank" rel="noopener noreferrer"><img alt="GalDude33" title="GalDude33" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/18334824?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/farizrahman4u" target="_blank" rel="noopener noreferrer"><img alt="farizrahman4u" title="farizrahman4u" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/11006006?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/maruschin" target="_blank" rel="noopener noreferrer"><img alt="maruschin" title="maruschin" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/678620?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/ErlingLie" target="_blank" rel="noopener noreferrer"><img alt="ErlingLie" title="ErlingLie" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/55686577?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/zetyquickly" target="_blank" rel="noopener noreferrer"><img alt="zetyquickly" title="zetyquickly" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/25350960?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/plashchynski" target="_blank" rel="noopener noreferrer"><img alt="plashchynski" title="plashchynski" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/30833?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/cortwave" target="_blank" rel="noopener noreferrer"><img alt="cortwave" title="cortwave" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/8936357?v=4"/></a></div><div class="border border-gray-300"><a href="https://github.com/Datasciensyash" target="_blank" rel="noopener noreferrer"><img alt="Datasciensyash" title="Datasciensyash" loading="lazy" width="70" height="70" decoding="async" data-nimg="1" style="color:transparent" src="https://avatars.githubusercontent.com/u/60406311?v=4"/></a></div></div></div><div class="bg-gradient-to-b from-blue-50 to-white py-8 md:py-12"><div class="container mx-auto px-4"><div class="text-center mb-8"><h2 class="text-2xl font-medium mb-3">Community-Driven Project, Supported By</h2><p class="text-gray-600 text-lg max-w-2xl mx-auto">Albumentations thrives on developer contributions. We appreciate our sponsors who help sustain the project&#x27;s infrastructure.</p></div><div class="grid md:grid-cols-3 gap-8 max-w-6xl mx-auto"><div class="bg-gradient-to-b from-amber-50 to-white p-6 rounded-xl border border-amber-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-600 bg-amber-50 px-4 py-1.5 rounded-full">Gold Sponsors</span></div><div class="h-32 flex items-center justify-center text-gray-400 text-lg">Your company could be here</div></div><div class="bg-gradient-to-b from-gray-50 to-white p-6 rounded-xl border border-gray-100"><div class="text-center mb-4"><span class="text-base font-semibold text-gray-600 bg-gray-50 px-4 py-1.5 rounded-full">Silver Sponsors</span></div><div class="flex flex-col gap-6 items-center"><a href="https://datature.io" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-4 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Datature" loading="lazy" width="180" height="60" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/datature-full.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Build, Train, and Deploy Enterprise Computer Vision Applications in One Platform</div></div></a></div></div><div class="bg-gradient-to-b from-orange-50 to-white p-6 rounded-xl border border-orange-100"><div class="text-center mb-4"><span class="text-base font-semibold text-amber-700 bg-amber-50 px-4 py-1.5 rounded-full">Bronze Sponsors</span></div><div class="flex flex-col gap-4 items-center"><a href="https://roboflow.com" target="_blank" rel="noopener noreferrer" class="group relative"><div class="p-3 rounded-lg hover:bg-white hover:shadow-md transition-all"><img alt="Roboflow" loading="lazy" width="160" height="50" decoding="async" data-nimg="1" class="object-contain" style="color:transparent" src="/assets/sponsors/roboflow.png"/><div class="hidden group-hover:block absolute bottom-full left-1/2 -translate-x-1/2 mb-2 p-2 bg-gray-900 text-white text-sm rounded whitespace-nowrap">Computer vision infrastructure for developers</div></div></a></div></div></div><div class="text-center mt-8 space-y-4"><a href="https://github.com/sponsors/albumentations-team" target="_blank" rel="noopener noreferrer" class="inline-flex items-center gap-3 px-6 py-3 bg-white border-2 border-pink-100 hover:border-pink-200 rounded-full shadow-sm hover:shadow-md transition-all group"><span class="text-lg font-medium text-gray-700">Become a Sponsor</span><i class="fas fa-heart text-pink-500 group-hover:scale-110 transition-transform"></i></a><div class="text-gray-500">View sponsorship tiers and benefits on GitHub Sponsors<i class="fas fa-external-link-alt ml-2 text-sm"></i></div></div></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\nb:I[6213,[],\"OutletBoundary\"]\nd:I[6213,[],\"MetadataBoundary\"]\nf:I[6213,[],\"ViewportBoundary\"]\n11:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"vJuFcpWvbN6zbAub4gcIZ\",\"p\":\"\",\"c\":[\"\",\"people\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"people\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"people\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"people\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[\"$La\",null,[\"$\",\"$Lb\",null,{\"children\":\"$Lc\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[null,[\"$\",\"$4\",\"Ms9w_hju4wmo-CT1B5ygd\",{\"children\":[[\"$\",\"$Ld\",null,{\"children\":\"$Le\"}],[\"$\",\"$Lf\",null,{\"children\":\"$L10\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$11\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"10:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\ne:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Albumentations: fast and flexible image augmentations\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models.\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"link\",\"21\",{\"re"])</script><script>self.__next_f.push([1,"l\":\"icon\",\"href\":\"/icon.svg?ed95530d83f93aed\",\"type\":\"image/svg+xml\",\"sizes\":\"any\"}]]\n"])</script><script>self.__next_f.push([1,"c:null\n"])</script><script>self.__next_f.push([1,"12:I[5984,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"284\",\"static/chunks/app/people/page-17007ded85d614d1.js\"],\"TeamMemberCard\"]\n13:I[7970,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"284\",\"static/chunks/app/people/page-17007ded85d614d1.js\"],\"Image\"]\n14:I[4265,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"284\",\"static/chunks/app/people/page-17007ded85d614d1.js\"],\"default\"]\n"])</script><script>self.__next_f.push([1,"a:[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[[\"$\",\"div\",null,{\"className\":\"mb-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-3xl font-medium text-center mb-8\",\"children\":\"Core team\"}],[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 grid-cols-1 gap-4 max-w-5xl mx-auto\",\"children\":[[\"$\",\"$L12\",\"Vladimir Iglovikov\",{\"member\":{\"name\":\"Vladimir Iglovikov\",\"photo\":\"iglovikov.jpg\",\"twitter\":\"viglovikov\",\"github\":\"ternaus\",\"linkedin\":\"iglovikov\",\"instagram\":\"ternaus\",\"kaggle\":\"iglovikov\",\"google_scholar\":\"vkjh9X0AAAAJ\"}}]]}]]}],[\"$\",\"div\",null,{\"className\":\"mb-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-2xl font-medium text-center mb-8\",\"children\":\"Honorary Developers. Were behind the creation of the library. Sadly not active anymore.\"}],[\"$\",\"div\",null,{\"className\":\"grid grid-cols-1 md:grid-cols-3 xl:grid-cols-4 gap-4 max-w-5xl mx-auto\",\"children\":[[\"$\",\"$L12\",\"Alexander Buslaev\",{\"member\":{\"name\":\"Alexander Buslaev\",\"photo\":\"buslaev.jpg\",\"twitter\":\"AlBuslaev\",\"github\":\"albu\",\"linkedin\":\"al-buslaev\",\"kaggle\":\"albuslaev\"}}],[\"$\",\"$L12\",\"Alex Parinov\",{\"member\":{\"name\":\"Alex Parinov\",\"photo\":\"parinov.jpg\",\"twitter\":\"creaf\",\"github\":\"creafz\",\"linkedin\":\"alex-parinov\",\"kaggle\":\"creafz\"}}],[\"$\",\"$L12\",\"Evegene Khvedchenya\",{\"member\":{\"name\":\"Evegene Khvedchenya\",\"photo\":\"khvedchenya.jpg\",\"twitter\":\"cvtalks\",\"github\":\"BloodAxe\",\"linkedin\":\"cvtalks\",\"kaggle\":\"bloodaxe\"}}],[\"$\",\"$L12\",\"Mikhail Druzhinin\",{\"member\":{\"name\":\"Mikhail Druzhinin\",\"photo\":\"druzhinin.jpg\",\"twitter\":\"Dipetm\",\"github\":\"Dipet\",\"linkedin\":\"mikhail-druzhinin-548229100\",\"kaggle\":\"dipetm\"}}]]}]]}],[\"$\",\"div\",null,{\"className\":\"mb-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-3xl font-medium text-center mb-8\",\"children\":\"Contributors\"}],[\"$\",\"div\",null,{\"className\":\"flex flex-wrap gap-2\",\"children\":[[\"$\",\"div\",\"ternaus\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ternaus\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5481618?v=4\",\"alt\":\"ternaus\",\"title\":\"ternaus\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"creafz\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/creafz\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/681989?v=4\",\"alt\":\"creafz\",\"title\":\"creafz\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Dipet\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Dipet\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7512250?v=4\",\"alt\":\"Dipet\",\"title\":\"Dipet\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"albu\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/albu\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1128788?v=4\",\"alt\":\"albu\",\"title\":\"albu\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"BloodAxe\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/BloodAxe\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/532320?v=4\",\"alt\":\"BloodAxe\",\"title\":\"BloodAxe\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ayasyrev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ayasyrev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/46181683?v=4\",\"alt\":\"ayasyrev\",\"title\":\"ayasyrev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"vfdev-5\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/vfdev-5\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2459423?v=4\",\"alt\":\"vfdev-5\",\"title\":\"vfdev-5\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"arsenyinfo\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/arsenyinfo\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4929993?v=4\",\"alt\":\"arsenyinfo\",\"title\":\"arsenyinfo\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"qubvel\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/qubvel\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/31920396?v=4\",\"alt\":\"qubvel\",\"title\":\"qubvel\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ZFTurbo\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ZFTurbo\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4013811?v=4\",\"alt\":\"ZFTurbo\",\"title\":\"ZFTurbo\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"i-aki-y\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/i-aki-y\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/9190086?v=4\",\"alt\":\"i-aki-y\",\"title\":\"i-aki-y\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"MichaelMonashev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/MichaelMonashev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/6323434?v=4\",\"alt\":\"MichaelMonashev\",\"title\":\"MichaelMonashev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"zakajd\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/zakajd\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/15848838?v=4\",\"alt\":\"zakajd\",\"title\":\"zakajd\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"victor1cea\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/victor1cea\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/12428490?v=4\",\"alt\":\"victor1cea\",\"title\":\"victor1cea\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"akarsakov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/akarsakov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5549721?v=4\",\"alt\":\"akarsakov\",\"title\":\"akarsakov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"IlyaOvodov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/IlyaOvodov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34230114?v=4\",\"alt\":\"IlyaOvodov\",\"title\":\"IlyaOvodov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"LinaShiryaeva\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/LinaShiryaeva\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8642759?v=4\",\"alt\":\"LinaShiryaeva\",\"title\":\"LinaShiryaeva\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"IliaLarchenko\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/IliaLarchenko\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/41329713?v=4\",\"alt\":\"IliaLarchenko\",\"title\":\"IliaLarchenko\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"StrikerRUS\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/StrikerRUS\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25141164?v=4\",\"alt\":\"StrikerRUS\",\"title\":\"StrikerRUS\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Arquestro\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Arquestro\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7341001?v=4\",\"alt\":\"Arquestro\",\"title\":\"Arquestro\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jangop\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jangop\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/17318436?v=4\",\"alt\":\"jangop\",\"title\":\"jangop\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"momincks\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/momincks\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/63245608?v=4\",\"alt\":\"momincks\",\"title\":\"momincks\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"guillaume-rochette-oxb\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/guillaume-rochette-oxb\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/129938647?v=4\",\"alt\":\"guillaume-rochette-oxb\",\"title\":\"guillaume-rochette-oxb\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bfialkoff\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bfialkoff\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/39707315?v=4\",\"alt\":\"bfialkoff\",\"title\":\"bfialkoff\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"VirajBagal\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/VirajBagal\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/51148252?v=4\",\"alt\":\"VirajBagal\",\"title\":\"VirajBagal\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"5n7-sk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/5n7-sk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/59395084?v=4\",\"alt\":\"5n7-sk\",\"title\":\"5n7-sk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"sergei3000\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/sergei3000\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18502517?v=4\",\"alt\":\"sergei3000\",\"title\":\"sergei3000\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"libfun\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/libfun\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/301147?v=4\",\"alt\":\"libfun\",\"title\":\"libfun\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"gavrin-s\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/gavrin-s\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/15730417?v=4\",\"alt\":\"gavrin-s\",\"title\":\"gavrin-s\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"SunQpark\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/SunQpark\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/31339583?v=4\",\"alt\":\"SunQpark\",\"title\":\"SunQpark\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alimbekovKZ\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alimbekovKZ\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2087546?v=4\",\"alt\":\"alimbekovKZ\",\"title\":\"alimbekovKZ\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"onurtore\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/onurtore\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20031290?v=4\",\"alt\":\"onurtore\",\"title\":\"onurtore\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ocourtin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ocourtin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1718264?v=4\",\"alt\":\"ocourtin\",\"title\":\"ocourtin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Aloqeely\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Aloqeely\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/52792999?v=4\",\"alt\":\"Aloqeely\",\"title\":\"Aloqeely\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ajinkyakhadilkar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ajinkyakhadilkar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/27129645?v=4\",\"alt\":\"ajinkyakhadilkar\",\"title\":\"ajinkyakhadilkar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alekseynp\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alekseynp\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4382608?v=4\",\"alt\":\"alekseynp\",\"title\":\"alekseynp\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alexander-rakhlin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alexander-rakhlin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10095956?v=4\",\"alt\":\"alexander-rakhlin\",\"title\":\"alexander-rakhlin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Andredance\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Andredance\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/23104744?v=4\",\"alt\":\"Andredance\",\"title\":\"Andredance\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"toshiks\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/toshiks\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/29678445?v=4\",\"alt\":\"toshiks\",\"title\":\"toshiks\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"erikgaas\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/erikgaas\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2773725?v=4\",\"alt\":\"erikgaas\",\"title\":\"erikgaas\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"domef\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/domef\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/38183728?v=4\",\"alt\":\"domef\",\"title\":\"domef\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"4pygmalion\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/4pygmalion\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/45510932?v=4\",\"alt\":\"4pygmalion\",\"title\":\"4pygmalion\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"JonasKlotz\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/JonasKlotz\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/19370958?v=4\",\"alt\":\"JonasKlotz\",\"title\":\"JonasKlotz\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"KiriLev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/KiriLev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13079303?v=4\",\"alt\":\"KiriLev\",\"title\":\"KiriLev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"marcocaccin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/marcocaccin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7538983?v=4\",\"alt\":\"marcocaccin\",\"title\":\"marcocaccin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"mys007\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/mys007\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5921083?v=4\",\"alt\":\"mys007\",\"title\":\"mys007\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"PerchunPak\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/PerchunPak\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/68118654?v=4\",\"alt\":\"PerchunPak\",\"title\":\"PerchunPak\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"philipp-fischer\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/philipp-fischer\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/12006597?v=4\",\"alt\":\"philipp-fischer\",\"title\":\"philipp-fischer\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"huuquan1994\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/huuquan1994\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/9111958?v=4\",\"alt\":\"huuquan1994\",\"title\":\"huuquan1994\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ruancomelli\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ruancomelli\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/22752929?v=4\",\"alt\":\"ruancomelli\",\"title\":\"ruancomelli\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kaichoulyc\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kaichoulyc\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/32312252?v=4\",\"alt\":\"kaichoulyc\",\"title\":\"kaichoulyc\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"rsokl\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/rsokl\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/29104956?v=4\",\"alt\":\"rsokl\",\"title\":\"rsokl\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bryant1410\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bryant1410\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3905501?v=4\",\"alt\":\"bryant1410\",\"title\":\"bryant1410\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bes-dev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bes-dev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3617413?v=4\",\"alt\":\"bes-dev\",\"title\":\"bes-dev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"matsumotosan\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/matsumotosan\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/27728280?v=4\",\"alt\":\"matsumotosan\",\"title\":\"matsumotosan\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tashay\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tashay\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14319581?v=4\",\"alt\":\"tashay\",\"title\":\"tashay\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tmramalho\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tmramalho\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/2317298?v=4\",\"alt\":\"tmramalho\",\"title\":\"tmramalho\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"timgates42\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/timgates42\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47873678?v=4\",\"alt\":\"timgates42\",\"title\":\"timgates42\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tabula-rosa\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tabula-rosa\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/45348789?v=4\",\"alt\":\"tabula-rosa\",\"title\":\"tabula-rosa\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"rbu\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/rbu\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/65913?v=4\",\"alt\":\"rbu\",\"title\":\"rbu\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ORippler\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ORippler\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/24656669?v=4\",\"alt\":\"ORippler\",\"title\":\"ORippler\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"nathanhubens\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/nathanhubens\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/23050329?v=4\",\"alt\":\"nathanhubens\",\"title\":\"nathanhubens\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"NatanBagrov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/NatanBagrov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20153650?v=4\",\"alt\":\"NatanBagrov\",\"title\":\"NatanBagrov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"namangup\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/namangup\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/32638824?v=4\",\"alt\":\"namangup\",\"title\":\"namangup\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"b0nce\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/b0nce\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/27873390?v=4\",\"alt\":\"b0nce\",\"title\":\"b0nce\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"amirassov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/amirassov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14277430?v=4\",\"alt\":\"amirassov\",\"title\":\"amirassov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"deleomike\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/deleomike\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34043765?v=4\",\"alt\":\"deleomike\",\"title\":\"deleomike\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"mennohofste\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/mennohofste\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/21225762?v=4\",\"alt\":\"mennohofste\",\"title\":\"mennohofste\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"matejpekar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/matejpekar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/69220871?v=4\",\"alt\":\"matejpekar\",\"title\":\"matejpekar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Matthew-J-Payne\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Matthew-J-Payne\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/106391289?v=4\",\"alt\":\"Matthew-J-Payne\",\"title\":\"Matthew-J-Payne\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"gogetron\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/gogetron\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/65114284?v=4\",\"alt\":\"gogetron\",\"title\":\"gogetron\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"mrsmrynk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/mrsmrynk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/66388955?v=4\",\"alt\":\"mrsmrynk\",\"title\":\"mrsmrynk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"daisukelab\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/daisukelab\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14831220?v=4\",\"alt\":\"daisukelab\",\"title\":\"daisukelab\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Diyago\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Diyago\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8073766?v=4\",\"alt\":\"Diyago\",\"title\":\"Diyago\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"thomaoc1\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/thomaoc1\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/73828133?v=4\",\"alt\":\"thomaoc1\",\"title\":\"thomaoc1\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"tatigabru\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/tatigabru\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47729437?v=4\",\"alt\":\"tatigabru\",\"title\":\"tatigabru\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"shyn\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/shyn\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/899449?v=4\",\"alt\":\"shyn\",\"title\":\"shyn\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ryoryon66\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ryoryon66\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/46624038?v=4\",\"alt\":\"ryoryon66\",\"title\":\"ryoryon66\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"oguz-hanoglu\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/oguz-hanoglu\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/99868596?v=4\",\"alt\":\"oguz-hanoglu\",\"title\":\"oguz-hanoglu\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"notmatthancock\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/notmatthancock\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1588456?v=4\",\"alt\":\"notmatthancock\",\"title\":\"notmatthancock\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"loicmagne\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/loicmagne\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/53355258?v=4\",\"alt\":\"loicmagne\",\"title\":\"loicmagne\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kmistry-wx\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kmistry-wx\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/39906469?v=4\",\"alt\":\"kmistry-wx\",\"title\":\"kmistry-wx\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jovenwayfarer\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jovenwayfarer\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47921506?v=4\",\"alt\":\"jovenwayfarer\",\"title\":\"jovenwayfarer\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jasonrock-a3\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jasonrock-a3\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/107961397?v=4\",\"alt\":\"jasonrock-a3\",\"title\":\"jasonrock-a3\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"iRyoka\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/iRyoka\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/795201?v=4\",\"alt\":\"iRyoka\",\"title\":\"iRyoka\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"pomacanthidae\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/pomacanthidae\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18525734?v=4\",\"alt\":\"pomacanthidae\",\"title\":\"pomacanthidae\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"haarisr\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/haarisr\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/122410226?v=4\",\"alt\":\"haarisr\",\"title\":\"haarisr\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"gyaneshwar-sunkara\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/gyaneshwar-sunkara\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/52619288?v=4\",\"alt\":\"gyaneshwar-sunkara\",\"title\":\"gyaneshwar-sunkara\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"dmitrie-ai\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/dmitrie-ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/72915468?v=4\",\"alt\":\"dmitrie-ai\",\"title\":\"dmitrie-ai\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"dependabot[bot]\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/dependabot[bot]\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/in/29110?v=4\",\"alt\":\"dependabot[bot]\",\"title\":\"dependabot[bot]\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bonlime\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bonlime\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/14337581?v=4\",\"alt\":\"bonlime\",\"title\":\"bonlime\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"aaroswings\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/aaroswings\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/35984064?v=4\",\"alt\":\"aaroswings\",\"title\":\"aaroswings\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"zahragolpa\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/zahragolpa\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20627999?v=4\",\"alt\":\"zahragolpa\",\"title\":\"zahragolpa\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"yisaienkov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/yisaienkov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25550114?v=4\",\"alt\":\"yisaienkov\",\"title\":\"yisaienkov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"WesleyYue\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/WesleyYue\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5280818?v=4\",\"alt\":\"WesleyYue\",\"title\":\"WesleyYue\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"notplus\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/notplus\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/47047973?v=4\",\"alt\":\"notplus\",\"title\":\"notplus\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"vladserkoff\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/vladserkoff\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/9671366?v=4\",\"alt\":\"vladserkoff\",\"title\":\"vladserkoff\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"johngull\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/johngull\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1451797?v=4\",\"alt\":\"johngull\",\"title\":\"johngull\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Vcv85\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Vcv85\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34249234?v=4\",\"alt\":\"Vcv85\",\"title\":\"Vcv85\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"t-hanya\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/t-hanya\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18238865?v=4\",\"alt\":\"t-hanya\",\"title\":\"t-hanya\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"dskkato\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/dskkato\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/46243939?v=4\",\"alt\":\"dskkato\",\"title\":\"dskkato\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"DBusAI\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/DBusAI\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25797646?v=4\",\"alt\":\"DBusAI\",\"title\":\"DBusAI\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ChristofHenkel\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ChristofHenkel\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/24292431?v=4\",\"alt\":\"ChristofHenkel\",\"title\":\"ChristofHenkel\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Juphex\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Juphex\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/20795522?v=4\",\"alt\":\"Juphex\",\"title\":\"Juphex\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"cdicle\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/cdicle\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/4490559?v=4\",\"alt\":\"cdicle\",\"title\":\"cdicle\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Multihuntr\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Multihuntr\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10515040?v=4\",\"alt\":\"Multihuntr\",\"title\":\"Multihuntr\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"spsancti\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/spsancti\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8904387?v=4\",\"alt\":\"spsancti\",\"title\":\"spsancti\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Callidior\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Callidior\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/7915048?v=4\",\"alt\":\"Callidior\",\"title\":\"Callidior\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"poke1024\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/poke1024\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/11859538?v=4\",\"alt\":\"poke1024\",\"title\":\"poke1024\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"bmabey\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/bmabey\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1837?v=4\",\"alt\":\"bmabey\",\"title\":\"bmabey\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"artyompal\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/artyompal\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/15948033?v=4\",\"alt\":\"artyompal\",\"title\":\"artyompal\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"sneddy\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/sneddy\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/11714584?v=4\",\"alt\":\"sneddy\",\"title\":\"sneddy\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"AndreyGurevich\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/AndreyGurevich\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8684917?v=4\",\"alt\":\"AndreyGurevich\",\"title\":\"AndreyGurevich\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Erlemar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Erlemar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/19382479?v=4\",\"alt\":\"Erlemar\",\"title\":\"Erlemar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"agchang-cgl\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/agchang-cgl\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/61133390?v=4\",\"alt\":\"agchang-cgl\",\"title\":\"agchang-cgl\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alicangok\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alicangok\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/26303095?v=4\",\"alt\":\"alicangok\",\"title\":\"alicangok\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"golunovas\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/golunovas\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/19951039?v=4\",\"alt\":\"golunovas\",\"title\":\"golunovas\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alxndrkalinin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alxndrkalinin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1107762?v=4\",\"alt\":\"alxndrkalinin\",\"title\":\"alxndrkalinin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kinoooshnik\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kinoooshnik\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3081499?v=4\",\"alt\":\"kinoooshnik\",\"title\":\"kinoooshnik\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Alex-JG3\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Alex-JG3\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/49785867?v=4\",\"alt\":\"Alex-JG3\",\"title\":\"Alex-JG3\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"aidonchuk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/aidonchuk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/485590?v=4\",\"alt\":\"aidonchuk\",\"title\":\"aidonchuk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"alessiobonfiglio\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/alessiobonfiglio\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/48260001?v=4\",\"alt\":\"alessiobonfiglio\",\"title\":\"alessiobonfiglio\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"belskikh\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/belskikh\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13022305?v=4\",\"alt\":\"belskikh\",\"title\":\"belskikh\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"aaronzs\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/aaronzs\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1827365?v=4\",\"alt\":\"aaronzs\",\"title\":\"aaronzs\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"AaronPinto\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/AaronPinto\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10177767?v=4\",\"alt\":\"AaronPinto\",\"title\":\"AaronPinto\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ah651\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ah651\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/26356801?v=4\",\"alt\":\"ah651\",\"title\":\"ah651\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"maremun\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/maremun\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3645135?v=4\",\"alt\":\"maremun\",\"title\":\"maremun\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"InCogNiTo124\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/InCogNiTo124\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/12953598?v=4\",\"alt\":\"InCogNiTo124\",\"title\":\"InCogNiTo124\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"maksimovkonstantin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/maksimovkonstantin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10925437?v=4\",\"alt\":\"maksimovkonstantin\",\"title\":\"maksimovkonstantin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Kupchanski\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Kupchanski\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13433909?v=4\",\"alt\":\"Kupchanski\",\"title\":\"Kupchanski\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"kirillbobyrev\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/kirillbobyrev\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3352968?v=4\",\"alt\":\"kirillbobyrev\",\"title\":\"kirillbobyrev\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"immortalCO\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/immortalCO\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/22522323?v=4\",\"alt\":\"immortalCO\",\"title\":\"immortalCO\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jveitchmichaelis\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jveitchmichaelis\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3159591?v=4\",\"alt\":\"jveitchmichaelis\",\"title\":\"jveitchmichaelis\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Erotemic\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Erotemic\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3186211?v=4\",\"alt\":\"Erotemic\",\"title\":\"Erotemic\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"xiaoyuan0203\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/xiaoyuan0203\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/78028879?v=4\",\"alt\":\"xiaoyuan0203\",\"title\":\"xiaoyuan0203\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"jaehyuck0103\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/jaehyuck0103\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/16634896?v=4\",\"alt\":\"jaehyuck0103\",\"title\":\"jaehyuck0103\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"cannon\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/cannon\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8286907?v=4\",\"alt\":\"cannon\",\"title\":\"cannon\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ifeherva\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ifeherva\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/3716849?v=4\",\"alt\":\"ifeherva\",\"title\":\"ifeherva\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Ingwar\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Ingwar\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/1016854?v=4\",\"alt\":\"Ingwar\",\"title\":\"Ingwar\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"hoel-bagard\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/hoel-bagard\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/34478245?v=4\",\"alt\":\"hoel-bagard\",\"title\":\"hoel-bagard\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"henrique\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/henrique\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/128897?v=4\",\"alt\":\"henrique\",\"title\":\"henrique\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"hassiahk\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/hassiahk\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/13920778?v=4\",\"alt\":\"hassiahk\",\"title\":\"hassiahk\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"steermomo\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/steermomo\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/10905619?v=4\",\"alt\":\"steermomo\",\"title\":\"steermomo\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Gurubaseio\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Gurubaseio\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/190025715?v=4\",\"alt\":\"Gurubaseio\",\"title\":\"Gurubaseio\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"georgymironov\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/georgymironov\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/5487801?v=4\",\"alt\":\"georgymironov\",\"title\":\"georgymironov\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"GalDude33\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/GalDude33\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/18334824?v=4\",\"alt\":\"GalDude33\",\"title\":\"GalDude33\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"farizrahman4u\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/farizrahman4u\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/11006006?v=4\",\"alt\":\"farizrahman4u\",\"title\":\"farizrahman4u\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"maruschin\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/maruschin\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/678620?v=4\",\"alt\":\"maruschin\",\"title\":\"maruschin\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"ErlingLie\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/ErlingLie\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/55686577?v=4\",\"alt\":\"ErlingLie\",\"title\":\"ErlingLie\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"zetyquickly\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/zetyquickly\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/25350960?v=4\",\"alt\":\"zetyquickly\",\"title\":\"zetyquickly\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"plashchynski\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/plashchynski\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/30833?v=4\",\"alt\":\"plashchynski\",\"title\":\"plashchynski\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"cortwave\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/cortwave\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/8936357?v=4\",\"alt\":\"cortwave\",\"title\":\"cortwave\",\"width\":70,\"height\":70}]}]}],[\"$\",\"div\",\"Datasciensyash\",{\"className\":\"border border-gray-300\",\"children\":[\"$\",\"a\",null,{\"href\":\"https://github.com/Datasciensyash\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$\",\"$L13\",null,{\"src\":\"https://avatars.githubusercontent.com/u/60406311?v=4\",\"alt\":\"Datasciensyash\",\"title\":\"Datasciensyash\",\"width\":70,\"height\":70}]}]}]]}]]}],[\"$\",\"$L14\",null,{}]]}]\n"])</script></body></html>
\ No newline at end of file
diff --git a/people/index.txt b/people/index.txt
index 47e2fa0e..ab237af1 100755
--- a/people/index.txt
+++ b/people/index.txt
@@ -11,7 +11,7 @@ f:I[6213,[],"ViewportBoundary"]
 1:HL["/_next/static/media/4473ecc91f70f139-s.p.woff","font",{"crossOrigin":"","type":"font/woff"}]
 2:HL["/_next/static/media/463dafcda517f24f-s.p.woff","font",{"crossOrigin":"","type":"font/woff"}]
 3:HL["/_next/static/css/8043a7c984777fb1.css","style"]
-0:{"P":null,"b":"ZNcQrWMYk7ymGN9y8PjnQ","p":"","c":["","people",""],"i":false,"f":[[["",{"children":["people",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",["$","$4","c",{"children":[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/8043a7c984777fb1.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","html",null,{"lang":"en","children":[["$","head",null,{"children":[["$","link",null,{"rel":"shortcut icon","href":"/albumentations_logo.png"}],["$","link",null,{"rel":"stylesheet","href":"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"}]]}],["$","body",null,{"className":"__variable_1e4310 __variable_c3aa02 font-sans antialiased","children":[["$","$L5",null,{}],["$","main",null,{"className":"pt-16","children":["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}],["$","footer",null,{"className":"border-t","children":["$","div",null,{"className":"container mx-auto px-4 py-8","children":["$","div",null,{"className":"flex flex-col md:flex-row justify-between items-start md:items-center gap-6","children":[["$","nav",null,{"className":"flex flex-wrap gap-x-6 gap-y-2","children":[["$","$L8","/",{"href":"/","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Home"}],["$","$L8","/docs",{"href":"/docs","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Documentation"}],["$","$L8","/whos_using",{"href":"/whos_using","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Who's using"}],["$","a","https://explore.albumentations.ai",{"href":"https://explore.albumentations.ai","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Explore"}],["$","$L8","/people",{"href":"/people","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"People"}],["$","a","https://github.com/albumentations-team/albumentations",{"href":"https://github.com/albumentations-team/albumentations","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"GitHub"}]]}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":["$undefined","Sponsor"]}]]}]}]}]]}],["$","$L9",null,{"gaId":"G-DCXRDR9HJ0"}]]}]]}],{"children":["people",["$","$4","c",{"children":[null,["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children","people","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]]}],{"children":["__PAGE__",["$","$4","c",{"children":["$La",null,["$","$Lb",null,{"children":"$Lc"}]]}],{},null]},null]},null],["$","$4","h",{"children":[null,["$","$4","mvY_dJGdwDys125CocXmB",{"children":[["$","$Ld",null,{"children":"$Le"}],["$","$Lf",null,{"children":"$L10"}],["$","meta",null,{"name":"next-size-adjust"}]]}]]}]]],"m":"$undefined","G":["$11","$undefined"],"s":false,"S":true}
+0:{"P":null,"b":"vJuFcpWvbN6zbAub4gcIZ","p":"","c":["","people",""],"i":false,"f":[[["",{"children":["people",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",["$","$4","c",{"children":[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/8043a7c984777fb1.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","html",null,{"lang":"en","children":[["$","head",null,{"children":[["$","link",null,{"rel":"shortcut icon","href":"/albumentations_logo.png"}],["$","link",null,{"rel":"stylesheet","href":"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"}]]}],["$","body",null,{"className":"__variable_1e4310 __variable_c3aa02 font-sans antialiased","children":[["$","$L5",null,{}],["$","main",null,{"className":"pt-16","children":["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}],["$","footer",null,{"className":"border-t","children":["$","div",null,{"className":"container mx-auto px-4 py-8","children":["$","div",null,{"className":"flex flex-col md:flex-row justify-between items-start md:items-center gap-6","children":[["$","nav",null,{"className":"flex flex-wrap gap-x-6 gap-y-2","children":[["$","$L8","/",{"href":"/","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Home"}],["$","$L8","/docs",{"href":"/docs","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Documentation"}],["$","$L8","/whos_using",{"href":"/whos_using","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Who's using"}],["$","a","https://explore.albumentations.ai",{"href":"https://explore.albumentations.ai","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Explore"}],["$","$L8","/people",{"href":"/people","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"People"}],["$","a","https://github.com/albumentations-team/albumentations",{"href":"https://github.com/albumentations-team/albumentations","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"GitHub"}]]}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":["$undefined","Sponsor"]}]]}]}]}]]}],["$","$L9",null,{"gaId":"G-DCXRDR9HJ0"}]]}]]}],{"children":["people",["$","$4","c",{"children":[null,["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children","people","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]]}],{"children":["__PAGE__",["$","$4","c",{"children":["$La",null,["$","$Lb",null,{"children":"$Lc"}]]}],{},null]},null]},null],["$","$4","h",{"children":[null,["$","$4","Ms9w_hju4wmo-CT1B5ygd",{"children":[["$","$Ld",null,{"children":"$Le"}],["$","$Lf",null,{"children":"$L10"}],["$","meta",null,{"name":"next-size-adjust"}]]}]]}]]],"m":"$undefined","G":["$11","$undefined"],"s":false,"S":true}
 10:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}]]
 e:[["$","meta","0",{"charSet":"utf-8"}],["$","title","1",{"children":"Albumentations: fast and flexible image augmentations"}],["$","meta","2",{"name":"description","content":"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."}],["$","meta","3",{"name":"robots","content":"index, follow"}],["$","link","4",{"rel":"canonical","href":"https://albumentations.ai/"}],["$","meta","5",{"property":"og:title","content":"Albumentations: fast and flexible image augmentations"}],["$","meta","6",{"property":"og:description","content":"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."}],["$","meta","7",{"property":"og:url","content":"https://albumentations.ai/"}],["$","meta","8",{"property":"og:site_name","content":"Albumentations"}],["$","meta","9",{"property":"og:locale","content":"en_US"}],["$","meta","10",{"property":"og:image","content":"https://albumentations.ai/assets/albumentations_card.png"}],["$","meta","11",{"property":"og:image:width","content":"1200"}],["$","meta","12",{"property":"og:image:height","content":"630"}],["$","meta","13",{"property":"og:image:alt","content":"Albumentations"}],["$","meta","14",{"property":"og:type","content":"website"}],["$","meta","15",{"name":"twitter:card","content":"summary_large_image"}],["$","meta","16",{"name":"twitter:site","content":"@albumentations"}],["$","meta","17",{"name":"twitter:creator","content":"@viglovikov"}],["$","meta","18",{"name":"twitter:title","content":"Albumentations: fast and flexible image augmentations"}],["$","meta","19",{"name":"twitter:description","content":"Albumentations provides a comprehensive, high-performance framework for augmenting images to improve machine learning models."}],["$","meta","20",{"name":"twitter:image","content":"https://albumentations.ai/assets/albumentations_card.png"}],["$","link","21",{"rel":"icon","href":"/icon.svg?ed95530d83f93aed","type":"image/svg+xml","sizes":"any"}]]
 c:null
diff --git a/sitemap.xml b/sitemap.xml
index eb0ba97e..6e5ac086 100755
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -1,378 +1,378 @@
 <?xml version='1.0' encoding='UTF-8'?>
 <urlset xmlns:xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/CONTRIBUTING/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/benchmarking_results/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/faq/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/frameworks_and_libraries/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/full_reference/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/autoalbument/benchmarks/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/blur/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/blur/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/blur/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/crops/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/crops/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/crops/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/domain_adaptation/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/channel_dropout/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/coarse_dropout/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/grid_dropout/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/mask_dropout/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/dropout/xy_masking/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/resize/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/rotate/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/geometric/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/mixing/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/mixing/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/mixing/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/transforms3d/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/transforms3d/functional/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/augmentations/transforms3d/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/core/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/core/bbox_utils/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/core/composition/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/core/keypoints_utils/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/core/serialization/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/core/transforms_interface/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/pytorch/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/api_reference/pytorch/transforms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/benchmarks/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/custom_model/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/docker/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/faq/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/how_to_use/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/installation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/introduction/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/metrics/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/search_algorithms/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/tuning_parameters/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/examples/cifar10/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/examples/cityscapes/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/examples/imagenet/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/examples/list/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/examples/pascal_voc/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/autoalbument/examples/svhn/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/contributing/coding_guidelines/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/contributing/environment_setup/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/examples/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/external_resources/blog_posts_podcasts_talks/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/external_resources/books/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/external_resources/online_courses/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/augmentation_mapping/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/image_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/installation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/keypoints_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/mask_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/setting_probabilities/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/simultaneous_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/transforms_and_targets/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/video_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/getting_started/volumetric_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/integrations/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/integrations/fiftyone/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/integrations/huggingface/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/integrations/huggingface/image_classification_albumentations/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/integrations/huggingface/object_detection/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/integrations/roboflow/train-rt-detr-on-custom-dataset-with-transformers/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/introduction/image_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/introduction/why_albumentations/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
     <xmlns:url>
          <xmlns:loc>https://albumentations.ai/docs/introduction/why_you_need_a_dedicated_library_for_image_augmentation/</xmlns:loc>
-         <xmlns:lastmod>2024-12-23</xmlns:lastmod>
+         <xmlns:lastmod>2024-12-24</xmlns:lastmod>
     </xmlns:url>
 <url>
     <loc>https://albumentations.ai</loc>
-    <lastmod>2024-12-23</lastmod>
+    <lastmod>2024-12-24</lastmod>
     <changefreq>daily</changefreq>
 </url><url>
     <loc>https://albumentations.ai/people</loc>
-    <lastmod>2024-12-23</lastmod>
+    <lastmod>2024-12-24</lastmod>
     <changefreq>daily</changefreq>
 </url><url>
     <loc>https://albumentations.ai/testimonials</loc>
-    <lastmod>2024-12-23</lastmod>
+    <lastmod>2024-12-24</lastmod>
     <changefreq>daily</changefreq>
 </url></urlset>
diff --git a/testimonials/index.html b/testimonials/index.html
index e479c1c4..09fdff11 100755
--- a/testimonials/index.html
+++ b/testimonials/index.html
@@ -1 +1 @@
-<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><script src="/_next/static/chunks/app/testimonials/page-4290fed0e0229be8.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>Community Testimonials | Albumentations</title><meta name="description" content="Real feedback from the Albumentations community"/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Community Testimonials"/><meta property="og:description" content="Real feedback from the Albumentations community"/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Community Testimonials"/><meta name="twitter:description" content="Real feedback from the Albumentations community"/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="icon" href="/icon.svg?ed95530d83f93aed" type="image/svg+xml" sizes="any"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div class="container mx-auto px-4 py-12"><h1 class="text-3xl md:text-4xl font-medium text-center mb-12">Community Feedback</h1><div class="masonry-grid"><div style="width:33.333333333333336%" class="masonry-grid_column"><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/dmitrii_sakharov.webp"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7262128537543786497/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/gonzalo_liedo.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7260131224340328451/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/anmol_sharan.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7179530063627870209" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/venkatkumar_r.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7262002222035546112/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div></div><div style="width:33.333333333333336%" class="masonry-grid_column"><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/datature.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7261972596458438656/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/sergei_chicherin.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7259354845051002882/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/christof_henkel.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-twitter text-white text-2xl"></i></div><a href="https://x.com/kagglingdieter/status/1775760029754253659" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/Natalia_Pavlovskaya.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7270124370264506371/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div></div><div style="width:33.333333333333336%" class="masonry-grid_column"><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/rittik_panda.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-twitter text-white text-2xl"></i></div><a href="https://x.com/rittik_panda/status/1854989544049066303" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/alexandr_simonyan.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7259736304416874497/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/deepneuralnetwork.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-reddit text-white text-2xl"></i></div><a href="https://www.reddit.com/r/computervision/comments/1gju7oe/comment/lvh4w6s/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\na:I[7051,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"759\",\"static/chunks/app/testimonials/page-4290fed0e0229be8.js\"],\"default\"]\nb:I[6213,[],\"OutletBoundary\"]\nd:I[6213,[],\"MetadataBoundary\"]\nf:I[6213,[],\"ViewportBoundary\"]\n11:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"ZNcQrWMYk7ymGN9y8PjnQ\",\"p\":\"\",\"c\":[\"\",\"testimonials\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"testimonials\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"testimonials\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"testimonials\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-3xl md:text-4xl font-medium text-center mb-12\",\"children\":\"Community Feedback\"}],[\"$\",\"$La\",null,{}]]}],null,[\"$\",\"$Lb\",null,{\"children\":\"$Lc\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[null,[\"$\",\"$4\",\"6qnBv18VhTET2FzHGeXfO\",{\"children\":[[\"$\",\"$Ld\",null,{\"children\":\"$Le\"}],[\"$\",\"$Lf\",null,{\"children\":\"$L10\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$11\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"10:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\ne:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Community Testimonials | Albumentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Real feedback from the Albumentations community\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Community Testimonials\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Real feedback from the Albumentations community\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Community Testimonials\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Real feedback from the Albumentations community\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"link\",\"21\",{\"rel\":\"icon\",\"href\":\"/icon.svg?ed95530d83f93aed\",\"type\":\"image/svg+xml\",\"sizes\":\"any\"}]]\n"])</script><script>self.__next_f.push([1,"c:null\n"])</script></body></html>
\ No newline at end of file
+<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="preload" href="/_next/static/media/4473ecc91f70f139-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="preload" href="/_next/static/media/463dafcda517f24f-s.p.woff" as="font" crossorigin="" type="font/woff"/><link rel="stylesheet" href="/_next/static/css/8043a7c984777fb1.css" data-precedence="next"/><link rel="preload" as="script" fetchPriority="low" href="/_next/static/chunks/webpack-e1c13a2b9846d525.js"/><script src="/_next/static/chunks/4bd1b696-2f95f241513d703e.js" async=""></script><script src="/_next/static/chunks/517-81a24976446e982a.js" async=""></script><script src="/_next/static/chunks/main-app-7cfbf51ca4e266f5.js" async=""></script><script src="/_next/static/chunks/565-7d6f0e76202f7b1c.js" async=""></script><script src="/_next/static/chunks/396-4e8a41fcd07efacf.js" async=""></script><script src="/_next/static/chunks/181-00c5394d716d3dbd.js" async=""></script><script src="/_next/static/chunks/app/layout-90679db76f0bc6e8.js" async=""></script><script src="/_next/static/chunks/158-fba4c5b26e754598.js" async=""></script><script src="/_next/static/chunks/app/page-01034d6f519d879c.js" async=""></script><script src="/_next/static/chunks/app/testimonials/page-4290fed0e0229be8.js" async=""></script><link rel="preload" href="https://www.googletagmanager.com/gtag/js?id=G-DCXRDR9HJ0" as="script"/><meta name="next-size-adjust"/><link rel="shortcut icon" href="/albumentations_logo.png"/><title>Community Testimonials | Albumentations</title><meta name="description" content="Real feedback from the Albumentations community"/><meta name="robots" content="index, follow"/><link rel="canonical" href="https://albumentations.ai/"/><meta property="og:title" content="Community Testimonials"/><meta property="og:description" content="Real feedback from the Albumentations community"/><meta property="og:url" content="https://albumentations.ai/"/><meta property="og:site_name" content="Albumentations"/><meta property="og:locale" content="en_US"/><meta property="og:image" content="https://albumentations.ai/assets/albumentations_card.png"/><meta property="og:image:width" content="1200"/><meta property="og:image:height" content="630"/><meta property="og:image:alt" content="Albumentations"/><meta property="og:type" content="website"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:site" content="@albumentations"/><meta name="twitter:creator" content="@viglovikov"/><meta name="twitter:title" content="Community Testimonials"/><meta name="twitter:description" content="Real feedback from the Albumentations community"/><meta name="twitter:image" content="https://albumentations.ai/assets/albumentations_card.png"/><link rel="icon" href="/icon.svg?ed95530d83f93aed" type="image/svg+xml" sizes="any"/><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"/><script src="/_next/static/chunks/polyfills-42372ed130431b0a.js" noModule=""></script></head><body class="__variable_1e4310 __variable_c3aa02 font-sans antialiased"><nav class="fixed top-0 left-0 right-0 z-50 bg-white border-b"><div class="container mx-auto px-4"><div class="flex items-center justify-between h-16"><a class="flex items-center" href="/"><img alt="Albumentations" loading="lazy" width="32" height="32" decoding="async" data-nimg="1" class="mr-2" style="color:transparent" src="/favicon.png"/><span class="font-medium text-lg">Albumentations</span></a><div class="hidden md:flex items-center space-x-8"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer"><i class="fa fa-heart mr-2"></i>Sponsor</a><a href="https://github.com/albumentations-team/albumentations" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-blue-600 text-blue-600 hover:bg-blue-50 px-3 py-1.5 text-sm" target="_blank" rel="noopener noreferrer"><i class="fab fa-github mr-2"></i>GitHub</a></div><button class="md:hidden p-2 text-gray-600 hover:text-gray-900" aria-label="Open menu"><i class="fas fa-bars text-xl"></i></button></div></div></nav><main class="pt-16"><div class="container mx-auto px-4 py-12"><h1 class="text-3xl md:text-4xl font-medium text-center mb-12">Community Feedback</h1><div class="masonry-grid"><div style="width:33.333333333333336%" class="masonry-grid_column"><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/dmitrii_sakharov.webp"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7262128537543786497/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/gonzalo_liedo.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7260131224340328451/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/anmol_sharan.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7179530063627870209" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/venkatkumar_r.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7262002222035546112/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div></div><div style="width:33.333333333333336%" class="masonry-grid_column"><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/datature.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7261972596458438656/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/sergei_chicherin.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7259354845051002882/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/christof_henkel.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-twitter text-white text-2xl"></i></div><a href="https://x.com/kagglingdieter/status/1775760029754253659" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/Natalia_Pavlovskaya.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7270124370264506371/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div></div><div style="width:33.333333333333336%" class="masonry-grid_column"><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/rittik_panda.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-twitter text-white text-2xl"></i></div><a href="https://x.com/rittik_panda/status/1854989544049066303" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/alexandr_simonyan.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-linkedin text-white text-2xl"></i></div><a href="https://www.linkedin.com/feed/update/urn:li:activity:7259736304416874497/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div><div class="masonry-grid_item mb-4 opacity-0" style="animation-delay:0.1s"><div class="relative group cursor-zoom-in"><img alt="Community feedback" loading="lazy" width="400" height="300" decoding="async" data-nimg="1" class="rounded-lg shadow-sm hover:shadow-md transition-shadow w-full" style="color:transparent" src="/assets/testimonials/deepneuralnetwork.png"/><div class="absolute inset-0 bg-black/50 opacity-0 group-hover:opacity-100 transition-opacity rounded-lg flex items-center justify-center"><i class="fab fa-reddit text-white text-2xl"></i></div><a href="https://www.reddit.com/r/computervision/comments/1gju7oe/comment/lvh4w6s/" target="_blank" rel="noopener noreferrer" class="absolute bottom-2 right-2 text-white opacity-0 group-hover:opacity-100 transition-opacity"><i class="fas fa-external-link-alt"></i></a></div></div></div></div></div></main><footer class="border-t"><div class="container mx-auto px-4 py-8"><div class="flex flex-col md:flex-row justify-between items-start md:items-center gap-6"><nav class="flex flex-wrap gap-x-6 gap-y-2"><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/">Home</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/docs/">Documentation</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/whos_using/">Who&#x27;s using</a><a href="https://explore.albumentations.ai" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">Explore</a><a class="text-gray-600 hover:text-gray-900 transition-colors" href="/people/">People</a><a href="https://github.com/albumentations-team/albumentations" target="_blank" rel="noopener noreferrer" class="text-gray-600 hover:text-gray-900 transition-colors">GitHub</a></nav><a href="https://github.com/sponsors/albumentations-team" class="inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2" target="_blank" rel="noopener noreferrer">Sponsor</a></div></div></footer><script src="/_next/static/chunks/webpack-e1c13a2b9846d525.js" async=""></script><script>(self.__next_f=self.__next_f||[]).push([0])</script><script>self.__next_f.push([1,"4:\"$Sreact.fragment\"\n5:I[431,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"default\"]\n6:I[5244,[],\"\"]\n7:I[3866,[],\"\"]\n8:I[4839,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"158\",\"static/chunks/158-fba4c5b26e754598.js\",\"974\",\"static/chunks/app/page-01034d6f519d879c.js\"],\"\"]\n9:I[766,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"396\",\"static/chunks/396-4e8a41fcd07efacf.js\",\"181\",\"static/chunks/181-00c5394d716d3dbd.js\",\"177\",\"static/chunks/app/layout-90679db76f0bc6e8.js\"],\"GoogleAnalytics\"]\na:I[7051,[\"565\",\"static/chunks/565-7d6f0e76202f7b1c.js\",\"759\",\"static/chunks/app/testimonials/page-4290fed0e0229be8.js\"],\"default\"]\nb:I[6213,[],\"OutletBoundary\"]\nd:I[6213,[],\"MetadataBoundary\"]\nf:I[6213,[],\"ViewportBoundary\"]\n11:I[4835,[],\"\"]\n1:HL[\"/_next/static/media/4473ecc91f70f139-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n2:HL[\"/_next/static/media/463dafcda517f24f-s.p.woff\",\"font\",{\"crossOrigin\":\"\",\"type\":\"font/woff\"}]\n3:HL[\"/_next/static/css/8043a7c984777fb1.css\",\"style\"]\n"])</script><script>self.__next_f.push([1,"0:{\"P\":null,\"b\":\"vJuFcpWvbN6zbAub4gcIZ\",\"p\":\"\",\"c\":[\"\",\"testimonials\",\"\"],\"i\":false,\"f\":[[[\"\",{\"children\":[\"testimonials\",{\"children\":[\"__PAGE__\",{}]}]},\"$undefined\",\"$undefined\",true],[\"\",[\"$\",\"$4\",\"c\",{\"children\":[[[\"$\",\"link\",\"0\",{\"rel\":\"stylesheet\",\"href\":\"/_next/static/css/8043a7c984777fb1.css\",\"precedence\":\"next\",\"crossOrigin\":\"$undefined\",\"nonce\":\"$undefined\"}]],[\"$\",\"html\",null,{\"lang\":\"en\",\"children\":[[\"$\",\"head\",null,{\"children\":[[\"$\",\"link\",null,{\"rel\":\"shortcut icon\",\"href\":\"/albumentations_logo.png\"}],[\"$\",\"link\",null,{\"rel\":\"stylesheet\",\"href\":\"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css\"}]]}],[\"$\",\"body\",null,{\"className\":\"__variable_1e4310 __variable_c3aa02 font-sans antialiased\",\"children\":[[\"$\",\"$L5\",null,{}],[\"$\",\"main\",null,{\"className\":\"pt-16\",\"children\":[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":[[\"$\",\"title\",null,{\"children\":\"404: This page could not be found.\"}],[\"$\",\"div\",null,{\"style\":{\"fontFamily\":\"system-ui,\\\"Segoe UI\\\",Roboto,Helvetica,Arial,sans-serif,\\\"Apple Color Emoji\\\",\\\"Segoe UI Emoji\\\"\",\"height\":\"100vh\",\"textAlign\":\"center\",\"display\":\"flex\",\"flexDirection\":\"column\",\"alignItems\":\"center\",\"justifyContent\":\"center\"},\"children\":[\"$\",\"div\",null,{\"children\":[[\"$\",\"style\",null,{\"dangerouslySetInnerHTML\":{\"__html\":\"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}\"}}],[\"$\",\"h1\",null,{\"className\":\"next-error-h1\",\"style\":{\"display\":\"inline-block\",\"margin\":\"0 20px 0 0\",\"padding\":\"0 23px 0 0\",\"fontSize\":24,\"fontWeight\":500,\"verticalAlign\":\"top\",\"lineHeight\":\"49px\"},\"children\":\"404\"}],[\"$\",\"div\",null,{\"style\":{\"display\":\"inline-block\"},\"children\":[\"$\",\"h2\",null,{\"style\":{\"fontSize\":14,\"fontWeight\":400,\"lineHeight\":\"49px\",\"margin\":0},\"children\":\"This page could not be found.\"}]}]]}]}]],\"notFoundStyles\":[]}]}],[\"$\",\"footer\",null,{\"className\":\"border-t\",\"children\":[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-8\",\"children\":[\"$\",\"div\",null,{\"className\":\"flex flex-col md:flex-row justify-between items-start md:items-center gap-6\",\"children\":[[\"$\",\"nav\",null,{\"className\":\"flex flex-wrap gap-x-6 gap-y-2\",\"children\":[[\"$\",\"$L8\",\"/\",{\"href\":\"/\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Home\"}],[\"$\",\"$L8\",\"/docs\",{\"href\":\"/docs\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Documentation\"}],[\"$\",\"$L8\",\"/whos_using\",{\"href\":\"/whos_using\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Who's using\"}],[\"$\",\"a\",\"https://explore.albumentations.ai\",{\"href\":\"https://explore.albumentations.ai\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"Explore\"}],[\"$\",\"$L8\",\"/people\",{\"href\":\"/people\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"People\"}],[\"$\",\"a\",\"https://github.com/albumentations-team/albumentations\",{\"href\":\"https://github.com/albumentations-team/albumentations\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"className\":\"text-gray-600 hover:text-gray-900 transition-colors\",\"children\":\"GitHub\"}]]}],[\"$\",\"a\",null,{\"href\":\"https://github.com/sponsors/albumentations-team\",\"className\":\"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2\",\"target\":\"_blank\",\"rel\":\"noopener noreferrer\",\"children\":[\"$undefined\",\"Sponsor\"]}]]}]}]}]]}],[\"$\",\"$L9\",null,{\"gaId\":\"G-DCXRDR9HJ0\"}]]}]]}],{\"children\":[\"testimonials\",[\"$\",\"$4\",\"c\",{\"children\":[null,[\"$\",\"$L6\",null,{\"parallelRouterKey\":\"children\",\"segmentPath\":[\"children\",\"testimonials\",\"children\"],\"error\":\"$undefined\",\"errorStyles\":\"$undefined\",\"errorScripts\":\"$undefined\",\"template\":[\"$\",\"$L7\",null,{}],\"templateStyles\":\"$undefined\",\"templateScripts\":\"$undefined\",\"notFound\":\"$undefined\",\"notFoundStyles\":\"$undefined\"}]]}],{\"children\":[\"__PAGE__\",[\"$\",\"$4\",\"c\",{\"children\":[[\"$\",\"div\",null,{\"className\":\"container mx-auto px-4 py-12\",\"children\":[[\"$\",\"h1\",null,{\"className\":\"text-3xl md:text-4xl font-medium text-center mb-12\",\"children\":\"Community Feedback\"}],[\"$\",\"$La\",null,{}]]}],null,[\"$\",\"$Lb\",null,{\"children\":\"$Lc\"}]]}],{},null]},null]},null],[\"$\",\"$4\",\"h\",{\"children\":[null,[\"$\",\"$4\",\"w-XkCD44dyevLpF2uef-4\",{\"children\":[[\"$\",\"$Ld\",null,{\"children\":\"$Le\"}],[\"$\",\"$Lf\",null,{\"children\":\"$L10\"}],[\"$\",\"meta\",null,{\"name\":\"next-size-adjust\"}]]}]]}]]],\"m\":\"$undefined\",\"G\":[\"$11\",\"$undefined\"],\"s\":false,\"S\":true}\n"])</script><script>self.__next_f.push([1,"10:[[\"$\",\"meta\",\"0\",{\"name\":\"viewport\",\"content\":\"width=device-width, initial-scale=1\"}]]\ne:[[\"$\",\"meta\",\"0\",{\"charSet\":\"utf-8\"}],[\"$\",\"title\",\"1\",{\"children\":\"Community Testimonials | Albumentations\"}],[\"$\",\"meta\",\"2\",{\"name\":\"description\",\"content\":\"Real feedback from the Albumentations community\"}],[\"$\",\"meta\",\"3\",{\"name\":\"robots\",\"content\":\"index, follow\"}],[\"$\",\"link\",\"4\",{\"rel\":\"canonical\",\"href\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"5\",{\"property\":\"og:title\",\"content\":\"Community Testimonials\"}],[\"$\",\"meta\",\"6\",{\"property\":\"og:description\",\"content\":\"Real feedback from the Albumentations community\"}],[\"$\",\"meta\",\"7\",{\"property\":\"og:url\",\"content\":\"https://albumentations.ai/\"}],[\"$\",\"meta\",\"8\",{\"property\":\"og:site_name\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"9\",{\"property\":\"og:locale\",\"content\":\"en_US\"}],[\"$\",\"meta\",\"10\",{\"property\":\"og:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"meta\",\"11\",{\"property\":\"og:image:width\",\"content\":\"1200\"}],[\"$\",\"meta\",\"12\",{\"property\":\"og:image:height\",\"content\":\"630\"}],[\"$\",\"meta\",\"13\",{\"property\":\"og:image:alt\",\"content\":\"Albumentations\"}],[\"$\",\"meta\",\"14\",{\"property\":\"og:type\",\"content\":\"website\"}],[\"$\",\"meta\",\"15\",{\"name\":\"twitter:card\",\"content\":\"summary_large_image\"}],[\"$\",\"meta\",\"16\",{\"name\":\"twitter:site\",\"content\":\"@albumentations\"}],[\"$\",\"meta\",\"17\",{\"name\":\"twitter:creator\",\"content\":\"@viglovikov\"}],[\"$\",\"meta\",\"18\",{\"name\":\"twitter:title\",\"content\":\"Community Testimonials\"}],[\"$\",\"meta\",\"19\",{\"name\":\"twitter:description\",\"content\":\"Real feedback from the Albumentations community\"}],[\"$\",\"meta\",\"20\",{\"name\":\"twitter:image\",\"content\":\"https://albumentations.ai/assets/albumentations_card.png\"}],[\"$\",\"link\",\"21\",{\"rel\":\"icon\",\"href\":\"/icon.svg?ed95530d83f93aed\",\"type\":\"image/svg+xml\",\"sizes\":\"any\"}]]\n"])</script><script>self.__next_f.push([1,"c:null\n"])</script></body></html>
\ No newline at end of file
diff --git a/testimonials/index.txt b/testimonials/index.txt
index 75991013..c7bb124e 100755
--- a/testimonials/index.txt
+++ b/testimonials/index.txt
@@ -12,7 +12,7 @@ f:I[6213,[],"ViewportBoundary"]
 1:HL["/_next/static/media/4473ecc91f70f139-s.p.woff","font",{"crossOrigin":"","type":"font/woff"}]
 2:HL["/_next/static/media/463dafcda517f24f-s.p.woff","font",{"crossOrigin":"","type":"font/woff"}]
 3:HL["/_next/static/css/8043a7c984777fb1.css","style"]
-0:{"P":null,"b":"ZNcQrWMYk7ymGN9y8PjnQ","p":"","c":["","testimonials",""],"i":false,"f":[[["",{"children":["testimonials",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",["$","$4","c",{"children":[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/8043a7c984777fb1.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","html",null,{"lang":"en","children":[["$","head",null,{"children":[["$","link",null,{"rel":"shortcut icon","href":"/albumentations_logo.png"}],["$","link",null,{"rel":"stylesheet","href":"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"}]]}],["$","body",null,{"className":"__variable_1e4310 __variable_c3aa02 font-sans antialiased","children":[["$","$L5",null,{}],["$","main",null,{"className":"pt-16","children":["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}],["$","footer",null,{"className":"border-t","children":["$","div",null,{"className":"container mx-auto px-4 py-8","children":["$","div",null,{"className":"flex flex-col md:flex-row justify-between items-start md:items-center gap-6","children":[["$","nav",null,{"className":"flex flex-wrap gap-x-6 gap-y-2","children":[["$","$L8","/",{"href":"/","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Home"}],["$","$L8","/docs",{"href":"/docs","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Documentation"}],["$","$L8","/whos_using",{"href":"/whos_using","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Who's using"}],["$","a","https://explore.albumentations.ai",{"href":"https://explore.albumentations.ai","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Explore"}],["$","$L8","/people",{"href":"/people","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"People"}],["$","a","https://github.com/albumentations-team/albumentations",{"href":"https://github.com/albumentations-team/albumentations","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"GitHub"}]]}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":["$undefined","Sponsor"]}]]}]}]}]]}],["$","$L9",null,{"gaId":"G-DCXRDR9HJ0"}]]}]]}],{"children":["testimonials",["$","$4","c",{"children":[null,["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children","testimonials","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]]}],{"children":["__PAGE__",["$","$4","c",{"children":[["$","div",null,{"className":"container mx-auto px-4 py-12","children":[["$","h1",null,{"className":"text-3xl md:text-4xl font-medium text-center mb-12","children":"Community Feedback"}],["$","$La",null,{}]]}],null,["$","$Lb",null,{"children":"$Lc"}]]}],{},null]},null]},null],["$","$4","h",{"children":[null,["$","$4","6qnBv18VhTET2FzHGeXfO",{"children":[["$","$Ld",null,{"children":"$Le"}],["$","$Lf",null,{"children":"$L10"}],["$","meta",null,{"name":"next-size-adjust"}]]}]]}]]],"m":"$undefined","G":["$11","$undefined"],"s":false,"S":true}
+0:{"P":null,"b":"vJuFcpWvbN6zbAub4gcIZ","p":"","c":["","testimonials",""],"i":false,"f":[[["",{"children":["testimonials",{"children":["__PAGE__",{}]}]},"$undefined","$undefined",true],["",["$","$4","c",{"children":[[["$","link","0",{"rel":"stylesheet","href":"/_next/static/css/8043a7c984777fb1.css","precedence":"next","crossOrigin":"$undefined","nonce":"$undefined"}]],["$","html",null,{"lang":"en","children":[["$","head",null,{"children":[["$","link",null,{"rel":"shortcut icon","href":"/albumentations_logo.png"}],["$","link",null,{"rel":"stylesheet","href":"https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.9.0/css/all.min.css"}]]}],["$","body",null,{"className":"__variable_1e4310 __variable_c3aa02 font-sans antialiased","children":[["$","$L5",null,{}],["$","main",null,{"className":"pt-16","children":["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":[["$","title",null,{"children":"404: This page could not be found."}],["$","div",null,{"style":{"fontFamily":"system-ui,\"Segoe UI\",Roboto,Helvetica,Arial,sans-serif,\"Apple Color Emoji\",\"Segoe UI Emoji\"","height":"100vh","textAlign":"center","display":"flex","flexDirection":"column","alignItems":"center","justifyContent":"center"},"children":["$","div",null,{"children":[["$","style",null,{"dangerouslySetInnerHTML":{"__html":"body{color:#000;background:#fff;margin:0}.next-error-h1{border-right:1px solid rgba(0,0,0,.3)}@media (prefers-color-scheme:dark){body{color:#fff;background:#000}.next-error-h1{border-right:1px solid rgba(255,255,255,.3)}}"}}],["$","h1",null,{"className":"next-error-h1","style":{"display":"inline-block","margin":"0 20px 0 0","padding":"0 23px 0 0","fontSize":24,"fontWeight":500,"verticalAlign":"top","lineHeight":"49px"},"children":"404"}],["$","div",null,{"style":{"display":"inline-block"},"children":["$","h2",null,{"style":{"fontSize":14,"fontWeight":400,"lineHeight":"49px","margin":0},"children":"This page could not be found."}]}]]}]}]],"notFoundStyles":[]}]}],["$","footer",null,{"className":"border-t","children":["$","div",null,{"className":"container mx-auto px-4 py-8","children":["$","div",null,{"className":"flex flex-col md:flex-row justify-between items-start md:items-center gap-6","children":[["$","nav",null,{"className":"flex flex-wrap gap-x-6 gap-y-2","children":[["$","$L8","/",{"href":"/","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Home"}],["$","$L8","/docs",{"href":"/docs","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Documentation"}],["$","$L8","/whos_using",{"href":"/whos_using","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Who's using"}],["$","a","https://explore.albumentations.ai",{"href":"https://explore.albumentations.ai","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"Explore"}],["$","$L8","/people",{"href":"/people","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"People"}],["$","a","https://github.com/albumentations-team/albumentations",{"href":"https://github.com/albumentations-team/albumentations","target":"_blank","rel":"noopener noreferrer","className":"text-gray-600 hover:text-gray-900 transition-colors","children":"GitHub"}]]}],["$","a",null,{"href":"https://github.com/sponsors/albumentations-team","className":"inline-flex items-center justify-center font-medium transition-colors rounded-md border-2 border-green-600 text-green-600 hover:bg-green-50 px-4 py-2","target":"_blank","rel":"noopener noreferrer","children":["$undefined","Sponsor"]}]]}]}]}]]}],["$","$L9",null,{"gaId":"G-DCXRDR9HJ0"}]]}]]}],{"children":["testimonials",["$","$4","c",{"children":[null,["$","$L6",null,{"parallelRouterKey":"children","segmentPath":["children","testimonials","children"],"error":"$undefined","errorStyles":"$undefined","errorScripts":"$undefined","template":["$","$L7",null,{}],"templateStyles":"$undefined","templateScripts":"$undefined","notFound":"$undefined","notFoundStyles":"$undefined"}]]}],{"children":["__PAGE__",["$","$4","c",{"children":[["$","div",null,{"className":"container mx-auto px-4 py-12","children":[["$","h1",null,{"className":"text-3xl md:text-4xl font-medium text-center mb-12","children":"Community Feedback"}],["$","$La",null,{}]]}],null,["$","$Lb",null,{"children":"$Lc"}]]}],{},null]},null]},null],["$","$4","h",{"children":[null,["$","$4","w-XkCD44dyevLpF2uef-4",{"children":[["$","$Ld",null,{"children":"$Le"}],["$","$Lf",null,{"children":"$L10"}],["$","meta",null,{"name":"next-size-adjust"}]]}]]}]]],"m":"$undefined","G":["$11","$undefined"],"s":false,"S":true}
 10:[["$","meta","0",{"name":"viewport","content":"width=device-width, initial-scale=1"}]]
 e:[["$","meta","0",{"charSet":"utf-8"}],["$","title","1",{"children":"Community Testimonials | Albumentations"}],["$","meta","2",{"name":"description","content":"Real feedback from the Albumentations community"}],["$","meta","3",{"name":"robots","content":"index, follow"}],["$","link","4",{"rel":"canonical","href":"https://albumentations.ai/"}],["$","meta","5",{"property":"og:title","content":"Community Testimonials"}],["$","meta","6",{"property":"og:description","content":"Real feedback from the Albumentations community"}],["$","meta","7",{"property":"og:url","content":"https://albumentations.ai/"}],["$","meta","8",{"property":"og:site_name","content":"Albumentations"}],["$","meta","9",{"property":"og:locale","content":"en_US"}],["$","meta","10",{"property":"og:image","content":"https://albumentations.ai/assets/albumentations_card.png"}],["$","meta","11",{"property":"og:image:width","content":"1200"}],["$","meta","12",{"property":"og:image:height","content":"630"}],["$","meta","13",{"property":"og:image:alt","content":"Albumentations"}],["$","meta","14",{"property":"og:type","content":"website"}],["$","meta","15",{"name":"twitter:card","content":"summary_large_image"}],["$","meta","16",{"name":"twitter:site","content":"@albumentations"}],["$","meta","17",{"name":"twitter:creator","content":"@viglovikov"}],["$","meta","18",{"name":"twitter:title","content":"Community Testimonials"}],["$","meta","19",{"name":"twitter:description","content":"Real feedback from the Albumentations community"}],["$","meta","20",{"name":"twitter:image","content":"https://albumentations.ai/assets/albumentations_card.png"}],["$","link","21",{"rel":"icon","href":"/icon.svg?ed95530d83f93aed","type":"image/svg+xml","sizes":"any"}]]
 c:null

Name	Type	Description
`reference_images`	`Sequence[Any]`	Sequence of objects to be converted into images by `read_fn`. This typically involves paths to images that serve as target domain examples for adaptation.
`beta_limit`	`tuple[float, float] \| float`	Coefficient beta from the paper, controlling the swapping extent of frequency components. If one value is provided beta will be sampled from uniform distribution [0, beta_limit]. Values should be less than 0.5.
`read_fn`	`Callable`	User-defined function for reading images. It takes an element from `reference_images` and returns a numpy array of image pixels. By default, it is expected to take a path to an image and return a numpy array.
Name	Type	Description
`reference_images`	`Sequence[Any]`	A sequence of reference image sources. These can be file paths, URLs, or any objects that can be converted to images by the `read_fn`.
`blend_ratio`	`tuple[float, float]`	Range for the blending factor between the original and the matched image. Must be two floats between 0 and 1, where: - 0 means no blending (original image is returned) - 1 means full histogram matching A random value within this range is chosen for each application. Default: (0.5, 1.0)
`read_fn`	`Callable[[Any], np.ndarray]`	A function that takes an element from `reference_images` and returns a numpy array representing the image. Default: read_rgb_image (reads image file from disk)
`p`	`float`	Probability of applying the transform. Default: 0.5
Name	Type	Description
`reference_images`	`Sequence[Any]`	A sequence of objects (typically image paths) that will be converted into images by `read_fn`. These images serve as references for the domain adaptation.
`blend_ratio`	`tuple[float, float]`	Specifies the minimum and maximum blend ratio for mixing the adapted image with the original. This enhances the diversity of the output images. Values should be in the range [0, 1]. Default: (0.25, 1.0)
`read_fn`	`Callable`	A user-defined function for reading and converting the objects in `reference_images` into numpy arrays. By default, it assumes these objects are image paths.
`transform_type`	`Literal["pca", "standard", "minmax"]`	Specifies the type of statistical transformation to apply. - "pca": Principal Component Analysis - "standard": StandardScaler (zero mean and unit variance) - "minmax": MinMaxScaler (scales to a fixed range, usually [0, 1]) Default: "pca"
`p`	`float`	The probability of applying the transform to any given image. Default: 0.5
Name	Type	Description
`n`	`int`	The total value to be split.
`parts`	`int`	The number of parts to split into.
Name	Type	Description
`points`	`np.ndarray`	Array of points with shape (N, 2).
`matrix`	`np.ndarray`	3x3 affine transformation matrix.
Name	Type	Description
`bboxes`	`np.ndarray`	Input bounding boxes
`matrix`	`np.ndarray`	Affine transformation matrix
`rotate_method`	`str`	Method for rotating bounding boxes ('largest_box' or 'ellipse')
`image_shape`	`Sequence[int]`	Shape of the input image
`border_mode`	`int`	OpenCV border mode
`output_shape`	`Sequence[int]`	Shape of the output image
Name	Type	Description
`bboxes`	`np.ndarray`	An array of bounding boxes with shape (N, 4+) where N is the number of bounding boxes. Each row should contain [x_min, y_min, x_max, y_max] followed by any additional attributes (e.g., class labels).
`matrix`	`np.ndarray`	The 3x3 affine transformation matrix to apply.
Name	Type	Description
`bboxes`	`np.ndarray`	Array of bounding boxes with shape (N, 4+) where N is the number of boxes. Each box is in format [x_min, y_min, x_max, y_max, ...], where ... represents optional additional fields (e.g., class_id, score).
`tiles`	`np.ndarray`	Array of tile coordinates with shape (M, 4) where M is the number of tiles. Each tile is in format [start_y, start_x, end_y, end_x].
`mapping`	`list[int]`	List of indices defining how tiles should be rearranged. Each index i in the list contains the index of the tile that should be moved to position i.
`image_shape`	`tuple[int, int]`	Shape of the image as (height, width).
`min_area`	`float`	Minimum area threshold in pixels. If a component's area after shuffling is smaller than this value, it will be filtered out. If None, no area filtering is applied.
`min_visibility`	`float`	Minimum visibility ratio threshold in range [0, 1]. Calculated as (component_area / original_area). If a component's visibility is lower than this value, it will be filtered out. If None, no visibility filtering is applied.
Name	Type	Description
`bboxes`	`np.ndarray`	A numpy array of bounding boxes with shape (num_bboxes, 4+). Each row represents a bounding box (x_min, y_min, x_max, y_max, ...).
`factor`	`int`	Number of CCW rotations. Must be in set {0, 1, 2, 3} See np.rot90.
Name	Type	Description
`src_points`	`np.ndarray`	Source control points with shape (num_points, 2)
`dst_points`	`np.ndarray`	Destination control points with shape (num_points, 2)