Apache TVM v0.15.0
Introduction
NOTE: This is last release version before unity branch switch as main branch. No unity features.
The TVM community has worked since the v0.14.0 release to deliver the following new exciting improvements! The main tags are below (bold text is with lots of progress):
- Community, RFCs
- Adreno, ArmComputeLibrary, Metal, cuda & cutlass & tensorrt, micoNPU, Runtime
- Frontend & Relay
- Arith, TOPI, TIR, TVMScript
- Docs, CI, Misc, BugFix
Please visit the full listing of commits for a complete view: v0.14.0...v0.15.0.
Community
- #16172 - Yixin Dong -> Reviewer
- #16162 - Shuai Yuan -> Committer
- #16164 - Qiang Zhang -> Committer
- #16166 - Bohan Hou -> PMC
- #16165 - Ruihang Lai -> PMC
RFCs
- #105 - Add a new backend language——SYCL
Adreno
- #15991 - [CI] Enhancements to Adreno specific CI utils
- #15786 - [TOPI] Add conv2d transpose nchw texture schedule
Arith
- #16227 - Simplify nested if_then_else when constant is appearing in then_expr
ArmComputeLibrary
- #15990 - [ACL] Update Compute Library to v23.08
Metal
- #16192 - [Device] Fix metal warp size
- #16033 - [Codegen] Disable cross-function call in Metal codegen
cuda & cutlass & tensorrt
- #16061 - [CUDA] Add an option for profiling cuda kernels
micoNPU
- #16003 - [microNPU][ETHOSU] Fix ConcatRewriter args processing
- #15929 - [microNPU][ETHOSU] Fix rounding mode in requantize operation
Runtime
- #15896 - [CLML] Fix for CLML ops and enable more test case
- #16133 - Parallel-for with threading backend
- #16066 - Support clear global memory allocators
- #16030 - Introduce
TVM_MODULE_VTABLE
Macros
BugFix
- #16269 - Update pillow usage
- #16272 - Fixed Inappropriate Logical Expression
- #16216 - [TIR] Fix dynamic smem merge leaf alloc
- #16190 - Fix the error of reloading the model library on the ROCm platform: "MIOpen Error: No invoker was registered for convolution forward.”
- #16167 - [Relay][Pytorch] Fix missing
.dtype
- #16091 - [Fix] Fix
topi.rms_norm
with float32 upscale - #16081 - [Fix] Broken Windows Build with LLVM
- #16051 - [Fix][TIR] Fix dtype issues for match_buffer and ramp node
- #14655 - [VTA] Fix FSIM compile error on macOS
- #16021 - [FFI] Typo fix of IncRef to DecRef
- #16010 - [Fix][TIR] fix mul dtype mismatch
- #16000 - [Fix][TIR] fix symbolic strides lower
- #15970 - [Hotfix] Mark python-FFI handling with TVM_DLL
- #15965 - [CI] Better to pass the build folder
CI
- #16110 - Refactor unittest folder
- #16055 - Fix broken links about Jenkins
- #16062 - Use LLVM 17 for tests on
ci_arm
- #16018 - [Tests] Fix work_dir location used by test_micro_tuning_with_meta_schedule
- #16019 - [Tests] Check int8+int32 testcases in test_estimate_peak_flops_cpu
- #16017 - [Tests] Fix str vs. int comparison in test_num_threads
Docs
- #16282 - [Doc] Fix minor error in doc (Add an operator to Relay)
- #16152 - [DOC] Add v0.14.0 docs to site
- #16127 - Revert "[#15157][Rust][Doc] Re-enable the Rust documentation build (#15213)"
- #16097 - Add missing backtick to contribute/code_guide.rst
- #16089 - Fix error on linting by adding
--rev
argument - #16024 - Update release_process.rst about version number modification
Frontend & Relay
- #16243 - [TFLite] Add support for quantized mirror pad
- #15914 - [TFLite]Support quantized SQUARE
- #16159 - [KERAS] Fix bug concat convert for NCHW
- #16319 - [Torch] add aten:broadcast_to
- #16131 - [Pytorch] Add support for
aten::unflatten
- #16105 - [Pytorch] Add support for
aten::bitwise_and
- #16079 - [Pytorch] Add support for aten::swapaxes operator
- #15502 - [Pytorch] aten::copy_ support for pytorch
- #16180 - [Pytorch] Fix bug when converting models with torch.nn.ParameterList
- #16143 - [Pytorch] Add support for
aten::scaled_dot_product_attention
- #16123 - [Pytorch] Add support for
aten::linalg_vector_norm
- #16171 - [Frontend] Preserve Pytorch Span Names
- #16217 - [Frontend][QNN] fix access
param_debug_name_map
to node output name in fx-quantized graph node replacement - #16199 - [Frontend] Add support for aten::concat
- #16151 - conv3d depthwise bug fix
- #15928 - Expose qnn ops directly from relay.qnn module
TOPI
- #16259 - Add support for group_conv3d_transpose_ncdhw for generic
- #16052 - Enhance
topi.nn.matmul
- #16080 - Reduce code redundancy in conv2d weights transformation
- #16248 - [TOPI] Add support for group_conv1d_transpose_ncw for generic
- #16106 - [TOPI] Add conv2d NHWC hybrid schedule for
arm_cpu
TIR
- #16239 - [Schedule] TileWithTensorIntrin skip incorrect ComputeInline for input-padding
- #16236 - ConvertSSA process entry func first
- #16070 - [Transform] Introduce new
InjectPermutedLayout
pass - #16083 - Enhance Python Type Annotations for TIR Expr
- #16073 - Support more mma intrinsics and
get_mma_intrin_group
utility - #16076 - Enhance Python Type Annotations for TIR stmt
- #16074 - Fix the thread binding iter_var dtype in
Bind
primitive - #16063 - Fix pass RenewDefs error in gather/take case
- #16027 - Fix software pipeline with dynamic loop extent
TVMScript
- #16271 - Disable concise scoping when the scope stmt is explicitly annotated
- #16041 - Fix mismatched dtype of IterVar in
T.thread_binding
- #15953 - [TIR] Pretty print TIR LLVM function name
- #15972 - delete print extra info at parsing
Misc
- #16279 - replace deprecated np.int with int to avoid crash
- #16262 - Update conv2d.py
- #16255 - [Support] Add Interrupt Handling in Pipe
- #16104 - [LoopPartition] Fix a bug of LoopPartition in single point scenarioes
- #16231 - [Target] Add Jetson AGX Orin tags
- #16221 - remove deprecated np.int in slice converter (pytorch)
- #16214 - [Python] Fix setup.py for inplace build
- #16174 - Bump cryptography from 37.0.2 to 41.0.6 in /docker/python
- #16202 - Fix IRModule initialization with attrs
- #16176 - Enable ccache to accelerate contrib compilation
- #15968 - Add missing backtick
- #16034 - [Packaging] Include BYOC dynamic libraries into wheel
- #16087 - Add _ffi_api.py under script folder
- #16039 - [Target] Support obtain l2 cache size from target
- #16065 - [Pylint] fix pylint issues from test_random to test_tedd
- #16031 - [TRT] fix outdated module building method in tensorrt
- #16032 - [CMake] Use llvm-config to locate Findzstd.cmake
- #16023 - [Pylint] fix pylint issues for thrust&tflite_runtime&util
- #15998 - [Codegen] Add shuffle for cuda and metal
- #16015 - [Pylint] fix pylint issues for cblas
- #15955 - [FFI][Python] Handle error propagation when line number is missing
- #15982 - Bump werkzeug from 2.2.3 to 3.0.1 in /apps/microtvm
- #15966 - [CMake] Fix order of GNUInstallDirs module
- #15952 - Update ci_arm Docker tag
- #15940 - [Minor] Fix compilation warnings for clang
- #15947 - Bump urllib3 from 1.26.9 to 1.26.18 in /docker/python
- #15835 - [CodeGenC][Redo] Handle GlobalVar callee as internal function call
- #15945 - Bump urllib3 from 1.26.15 to 1.26.18 in /apps/microtvm