- NnxBuildFlow and CmakeBuildFlow
- Neureka V2 support
- github action for testing neureka
- add NnxMapping dictionary that maps accelerator name to the accelerator specific classes
- choice of data generation method (ones, incremented, or random)
- N-EUREKA accelerator support: 3x3, 1x1, and 3x3 depthwise convolution kernels
- Support for kernels without normalization and quantization for NE16
- isort check
- publication citation
- support 32bit scale
- cmake support
- const qualifier to
<acc>_dev_t
function arguments - support for N-EUREKA's dedicated weight memory
- wmem is no more a test configuration argument but a command line argument
- neureka is now tested with a more recent gcc version
- python requirements are changed into requirements-pip and requirements-conda
- conftest now passes only strings to test.py to improve readability of pytest logs
- NnxMemoryLayout is now NnxWeight and also has a method for source generation
- the
wmem
field in the test configurations is now required ne16_task_init
got split into smaller parts:ne16_task_init
,ne16_task_set_op_to_conv
,ne16_task_set_weight_offset
,ne16_task_set_bits
,ne16_task_set_norm_quant
- strides in
ne16_task_set_strides
,ne16_task_set_dims
, andne16_task_set_ptrs
are now strides between consecutive elements in that dimension ne16_task_queue_size
is nowNE16_TASK_QUEUE_SIZE
ne16_task_set_ptrs
split intone16_task_set_ptrs_conv
andne16_task_set_ptrs_norm_quant
k_in_stride
,w_in_stride
,k_out_stride
, andw_out_stride
fromne16_nnx_dispatch_stride2x2
mode
attribute fromne16_quant_t
structure
- global shift should have been of type uint8 not int32
- type conversion compiler warning
- New Hardware Processing Engine (HWPE) device in
util/hwpe.h
- A device structure for ne16
ne16_dev_t
inne16/hal/ne16.h
which extends the hwpe device - Test app Makefile has now an
ACCELERATOR
variable to specify which accelerator is used
- Library functions no longer start with a generic
nnx_
prefix but with<accelerator>_nnx_
prefix to allow for usage of multiple kinds of accelerators in the same system - Decoupled board specific functionality into
ne16/bsp
which also contains constant global structures to the implementations of thene16_dev_t
structure - Moved all task related functions (
nnx_task_set_dims*
) intone16/hal/ne16_task.c
- Tests adjusted for the new interface
- Test data generation moved into source files with extern declarations to check the output from the main
- pyright errors
- formatting errors
- Stridded 2x2 mode needed to propagate
padding_bottom
when input height is smaller then 5 - Test requirements where missing the toml dependency and
- Added timeout parameter to conftest.py
- Added stride arguments to
nnx_task_set_dims
,nnx_task_set_dims_stride2x2
, andnnx_dispatch_task_stride2x2
- Initial release of the repository.