forked from pulp-platform/pulp-trainlib
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request pulp-platform#6 from pulp-platform/main
Rebase main branch to updated main of original repo
- Loading branch information
Showing
22 changed files
with
1,238 additions
and
117 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -88,6 +88,11 @@ The `tools/` folder contains useful tools which ease the usage of PULP-TrainLib, | |
The `assets/` folder contains useful support files for PULP-TrainLib. Inside [CI_test_suite](assets/CI_test_suite/), users can find a testing environment that can be used to verify PULP-TrainLib's primitives for Continuous Integration. | ||
|
||
|
||
# Tutorials | ||
|
||
To learn how to generate the code with our TrainLib_Deployer and more details about the optimizations used in this library, a [tutorial repository](https://github.com/dnadalini/PULP-TrainLib-Tutorial) is available online. This repository contains tutorials and a guide to easily install a conda environment with all the necessary requirements to run PULP-TrainLib. | ||
|
||
|
||
|
||
# Installation and requirements | ||
|
||
|
@@ -137,6 +142,7 @@ To add new functionalities, users can follow the naming convention of PULP-Train | |
|
||
PULP-TrainLib's repository is organized with these branches: | ||
- `main`: main branch, targeting PULP architectures. | ||
- `trainlib-tutorial`: branch reserved for tutorial purposes (see [https://github.com/dnadalini/PULP-TrainLib-Tutorial](https://github.com/dnadalini/PULP-TrainLib-Tutorial)). | ||
- `pulp-trainlib-paper`: branch to reproduce the results provided in the paper ["PULP-TrainLib: Enabling On-Device Training for RISC-V Multi-Core MCUs through Performance-Driven Autotuning"](https://www.samos-conference.com/Resources_Samos_Websites/Proceedings_Repository_SAMOS/2022/Papers/Paper_14.pdf). | ||
- `pulp-trainlib-stm32`: this is a PULP-TrainLib port compatible with STM32 and other MCUs (FP32 format only). | ||
|
||
|
@@ -177,7 +183,6 @@ PULP-TrainLib's repository is organized with these branches: | |
- Performance bugs in im2col/im2row with DMA loading (performances tend to be less than im2col/im2row with cores) | ||
- Missing integration for RNN / MHSE in TrainLib_Deployer | ||
- FP32 MHSA primitives (Input Grad) | ||
- FP32 and FP16 InstanceNorm's output do not perfectly match PyTorch ones (need bugfixing) | ||
- Missing integration of sigmoid function in TrainLib_Deployer | ||
- Performances of FP16 sigmoid may need to be optimized with FP16 exponenetial (e.g., https://github.com/0xBYTESHIFT/fp16/blob/master/include/half/half.hpp) | ||
|
||
|
@@ -191,6 +196,7 @@ PULP-TrainLib's repository is organized with these branches: | |
- Manuele Rusci ([email protected]) | ||
- Francesco Conti ([email protected]) | ||
- Cristian Cioflan ([email protected]) | ||
- Luca Bompani ([email protected]) | ||
|
||
## Past Contributors | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
/* | ||
* Copyright (C) 2021-2022 ETH Zurich and University of Bologna | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
/** | ||
* Authors: Davide Nadalini | ||
*/ | ||
|
||
|
||
/** | ||
* Instance Norm layer configuration structure | ||
*/ | ||
|
||
/** | ||
* @brief Structure for Instance Norm Training in FP32 | ||
* @param input input feauture maps for the depthwise layer | ||
* @param output output feature maps for the depthwise layer | ||
* @param coeff coefficients to compute normalization, bias are included | ||
* @param batch_size size of the batch to be processed by the BatchNorm layer | ||
* @param running_mean array of running means computed during the forward step | ||
* @param running_var array of running variances computed during the forward step | ||
* @param running_stdev array of running standard deviations computed during the forward step | ||
* @param freeze_running_params if 1, freezes running mean and variance | ||
* @param skip_wg_grad skips the computation of the weight grad | ||
* @param skip_in_grad skips the computation of the input grad (1st DNN layer) | ||
*/ | ||
struct BatchNorm_args { | ||
struct blob * input; | ||
struct blob * output; | ||
struct blob * coeff; | ||
int batch_size; | ||
float * running_mean; | ||
float * running_var; | ||
float * running_stdev; | ||
int freeze_running_params; | ||
int skip_wg_grad; | ||
int skip_in_grad; | ||
}; | ||
|
||
/** | ||
* @brief Forward function that calls the parallelized version | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_fp32_fw_cl( void * BatchNorm_args ); | ||
|
||
/** | ||
* @brief Function that calls both input and param gradient functions | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_fp32_bw_cl( void * BatchNorm_args ); | ||
|
||
/** | ||
* @brief Backward param gradient function that calls the parallelized version | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_fp32_bw_param_grads_cl( void * BatchNorm_args ); | ||
|
||
/** | ||
* @brief Backward input gradient function that calls the parallelized version | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_fp32_bw_input_grads_cl( void * BatchNorm_args ); | ||
|
||
/** | ||
* @brief Forward backend function parallelized on multicore | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_parallelized_fp32_fw_cl( void * BatchNorm_args ); | ||
/** | ||
* @brief Backward backend function for input gradients parallelized on multicore | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_parallelized_fp32_bw_input_grads_cl( void * BatchNorm_args ); | ||
/** | ||
* @brief Backward backend function for parameters gradients parallelized on multicore | ||
* @param (void *) (struct InstNorm_args void_args) | ||
*/ | ||
void pulp_batchnorm_parallelized_fp32_bw_param_grads_cl( void * BatchNorm_args ); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.