-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【新算子】- linalg.lu 算子开发 #1007
Comments
@Chuancysun 麻烦更新进展 |
目前正针对长条形的矩阵规模,比如(65536,30)进行针对性的优化,重构了分解内核的代码逻辑,其他规模下能够达到10倍以内的性能指标 |
PR链接如下: |
当前完成了实数单batch非主元的LU分解,文档和代码已经贴到了PR,#1019 |
与 @Chuancysun 沟通,当前PR 中设计文档&代码仅包含非主元,主元文档以及代码还在开发中 |
建议排期提前半周,多留几天给review +修改。 否则7.15 风险很大 |
正在完成单batch复数下的功能及性能调优,目前正在重点优化长条规模下的性能。 |
完成了对长条形矩阵的性能优化,目前已完成非主元的LU分解,正在开发选主元的LU分解 |
目前完成了选主元LU分解中较小规模的功能和性能,对于较大规模的分块实现正在开发调试中 |
nram float uint8_t test[512*1024]; /// 这句代码错误,有两个类型 不建议编写上述代码,因为不同板卡NRAM_SIZE 大小存在区别,590 算子开发可用 nram 空间没有512k |
测试代码和json可以贴下 |
mannul_shape_1.json |
compute.py如下: @registerTensorList("sgetrf2") def print_matrix(A): def set_complex_data(data_node, complex_tensor): def set_values_below_threshold(input_tensor, threshold=1e-3, new_value=1e-6):
def set_diag_imag_one(input_tensor):
def matrix_multiply(A, B):
def extract_LU(LU, pivots):
def make_diagonally_dominant(input_data): Function to swap two rows of a matrixdef swap_rows(matrix, row1, row2):
Function to apply row swaps to matrix A using ipivdef apply_row_swaps(A, ipiv): @registerOp("sgetrf2")
@registerProtoWriter("sgetrf2") |
开发计划可参考以下节点:
The text was updated successfully, but these errors were encountered: