Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bad_alloc error when set epoch to 417 #3

Open
lukey936a opened this issue May 31, 2021 · 6 comments
Open

bad_alloc error when set epoch to 417 #3

lukey936a opened this issue May 31, 2021 · 6 comments

Comments

@lukey936a
Copy link

HI,when I set the epoch to 417 , I got a lot of errors, it seems when dag size bigger than 4GB, bad_alloc error happens ,do you know how to fix this?
./build/xleth 417 Xilinx ./xleth/xclbin/ethash.hw.xclbin
.......
Trying to program device xilinx_u50_gen3x16_xdma_201920_3
INFO: Reading ./xleth/xclbin/ethash.hw.xclbin
Loading: './xleth/xclbin/ethash.hw.xclbin'
Device program successful.
DEV: jinlili Global mem size 0 GB
KNL: L_WORKSIZE 128
KNL: MULTIPLIER 65536
KNL: G_WORKSIZE 16384
KNL: FASTEXIT 0
DEV: Global mem size 0 GB
DEV: Max alloc size 4096 MB
DEV: Max W Group size 4294967295
DEV: Max W Item size 4294967295/4294967295/4294967295
DEV: Max compute unit 2

Generating DAG ...

DAG: generating for epoch 417 ...
XRT build version: 2.9.317
Build hash: b0230e59e22351fb957dc46a6e68d7560e5f630c
Build date: 2021-03-13 05:10:45
Git branch: 2020.2_PU1
PID: 1940
UID: 1000
[Mon May 31 04:29:43 2021 GMT]
HOST: jin-HP-Z220-SFF-Workstation
EXE: /home/jin/work/ethash/xilinx-ethash/build/xleth
[XRT] ERROR: std::bad_alloc
[XRT] ERROR: std::bad_alloc
DAG: epoch 417 lightSize 71432512 dagSize 4571790208
[XRT] ERROR: Kernel arg '_DAG0' is not set
DAG: item 0 chunk 1280000, took 0.00s
[XRT] ERROR: Kernel arg '_DAG0' is not set
DAG: item 1280000 chunk 1280000, took 0.00s
......

@Ed-Yang
Copy link
Owner

Ed-Yang commented May 31, 2021

I am totally fresh on Xilinx solution and I did not have U50 on hand, maybe you could try to extend the DAG size in:

https://github.com/Ed-Yang/xilinx-ethash/blob/main/xleth/config/connectivity_u50.ini

@RezaAhmadi0117
Copy link

u can use 2 way for doing this:
1 - pass data to host when program wanna create context (no HBM) OR
2 - using more HBM bank(increase range number in config with (as Ed said) each bank has 2Gb space and totally 32 bank is available . so u need to add 20 bank of HBM to m_dag port( I test it on epoch 430 and its work for me)

@lukey936a
Copy link
Author

u can use 2 way for doing this:
1 - pass data to host when program wanna create context (no HBM) OR
2 - using more HBM bank(increase range number in config with (as Ed said) each bank has 2Gb space and totally 32 bank is available . so u need to add 20 bank of HBM to m_dag port( I test it on epoch 430 and its work for me)

This problem have solved by larger the HBM BANKS FOR dag IN the ini file, but I was confued why run on HW is slower than on SW_EMU

@RezaAhmadi0117
Copy link

In SW_EMU you are suing CPU to process DAG creation. but in real hardware You implement hardware with HLS code. PL parts in FPGA are slower than CPU (in freq.) and also need to optimize with Xilinx attribute (OR pragma). you and search and see about this problem in HLS tools that many researchers now working on that.
for be faster than CPU you need optimize this part but another way is to create dataset on CPU.
I post in another issue a file u can test it.(it is not complete because of copy buffer, but DAG creation is ok)

@lukey936a
Copy link
Author

In SW_EMU you are suing CPU to process DAG creation. but in real hardware You implement hardware with HLS code. PL parts in FPGA are slower than CPU (in freq.) and also need to optimize with Xilinx attribute (OR pragma). you and search and see about this problem in HLS tools that many researchers now working on that.
for be faster than CPU you need optimize this part but another way is to create dataset on CPU.
I post in another issue a file u can test it.(it is not complete because of copy buffer, but DAG creation is ok)

Hi,jackwatson01234
I downloaded you file and run on my hardware, but get errors:
./build/xleth 0 4 ./xleth/kernel/ethash.cl ./xleth/xclbin/ethash.hw.xclbin
.......
Found Platform
Platform Name: Xilinx
platform intel not found, kernel is not loaded

I think the dag generation is no the major problem,dag file can be generated by host cpu use 'geth makedag blockheight' cmd,and migrate to global memory of U50.

@RezaAhmadi0117
Copy link

RezaAhmadi0117 commented Jun 25, 2021

Do you install intel ocl deriver? your intel driver not founded and you need to install that. check below link:
https://software.intel.com/content/www/us/en/develop/articles/opencl-drivers.html#cpu-section
and yes there no different on that.
If u have this device(u50 and u280), pls send me massage on discord( J_Watson#4036).
I'm working on importing Ethminer to u50 but I've not that device.(I've one with a lot of problems :) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants