Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO error: No space left on device: Zone allocation failure during fillseq benchmark with ZenFS #284

Open
Zizhao-Wang opened this issue Nov 30, 2023 · 1 comment

Comments

@Zizhao-Wang
Copy link

Zizhao-Wang commented Nov 30, 2023

Issue Description
I encountered an issue while conducting a large-scale data write test using ZenFS with RocksDB. The test failed after writing around 200 million entries, with an error message “IO error: No space left on device: Zone allocation failure”.

Environment Setup

  • RocksDB Version: 8.10.0
  • ZenFS Version: latest version
  • Operating System: Ubuntu 22.04
  • Hardware Configuration: Emulated 500GB ZNS SSD using QEMU

Steps to Reproduce

  1. Configured and launched the virtual environment using the following QEMU command:
qemu-system-x86_64 --enable-kvm
-name cs-exp-zns
-m 50G
-nographic
-cpu host -smp 16
-hda ./virtualdisks/ubuntu.qcow2
-net user,hostfwd=tcp::8081-:22 -net nic
-drive file=./virtualdisks/zns.raw,id=mynvme,format=raw,if=none
-device nvme,serial=baz,id=nvme2
-device nvme-ns,id=ns2,drive=mynvme,nsid=2,logical_block_size=4096,physical_block_size=4096,zoned=true,zoned.zone_size=1024M,zoned.zone_capacity=1000M,zoned.max_open=0,zoned.max_active=0,bus=nvme2
-drive file=./virtualdisks/nvmessd.raw,id=mynvme2,format=raw,if=none
-device nvme,serial=foo,id=nvme3
-device nvme-ns,id=ns3,drive=mynvme2,nsid=3,bus=nvme3
-fsdev local,id=fsdev0,path=./work/,security_model=none
-device virtio-9p-pci,id=fs0,fsdev=fsdev0,mount_tag=hostshare
  1. Ran the following script to perform data writes on ZenFS using RocksDB:
echo deadline > /sys/class/block/nvme0n1/queue/scheduler

../rocksdb/plugin/zenfs/util/zenfs  mkfs\
    --zbd=nvme0n1 \
    --force \
    --aux_path=/root/logs
  1. Encountered the error after writing around 200 million entries.
#!/bin/bash


NUM_ENTRIES=500000000                           
VALUE_SIZE=100                                  
COMPRESSION_TYPE="none"                          
WRITE_BUFFER_SIZE=67108864                       
MAX_WRITE_BUFFER_NUMBER=3                       
MIN_WRITE_BUFFER_NUMBER_TO_MERGE=1               
CACHE_SIZE=8388608                               
MAX_BACKGROUND_JOBS=7                            
OPEN_FILES=40000                                 
STATS_PER_INTERVAL=$(($NUM_ENTRIES / 10))        
HISTOGRAM=true                                   
BLOOM_BITS=10                                   
DISABLE_WAL=true                                 

../../rocksdb/db_bench \
    --fs_uri=zenfs://dev:nvme0n1 \
    --use_direct_io_for_flush_and_compaction \
    --benchmarks=fillseq,stats \
    --num="$NUM_ENTRIES" \
    --value_size="$VALUE_SIZE" \
    --compression_type="$COMPRESSION_TYPE" \
    --write_buffer_size="$WRITE_BUFFER_SIZE" \
    --max_write_buffer_number="$MAX_WRITE_BUFFER_NUMBER" \
    --min_write_buffer_number_to_merge="$MIN_WRITE_BUFFER_NUMBER_TO_MERGE" \
    --cache_size="$CACHE_SIZE" \
    --max_background_jobs="$MAX_BACKGROUND_JOBS" \
    --open_files="$OPEN_FILES" \
    --stats_per_interval="$STATS_PER_INTERVAL" \
    --histogram="$HISTOGRAM" \
    --bloom_bits="$BLOOM_BITS" \
    --disable_wal="$DISABLE_WAL" \
    | tee zns_kv_log.log

Expected vs. Actual Results

  • Expected: Successful write of 500 million entries.
  • Actual: Encountered “IO error: No space left on device: Zone allocation failure” after writing around 200 million entries.

Expected vs. Actual Results

  • Expected: Successful write of 500 million entries without running out of space.
  • Actual: Encountered “IO error: No space left on device: Zone allocation failure” after writing around 20 GB of data, despite having approximately 497,000 MB of reported available space.

Additional Information

  • The ZenFS file system was successfully created on the ZNS SSD, and the available space was reported to be around 497,000 MB.
  • The error occurred much earlier than expected, given the amount of reported available space.
  • Is there a potential configuration issue with ZenFS or the emulated ZNS SSD that could lead to early space exhaustion?

Any insights or assistance in addressing this issue would be greatly appreciated.

@yhr
Copy link
Collaborator

yhr commented Dec 6, 2023

I think the default target file size ended up fragmenting the zones and causing the issue. This can happen with the fillseq workload which skips the normal write flow.

I suggest you set --target_file_size_base=$(( 1000 * 1024 * 1024 ))

There is a script in the zenfs tests directory called get_good_db_bench_params_for_zenfs.sh , this will generate a decent set of parameters for your device geometry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants