Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs-balance-least-used: Error during balancing, there may be more info in dmesg: ENOENT, state none #51

Open
Massimo-B opened this issue Jan 24, 2025 · 5 comments

Comments

@Massimo-B
Copy link

I'm doing a btrfs-balance-least-used in order to optimize free space (Zygo/bees#298):

# btrfs-balance-least-used -u 80 /mnt/local/data/
Loading block group objects with used_pct <= 80 ... found 62
Balance block group vaddr 1303708893184 used_pct 1 ... duration 17 sec total 655
Balance block group vaddr 555222040576 used_pct 32 ...Error during balancing, there may be more info in dmesg: ENOENT, state none

But after some work it fails like this.
In the syslog (grep BTRFS) I can't find much but this:

[kernel] BTRFS info (device dm-2): balance: start -dvrange=1303708893184..1303708893185
[kernel] BTRFS info (device dm-2): relocating block group 1303708893184 flags data
[kernel] BTRFS info (device dm-2): found 1 extents, stage: move data extents
[kernel] BTRFS info (device dm-2): found 1 extents, stage: update data pointers
[kernel] BTRFS info (device dm-2): balance: ended with status: 0
[kernel] BTRFS info (device dm-2): balance: start -dvrange=555222040576..555222040577
[kernel] BTRFS info (device dm-2): relocating block group 555222040576 flags data
[kernel] BTRFS info (device dm-2): found 6650 extents, stage: move data extents
[kernel] BTRFS info (device dm-2): balance: ended with status: -2

Trying a usual balance it also fails after a while:

btrfs balance start -dusage=80 /mnt/local/data/
ERROR: error during balancing '/mnt/local/data/': No such file or directory

Kernel: 6.6.30-gentoo

@Massimo-B
Copy link
Author

On Libera I was told...

if the filesystem is old enough to have been mounted with kernel 5.1 to 5.3, it might be a leftover metadata corruption. btrfs check --repair might be able to fix it, but if not, then mkfs and restore`

and this is running now since 4 days and still working on 1 TiB HDD...

@Massimo-B
Copy link
Author

Because it's repeating the same line again and again…
super bytes used 631386423296 mismatches actual used 631386390528
… I guess it's kind of an infinite loop I need to cancel.

@Massimo-B
Copy link
Author

I stopped the --repair. balance and btrfs-balance-least-used are still stopping with the same failure.
I started bees again on the device, but it failed and remounted ro:

Jan 27 14:07:41 [kernel] BTRFS info (device dm-0: state M): force zstd compression, level 3
Jan 27 14:07:41 [kernel] BTRFS info (device dm-0: state M): turning off async discard
Jan 27 14:08:57 [kernel] BTRFS info (device dm-2): first mount of filesystem 1d577e4b-27c1-4729-8787-cd20ebfda91d
Jan 27 14:08:57 [kernel] BTRFS info (device dm-2): using crc32c (crc32c-intel) checksum algorithm
Jan 27 14:08:57 [kernel] BTRFS info (device dm-2): force zstd compression, level 15
Jan 27 14:08:57 [kernel] BTRFS info (device dm-2): using free space tree
Jan 27 14:09:06 [kernel] BTRFS info (device dm-2): checking UUID tree
Jan 27 14:09:11 [kernel] BTRFS info (device dm-2): balance: start -dvrange=1304782635008..1304782635009
Jan 27 14:09:11 [kernel] BTRFS info (device dm-2): relocating block group 1304782635008 flags data
Jan 27 14:09:35 [kernel] BTRFS info (device dm-2): found 1 extents, stage: move data extents
Jan 27 14:09:36 [kernel] BTRFS info (device dm-2): found 1 extents, stage: update data pointers
Jan 27 14:09:37 [kernel] BTRFS info (device dm-2): balance: ended with status: 0
Jan 27 14:09:37 [kernel] BTRFS info (device dm-2): balance: start -dvrange=555222040576..555222040577
Jan 27 14:09:37 [kernel] BTRFS info (device dm-2): relocating block group 555222040576 flags data
Jan 27 14:12:23 [kernel] BTRFS info (device dm-2): found 6650 extents, stage: move data extents
Jan 27 14:14:50 [kernel] BTRFS error (device dm-2): incorrect extent count for 213705031680; counted 4327, expected 1337
Jan 27 14:14:50 [kernel] BTRFS error (device dm-2: state A): Transaction aborted (error -5)
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state A) in convert_free_space_to_extents:471: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS info (device dm-2: state EA): forced readonly
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in add_to_free_space_tree:1057: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in do_free_extent_accounting:2870: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS error (device dm-2: state EA): failed to run delayed ref for logical 213930115072 num_bytes 16384 type 176 action 2 ref_mod 1: -5
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in btrfs_run_delayed_refs:2168: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS error (device dm-2: state EA): incorrect extent count for 213705031680; counted 4429, expected 1439
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in convert_free_space_to_bitmaps:338: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in add_to_free_space_tree:1057: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in do_free_extent_accounting:2870: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS error (device dm-2: state EA): failed to run delayed ref for logical 213940502528 num_bytes 16384 type 176 action 2 ref_mod 1: -5
Jan 27 14:14:50 [kernel] BTRFS: error (device dm-2: state EA) in btrfs_run_delayed_refs:2168: errno=-5 IO failure
Jan 27 14:14:50 [kernel] BTRFS info (device dm-2: state EA): balance: ended with status: -30
Jan 27 14:34:06 [kernel] BTRFS info (device dm-2: state EA): last unmount of filesystem 1d577e4b-27c1-4729-8787-cd20ebfda91d
Jan 27 14:36:40 [kernel] BTRFS: device label local_data devid 1 transid 152823 /dev/mapper/localdata_crypt scanned by mount (23504)
Jan 27 14:36:40 [kernel] BTRFS info (device dm-2): first mount of filesystem 1d577e4b-27c1-4729-8787-cd20ebfda91d
Jan 27 14:36:40 [kernel] BTRFS info (device dm-2): using crc32c (crc32c-intel) checksum algorithm
Jan 27 14:36:40 [kernel] BTRFS info (device dm-2): force zstd compression, level 15
Jan 27 14:36:40 [kernel] BTRFS info (device dm-2): using free space tree
Jan 27 14:36:49 [kernel] BTRFS error (device dm-2): incorrect extent count for 213705031680; counted 3633, expected 2758
Jan 27 14:36:49 [kernel] BTRFS info (device dm-2): checking UUID tree
Jan 27 14:37:11 [kernel] BTRFS info (device dm-2): balance: resume -dusage=90,vrange=555222040576..555222040577
Jan 27 14:37:11 [kernel] BTRFS info (device dm-2): relocating block group 555222040576 flags data
Jan 27 14:43:09 [kernel] BTRFS info (device dm-2): found 6650 extents, stage: move data extents
Jan 27 14:51:03 [kernel] BTRFS error (device dm-2): incorrect extent count for 213705031680; counted 5706, expected 1337
Jan 27 14:51:03 [kernel] BTRFS error (device dm-2: state A): Transaction aborted (error -5)
Jan 27 14:51:03 [kernel] BTRFS: error (device dm-2: state A) in convert_free_space_to_extents:471: errno=-5 IO failure
Jan 27 14:51:03 [kernel] BTRFS info (device dm-2: state EA): forced readonly
Jan 27 14:51:03 [kernel] BTRFS: error (device dm-2: state EA) in add_to_free_space_tree:1057: errno=-5 IO failure
Jan 27 14:51:03 [kernel] BTRFS: error (device dm-2: state EA) in do_free_extent_accounting:2870: errno=-5 IO failure
Jan 27 14:51:03 [kernel] BTRFS error (device dm-2: state EA): failed to run delayed ref for logical 213998551040 num_bytes 16384 type 176 action 2 ref_mod 1: -5
Jan 27 14:51:03 [kernel] BTRFS: error (device dm-2: state EA) in btrfs_run_delayed_refs:2168: errno=-5 IO failure
Jan 27 14:51:03 [kernel] BTRFS info (device dm-2: state EA): balance: ended with status: -30
Jan 27 14:51:04 [kernel] BTRFS error (device dm-2: state EA): incorrect extent count for 213705031680; counted 5808, expected 1439
Jan 27 14:51:04 [kernel] BTRFS: error (device dm-2: state EA) in convert_free_space_to_bitmaps:338: errno=-5 IO failure
Jan 27 14:51:04 [kernel] BTRFS: error (device dm-2: state EA) in add_to_free_space_tree:1057: errno=-5 IO failure
Jan 27 14:51:04 [kernel] BTRFS: error (device dm-2: state EA) in do_free_extent_accounting:2870: errno=-5 IO failure
Jan 27 14:51:04 [kernel] BTRFS error (device dm-2: state EA): failed to run delayed ref for logical 214012198912 num_bytes 16384 type 176 action 2 ref_mod 1: -5
Jan 27 14:51:04 [kernel] BTRFS: error (device dm-2: state EA) in btrfs_run_delayed_refs:2168: errno=-5 IO failure

@kakra
Copy link

kakra commented Jan 27, 2025

Better report such issues to the btrfs kernel mailing list. If your report is specific enough, you'd usually get a quick answer, maybe even a fix to patch broken bytes, or a patch for your kernel. Nothing indicates an obvious bitflip but is this a btrfs raid1, or is it md raid1 with btrfs on top?

@Massimo-B
Copy link
Author

https://lore.kernel.org/linux-btrfs/[email protected]/T/#t
How can I get more debug verbosity from dmesg?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants