Grant,
Thank you so much for posting! And for debugging and suggesting a potential solution. I really like that you unblocked yourself, and I also want to get to the root cause.
From looking at the code, it seemed like a real issue, and after some time of debugging, I think this is a real issue. But the big question for me was: why no one noticed before? For context, this code path has been enabled by default since May 2023 , and since that time we have had lots of contributions and new development, and I was thinking how could it be that no one hit the issue before?
I can't see your values in your debugger on erasure_layout.c line 55, but I did my best to reproduce. For debugging I used Test case #8 from tests/erase_func_algo.c (set args 'Erase test case #8' in gdb in case someone wants to run) and I think I reproed for eraseblock of size 8. subedata went out of bounds! it was pointing to a memory it shouldn't be , like this
{start_addr = 4025479151, end_addr = 4025479151, selected = 239, block_num = 0, first_sub_block_index = 5249, last_sub_block_index = 0}
but it wasn't segfaulting , and those random large numbers in start_addr and end_addr made the loop condition false and the program kept running. without segfaults!
I think your patch may be valid, and please don't throw it away, maybe you will send it as a patch.
Can it be the difference between Linux and OpenBSD? It's true that most developers work on Linux, also from what I remember OpenBSD has more strict memory protections. Is it possible that OpenBSD segfaults in the same code where Linux keeps running?
On Fri, Sep 6, 2024 at 3:58 AM Grant Pannell via flashrom flashrom@flashrom.org wrote:
Hi,
Trying to report a bug in flashrom 1.4. I'm not subject matter expert, but I've done my best to try and debug and fix the issue. Seeking guidance from the experts on whether this is the solution :-)
I found that in 1.4, it looks like the erase and write logic was refactored. Since then, I get segfaults when trying to flash my coreboot image.
The chip is a Macronix MX25U6435E in a Protectli FW4B. I'm running on OpenBSD 7.5.
The segfault occurs in erasure_layout.c: init_eraseblock line 55. I am no expert on the internals of flashrom, or what's going on here...but I've narrowed it down to segfaulting at the last iteration of this while loop. Code in question looks like:
edata->first_sub_block_index = *sub_block_index; struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index]; while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr && *sub_block_index < layout[idx-1].block_count) { (*sub_block_index)++; subedata++; }
In my case, it seems that the last iteration looks like:
layout[idx-1].block_count == 2048 *sub_block_index == 2047 subedata->end_addr == end_addr
What then happens is, the variable "subedata" is incremented and the while condition is checked, but subedata is now out of bounds and the application segfaults. I'm pretty sure the while loop shouldn't iterate again because the next iteration would fail the *sub_block_index < layout[idx-1].block_count check (2048 < 2028). I solved this by short circuiting the while condition and checking that condition first, so that subedata is not accessed and flashrom successfully flashes my coreboot image. Patch included below. Is this an appropriate fix?
Thank you,
Grant
--- erasure_layout.c.orig +++ erasure_layout.c @@ -52,8 +52,8 @@
edata->first_sub_block_index = *sub_block_index; struct eraseblock_data *subedata = &layout[idx - 1].layout_list[*sub_block_index];
while (subedata->start_addr >= start_addr && subedata->end_addr <= end_addr &&
*sub_block_index < layout[idx-1].block_count) {
while (*sub_block_index < layout[idx-1].block_count &&
subedata->start_addr >= start_addr && subedata->end_addr <= end_addr) { (*sub_block_index)++; subedata++; }
flashrom mailing list -- flashrom@flashrom.org To unsubscribe send an email to flashrom-leave@flashrom.org