On 01/20/17 20:39, Dr. David Alan Gilbert wrote:
- Kevin O'Connor (kevin@koconnor.net) wrote:
On Fri, Jan 20, 2017 at 06:40:44PM +0000, Dr. David Alan Gilbert wrote:
Hi, I turned the debug level up to 4 on our smaller (128k) ROM downstream build and seem to have hit a case where it's been layed out so that the 'ExtraStack' is at the same location as some code (display_uuid) which was causing some very random behaviour;
from an objdump disassemble of the rom.o that was produced:
ef79d: f3 ab rep stos %eax,%es:(%edi) ef79f: 8d 43 08 lea 0x8(%ebx),%eax ef7a2: b1 10 mov $0x10,%cl ef7a4: 8d 54 24 74 lea 0x74(%esp),%edx ef7a8: e8 50 46 ff ff call e3dfd <memcmp> ef7ad: 85 c0 test %eax,%eax ef7af: 0f 84 33 01 00 00 je ef8e8 <ExtraStack+0x110> ef7b5: 8b 15 0c 23 0f 00 mov 0xf230c,%edx ef7bb: 80 7a 06 02 cmpb $0x2,0x6(%edx) ef7bf: 0f b6 43 17 movzbl 0x17(%ebx),%eax ef7c3: 77 0c ja ef7d1 <StackPos+0x1> ef7c5: 0f 85 80 00 00 00 jne ef84b <ExtraStack+0x73> ef7cb: 80 7a 07 05 cmpb $0x5,0x7(%edx) ef7cf: 76 7a jbe ef84b <ExtraStack+0x73>
Note the 'ExtraStack+...' where as a few lines before it's maininit for other jumps, then looking at a sorted output of the rom.o.objdump:
000eddf2 l F .text 000000e0 virtio_scsi_add_lun.constprop.113 000eded2 l F .text 0000080d device_hardware_setup 000ee6df l F .text 00001a17 maininit <------------- 000ef790 g *ABS* 00000000 final_varlow_start 000ef790 g O *ABS* 00000004 BootSequence 000ef794 g O *ABS* 00000001 FloppyDOR 000ef798 g O *ABS* 00000008 LastUSBkey 000ef7a0 g O *ABS* 00000001 Ps2ctr 000ef7a4 g O *ABS* 00000004 RTCusers 000ef7a8 g O *ABS* 00000004 TimerLast 000ef7ac g O *ABS* 00000001 HaveAttemptedReboot 000ef7ad g O *ABS* 00000001 Century 000ef7b0 g O *ABS* 00000010 CDRom_locks 000ef7c0 g O *ABS* 00000010 DefaultDPTE 000ef7d0 g O *ABS* 00000004 StackPos 000ef7d8 g O *ABS* 00000801 ExtraStack <------------- 000effdc g O *ABS* 00000018 Call16Data 000f0000 g *ABS* 00000000 final_readonly_start 000f0000 g *ABS* 00000000 zonefseg_start 000f00f6 g F .text 0000038b dopost 000f0481 g F .text 000001c2 handle_pmm 000f0644 l O .text 00000014 CSWTCH.1353 000f0658 l O .text 00000014 __func__.14607 000f066c l O .text 00000011 __func__.14624 000f0680 l O .text 00000010 __func__.14549 000f0690 l O .text 0000000b __func__.14497
What I think you're seeing here is an artifact of seabios' code self-relocation. The objdump stores the final location of "varlow" variables, and not the location of their pre-relocation initial values. After the code is self-relocated (in post.c:reloc_preinit() ) it's malloc.c:malloc_init() (see memmove call) that copies over that area of memory.
OK, I'll try and trace that.
What's supposed to stop that happening?
The code in scripts/layoutrom.py is supposed to layout the rom without conflicts. It's not clear to me if that's malfunctioning or if the underlying issue is something else - what is the "very random behaviour" you are seeing?
Hangs, typically after/in display_uuid or kvm entry exceptions where the EIP is totally bogus; they only happen sometimes on reboot, and adding some debug can make them totally disappear. So the thought of the code beign scribbled over by a stack sounded like a reasonable explanation.
I'd chosen a debug level of 4 since that was the largest it would go without the build complaining it wouldn't fit, so I thought I was safe since something did complain if it got way too big.
It should have been safe - something must not be right.
Hmm OK.
Would this be consistent with a stack overflow?
See commit 46b82624c95b951e8825fab117d9352faeae0ec8. Perhaps BUILD_EXTRA_STACK_SIZE (2KB) is too small now?
Thanks Laszlo