On Thu, May 14, 2020 at 2:46 PM Mike Banon <mikebdp2@gmail.com> wrote:
Unfortunately it seems a lot of boards are affected by this. A88XM-E
and Lenovo G505S (AMD fam15h) also got broken: they rarely succeed at
booting - and, when they do, no boot devices are available (virtual
floppies too, for some reason) - except coreinfo/tint secondary
payloads which became prone to freezing. I attach the A88XM-E logs
I've been able to obtain with USB FT232H:

1) ok_e6fb1344ed9188e19be4b54bdf1a76680b8c4523.txt - last coreboot
repo's revision where all the stuff works
2) fail_1_3b02006afe8a85477dafa1bd149f1f0dba02afc7.txt - this commit
got the boards broken for the first time
3) fail_2_6b95507ec5b087658178a325bdc68570bc48bb20.txt - this is a log
for coreboot's master top

For some reason logs for 2) and 3) always stop after "PCI: 00:12.2
EHCI Debug Port hook triggered".

I hope these commits could be reverted before we figure out what's
going on with them. Good thing we've noticed it fast enough.


Thanks, Mike. The amd chipset code (all of it from what I can tell) is fundamentally broken and at odds with all of the resource allocation flow. They worked previously because dynamic resources were being assigned using an algorithm that just assumed there weren't collisions, and that was done w/o all the necessary info required for making the proper decisions regarding dynamic resource allocation.

I landed the other chipsets' fixes, but the amd chipset code is going to take a lot more to fix. Would you be willing to test patches as they are crafted? Given the largeness of the problem as well as the gnarly code that is the amd chipset code it's going to take some time so I think we do need to revert the allocator changes until we can do some house keeping.

-Aaron
Best regards,
Mike Banon

On Thu, May 14, 2020 at 8:47 PM Keith Hui <buurin@gmail.com> wrote:
>
> Hi guys,
>
> 31ab7de51a is CB:41368, cherry picked into my local repo.
>
> Turns out I have to back out all four of Furquan's patches
> (CB:39486~39489) for my board to boot normally again.
>
> Thoughts?
>
> I'll now get a log with everything in at SPEW.
>
>
> On Thu, May 14, 2020 at 1:05 PM Aaron Durbin <adurbin@google.com> wrote:
> >
> > Keith, is it possible to have the console log level set to SPEW? I'm not seeing the full logs to piece it all together.
> >
> > Allocating resources...
> > Reading resources...
> > Setting RAM size to 768 MB
> > PNP: 03f0.8 missing read_resources
> > Done reading resources.
> > Resource allocator: DOMAIN: 0000 - Pass 1 (gathering requirements)
> > Resource allocator: DOMAIN: 0000 - Pass 2 (allocating resources)
> > Resource ranges:
> > Base: 1000, Size: d000, Tag: 100
> > Base: f000, Size: 1000, Tag: 100
> > Resource ranges:
> > Base: 0, Size: ff800000, Tag: 200
> > Base: 100000000, Size: f00000000, Tag: 100200
> > Resource ranges:
> > Base: 10000000, Size: 8000000, Tag: 1200
> > Resource ranges:
> > Base: 18000000, Size: 1100000, Tag: 200
> >
> > This is the memory address space:
> > Base: 0, Size: ff800000, Tag: 200
> > Base: 100000000, Size: f00000000, Tag: 100200
> >
> > Those are valid ranges to choose dynamic resources from.
> >
> > PCI: 00:00.0 10 <- [0x0000000000 - 0x000fffffff] size 0x10000000 gran 0x1c prefmem
> >
> > I see 'Setting RAM size to 768 MB' which means I would expect to see a hole in the ranges representing 768MiB.
> >
> > that would be bad. I don't know what commit '31ab7de51a' is, but it might not contain the CB:41368. Having SPEW logs would be helpful.
> >
> > Also, what mainboard Kconfig are you selecting for p3bf? src/mainboard/asus/p2b ?
> >
> >
> >
> > On Thu, May 14, 2020 at 10:42 AM Keith Hui <buurin@gmail.com> wrote:
> >>
> >> (Temporarily leaving the list out)
> >>
> >> Hi Aaron,
> >>
> >> Here is a log with everything including CB:41368 included. I'll get
> >> this log out to you first, while I try a build with all problem
> >> commits left out.
> >>
> >> Thanks
> >> Keith
> >>
> >> On Thu, May 14, 2020 at 12:53 AM Aaron Durbin <adurbin@google.com> wrote:
> >> >
> >> >
> >> >
> >> > On Wed, May 13, 2020 at 10:51 PM Keith Hui <buurin@gmail.com> wrote:
> >> >>
> >> >> Hi guys,
> >> >>
> >> >> I tested these fixes on my board, and I have to say there's still
> >> >> something wrong. They did address the hang or reset in SeaBIOS I first
> >> >> described, but now either my ATA hard drive failed to boot (it tried
> >> >> to hand off to GRUB on my drive, but didn't get there), or it can't
> >> >> find the option ROM of my video card, meaning no display.
> >> >>
> >> >> Now I want to try the other way, testing a build with all changes
> >> >> related to the problem backed out instead. So besides the one I first
> >> >> identified, what other related patches should I try backing out?
> >> >
> >> >
> >> > Just go to the parent of the identified patch.  As for the other symptoms you are seeing, I'd love to see logs with the patches we identified so we can root cause.
> >> >
> >> > Thanks.
> >> >
> >> > -Aaron
> >> >
> >> >>
> >> >> On Wed, May 13, 2020 at 11:54 PM Furquan Shaikh
> >> >> <furquan.m.shaikh@gmail.com> wrote:
> >> >> >
> >> >> > Similar fix for i440x: https://review.coreboot.org/c/coreboot/+/41368
> >> >> >
> >> >> > On Wed, May 13, 2020 at 11:29 AM Aaron Durbin <adurbin@google.com> wrote:
> >> >> > >
> >> >> > > i440x chipset is doing things in the wrong way like sandybridge. I uploaded this fix for sandy: https://review.coreboot.org/c/coreboot/+/41364 We'll need to do the equivalent for i440x.
> >> >> > >
> >> >> > > On Wed, May 13, 2020 at 11:13 AM Aaron Durbin <adurbin@google.com> wrote:
> >> >> > >>
> >> >> > >> OK. I'll take a look at your logs and see what's going on. The patch link I sent was based off of someone else's mainboard logs.
> >> >> > >>
> >> >> > >> On Wed, May 13, 2020 at 10:59 AM Keith Hui <buurin@gmail.com> wrote:
> >> >> > >>>
> >> >> > >>> Hi Aaron,
> >> >> > >>>
> >> >> > >>> It didn't help. There still a way out of whack entry in the coreboot
> >> >> > >>> table and e820 entry ending at 000003ffffffffff, which I think have
> >> >> > >>> more to do than the 41363's scope.
> >> >> > >>>
> >> >> > >>> Keith
> >> >> > >>>
> >> >> > >>> On Wed, May 13, 2020 at 12:24 PM Aaron Durbin <adurbin@google.com> wrote:
> >> >> > >>> >
> >> >> > >>> > I think the following patch will fix things up: https://review.coreboot.org/c/coreboot/+/41363 Please let me know.
> >> >> > >>> >
> >> >> > >>> > On Wed, May 13, 2020 at 8:43 AM Keith Hui <buurin@gmail.com> wrote:
> >> >> > >>> >>
> >> >> > >>> >> Thanks Furquan.
> >> >> > >>> >>
> >> >> > >>> >> Here are 3 logs. Log 1 is at the commit just before the problem. Log 2
> >> >> > >>> >> is at the problem commit. Log 3 is at the current master, if that's
> >> >> > >>> >> what you meant by ToT.
> >> >> > >>> >>
> >> >> > >>> >> I'm using SeaBIOS 1.13.0, compiled once using the attached .config
> >> >> > >>> >> before taking these logs. All 3 runs are taken using the same SeaBIOS
> >> >> > >>> >> binary.
> >> >> > >>> >>
> >> >> > >>> >> Then I recompiled SeaBIOS with CONFIG_RELOCATE_INIT off, replaced the
> >> >> > >>> >> payload used in run 3, and took an extra run. In this case the board
> >> >> > >>> >> reset on its own at "Scanning option roms", looping infinitely.
> >> >> > >>> >>
> >> >> > >>> >> Hope this helps
> >> >> > >>> >> Keith
> >> >> > >>> >>
> >> >> > >>> >> On Wed, May 13, 2020 at 7:38 AM Furquan Shaikh
> >> >> > >>> >> <furquan.m.shaikh@gmail.com> wrote:
> >> >> > >>> >> >
> >> >> > >>> >> > Thanks for the report Keith!
> >> >> > >>> >> >
> >> >> > >>> >> > On Wed, May 13, 2020 at 3:42 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
> >> >> > >>> >> > >
> >> >> > >>> >> > > Dear Keith,
> >> >> > >>> >> > >
> >> >> > >>> >> > >
> >> >> > >>> >> > > Am 13.05.20 um 05:21 schrieb Keith Hui:
> >> >> > >>> >> > >
> >> >> > >>> >> > > > I am still refining the P2B family of boards, now including the
> >> >> > >>> >> > > > infamous P3B-F with an unusual appetite for hacks to make work.
> >> >> > >>> >> > > >
> >> >> > >>> >> > > > That said, I'm now finding that, on P3B-F, SeaBIOS hangs when it tries
> >> >> > >>> >> > > > to relocate itself as part of its usual chores. Having just learned
> >> >> > >>> >> > > > git bisect, I decided to try it out.
> >> >> > >>> >> > > >
> >> >> > >>> >> > > > It was commit 3b02006afe8a85477dafa1bd149f1f0dba02afc7 [1] that broke
> >> >> > >>> >> > > > my SeaBIOS. It doesn't affect my newer toy the P8Z77-M as much as
> >> >> > >>> >> > > > P3B-F, but I still want to blame that, and probably the very next
> >> >> > >>> >> > > > commit as well, as they both deal with some very modern aspects of PCI
> >> >> > >>> >> > > > that well predates the 440BX.
> >> >> > >>> >> > > >
> >> >> > >>> >> > > > Is there anything we can do to fix 3b02006afe?
> >> >> > >>> >> > >
> >> >> > >>> >> > > I commented in the change-set [1] to make the author and reviewers aware
> >> >> > >>> >> > > of this issue and referenced your list message, and ask to comment here.
> >> >> > >>> >> > >
> >> >> > >>> >> > > Could you please provide the debug log of coreboot and SeaBIOS?
> >> >> > >>> >> >
> >> >> > >>> >> > As Paul mentioned, can you please provide the debug logs for coreboot
> >> >> > >>> >> > and SeaBIOS both with ToT coreboot and with HEAD set before the change
> >> >> > >>> >> > 3b02006afe where it does not hang? Thanks!
> >> >> > >>> >> >
> >> >> > >>> >> > >
> >> >> > >>> >> > >
> >> >> > >>> >> > > > Meanwhile I ported the P3B-F board enable to flashrom [2], which got a
> >> >> > >>> >> > > > heavy workout during this bisect, through vendor firmware and both
> >> >> > >>> >> > > > good and bad builds of coreboot. In all cases I can flash internal, no
> >> >> > >>> >> > > > longer having to haul out my P2B-LS just to use it as a flasher.
> >> >> > >>> >> > > >
> >> >> > >>> >> > > > Enjoy this long overdue board enable. If it gets submitted, I'll
> >> >> > >>> >> > > > retract the ramstage hack[3] doing the same as redundant.
> >> >> > >>> >> > >
> >> >> > >>> >> > > Very nice! It’s always amazing, how after so many years, when the vendor
> >> >> > >>> >> > > already stopped supporting the device, the community still supports the
> >> >> > >>> >> > > device and improves the firmware showing that Free Software is the more
> >> >> > >>> >> > > sustainable way.
> >> >> > >>> >> > >
> >> >> > >>> >> > >
> >> >> > >>> >> > > Kind regards,
> >> >> > >>> >> > >
> >> >> > >>> >> > > Paul
> >> >> > >>> >> > >
> >> >> > >>> >> > >
> >> >> > >>> >> > > > [1] https://review.coreboot.org/c/coreboot/+/39486
> >> >> > >>> >> > > > [2] https://review.coreboot.org/c/flashrom/+/41354
> >> >> > >>> >> > > > [3] https://review.coreboot.org/c/coreboot/+/41224
> >> >> > >>> >> > > _______________________________________________
> >> >> > >>> >> > > coreboot mailing list -- coreboot@coreboot.org
> >> >> > >>> >> > > To unsubscribe send an email to coreboot-leave@coreboot.org
> >> >> > >>> >> _______________________________________________
> >> >> > >>> >> coreboot mailing list -- coreboot@coreboot.org
> >> >> > >>> >> To unsubscribe send an email to coreboot-leave@coreboot.org
> _______________________________________________
> coreboot mailing list -- coreboot@coreboot.org
> To unsubscribe send an email to coreboot-leave@coreboot.org