On Sun, Feb 21, 2010 at 04:18:38PM -0700, Brandon Bennett wrote:
> > On Sat, Feb 20, 2010 at 9:05 PM, Kevin O'Connor <kevin(a)koconnor.net> wrote:
> >> Should a kernel fail during boot, I'd suspect it doesn't like one of
> >> the apm/pcibios callbacks, or it doesn't like one of the
> >> smbios/mptable/acpi tables. You could try compiling the SeaBIOS code
> >> (see http://seabios.org/Download ) and increasing the debugging by
> >> modifying src/config.h. Specifically, you could increase
> >> CONFIG_DEBUG_LEVEL, and set DEBUG_HDL_pcibios32 and DEBUG_HDL_apm to
> >> 1. Also, you could try disabling some of the features to see if that
> >> prevents the fault (eg, disabling CONFIG_ACPI / CONFIG_SMBIOS /
> >> CONFIG_MPTABLE).
> >
>
> I have narrowed it down to SMBIOS. If I disable CONFIG_SMBIOS the
> image boots up fine.
Gleb, have you seen this thread?
Some of the recent changes to smbios that look like possible culprits
are:
Make SMBIOS table pass MS SVVP test
Use MaxCountCPUs during building of per cpu tables.
Add malloc_high/fseg() and rework bios table creation to use them.
There were other changes, but the comments indicate they were only
ports of changes already in bochs. I suppose it's also possible the
lack of smbios is turning off some other feature in the guest (eg,
acpi) that's the real culprit.
-Kevin
Added Cc: seabios(a)seabios.org
On Wed, Jul 21, 2010 at 06:31:01AM +0300, Michael S. Tsirkin wrote:
> On Tue, Jul 20, 2010 at 06:52:23PM +0900, Isaku Yamahata wrote:
> > On Wed, Jul 14, 2010 at 09:10:28AM -0600, Cam Macdonell wrote:
> > > On Tue, Jul 13, 2010 at 8:52 PM, Isaku Yamahata <yamahata(a)valinux.co.jp> wrote:
> > > > On Tue, Jul 13, 2010 at 04:48:19PM -0600, Cam Macdonell wrote:
> > > >> On Tue, Jul 13, 2010 at 2:41 PM, Isaku Yamahata <yamahata(a)valinux.co.jp> wrote:
> > > >> > On Tue, Jul 13, 2010 at 02:05:51PM -0600, Cam Macdonell wrote:
> > > >> >> >> > Seabios completely ignore the 64-bitness of the BAR. ?Looks like it also
> > > >> >> >> > thinks the second half of the BAR is an I/O region instead of memory (hence
> > > >> >> >> > the c200, that's part of the pci portio region.
> > > >> >> >
> > > >> >> > I've sent the patches to address it. But they haven't been merged yet.
> > > >> >> > seabios doesn't map BARs beyond 4GB.
> > > >> >> > If bar is mapped beyond 4GB, guest BIOS does it.
> > > >> >>
> > > >> >> Have those patches been merged yet?
> > > >> >
> > > >> > They have been merged into seabios upstream now.
> > > >> > qemu seabios fork hasn't pulled for a while, though.
> > > >> >
> > > >> >
> > > >> >> > To see how seabios works, it would help to increase CONFIG_DEBUG_LEVEL
> > > >> >> > in config.h of seabios
> > > >> >>
> > > >> >> Where does the output from seabios end up? ?Inside dmesg?
> > > >> >
> > > >> > It outputs them to the serial console which qemu emulates.
> > > >> > seabios is out of kernel control, so dmesg doesn't show it.
> > > >> >
> > > >> >
> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
> > > >> >> >> pci_read_config: (val) 0xffffffff <- 0x1c (addr)
> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
> > > >> >> >> pci_read_config: (val) 0x0 <- 0x1c (addr)
> > > >> >> >> pci_write_config: (val) 0x0 -> 0x1c (addr)
> > > >> >> >
> > > >> >> > seabios BAR3. Not sure how it is mapped from this
> > > >> >> > message.
> > > >> >>
> > > >> >> Isn't the BAR3 from the fact that a 64-bit BAR would use both BAR2 and
> > > >> >> BAR3 to store all 64-bits?
> > > >> >
> > > >> > Yes. Seabios misbehaves. 64bit bar is(was) a missing feature.
> > > >> > --
> > > >> > yamahata
> > > >> >
> > > >> >
> > > >>
> > > >> With the latest seabios git passed via -bios, I no longer see the
> > > >> 48-bit address, but instead a 32-bit address and then
> > > >> ffffffff00000000. ?This guest has 1gb of RAM so the address isn't be
> > > >> mapped beyond 4g.
> > > >
> > > > Can I see the debug log like before?
> > > > (hopefully seabios with CONFIG_DEBUG_LEVEL enabled.)
> > >
> > > Here's the dump from SeaBIOS in the region related to the PCI devices.
> > > The SeaBIOS output is identical whether the BAR is 32-bit or 64-bit.
> > >
> > > PCI: bus=0 devfn=0x10: vendor_id=0x1013 device_id=0x00b8
> > > region 0: 0xf0000000
> > > region 1: 0xf2000000
> > > region 6: 0xf2010000
> > > PCI: bus=0 devfn=0x18: vendor_id=0x1af4 device_id=0x1000
> > > region 0: 0x0000c020
> > > region 1: 0xf2020000
> > > region 6: 0xf2030000
> > > PCI: bus=0 devfn=0x20: vendor_id=0x1af4 device_id=0x1110
> > > region 0: 0xf2040000
> > > region 1: 0xf2041000
> > > region 2: 0x00000000
> >
> > Is this region (region 2 of devfn=0x20: vendor_id=0x1af4 device_id=0x1110)
> > the BAR in quistion?
> > The value 0 seems odd. Probably BAR address calculation overflowed.
> > Currently seabios doesn't check overflow. I attached the patch.
> >
> >
> > > > Do you know who sets the BAR to ffffffff00000000?
> > >
> > > Here are the config reads/writes related to the 0x18/1c, the 'IVSHMEM'
> > > lines are from the map function passed to pci_register_bar(). It
> > > looks like SeaBIOS sets the address to 0 and then the potentially
> > > useful e0000000 address gets mangled into ffffffff000000.
> >
> > There is something wrong with the debug message of write case, I suppose.
> > All written value are 0, but the resulted effect doesn't seems so.
> >
> > >
> > > IVSHMEM: guest pci addr = 0, guest h/w addr = 1090912256, size = 536870912
> > >
> > > ...snip...
> > >
> > > pci_read_config: (val) 0x4 <- 0x18 (addr)
> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
> > > IVSHMEM: guest pci addr = e0000000, guest h/w addr = 1090912256, size = 20000000
> >
> > If 0 is written to 0x18, the bar address should be 0, but it says e0000000.
> >
> > > pci_read_config: (val) 0xe0000004 <- 0x18 (addr)
> >
> > The read value isn't 0. and so on...
> >
> > > pci_write_config: (val) 0x0 -> 0x18 (addr)
> > > pci_read_config: (val) 0x0 <- 0x1c (addr)
> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
> > > IVSHMEM: guest pci addr = ffffffff00000000, guest h/w addr =
> > > 1090912256, size = 20000000
> > > pci_read_config: (val) 0xffffffff <- 0x1c (addr)
> > > pci_write_config: (val) 0x0 -> 0x1c (addr)
> > >
> > > and with the 64-bit guest I get this error as well (recall the guest
> > > fails to boot on 64-bit)
> > >
> > > BUG: kvm_dirty_pages_log_change: invalid parameters
> > > 00000000f0000000-00000000f0ffffff
> >
> >
> > diff --git a/src/pciinit.c b/src/pciinit.c
> > index b110531..6eca2ce 100644
> > --- a/src/pciinit.c
> > +++ b/src/pciinit.c
> > @@ -90,7 +90,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
> > /* If pci_bios_prefmem_addr == 0, keep old behaviour */
> > pci_bios_prefmem_addr != 0) {
> > paddr = &pci_bios_prefmem_addr;
> > - if (ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
> > + if (ALIGN(*paddr, size) + size < *paddr ||
> > + ALIGN(*paddr, size) + size >= BUILD_PCIPREFMEM_END) {
> > dprintf(1,
> > "prefmem region of (bdf 0x%x bar %d) can't be mapped. "
> > "decrease BUILD_PCIMEM_SIZE and recompile. size %x\n",
> > @@ -99,7 +100,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
> > }
> > } else {
> > paddr = &pci_bios_mem_addr;
> > - if (ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
> > + if (ALIGN(*paddr, size) + size < *paddr ||
> > + ALIGN(*paddr, size) + size >= BUILD_PCIMEM_END) {
> > dprintf(1,
> > "mem region of (bdf 0x%x bar %d) can't be mapped. "
> > "increase BUILD_PCIMEM_SIZE and recompile. size %x\n",
>
> Looking at the source, all of the values like pci_bios_prefmem_addr seem to be
> 32 bit. Since in the spec prefetcheable memory is up to 64 bit,
> can't the math overflow, here and elsewhere?
> Maybe we should switch to 64 bit values all over ...
Make sense. I'll create a patch to convert them into u64.
>
> > @@ -116,12 +118,8 @@ static int pci_bios_allocate_region(u16 bdf, int region_num)
> >
> > int is_64bit = !(val & PCI_BASE_ADDRESS_SPACE_IO) &&
> > (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == PCI_BASE_ADDRESS_MEM_TYPE_64;
> > - if (is_64bit) {
> > - if (size > 0) {
> > - pci_config_writel(bdf, ofs + 4, 0);
> > - } else {
> > - pci_config_writel(bdf, ofs + 4, ~0);
> > - }
> > + if (is_64bit && size > 0) {
> > + pci_config_writel(bdf, ofs + 4, 0);
> > }
> > return is_64bit;
> > }
>
>
> Was there any reason we wrote all-ones there on size 0?
> BAR sizing?
No reason. It's just left over from debugging.
So I'd like to remove it.
--
yamahata
SeaBIOS is coming close to exceeding 128K in size. Right now, it's
approximately 107K on qemu with default options and 120K with
full debugging and all options enabled.
SeaBIOS currently contends with option roms for space. Every byte
that SeaBIOS uses is one less byte available for option roms. It's
technically possible for SeaBIOS to exceed 128K, but it's likely that
we'll start seeing option roms not fitting around that point.
It would also be nice to be able to do malloc_low allocations from the
e-segment. This too requires freeing up space that SeaBIOS is
currently using to store code.
I've been thinking of a few different ways to free up space. The most
promising approach seems to be to relocate all or part of SeaBIOS'
32bit "flat" code. This 32bit code is larger than the segmented code
(67K vs 40K) and it's the area that has been growing the most.
To relocate the code, the build can store a list of code that depends
on a fixed address and then at runtime SeaBIOS can modify the code
with the new address.
Some different permutations of this idea that I've come up with are:
1 - On boot, have SeaBIOS relocate all of it's 32bit code to permanent
high ram.
CONS: Reserves memory that then can't be used by the OS (67K
today).
CONS: SeaBIOS code is still limited to 256K size (c+d+e+f
segments)
2 - Have the build separate out the POST code from other 32bit code
(eg, boot & resume) and store the POST code in a separate
CBFS/fw_cfg "file". During boot SeaBIOS would extract the
separate POST code blob into temporary high ram and run it.
Currently, the POST code is the bulk of the 32bit "flat" code (57K
vs 10K).
CONS: It adds complexity during deployment as both the main
SeaBIOS blob and the CBFS/fw_cfg POST blob would need to be
copied.
3 - On boot, relocate only POST code to temporary high ram, and have
SeaBIOS reset the machine if the POST code is ever rerun.
CONS: Requires a reliable way of resetting the machine.
CONS: SeaBIOS code is still limited to 256K size (c+d+e+f
segments)
Any thoughts? I kinda prefer option 3. Resetting the machine on a
rerun of POST would be a good thing to do anyway, as trying to re-post
the machine has never really worked well. It does, however, make the
build a bit more complex.
-Kevin