This series updates the generic code so that either "low" memory or
the f-segment can be used for storing drive mapping information. It
also updates the virtio code to use "low" memory for its allocations.
I think there is a good chance this series will, in practice, avoid
running out of memory when a large number of virtio drives are
available.
There are other ways to tackle this problem (eg, more of the virtio
storage could be moved into "high" memory), but I think this approach
has the benefit of avoiding a "big bang" patch series.
Comments welcome.
-Kevin
Kevin O'Connor (4):
boot: Rename drive_g to drive
disk: Don't require the 'struct drive_s' to be in the f-segment
block: Rename disk_op_s->drive_gf to drive_fl
virtio: Allocate drive_s storage in low memory
src/block.c | 36 ++++-----
src/block.h | 2 +-
src/boot.c | 16 ++--
src/cdrom.c | 6 +-
src/disk.c | 218 +++++++++++++++++++++++++--------------------------
src/hw/ahci.c | 4 +-
src/hw/ata.c | 26 +++---
src/hw/blockcmd.c | 10 +--
src/hw/esp-scsi.c | 2 +-
src/hw/floppy.c | 20 ++---
src/hw/lsi-scsi.c | 2 +-
src/hw/megasas.c | 2 +-
src/hw/mpt-scsi.c | 2 +-
src/hw/nvme.c | 2 +-
src/hw/pvscsi.c | 2 +-
src/hw/ramdisk.c | 2 +-
src/hw/sdcard.c | 2 +-
src/hw/usb-msc.c | 4 +-
src/hw/usb-uas.c | 2 +-
src/hw/virtio-blk.c | 14 ++--
src/hw/virtio-ring.c | 1 -
src/hw/virtio-scsi.c | 5 +-
22 files changed, 189 insertions(+), 191 deletions(-)
--
2.9.4
The first patch fixes a regression where many disks on a single SCSI
target could prevent LUN0 from booting on another target.
The second patch then exploits the new scanning algorithm to speed up
the bus scan on virtio-scsi, which has a higher limit for the target
than the other adaptors and therefore was slowed down the most.
Paolo Bonzini (2):
scsi: ensure LUN0 is added first
virtio-scsi: speed up SCSI bus scan
src/hw/blockcmd.c | 6 ++++--
src/hw/esp-scsi.c | 10 ++++++++--
src/hw/lsi-scsi.c | 10 ++++++++--
src/hw/mpt-scsi.c | 10 ++++++++--
src/hw/usb-uas.c | 1 +
src/hw/virtio-scsi.c | 46 ++++++++++++++++++++++++++++++++++++++--------
6 files changed, 67 insertions(+), 16 deletions(-)
--
2.13.0
Hi Gerd,
I was curious to see what the "sercon" stuff would look like in
SeaVGABIOS instead of in SeaBIOS. So, I put together a quick
prototype. The code is also at:
https://github.com/KevinOConnor/seabios/tree/testing
This is just a proof-of-concept thing - I didn't implement any of the
useful features you have in your series. Specifically, it doesn't
unclutter the serial output, doesn't implement cp437 translations, and
doesn't handle multi-byte input. It does does have basic input,
output, and split-output handling though.
I'm not sure if this is better than a SeaBIOS implementation. I
suspect it will be easier to handle vga quirks this way. However,
handling exotic serial outputs (eg, mmio and usb based) would be
better suited in SeaBIOS.
Thoughts?
-Kevin
Kevin O'Connor (3):
sercon: Initial support for VGA emulation over serial port
sercon: Add basic keyboard input support
sercon: Support "split output" mode
Makefile | 2 +-
vgasrc/Kconfig | 6 +
vgasrc/sercon.c | 356 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
vgasrc/vgabios.c | 15 ++-
vgasrc/vgabios.h | 5 +
vgasrc/vgaentry.S | 7 ++
vgasrc/vgahw.h | 72 ++++++++++-
vgasrc/vgainit.c | 12 +-
vgasrc/vgautil.h | 24 ++++
9 files changed, 482 insertions(+), 17 deletions(-)
create mode 100644 vgasrc/sercon.c
--
2.5.5
On 24/07/2017 7:53, Kinsella, Ray wrote:
>
Hi Ray,
Thank you for the details,
> So as it turns out at 512 devices, it is nothing to do SeaBIOS, it was the Kernel again.
> It is taking quite a while to startup, a little over two hours (7489 seconds).
> The main culprits appear to be enumerating/initializing the PCI Express ports and enabling interrupts.
>
> The PCI Express Root Ports are taking a long time to enumerate/ initializing.
> 42 minutes in total=2579/60=64 ports in total, 40 seconds each.
>
Even if I am not aware of how much time would take to init a bare-metal
PCIe Root Port, it seems too much.
> [ 50.612822] pci_bus 0000:80: root bus resource [bus 80-c1]
> [ 172.345361] pci 0000:80:00.0: PCI bridge to [bus 81]
> ...
> [ 2724.734240] pci 0000:80:08.0: PCI bridge to [bus c1]
> [ 2751.154702] ACPI: Enabled 2 GPEs in block 00 to 3F
>
> I assume the 1 hour (3827 seconds) below is being spent enabling interrupts.
>
Assuming you are referring to legacy interrupts, maybe is possible
to disable them and use only MSI/MSI-X for PCIe Root Ports
(based on user input, we can't disable INTx for all the ports)
> [ 2899.394288] ACPI: PCI Interrupt Link [GSIG] enabled at IRQ 22
> [ 2899.531324] ACPI: PCI Interrupt Link [GSIH] enabled at IRQ 23
> [ 2899.534778] ACPI: PCI Interrupt Link [GSIE] enabled at IRQ 20
> [ 6726.914388] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> [ 6726.937932] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
> [ 6726.964699] Linux agpgart interface v0.103
>
> There finally there is another 20 minutes to find in the boot.
>
> [ 7489.202589] virtio_net virtio515 enp193s0f0: renamed from eth513
>
> Poky (Yocto Project Reference Distro) 2.3 qemux86-64 ttyS0
>
> qemux86-64 login: root
>
> I will remove the virtio-net-pci devices and hotplug them instead.
> In theory it should improve boot time, at expense of incurring some of these costs at runtime.
>
I would appreciate if you can share the results.
Thanks,
Marcel
> Ray K
>
> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Sunday, July 23, 2017 1:05 PM
> To: Marcel Apfelbaum <marcel(a)redhat.com>; Kinsella, Ray <ray.kinsella(a)intel.com>
> Cc: qemu-devel(a)nongnu.org; seabios(a)seabios.org; Gerd Hoffmann <kraxel(a)redhat.com>; Michael Tsirkin <mst(a)redhat.com>
> Subject: Re: >256 Virtio-net-pci hotplug Devices
>
> On Sun, Jul 23, 2017 at 07:28:01PM +0300, Marcel Apfelbaum wrote:
>> On 22/07/2017 2:57, Kinsella, Ray wrote:
>>> When scaling up to 512 Virtio-net devices SeaBIOS appears to really
>>> slow down when configuring PCI Config space - haven't manage to get
>>> this to work yet.
>
> If there is a slowdown in SeaBIOS, it would help to produce a log with timing information - see:
> https://www.seabios.org/Debugging#Timing_debug_messages
>
> It may also help to increase the debug level in SeaBIOS to get more fine grained timing reports.
>
> -Kevin
>
This series introduces a new device - generic PCI Express to PCI bridge,
and also makes all necessary changes to enable hotplug of the bridge itself
and any device into the bridge.
Changes v2->v3:
(0). 'do_not_use' capability field flag is still _not_ in here since we haven't come to consesus on it yet.
1. Merge commits 5 (bus_reserve property creation) and 6 (property usage) together - addresses Michael's comment.
2. Add 'bus_reserve' property and QEMU PCI capability only to Generic PCIE Root Port - addresses Michael's and Marcel's comments.
3. Change 'bus_reserve' property's default value to 0 - addresses Michael's comment.
4. Rename QEMU bridge-specific PCI capability creation function - addresses Michael's comment.
5. Init the whole QEMU PCI capability with zeroes - addresses Michael's and Laszlo's comments.
6. Change QEMU PCI capability layout (SeaBIOS side has the same changes)
- add 'type' field to distinguish multiple
RedHat-specific capabilities - addresses Michael's comment
- do not mimiс PCI Config space register layout, but use mutually exclusive differently
sized fields for IO and prefetchable memory limits - addresses Laszlo's comment
7. Correct error handling in PCIE-PCI bridge realize function.
8. Replace a '2' constant with PCI_CAP_FLAGS in the capability creation function - addresses Michael's comment.
9. Remove a comment on _OSC which isn't correct anymore - address Marcel's comment.
10. Add documentation for the Generic PCIE-PCI Bridge and QEMU PCI capability - addresses Michael's comment.
Changes v1->v2:
1. Enable SHPC for the bridge.
2. Enable SHPC support for the Q35 machine (ACPI stuff).
3. Introduce PCI capability to help firmware on the system init.
This allows the bridge to be hotpluggable. Now it's supported
only for pcie-root-port. Now it's supposed to used with
SeaBIOS only, look at the SeaBIOS corresponding series
"".
Aleksandr Bezzubikov (5):
hw/i386: allow SHPC for Q35 machine
hw/pci: introduce pcie-pci-bridge device
hw/pci: introduce bridge-only vendor-specific capability to provide
some hints to firmware
hw/pci: add QEMU-specific PCI capability to Generic PCI Express Root
Port
docs: update documentation considering PCIE-PCI bridge
docs/pcie.txt | 46 ++++----
docs/pcie_pci_bridge.txt | 121 ++++++++++++++++++++
hw/i386/acpi-build.c | 4 +-
hw/pci-bridge/Makefile.objs | 2 +-
hw/pci-bridge/gen_pcie_root_port.c | 23 ++++
hw/pci-bridge/pcie_pci_bridge.c | 220 +++++++++++++++++++++++++++++++++++++
hw/pci-bridge/pcie_root_port.c | 2 +-
hw/pci/pci_bridge.c | 37 +++++++
include/hw/pci/pci.h | 1 +
include/hw/pci/pci_bridge.h | 28 +++++
include/hw/pci/pcie_port.h | 2 +
11 files changed, 462 insertions(+), 24 deletions(-)
create mode 100644 docs/pcie_pci_bridge.txt
create mode 100644 hw/pci-bridge/pcie_pci_bridge.c
--
2.7.4
Old operating systems would like to have a rev1 (ACPI 1.0) FADT, but
new operating systems would like to have rev3 (ACPI 2.0).
Since old operating systems do not know about XSDTs, the
solution is to point the RSDT to a v1 FADT and the XSDT to a
rev3 FADT.
Paolo Bonzini (2):
seabios: build RSDT from XSDT
seabios: create rev1 FADT in compatibility RSDT
src/fw/paravirt.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
src/std/acpi.h | 11 +++++++++
2 files changed, 78 insertions(+), 1 deletion(-)
--
2.13.3
On Thu, Jul 27, 2017 at 12:54:07AM +0300, Alexander Bezzubikov wrote:
> 2017-07-26 22:43 GMT+03:00 Michael S. Tsirkin <mst(a)redhat.com>:
> > On Sun, Jul 23, 2017 at 01:15:41AM +0300, Aleksandr Bezzubikov wrote:
> >> On PCI init PCI bridges may need some
> >> extra info about bus number to reserve, IO, memory and
> >> prefetchable memory limits. QEMU can provide this
> >> with special
> >
> > with a special
> >
> >> vendor-specific PCI capability.
> >>
> >> Sizes of limits match ones from
> >> PCI Type 1 Configuration Space Header,
> >> number of buses to reserve occupies only 1 byte
> >> since it is the size of Subordinate Bus Number register.
> >>
> >> Signed-off-by: Aleksandr Bezzubikov <zuban32s(a)gmail.com>
> >> ---
> >> hw/pci/pci_bridge.c | 27 +++++++++++++++++++++++++++
> >> include/hw/pci/pci_bridge.h | 18 ++++++++++++++++++
> >> 2 files changed, 45 insertions(+)
> >>
> >> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
> >> index 720119b..8ec6c2c 100644
> >> --- a/hw/pci/pci_bridge.c
> >> +++ b/hw/pci/pci_bridge.c
> >> @@ -408,6 +408,33 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name,
> >> br->bus_name = bus_name;
> >> }
> >>
> >> +
> >> +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset,
> >
> > help? should be qemu_cap_init?
> >
> >> + uint8_t bus_reserve, uint32_t io_limit,
> >> + uint16_t mem_limit, uint64_t pref_limit,
> >> + Error **errp)
> >> +{
> >> + size_t cap_len = sizeof(PCIBridgeQemuCap);
> >> + PCIBridgeQemuCap cap;
> >
> > This leaks info to guest. You want to init all fields here:
> >
> > cap = {
> > .len = ....
> > };
>
> I surely can do this for len field, but as Laszlo proposed
> we can use mutually exclusive fields,
> e.g. pref_32 and pref_64, the only way I have left
> is to use ternary operator (if we surely need this
> big initializer). Keeping some if's would look better,
> I think.
>
> >
> >> +
> >> + cap.len = cap_len;
> >> + cap.bus_res = bus_reserve;
> >> + cap.io_lim = io_limit & 0xFF;
> >> + cap.io_lim_upper = io_limit >> 8 & 0xFFFF;
> >> + cap.mem_lim = mem_limit;
> >> + cap.pref_lim = pref_limit & 0xFFFF;
> >> + cap.pref_lim_upper = pref_limit >> 16 & 0xFFFFFFFF;
> >
> > Please use pci_set_word etc or cpu_to_leXX.
> >
>
> Since now we've decided to avoid fields separation into <field> + <field_upper>,
> this bitmask along with pci_set_word are no longer needed.
>
> > I think it's easiest to replace struct with a set of macros then
> > pci_set_word does the work for you.
> >
>
> I don't really want to use macros here because structure
> show us the whole capability layout and this can
> decrease documenting efforts. More than that,
> memcpy usage is very convenient here, and I wouldn't like
> to lose it.
>
> >
> >> +
> >> + int offset = pci_add_capability(dev, PCI_CAP_ID_VNDR,
> >> + cap_offset, cap_len, errp);
> >> + if (offset < 0) {
> >> + return offset;
> >> + }
> >> +
> >> + memcpy(dev->config + offset + 2, (char *)&cap + 2, cap_len - 2);
> >
> > +2 is yacky. See how virtio does it:
> >
> > memcpy(dev->config + offset + PCI_CAP_FLAGS, &cap->cap_len,
> > cap->cap_len - PCI_CAP_FLAGS);
> >
> >
>
> OK.
>
> >> + return 0;
> >> +}
> >> +
> >> static const TypeInfo pci_bridge_type_info = {
> >> .name = TYPE_PCI_BRIDGE,
> >> .parent = TYPE_PCI_DEVICE,
> >> diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> >> index ff7cbaa..c9f642c 100644
> >> --- a/include/hw/pci/pci_bridge.h
> >> +++ b/include/hw/pci/pci_bridge.h
> >> @@ -67,4 +67,22 @@ void pci_bridge_map_irq(PCIBridge *br, const char* bus_name,
> >> #define PCI_BRIDGE_CTL_DISCARD_STATUS 0x400 /* Discard timer status */
> >> #define PCI_BRIDGE_CTL_DISCARD_SERR 0x800 /* Discard timer SERR# enable */
> >>
> >> +typedef struct PCIBridgeQemuCap {
> >> + uint8_t id; /* Standard PCI capability header field */
> >> + uint8_t next; /* Standard PCI capability header field */
> >> + uint8_t len; /* Standard PCI vendor-specific capability header field */
> >> + uint8_t bus_res;
> >> + uint32_t pref_lim_upper;
> >
> > Big endian? Ugh.
> >
>
> Agreed, and this's gonna to disappear with
> the new layout.
>
> >> + uint16_t pref_lim;
> >> + uint16_t mem_lim;
> >
> > I'd say we need 64 bit for memory.
> >
>
> Why? Non-prefetchable MEMORY_LIMIT register is 16 bits long.
Hmm ok, but e.g. for io there are bridges that have extra registers
to specify non-standard non-aligned registers.
> >> + uint16_t io_lim_upper;
> >> + uint8_t io_lim;
> >> + uint8_t padding;
> >
> > IMHO each type should have a special "don't care" flag
> > that would mean "I don't know".
> >
> >
>
> Don't know what? Now 0 is an indicator to do nothing with this field.
In that case how do you say "don't allocate any memory"?
> >> +} PCIBridgeQemuCap;
> >
> > You don't really need this struct in the header. And pls document all fields.
> >
> >> +
> >> +int pci_bridge_help_cap_init(PCIDevice *dev, int cap_offset,
> >> + uint8_t bus_reserve, uint32_t io_limit,
> >> + uint16_t mem_limit, uint64_t pref_limit,
> >> + Error **errp);
> >> +
> >> #endif /* QEMU_PCI_BRIDGE_H */
> >> --
> >> 2.7.4
>
>
>
> --
> Alexander Bezzubikov
On Mon, Jul 31, 2017 at 09:54:55PM +0300, Alexander Bezzubikov wrote:
> 2017-07-31 17:09 GMT+03:00 Marcel Apfelbaum <marcel(a)redhat.com>:
> > On 31/07/2017 17:00, Michael S. Tsirkin wrote:
> >>
> >> On Sat, Jul 29, 2017 at 02:34:31AM +0300, Aleksandr Bezzubikov wrote:
> >>>
> >>> On PCI init PCI bridge devices may need some
> >>> extra info about bus number to reserve, IO, memory and
> >>> prefetchable memory limits. QEMU can provide this
> >>> with special vendor-specific PCI capability.
> >>>
> >>> This capability is intended to be used only
> >>> for Red Hat PCI bridges, i.e. QEMU cooperation.
> >>>
> >>> Signed-off-by: Aleksandr Bezzubikov <zuban32s(a)gmail.com>
> >>> ---
> >>> src/fw/dev-pci.h | 62
> >>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> 1 file changed, 62 insertions(+)
> >>> create mode 100644 src/fw/dev-pci.h
> >>>
> >>> diff --git a/src/fw/dev-pci.h b/src/fw/dev-pci.h
> >>> new file mode 100644
> >>> index 0000000..fbd49ed
> >>> --- /dev/null
> >>> +++ b/src/fw/dev-pci.h
> >>> @@ -0,0 +1,62 @@
> >>> +#ifndef _PCI_CAP_H
> >>> +#define _PCI_CAP_H
> >>> +
> >>> +#include "types.h"
> >>> +
> >>> +/*
> >>> +
> >>> +QEMU-specific vendor(Red Hat)-specific capability.
> >>> +It's intended to provide some hints for firmware to init PCI devices.
> >>> +
> >>> +Its is shown below:
> >>> +
> >>> +Header:
> >>> +
> >>> +u8 id; Standard PCI Capability Header field
> >>> +u8 next; Standard PCI Capability Header field
> >>> +u8 len; Standard PCI Capability Header field
> >>> +u8 type; Red Hat vendor-specific capability type:
> >>> + now only REDHAT_QEMU_CAP 1 exists
> >>> +Data:
> >>> +
> >>> +u16 non_prefetchable_16; non-prefetchable memory limit
> >>> +
> >>> +u8 bus_res; minimum bus number to reserve;
> >>> + this is necessary for PCI Express Root Ports
> >>> + to support PCIE-to-PCI bridge hotplug
> >>> +
> >>> +u8 io_8; IO limit in case of 8-bit limit value
> >>> +u32 io_32; IO limit in case of 16-bit limit value
> >>> + io_8 and io_16 are mutually exclusive, in other words,
> >>> + they can't be non-zero simultaneously
> >>> +
> >>> +u32 prefetchable_32; non-prefetchable memory limit
> >>> + in case of 32-bit limit value
> >>> +u64 prefetchable_64; non-prefetchable memory limit
> >>> + in case of 64-bit limit value
> >>> + prefetachable_32 and prefetchable_64 are
> >>> + mutually exclusive, in other words,
> >>> + they can't be non-zero simultaneously
> >>> +If any field in Data section is 0,
> >>> +it means that such kind of reservation
> >>> +is not needed.
>
> I really don't like this 'mutually exclusive' fields approach because
> IMHO it increases confusion level when undertanding this capability structure.
> But - if we came to consensus on that, then IO fields should be used
> in the same way,
> because as I understand, this 'mutual exclusivity' serves to distinguish cases
> when we employ only *_LIMIT register and both *_LIMIT an UPPER_*_LIMIT
> registers.
> And this is how both IO and PREFETCHABLE works, isn't it?
I would just use simeple 64 bit registers. PCI spec has an ugly format
with fields spread all over the place but that is because of
compatibility concerns. It makes not sense to spend cycles just
to be similarly messy.
> >>
> >>
> >
> > Hi Michael,
> >
> >> We also want a way to say "no hint for this type".
> >>
> >> One way to achive this would be to have instead multiple
> >> vendor specific capabilities, one for each of
> >> bus#/io/mem/prefetch. 0 would mean do not reserve anything,
> >> absence of capability would mean "no info, up to firmware".
> >>
> >
> > First version of the series was implemented exactly like you propose,
> > however Gerd preferred only one capability with multiple fields.
> >
> > I personally like the simplicity of vendor cap per io/mem/bus,
> > even if it is on the expense of the limited PCI Config space.
> >
>
> Personally I agree with Marcel since all this fields express
> reservations of some objects.
>
> > We need a consensus here :)
> >
>
> Absolutely :)
>
> > Thanks,
> > Marcel
> >
> >
> >>
> >>> +
> >>> +*/
> >>> +
> >>> +/* Offset of vendor-specific capability type field */
> >>> +#define PCI_CAP_VNDR_SPEC_TYPE 3
> >>
> >>
> >> This is a QEMU specific thing. Please name it as such.
> >>
> >>> +
> >>> +/* List of valid Red Hat vendor-specific capability types */
> >>> +#define REDHAT_CAP_TYPE_QEMU 1
> >>> +
> >>> +
> >>> +/* Offsets of QEMU capability fields */
> >>> +#define QEMU_PCI_CAP_NON_PREF 4
> >>> +#define QEMU_PCI_CAP_BUS_RES 6
> >>> +#define QEMU_PCI_CAP_IO_8 7
> >>> +#define QEMU_PCI_CAP_IO_32 8
> >>> +#define QEMU_PCI_CAP_PREF_32 12
> >>> +#define QEMU_PCI_CAP_PREF_64 16
> >>> +#define QEMU_PCI_CAP_SIZE 24
> >>> +
> >>> +#endif /* _PCI_CAP_H */
> >>> --
> >>> 2.7.4
> >
> >
>
>
>
> --
> Alexander Bezzubikov