On Mon, Jan 25, 2010 at 05:46:42PM +0100, Jes Sorensen wrote:
Hi,
Right now KVM/QEMU relies on hard coded values in Seabios for the reserved area for the TSS pages and the EPT page.
I'd like to suggest we change this to pass the value from QEMU via qemu-cfg making it possible to move it around dynamically in the future.
Attached is a patch to Seabios for this, which defaults to the current hard coded value if no value is provided by qemu-cfg. We can remove the backwards compatibility later.
I'll post the QEMU patches for upstream QEMU and QEMU-KVM in a minute.
Comments most welcome!
I like the idea, but I think it would be better to pass a list of e820 entries explicitly. That is, pass an array of:
struct e820entry { u64 start; u64 size; u32 type; };
where 'type' uses the standard e820 definitions. That way, SeaBIOS can just walk through the list and add them to its e820 map. BTW, this is what SeaBIOS does when running under coreboot (coreboot passes a memory map as part of the "coreboot tables"). SeaBIOS is already smart enough to not use any high memory addresses marked as reserved.
-Kevin
Hi,
Based on the feedback I received over the e820 reserve patch, I have changed it to have QEMU pass in a list of entries that can cover more than just the TSS/EPT range. This should provide the flexibility that people were asking for.
The Seabios portion should allow for unlimited sized tables in theory, whereas for QEMU I have set a fixed limit for now, but it can easily be extended.
Please let me know what you think of this version!
Cheers, Jes
Hi,
This is the QEMU-KVM part of the patch. If we can agree on this approach, I will do a version for upstream QEMU as well.
Cheers, Jes
On 26.01.2010, at 22:53, Jes Sorensen wrote:
Hi,
This is the QEMU-KVM part of the patch. If we can agree on this approach, I will do a version for upstream QEMU as well.
It shows as attachment again :(.
Alex
Cheers, Jes
<0011-qemu-kvm-e820-table.patch>
On Tue, Jan 26, 2010 at 10:52:12PM +0100, Jes Sorensen wrote:
Read optional table of e820 entries from qemu_cfg
[...]
--- seabios.orig/src/paravirt.c +++ seabios/src/paravirt.c @@ -132,6 +132,23 @@ u16 qemu_cfg_smbios_entries(void) return cnt; }
+u32 qemu_cfg_e820_entries(void) +{
- u32 cnt;
- if (!qemu_cfg_present)
return 0;
- qemu_cfg_read_entry(&cnt, QEMU_CFG_E820_TABLE, sizeof(cnt));
- return cnt;
+}
+void* qemu_cfg_e820_load_next(void *addr) +{
- qemu_cfg_read(addr, sizeof(struct e820_entry));
- return addr;
+}
I think defining accessor functions for every piece of data passed through qemu-cfg interface is going to get tiring. I'd prefer to extend the existing qemu-cfg "file" interface for new content.
For example, add a helper with something like:
int qemu_cfg_get_file(const char *name, void *dest, int maxsize);
- if (kvm_para_available())
// 4 pages before the bios, 3 pages for vmx tss pages, the
// other page for EPT real mode pagetable
add_e820(0xfffbc000, 4*4096, E820_RESERVED);
- if (kvm_para_available()) {
u32 count;
count = qemu_cfg_e820_entries();
if (count) {
struct e820_entry entry;
int i;
for (i = 0; i < count; i++) {
qemu_cfg_e820_load_next(&entry);
add_e820(entry.address, entry.length, entry.type);
}
and then this becomes:
struct e820entry map[128]; int len = qemu_cfg_get_file("e820map", &map, sizeof(map)); if (len >= 0) for (i=0; i<len / sizeof(map[0]); i++) add_e820(map[i].start, map[i].size, map[i].type);
The advantage being that it should be possible to write one set of helper functions in both qemu and seabios that can then be used to pass arbitrary content.
As a side note, it should probably do the e820 map check even for qemu users (ie, not just kvm).
-Kevin
On 01/28/10 05:39, Kevin O'Connor wrote:
I think defining accessor functions for every piece of data passed through qemu-cfg interface is going to get tiring. I'd prefer to extend the existing qemu-cfg "file" interface for new content.
For example, add a helper with something like:
int qemu_cfg_get_file(const char *name, void *dest, int maxsize);
Hi Kevin,
I think switching qemu_cfg to use a file name based interface would be a nice feature, but I think it should be independent of this patch. I am CC'ing Gleb on this as he did the original design I believe.
- if (kvm_para_available())
// 4 pages before the bios, 3 pages for vmx tss pages, the
// other page for EPT real mode pagetable
add_e820(0xfffbc000, 4*4096, E820_RESERVED);
- if (kvm_para_available()) {
u32 count;
count = qemu_cfg_e820_entries();
if (count) {
struct e820_entry entry;
int i;
for (i = 0; i< count; i++) {
qemu_cfg_e820_load_next(&entry);
add_e820(entry.address, entry.length, entry.type);
}
and then this becomes:
struct e820entry map[128]; int len = qemu_cfg_get_file("e820map",&map, sizeof(map)); if (len>= 0) for (i=0; i<len / sizeof(map[0]); i++) add_e820(map[i].start, map[i].size, map[i].type);
The advantage being that it should be possible to write one set of helper functions in both qemu and seabios that can then be used to pass arbitrary content.
The only issue here is that I designed the Seabios portion to not rely on the size of the struct, to avoid having to statically reserve it like in your example. Having the qemu_cfg_get_file() function return a pointer to a file descriptor and then have a qemu_cfg_read() helper that takes the descriptor as it's first argument would avoid this problem.
As a side note, it should probably do the e820 map check even for qemu users (ie, not just kvm).
Ah I didn't realize Seabios would try to use the fw_cfg interface if it wasn't running on top of QEMU. That would be good to do.
Cheers, Jes
On Fri, Jan 29, 2010 at 10:03:55AM +0100, Jes Sorensen wrote:
On 01/28/10 05:39, Kevin O'Connor wrote:
I think defining accessor functions for every piece of data passed through qemu-cfg interface is going to get tiring. I'd prefer to extend the existing qemu-cfg "file" interface for new content.
For example, add a helper with something like:
int qemu_cfg_get_file(const char *name, void *dest, int maxsize);
Hi Kevin,
I think switching qemu_cfg to use a file name based interface would be a nice feature, but I think it should be independent of this patch. I am CC'ing Gleb on this as he did the original design I believe.
There is already file like interface on top of fw_cfg. Look for qemu_cfg_read_file(). I am not sure this is a good idea to start using it for something that is not actually a file. I have no problem with adding accessors for each new data time. As you noted below this way we don't need to load the whole e820 map into the memory, but can do entry by entry.
-- Gleb.
On Fri, Jan 29, 2010 at 10:03:55AM +0100, Jes Sorensen wrote:
On 01/28/10 05:39, Kevin O'Connor wrote:
The advantage being that it should be possible to write one set of helper functions in both qemu and seabios that can then be used to pass arbitrary content.
The only issue here is that I designed the Seabios portion to not rely on the size of the struct, to avoid having to statically reserve it like in your example. Having the qemu_cfg_get_file() function return a pointer to a file descriptor and then have a qemu_cfg_read() helper that takes the descriptor as it's first argument would avoid this problem.
SeaBIOS already has a maximum size for the e820 map (32) - see CONFIG_MAX_E820.
As a side note, it should probably do the e820 map check even for qemu users (ie, not just kvm).
Ah I didn't realize Seabios would try to use the fw_cfg interface if it wasn't running on top of QEMU. That would be good to do.
Your patch only used it for kvm. SeaBIOS will use fw_cfg on both qemu and kvm.
-Kevin
On 01/28/10 05:39, Kevin O'Connor wrote:
As a side note, it should probably do the e820 map check even for qemu users (ie, not just kvm).
Hi Kevin,
Here is an updated version of the patch which does the e820 read unconditionally of the return from kvm_para_available() so it should work for coreboot too.
I haven't touched the file descriptor issue as I find it's a different discussion.
Let me know what you think.
Cheers, Jes
PS: I tried to subscribe to the seabios list but I never got an ack for it :(
On Mon, Feb 08, 2010 at 11:31:40AM +0100, Jes Sorensen wrote:
On 01/28/10 05:39, Kevin O'Connor wrote:
As a side note, it should probably do the e820 map check even for qemu users (ie, not just kvm).
Hi Kevin,
Here is an updated version of the patch which does the e820 read unconditionally of the return from kvm_para_available() so it should work for coreboot too.
I haven't touched the file descriptor issue as I find it's a different discussion.
I'd prefer to use the "file" method - but I wont hold up your patch for it. If the host part of your patch is committed to qemu, I'll commit the SeaBIOS part.
[...]
+struct e820_entry {
- u64 address;
- u64 length;
- u32 type;
+};
I find this struct to be easily confused with 'struct e820entry' in memmap.h. Both code should use the same struct, or the new struct should clearly indicate it's for qemu (eg, "qemu_e820_entry").
Thanks, -Kevin