Re: [SeaBIOS] Problem with Debug lvl under XEN

List overview All Threads
Download

newer

older

Relation between PCI device and...

Re: [SeaBIOS] [PATCH 3/5] vgabios:...

Ian Campbell

13 Feb 2012 13 Feb '12

9:50 p.m.

On Mon, 2012-02-13 at 23:21 +0900, Daniel Castro wrote:

...

Hello,

I have encountered something a little strange, if I set up the debug lvl to 3 or more Y will get a Triple VCPU fault. If I set it to 1 the bios runs normally but I loose a lot of information that I need to debug. Sometimes if I try to print char * variables regardless of the debug level I still get the fault.

Any ideas why?

My guess is that there is a debug print at lvl>=3 which ends up dereferencing a NULL pointer in one of its arguments (probably a %s) and this leads to a page fault. This in turn leads to a double fault because SeaBIOS does not install a page fault handler and then a triple fault because it also does not install a double fault handler. Likewise when you are printing "char * variables regardless of the debug level".

You could test this by adding an explicit check for null in the bit of bvprintf which handles %s, perhaps putc()ing "(null)" instead.

...

Thanks for the ideas/pointers/ anything!

Daniel

Show replies by date

Daniel Castro

14 Feb 14 Feb

12:21 a.m.

New subject: Problem with Debug lvl under XEN

On Tue, Feb 14, 2012 at 5:50 AM, Ian Campbell Ian.Campbell@citrix.com wrote:

...

On Mon, 2012-02-13 at 23:21 +0900, Daniel Castro wrote:

...
Hello,

I have encountered something a little strange, if I set up the debug lvl to 3 or more Y will get a Triple VCPU fault. If I set it to 1 the bios runs normally but I loose a lot of information that I need to debug. Sometimes if I try to print char * variables regardless of the debug level I still get the fault.

Any ideas why?

My guess is that there is a debug print at lvl>=3 which ends up dereferencing a NULL pointer in one of its arguments (probably a %s) and this leads to a page fault. This in turn leads to a double fault because SeaBIOS does not install a page fault handler and then a triple fault because it also does not install a double fault handler. Likewise when you are printing "char * variables regardless of the debug level".

You could test this by adding an explicit check for null in the bit of bvprintf which handles %s, perhaps putc()ing "(null)" instead.

Thanks for the response, I will try what you suggest.

...

...
Thanks for the ideas/pointers/ anything!

Daniel

-- +-=====---------------------------+ | +---------------------------------+ | This space intentionally blank for notetaking. | | | Daniel Castro, | | | | Consultant/Programmer.| | | | U Andes | +-------------------------------------+

Kevin O'Connor

1:33 a.m.

New subject: Problem with Debug lvl under XEN

On Mon, Feb 13, 2012 at 08:50:56PM +0000, Ian Campbell wrote:

...

On Mon, 2012-02-13 at 23:21 +0900, Daniel Castro wrote:

...
Hello,

I have encountered something a little strange, if I set up the debug lvl to 3 or more Y will get a Triple VCPU fault. If I set it to 1 the bios runs normally but I loose a lot of information that I need to debug. Sometimes if I try to print char * variables regardless of the debug level I still get the fault.

Any ideas why?

My guess is that there is a debug print at lvl>=3 which ends up dereferencing a NULL pointer in one of its arguments (probably a %s) and this leads to a page fault. This in turn leads to a double fault because SeaBIOS does not install a page fault handler and then a triple fault because it also does not install a double fault handler. Likewise when you are printing "char * variables regardless of the debug level".

SeaBIOS doesn't have paging enabled, so it should not need to install a page fault handler. SeaBIOS needs to write the real-mode interrupt descriptor table to address 0, so it should definitely have read/write access to the memory there. Thus, a null pointer dereference shouldn't cause a fault. Indeed, I can't think of much that should cause a fault (other than read/write to IO memory incorrectly, divide by zero, invalid opcode, etc.).

...

You could test this by adding an explicit check for null in the bit of bvprintf which handles %s, perhaps putc()ing "(null)" instead.

If you think it is specific to the Xen handling, one could also try running the same code on qemu to verify it.

-Kevin

Ian Campbell

11:08 a.m.

New subject: Problem with Debug lvl under XEN

On Tue, 2012-02-14 at 00:33 +0000, Kevin O'Connor wrote:

...

On Mon, Feb 13, 2012 at 08:50:56PM +0000, Ian Campbell wrote:

...
On Mon, 2012-02-13 at 23:21 +0900, Daniel Castro wrote:

...
Hello,

I have encountered something a little strange, if I set up the debug lvl to 3 or more Y will get a Triple VCPU fault. If I set it to 1 the bios runs normally but I loose a lot of information that I need to debug. Sometimes if I try to print char * variables regardless of the debug level I still get the fault.

Any ideas why?

My guess is that there is a debug print at lvl>=3 which ends up dereferencing a NULL pointer in one of its arguments (probably a %s) and this leads to a page fault. This in turn leads to a double fault because SeaBIOS does not install a page fault handler and then a triple fault because it also does not install a double fault handler. Likewise when you are printing "char * variables regardless of the debug level".

SeaBIOS doesn't have paging enabled, so it should not need to install a page fault handler.

Doh, yes you are obviously right!

In my defence when running virtualised paging may actually be enabled contrary to what the guest thinks is going on (I think this is needed in order to run real-mode code on EPT with a 1-1 map).

Really the hypervisor should completely hide this from the guest. I'm not actually sure what Xen does but it may well take the easy way out and rely on the BIOS not faulting... It still ought to print at least the faulting address and IP on triple fault though. It may be useful for Daniel to patch xen/arch/x86/hvm/hvm.c:hvm_triple_fault to add this information.

...

SeaBIOS needs to write the real-mode interrupt descriptor table to address 0, so it should definitely have read/write access to the memory there. Thus, a null pointer dereference shouldn't cause a fault. Indeed, I can't think of much that should cause a fault (other than read/write to IO memory incorrectly, divide by zero, invalid opcode, etc.).

An invalid pointer other than NULL might also do it, e.g. I think Xen scrubs memory (in a debug build) to something like 0xcc.

In that case a NULL check won't work but I suppose one could use a patch which treats %s as %p for the purposes of debugging it...

...

...
You could test this by adding an explicit check for null in the bit of bvprintf which handles %s, perhaps putc()ing "(null)" instead.

If you think it is specific to the Xen handling, one could also try running the same code on qemu to verify it.

Also trying the underlying SeaBIOS version without any local patches would be a good idea if you haven't already.

Ian.

Daniel Castro

11:38 a.m.

New subject: Problem with Debug lvl under XEN

On Tue, Feb 14, 2012 at 7:08 PM, Ian Campbell Ian.Campbell@citrix.com wrote:

...

On Tue, 2012-02-14 at 00:33 +0000, Kevin O'Connor wrote:

...
On Mon, Feb 13, 2012 at 08:50:56PM +0000, Ian Campbell wrote:

...
On Mon, 2012-02-13 at 23:21 +0900, Daniel Castro wrote:

...
Hello,

I have encountered something a little strange, if I set up the debug lvl to 3 or more Y will get a Triple VCPU fault. If I set it to 1 the bios runs normally but I loose a lot of information that I need to debug. Sometimes if I try to print char * variables regardless of the debug level I still get the fault.

Any ideas why?

My guess is that there is a debug print at lvl>=3 which ends up dereferencing a NULL pointer in one of its arguments (probably a %s) and this leads to a page fault. This in turn leads to a double fault because SeaBIOS does not install a page fault handler and then a triple fault because it also does not install a double fault handler. Likewise when you are printing "char * variables regardless of the debug level".

SeaBIOS doesn't have paging enabled, so it should not need to install a page fault handler.

Doh, yes you are obviously right!

In my defence when running virtualised paging may actually be enabled contrary to what the guest thinks is going on (I think this is needed in order to run real-mode code on EPT with a 1-1 map).

Really the hypervisor should completely hide this from the guest. I'm not actually sure what Xen does but it may well take the easy way out and rely on the BIOS not faulting... It still ought to print at least the faulting address and IP on triple fault though. It may be useful for Daniel to patch xen/arch/x86/hvm/hvm.c:hvm_triple_fault to add this information.

...
SeaBIOS needs to write the real-mode interrupt descriptor table to address 0, so it should definitely have read/write access to the memory there. Thus, a null pointer dereference shouldn't cause a fault. Indeed, I can't think of much that should cause a fault (other than read/write to IO memory incorrectly, divide by zero, invalid opcode, etc.).

An invalid pointer other than NULL might also do it, e.g. I think Xen scrubs memory (in a debug build) to something like 0xcc.

In that case a NULL check won't work but I suppose one could use a patch which treats %s as %p for the purposes of debugging it...

...
...
You could test this by adding an explicit check for null in the bit of bvprintf which handles %s, perhaps putc()ing "(null)" instead.

If you think it is specific to the Xen handling, one could also try running the same code on qemu to verify it.

Also trying the underlying SeaBIOS version without any local patches would be a good idea if you haven't already.

Well I suspected some limitation on the stack or something like that, so I decided to divide the code in a succession of function calls, for example: int share_vbd(char* device); int share_vbd2(char * device, char * state); int share_vbd3((char * device, char * state, char *back_end_path); etc...

Anyway now the fault is not present, it is the same code just that I called it in a succesion of function... So my best guess is that I was over running the stack.

Thank you all for the suggestion, I will implement Ian's suggestions.

Daniel

...

Ian.

4833

days inactive

4834

days old

seabios@seabios.org

4 comments

3 participants

tags (0)

participants (3)

Daniel Castro
Ian Campbell
Kevin O'Connor