When using seabios with Xen and qemu, I frequently get the following message:
WARNING - Timeout at i8042_flush:69!
I do have a slightly unusual configuration compared to the typical Xen configuration. We are running with no hvmloader. However I don't think that's related to this issue....
The flush routine attempts to read the queue depth's worth of data from the port and then checks to see if the queue is empty. There are two issues with this. First is that qemu and seabios disagree about the queue depth. Qemu's ps2 buffer has 256 entries, and I8042_BUFFER_SIZE is 16 for seabios. The second is that a command may arrive between the start of the flush loop and the completion.
I believe a more robust way to do this would be to first issue a RESET command, and then flush. That resolves the issue that I am seeing.
Signed-off By: John Baboval john.baboval@virtualcomputer.com --- diff --git a/src/ps2port.c b/src/ps2port.c index 58335af..4ed25bf 100644 --- a/src/ps2port.c +++ b/src/ps2port.c @@ -20,7 +20,7 @@ // Timeout value. #define I8042_CTL_TIMEOUT 10000
-#define I8042_BUFFER_SIZE 16 +#define I8042_BUFFER_SIZE 256
static int i8042_wait_read(void) @@ -391,7 +391,7 @@ handle_09(void) } v = inb(PORT_PS2_DATA);
- if (!(GET_EBDA(ps2ctr) & I8042_CTR_KBDINT)) + if (!(GET_EBDA(ps2ctr) & I8042_CTR_KBDINT)) // Interrupts not enabled. goto done;
@@ -409,13 +409,17 @@ done: static void keyboard_init(void *data) { + int ret; + u8 param[2]; /* flush incoming keys */ - int ret = i8042_flush(); + ret = ps2_kbd_command(ATKBD_CMD_RESET_BAT, param); + if (ret) + return; + ret = i8042_flush(); if (ret) return;
// Controller self-test. - u8 param[2]; ret = i8042_command(I8042_CMD_CTL_TEST, param); if (ret) return;
I somehow let a whitespace-only change sneak into that diff.... This one should be better:
On Thu, Oct 27, 2011 at 04:55:15PM -0400, John Baboval wrote:
When using seabios with Xen and qemu, I frequently get the following message:
WARNING - Timeout at i8042_flush:69!
I do have a slightly unusual configuration compared to the typical Xen configuration. We are running with no hvmloader. However I don't think that's related to this issue....
The flush routine attempts to read the queue depth's worth of data from the port and then checks to see if the queue is empty. There are two issues with this. First is that qemu and seabios disagree about the queue depth. Qemu's ps2 buffer has 256 entries, and I8042_BUFFER_SIZE is 16 for seabios. The second is that a command may arrive between the start of the flush loop and the completion.
I believe a more robust way to do this would be to first issue a RESET command, and then flush. That resolves the issue that I am seeing.
The ps2 port code is used on real hardware as well as on QEmu/Xen. The goal of the flush is two fold - to remove bytes in the i8042 buffer from a previous session and to be sure the i8042 is actually present and working. So, we can't issue a reset prior to flushing the i8042.
As for the queue length - this seems like a virtual machine quirk with timing. The goal of the code is to clear the i8042 buffer which shouldn't be more than a few bytes. On a real machine, the keyboard wouldn't have enough time to fill the i8042 before the flush would remove any characters.
It's possible to increase the timeout, but the increased boot time can be annoying for users on real hardware without a ps2 port. Another option may be to change the code so that it reads and discards continuosly for 1ms - that way it can clear however many bytes are present, but still move on quickly if the port isn't responding.
-Kevin
On 10/27/2011 07:56 PM, Kevin O'Connor wrote:
On Thu, Oct 27, 2011 at 04:55:15PM -0400, John Baboval wrote:
I believe a more robust way to do this would be to first issue a RESET command, and then flush. That resolves the issue that I am seeing.
As for the queue length - this seems like a virtual machine quirk with timing. The goal of the code is to clear the i8042 buffer which shouldn't be more than a few bytes. On a real machine, the keyboard wouldn't have enough time to fill the i8042 before the flush would remove any characters.
It's possible to increase the timeout, but the increased boot time can be annoying for users on real hardware without a ps2 port. Another option may be to change the code so that it reads and discards continuosly for 1ms - that way it can clear however many bytes are present, but still move on quickly if the port isn't responding.
Since the flush breaks out of the loop when the buffer is empty, increasing the count shouldn't increase boot times except in cases where the keyboard would have been broken anyway.
I like the time based mechanism, though. The reset and then looping for <buffer size> iterations was still a race between the loop and the key repeat or angry user.
I will re-code it as a time-based loop instead of a count based loop and re-submit it.
I'm also trying to figure out how my buffer is getting filled so early.
On Fri, Oct 28, 2011 at 09:50:18AM -0400, John Baboval wrote:
On 10/27/2011 07:56 PM, Kevin O'Connor wrote:
On Thu, Oct 27, 2011 at 04:55:15PM -0400, John Baboval wrote:
I believe a more robust way to do this would be to first issue a RESET command, and then flush. That resolves the issue that I am seeing.
As for the queue length - this seems like a virtual machine quirk with timing. The goal of the code is to clear the i8042 buffer which shouldn't be more than a few bytes. On a real machine, the keyboard wouldn't have enough time to fill the i8042 before the flush would remove any characters.
It's possible to increase the timeout, but the increased boot time can be annoying for users on real hardware without a ps2 port. Another option may be to change the code so that it reads and discards continuosly for 1ms - that way it can clear however many bytes are present, but still move on quickly if the port isn't responding.
Since the flush breaks out of the loop when the buffer is empty, increasing the count shouldn't increase boot times except in cases where the keyboard would have been broken anyway.
Many real machines use USB keyboards or don't have a keyboard at all. Granted, they usually emulate the i8042 anyway, but we've run into enough boards without a functional i8042 that this needs to work correctly.
I like the time based mechanism, though. The reset and then looping for <buffer size> iterations was still a race between the loop and the key repeat or angry user.
The goal of the flush isn't to get rid of keyboard keys, but to get rid of lingering responses from commands sent to the i8042. A spurious key event won't cause an issue during init, but a stale response to an i8042 command from a previous session will confuse things.
I will re-code it as a time-based loop instead of a count based loop and re-submit it.
I'm also trying to figure out how my buffer is getting filled so early.
Okay - thanks.
-Kevin
Attached is a re-write of my previous patch that changes i8042_flush to be timeout based rather than buffer size based.... Which turns out to not be significantly different, since the udelay in the loop is the basis for the timing. I created the I8042_FLUSH_TIMEOUT delay, and used it similarly to the I8042_CTL_TIMEOUT define; however the flush timeout is much shorter.
I was tempted to make the "50" that is hard coded into the udelays into a define to make the timeout defines more readable, but it seemed like unnecessary refactoring that is outside of the scope of this change.
As a tangentially related aside, I found my bug in the device model that was causing the queue to be full so early as well. So the keyboard is working for me even without this patch now.
-John
On Tue, Nov 01, 2011 at 02:24:48PM -0400, John Baboval wrote:
Attached is a re-write of my previous patch that changes i8042_flush to be timeout based rather than buffer size based.... Which turns out to not be significantly different, since the udelay in the loop is the basis for the timing. I created the I8042_FLUSH_TIMEOUT delay, and used it similarly to the I8042_CTL_TIMEOUT define; however the flush timeout is much shorter.
I was tempted to make the "50" that is hard coded into the udelays into a define to make the timeout defines more readable, but it seemed like unnecessary refactoring that is outside of the scope of this change.
As a tangentially related aside, I found my bug in the device model that was causing the queue to be full so early as well. So the keyboard is working for me even without this patch now.
Increasing the timeout from 800us to 15ms is annoying for those without a ps2 port. I was thinking something more like what's below. However, now that I think about it, the dprintf() could cause a delay when writing to the serial port which could skew the timing. Maybe just leave it alone if you're no longer seeing the issue (or alternatively the dprintf() could be removed from the loop).
-Kevin
--- a/src/ps2port.c +++ b/src/ps2port.c @@ -19,8 +19,7 @@
// Timeout value. #define I8042_CTL_TIMEOUT 10000 - -#define I8042_BUFFER_SIZE 16 +#define I8042_FLUSH_TIMEOUT 1 // ms
static int i8042_wait_read(void) @@ -56,14 +55,15 @@ static int i8042_flush(void) { dprintf(7, "i8042_flush\n"); - int i; - for (i=0; i<I8042_BUFFER_SIZE; i++) { + u64 end = calc_future_tsc(I8042_FLUSH_TIMEOUT); + for (;;) { u8 status = inb(PORT_PS2_STATUS); if (! (status & I8042_STR_OBF)) return 0; - udelay(50); u8 data = inb(PORT_PS2_DATA); dprintf(7, "i8042 flushed %x (status=%x)\n", data, status); + if (check_tsc(end)) + break; }
warn_timeout();