[LinuxBIOS] V3 fallback, failover, and the normal boot flag.

List overview All Threads
Download

newer

older

[LinuxBIOS] r457 - in LinuxBIOSv3:...

[LinuxBIOS] (no subject)

Marc Jones

18 Jul 2007 18 Jul '07

12:11 a.m.

How should this work in V3? The current implementation doesn't really make sense to me. See do_normal_boot() in V2.

I think that stage0+1 is the equivalent of V2 failover. In stage1 the normal boot flag cmos byte is checked to see if the normal or the fallback image should be loaded.

The cmos byte is not documented well. I think it is as follows: [7-4] - boot count [3-2] - not defined [1] - last boot flag [0] - normal boot flag

Boot count increments until reset if normal boot flag is set.

Last boot flag is set anytime boot count is less than max boot count.

Normal boot flag is cleared if boot count exceeds max boot count. Normal boot flag doesn't seem that it can ever be set?

Currently in V3, if CMOS checksum is invalid (CMOS is cleared/not setup) then the boot count is maxed and the fallback image is forced. I don't think this is the correct thing to do. If checksum is bad we should clear cmos, set the boot count to 0, and try to do a normal boot. If normal boot fails the CMOS will checksum this time around. The boot count flag should be incremented should try again until max boot count is exceeded. Then the normal boot flag should be cleared and fallback image used. The last item is about the last boot flag. I don't think that it is needed.

Thoughts and comments? Marc

-- Marc Jones Senior Software Engineer (970) 226-9684 Office mailto:Marc.Jones@amd.com http://www.amd.com/embeddedprocessors

Show replies by date

Marc Jones

18 Jul 18 Jul

12:20 a.m.

Marc Jones wrote:

...

How should this work in V3? The current implementation doesn't really make sense to me. See do_normal_boot() in V2.

I think that stage0+1 is the equivalent of V2 failover. In stage1 the normal boot flag cmos byte is checked to see if the normal or the fallback image should be loaded.

The cmos byte is not documented well. I think it is as follows: [7-4] - boot count [3-2] - not defined [1] - last boot flag [0] - normal boot flag

Boot count increments until reset if normal boot flag is set.

Last boot flag is set anytime boot count is less than max boot count.

Normal boot flag is cleared if boot count exceeds max boot count. Normal boot flag doesn't seem that it can ever be set?

Currently in V3, if CMOS checksum is invalid (CMOS is cleared/not setup) then the boot count is maxed and the fallback image is forced. I don't think this is the correct thing to do. If checksum is bad we should clear cmos, set the boot count to 0, and try to do a normal boot. If normal boot fails the CMOS will checksum this time around. The boot count flag should be incremented should try again until max boot count is exceeded. Then the normal boot flag should be cleared and fallback image used. The last item is about the last boot flag. I don't think that it is needed.

Thoughts and comments? Marc

Also, the boot count needs to be cleared at the end of LinuxBIOS. I think that this what the last boot flag was trying to do.

Marc

-- Marc Jones Senior Software Engineer (970) 226-9684 Office mailto:Marc.Jones@amd.com http://www.amd.com/embeddedprocessors

Stefan Reinauer

1:09 a.m.

* Marc Jones marc.jones@amd.com [070718 00:20]:

...

Also, the boot count needs to be cleared at the end of LinuxBIOS. I think that this what the last boot flag was trying to do.

with v3 we can potentially have n:m mappings for all modules. is a single byte like we have it in v2 still appropriate?

Or should we just leave the mechanism unchanged for now despite we could do it more complex^Wflexible?

We should document this in the design paper. I remember I added some hack to lxbios to make switching between normal and fallback work.

Stefan

-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/

ron minnich

1:16 a.m.

I think the key thing that v2 missed was in letting linuxbios set the 'boot success' flag. That flag should only be set by a payload or even linux. It's not really useful to know that linuxbios came up. What matters is that the payload came up. It was a big headache to me, esp. on 1000 nodes, when linuxbios came up, said "boot normal is fine', and loaded a payload which locked up.

So, at this point, I'd argue that linuxbios can CLEAR the boot normal, but the only thing that should set it is a payload. This is very conservative behaviour, but I think it's correct.

And, yes, if CMOS is messed up, zero it, try a normal boot, come around if that fails.

n:m mappings for all modules, well, we don't have lots of bits. Let's not get too fancy. We're not even coming up yet.

ron

Stefan Reinauer

1:35 a.m.

* ron minnich rminnich@gmail.com [070718 01:16]:

...

So, at this point, I'd argue that linuxbios can CLEAR the boot normal, but the only thing that should set it is a payload. This is very conservative behaviour, but I think it's correct.

It definitely is, since linuxbios itself has no shell or other interface to recover easily from such a case.

But since many payloads don't do this correctly, should this be a config option?

...

n:m mappings for all modules, well, we don't have lots of bits. Let's not get too fancy. We're not even coming up yet.

So let's use our modules and try to emulate a normal/fallback behavior for now.

-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/

Peter Stuge

3:05 a.m.

On Wed, Jul 18, 2007 at 01:35:49AM +0200, Stefan Reinauer wrote:

...

...
linuxbios can CLEAR the boot normal, but the only thing that should set it is a payload. This is very conservative behaviour, but I think it's correct.

It definitely is, since linuxbios itself has no shell or other interface to recover easily from such a case.

It's the only correct behavior.

...

But since many payloads don't do this correctly, should this be a config option?

Not payload, the OS. Maybe even the application. I say teach lxbios how to set the completed flag if it doesn't already know and move the problem to where it belongs - userspace and init scripts.

...

So let's use our modules and try to emulate a normal/fallback behavior for now.

That feels broken, I'd rather have LB boot fallback forever.

Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

//Peter

Corey Osgood

5:06 a.m.

Peter Stuge wrote:

...

Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

//Peter

Agreed

-Corey

ron minnich

5:16 a.m.

On 7/17/07, Corey Osgood corey.osgood@gmail.com wrote:

...

Peter Stuge wrote:

...
Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

//Peter

Agreed

-Corey

So I guess the original logic is right.

- CMOS csum bad? fallback boot - last boot normal not set? fallback boot

but we still need the ability to clear one bit, so we can drop to fallback. That bit is the 'last boot normal' bit. We can't get away from clearing one bit. But we can avoid messing with cmos otherwise.

ron

Marc Jones

5:45 p.m.

ron minnich wrote:

...

On 7/17/07, Corey Osgood corey.osgood@gmail.com wrote:

...
Peter Stuge wrote:

...
Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

//Peter

Agreed

-Corey

So I guess the original logic is right.

CMOS csum bad? fallback boot

last boot normal not set? fallback boot

but we still need the ability to clear one bit, so we can drop to fallback. That bit is the 'last boot normal' bit. We can't get away from clearing one bit. But we can avoid messing with cmos otherwise.

ron

It seems that it is the behavior that most people want.

Marc

-- Marc Jones Senior Software Engineer (970) 226-9684 Office mailto:Marc.Jones@amd.com http://www.amd.com/embeddedprocessors

Stefan Reinauer

11:30 a.m.

* Corey Osgood corey.osgood@gmail.com [070718 05:06]:

...

Peter Stuge wrote:

...
Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

Agreed

This wont work, or it will render lots of features unusable. LinuxBIOS relies on having the cmos available for its settings. Fallback does not use any CMOS settings though. (Does it erase the CMOS anyways? If so, this should probably be fixed)

Instead, if you switch between flash contents, switch between cmos contents, too by using lxbios (now in the tree)

-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/

Carl-Daniel Hailfinger

9:17 a.m.

On 18.07.2007 03:05, Peter Stuge wrote:

...

Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

What about writing to flash instead? IIRC some factory BIOSes write to flash instead of writing to cmos. Of coure that makes sense only if you are able to flash single pages.

Regards, Carl-Daniel

ron minnich

6:26 p.m.

On 7/18/07, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:

...

On 18.07.2007 03:05, Peter Stuge wrote:

...
Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

What about writing to flash instead? IIRC some factory BIOSes write to flash instead of writing to cmos. Of coure that makes sense only if you are able to flash single pages.

it's a hjgh overhead operation. I still prefer using one bit of CMOS.

ron

Stefan Reinauer

11:27 a.m.

* Peter Stuge peter@stuge.se [070718 03:05]:

...

...
But since many payloads don't do this correctly, should this be a config option?

Not payload, the OS. Maybe even the application. I say teach lxbios how to set the completed flag if it doesn't already know and move the problem to where it belongs - userspace and init scripts.

Yes, the OS should set it. So we should add an option to lxbios lxbios --set-boot-complete or something.

...

...
So let's use our modules and try to emulate a normal/fallback behavior for now.

That feels broken, I'd rather have LB boot fallback forever.

Ok. This makes sense to some extent.

...

Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

lxbios can back up the cmos for you. So when you call flashrom, you should also call lxbios to put the cmos values for your bios in place.

It would be nice if flashrom could call lxbios externally to automatically do that, or to have a wrapper script.

One of my customers uses 2 small scripts

switch_to_linuxbios.sh switch_to_legacybios.sh

that flash a fiven bios image and write the according cmos contents to it.

...

Writing LB CMOS bytes would be done after the final switch.

Final switch?

-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/

Carl-Daniel Hailfinger

12:43 p.m.

On 18.07.2007 11:27, Stefan Reinauer wrote:

...

Peter Stuge peter@stuge.se [070718 03:05]:

...
...
But since many payloads don't do this correctly, should this be a config option?

Not payload, the OS. Maybe even the application. I say teach lxbios how to set the completed flag if it doesn't already know and move the problem to where it belongs - userspace and init scripts.

Yes, the OS should set it. So we should add an option to lxbios lxbios --set-boot-complete or something.

Hm. Can we abuse ACPI to do that? Like accessing SystemCMOS from a _INI function?

Regards, Carl-Daniel

Stefan Reinauer

1:15 p.m.

* Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [070718 12:43]:

...

Hm. Can we abuse ACPI to do that? Like accessing SystemCMOS from a _INI function?

Possibly. But ACPI is running very early in the game. Where would we hook it up?

This would also establish ACPI as a pretty hard requirement. There should be an alternative, too.

Using ACPI here would be nicely transparent though, hiding firmware specifics in the firmware code. I like that approach.

-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/

Carl-Daniel Hailfinger

4:13 p.m.

On 18.07.2007 13:15, Stefan Reinauer wrote:

...

Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [070718 12:43]:

...
Hm. Can we abuse ACPI to do that? Like accessing SystemCMOS from a _INI function?

Possibly. But ACPI is running very early in the game. Where would we hook it up?

This would also establish ACPI as a pretty hard requirement. There should be an alternative, too.

It seems there is even a specification FAQ how to do it with Windows: http://www.microsoft.com/whdc/resources/respec/specs/simp_bios.mspx For the real spec, see http://www.microsoft.com/whdc/resources/respec/specs/simp_boot.mspx or google for "sbf21.doc". I have not read the real spec because I didn't want to agree to their LA.

...

Using ACPI here would be nicely transparent though, hiding firmware specifics in the firmware code. I like that approach.

Since I just discovered the Microsoft approach, I doubt we would want to invent our own mechanism.

Regards, Carl-Daniel

Peter Stuge

5:32 p.m.

On Wed, Jul 18, 2007 at 01:15:50PM +0200, Stefan Reinauer wrote:

...

Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net [070718 12:43]:

...
Hm. Can we abuse ACPI to do that? Like accessing SystemCMOS from a _INI function?

Possibly. But ACPI is running very early in the game. Where would we hook it up?

This would also establish ACPI as a pretty hard requirement. There should be an alternative, too.

Using ACPI here would be nicely transparent though, hiding firmware specifics in the firmware code. I like that approach.

We want to use less ACPI, not more, right? Tempting as it may be, can't we find a better way?

On Wed, Jul 18, 2007 at 04:13:24PM +0200, Carl-Daniel Hailfinger wrote:

...

http://www.microsoft.com/whdc/resources/respec/specs/simp_bios.mspx For the real spec, see http://www.microsoft.com/whdc/resources/respec/specs/simp_boot.mspx or google for "sbf21.doc". I have not read the real spec because I didn't want to agree to their LA.

Also see Dell's patent:

http://www.patentstorm.us/patents/6640316.html

or just google simple boot flag. This article is informative:

http://www.rtcmagazine.com/home/article.php?id=100333

It seems that the simple boot flag is merely intended to control what parts of POST is performed. Not a perfect fit for us..

...

Since I just discovered the Microsoft approach, I doubt we would want to invent our own mechanism.

But they are two different problems.

We could use the simple boot flag to do clever things during init (like cache a list of register writes) but normal vs. fallback is AFAIK a new concept at least in PC firmware so there's no existing mechanism that really fits.

//Peter

Stefan Reinauer

6:03 p.m.

* Peter Stuge peter@stuge.se [070718 17:32]:

...

...
Using ACPI here would be nicely transparent though, hiding firmware specifics in the firmware code. I like that approach.

We want to use less ACPI, not more, right?

Not sure. I think the use of ACPI as to and will greatly increase in the (near) future for us. At some point all boards should use ACPI. Or we end up patching the kernel with power management code for each and every board. Or we continue to ignore the power management issue as we did until now, but then we will not support laptops.

...

Tempting as it may be, can't we find a better way?

Sure, do it manually. I think whether LinuxBIOS or ACPI or the OS does this should be configurable (or detectable by inspecting the payload?)

...

Also see Dell's patent:

http://www.patentstorm.us/patents/6640316.html

or just google simple boot flag. This article is informative:

http://www.rtcmagazine.com/home/article.php?id=100333

It seems that the simple boot flag is merely intended to control what parts of POST is performed. Not a perfect fit for us..

Did you see the patent on "fast booting"? If we are scared we have to boot slow, because fast booting is patented. Engineers, let's go home, we're not allowed to do a better job anymore.

Stefan

-- coresystems GmbH • Brahmsstr. 16 • D-79104 Freiburg i. Br. Tel.: +49 761 7668825 • Fax: +49 761 7664613 Email: info@coresystems.de • http://www.coresystems.de/

Peter Stuge

5:06 p.m.

On Wed, Jul 18, 2007 at 09:17:14AM +0200, Carl-Daniel Hailfinger wrote:

...

What about writing to flash instead?

Yes, that is definately where I want to go evetually.

But, for the last_boot_normal this one bit toggles on every boot, assuming one reboot per second a 100k writes flash chip is destroyed in about one day. :( Perhaps a clever algorithm can fix that by using a full 4k page rather than a single bit.

On Tue, Jul 17, 2007 at 08:16:34PM -0700, ron minnich wrote:

...

...
...
Also re CMOS checksum, I would like if LB could avoid clobbering the CMOS, for when switching back and forth with another BIOS.

Writing LB CMOS bytes would be done after the final switch.

Agreed

So I guess the original logic is right.

CMOS csum bad? fallback boot

last boot normal not set? fallback boot

Yep:

if checksumok and last_boot_normal: clear last_boot_normal boot normal else: boot fallback

...

but we still need the ability to clear one bit, so we can drop to fallback. That bit is the 'last boot normal' bit. We can't get away from clearing one bit. But we can avoid messing with cmos otherwise.

As long as LB checksum does not collide with the factory checksum this will just work<tm> because the checksum is always "bad" until lxbios has been run. This is exactly what I would like.

LB can complain quite loudly that the CMOS is bad and that it should be upgraded with lxbios in order to actually do a normal boot, but I think having this backwards compatibility is really important.

On Wed, Jul 18, 2007 at 11:30:09AM +0200, Stefan Reinauer wrote:

...

...
...
Writing LB CMOS bytes would be done after the final switch.

Agreed

This wont work, or it will render lots of features unusable. LinuxBIOS relies on having the cmos available for its settings. Fallback does not use any CMOS settings though.

That's fine. Until the CMOS has been LB:ified the best you'll get is fallback. I think this is a great deal if we can make sure that LB never clobbers the CMOS until after lxbios has been run.

...

(Does it erase the CMOS anyways? If so, this should probably be fixed)

All I can say for sure is that LB does change CMOS bytes even if lxbios has not been run. :)

...

Instead, if you switch between flash contents, switch between cmos contents, too by using lxbios (now in the tree)

Yes, good point. But I will forget to do it, and others may not want to do it. I think it would be very nice to not have this requirement.

On Wed, Jul 18, 2007 at 11:27:43AM +0200, Stefan Reinauer wrote:

...

Yes, the OS should set it. So we should add an option to lxbios lxbios --set-boot-complete or something.

Yes! Perfect.

...

lxbios can back up the cmos for you. So when you call flashrom, you should also call lxbios to put the cmos values for your bios in place.

It would be nice if flashrom could call lxbios externally to automatically do that, or to have a wrapper script.

I thought about that too. Perhaps store original CMOS in the lar?

...

One of my customers uses 2 small scripts

switch_to_linuxbios.sh switch_to_legacybios.sh

that flash a fiven bios image and write the according cmos contents to it.

But that doesn't make much sense when we already have these nice tools. Something is wrong if the tools aren't enough to do a good job IMHO.

...

...
Writing LB CMOS bytes would be done after the final switch.

Final switch?

Ah, sorry, should have been more clear.

I think it will be common for people to switch back and forth between LB and the factory BIOS when first getting started with LB. Present situation is that LB writes to CMOS and on next factory BIOS boot there's a checksum error, and some or all of the factory BIOS settings may have been lost.

I wanted to avoid this by making LB completely unobtrusive if the CMOS has not been prepared by the user in advance. Running in fallback-only is a fair price to pay for that I think.

Once the user has switched to LB the last time (final switch) she then runs lxbios in order to LB:ify the CMOS contents.

//Peter

6519

days inactive

6520

days old

coreboot@coreboot.org

18 comments

6 participants

tags (0)

participants (6)

Carl-Daniel Hailfinger
Corey Osgood
Marc Jones
Peter Stuge
ron minnich
Stefan Reinauer