[coreboot] [PATCH] Halt TCO timer on Intel 3100 chipset

ron minnich rminnich at gmail.com
Fri Apr 18 17:46:35 CEST 2008


Stuff Learned The Hard Way: experience I have had with fallback/normal
and timers suggests this:

bios should set the watchdog timer, very early, to fire in, say, 5 minutes

the final payload (linux) should clear the timer, but not until it
hits a runlevel such that it is network reachable (and hence
manageable), and then, the remote network manager system initiates the
watchdog reset. The decision should not be made locally.

So, you have to look at the system -- at what point is the system
reachable from remote?
- not when the bios tries to boot a payload (could be a bug in bios
that breaks payload)
- not when payload tries to boot os (ditto)
- not when os is booting (ditto)
- not when os is in /etc/rc or equivalent

but only when os is "up", and presumed working. In fact, the best way
to reset the watchdog? From remote:
ssh node watchdog-reset

Why is this? because the bios, payload, and kernel, and kernel
runtime, have to be taken as a whole. The system is not really
considered viable until it's totally booted. The kernel might be on
flash, and you might have just tweaked it, and broken it -- which I
have done, on 1024 machines, and thanked my lucky stars for fallback!

if you can't do "reset the watchdog" from a remote node, over the
network,  the node is not up in any useful sense. Let it crash into
fallback.

Obviously this logic does not apply to standalone nodes :-) (well,not
completely: for standalone, don't clear the timer until you hit
runlevel 3 or equivalant)

ron




More information about the coreboot mailing list