Hi all

On most modern x86 systems a lot of the silicon init happens as part of blob (FSP, binaryPI)
or some reference code (AGESA). Very often that silicon init enables or hides
PCI devices. This means that this code needs to run before coreboot device enumeration code.
With coreboot's current ramstage state machine running this binary can only happen as part of
chip_ops->init or with a bootstate hook.

As a quick reminder about the coreboot ramstage state machine:
 *        start
 *          |
 *    BS_PRE_DEVICE
 *          |
 *    BS_DEV_INIT_CHIPS
 *          |
 *    BS_DEV_ENUMERATE
 *          |
         the rest (device resource allocation, device init, ...)

BS_DEV_INIT_CHIPS:
- Call the .init from all the chip_ops from each "chip" once.
  FSP-S loading, configuring and calling is currently set up
  as a monolithic call inside such chip_ops->init. It has a lot of
  callbacks for soc and mainboard and consumes the devicetree in one big
  soc specific function, with little reuse between socs.
  All the silicon configuration can only be put into a monolithic soc chip_info and is translated
  into FSP UPDs.

BS_DEV_ENUMERATE:
- call static chip_ops->enable_dev on each static device on the devicetree. On older
  non-RC using code enabling or hiding PCI devices is done here.
- scan the actual hardware (e.g. PCI bus) for what is there. What will be
  found is depends on the params passed to the reference code which will hide/enable
  devices

Both the monolithic configuration, one big chip_info, and code flow don't follow coreboot's
state machine philosophy, where the devicetree nodes drive the configuration process and codeflow.
It would be nicer if device enabling and other configuration would be structured on a per node basis.
So for instance all the USB configuration options would be below the xhci device node and
not globally available to all nodes.

Based on a discussion on IRC there are a few options, which are generally not mutually exclusive or share similar ideas:
1) Split up the DEV_ENUMERATE into a static devicetree part and a scan part.
  In combination with calling FSP between those two parts this would allow
  the chip_ops->enable_dev to be called on each device and so UPD setup could
  be modularised. This is still rather monolithic as the enable_dev is the same
  function callback for each device but it could be more modularised because the
  struct device is passed as an argument.
  So something like:

  switch (dev->path.pci.devfn) {
  case PCI_DEV(0, 0):
         do_something()
  case PCI_DEV(0x1f, 0):
         do_something_else()
  }
  is possible. So this at least removes all the current callbacks.

2) Move devices below separate chips and split up FSP loading, configuring and calling.
  Then then devicetree looks like this

  chip soc/intel/.../usb
     device ref xhci on end
     register "config_option_1" = "A"
     register "config_option_2" = "B"
  end

  If the FSP loading is separated from the calling, for instance by moving it earlier in
 a cbmem init hook or bootstate ENTRY hook, then each chip_ops .init could configure FSP UPDs based.  The calling of FSP can then happen as part of bootstate EXIT hook.
  A disadvantage here would be that this would add a lot of "chips" which is not recommended
  for things that are not a separate physical chip.

3) Allow for a per device configuration.
  https://review.coreboot.org/c/coreboot/+/41745 implements this.
  This would easily allow per device configuration without introducing new chips.

4) Since it is now possible to hook up device ops directly in the devicetree
  we could add a new device specific entry to ops that a new bootstate executes
  before scanning buses. A enable_dev, but not on the chip_ops level but device level.
  One thing to note is that scan_bus will not give default PCI ops to devices anymore
  if ops is !NULL. This can be worked around by setting ops to NULL after calling this
  new ops on devices where the default pci ops are desirable. Another way
  would be to allow more finegrained control when writing the devicetree about which ops can be set directly in the devicetree.

It looks like 3 + 4 would be the cleanest, but it should maybe not block trying out 1 and 2.
The splitting of device specific configuration is possible in 1 and 2 and migrating to
solutions of 3 and 4 would be rather trivial.

Any thoughts on this?

Arthur