Thanks for the details. I agree that adding 48-bit support is better than PA!=VA, but I really think you should implement the L0 table instead of messing with the granule size. This should really not be hard to do since our paging code is already recursive... just add an L0_ADDR_SHIFT and use it instead of L1_ADDR_SHIFT in mmu_init() and get_pte(), and then duplicate the code block handling L1 in init_xlat_table() for L0. That should be everything you need.

You can see that L1 is essentially already implemented as optional right now with those BITS_PER_VA > L1_ADDR_SHIFT tests, even though that never really matters at the moment with BITS_PER_VA hardcoded to 33. But you should be able to just do the same thing for L0 (and maybe make L1 non-optional instead because I don't think we ever anticipate not needing it at this point).

Changing the granule size would not only break the assumption that pages are 4K (which we probably have in a bunch of code) and make it impossible to map smaller boundaries which you sometimes need to do (e.g. SRAM areas don't always start or end at 64K boundaries), it also makes your page tables incredibly bloated (64K *each*, and you usually need at least 3) which would essentially make this unusable for SRAM systems. Adding L0 only costs you one extra 4K table and is much more flexible.

It would be nice if you could do this change in both coreboot and libpayload since we're trying to keep both implementations in sync.

On Fri, Mar 2, 2018 at 10:36 AM, David Hendricks <david.hendricks@gmail.com> wrote:

On Thu, Mar 1, 2018 at 11:30 PM, Patrick Rudolph <patrick.rudolph@9elements.com> wrote:
Am Donnerstag, den 01.03.2018, 14:11 -0800 schrieb Julius Werner:
> Hi Patrick et. al.,
>
> Continuing from what you said on IRC, let's please discuss this
> before you spend time to work on any major changes to the ARMv8 MMU
> code. I don't think that should be necessary (especially changing the
> GRANULE_SIZE which is a complication I'd really like to avoid
> having), and I'd like to understand what you're trying to do first.
> Can you explain what memory ranges you need to map on your SoC and
> why they need to be so large?
>
Hi Julius,
the Cavium SoC expose the MMIO space at 1 << 47 with a size of 1 << 47.
As coreboot does use PA=VA mapping, I have to extend the VA space to 48
bit and that seems to work fine with 64KB granule.

The change in coreboot is minimal, I'm not sure if all armv8 CPUs
support's that granule, but a Kconfig should do for now.

Yes, the architecture reference manual for ARMv8 mentions that 4KB, 16KB, and 64KB are supported for VMSAv8-64.