[coreboot] native video init question

Wed Nov 16 13:32:07 CET 2016

On 16.11.2016 06:08, Charlotte Plusplus wrote:
> Hello
> 
> On Tue, Nov 15, 2016 at 6:46 PM, Nico Huber <nico.h at gmx.de> wrote:
> 
>> I've seen a garbled image, too, lately. When I built with native
>> raminit by chance but with a completely different gfx init code
>> (3rdparty/libgfxinit). So it might still be some problem in the
>> raminit. This was also on an Ivy Bridge system with four DIMMs,
>> btw. I suspected that the native raminit just wasn't tested in that
>> configuration.
>>
> 
> Interesting, I noticed some patterns with raminit too. Most of my problems
> with the native raminit happen with 4x 8Gb DIMMS. The more sticks I have,
> the less likely native ram training is to succeed. I have logged a few of
> the failed attempt in case anyone else is interested (attached).
> 
> Basically, the training can fail several times with the exact same
> parameters that it later succeeds with. Also, failure is a function of
> speed. All the attempts I have done but not attached can be summed up like
> this: failure of the native ram training is also more likely with a MCU
> over 666MHz. But whenever the raminit succeed, the memory is stable in
> memtests (I did several passes to check.
> 
> Now that I can use Windows 10 with Coreboot, I decided to experiment a bit
> more. First, I tried changing the SPD settings with Thaiphoon Burner. The
> sticks I have advertise they supported both 1.35 and 1.5V profiles (SPD:
> 006=03) which I feared might cause issue. Changing that to 1.5V only (SPD:
> 006=00) did not help, even if it did help with another computer that I
> borrowed to do some comparisons with (I was afraid my sticks were at fault)
> 
> Then I tried manually filling XMP profile 1 with known to be working values
> (either published, or found during native raminit training). It seemed to
> help but the results were inconsistent. Between my tests for different
> value, I was clearing the MRC cache.
> 
> Then I had a weird idea: what if the ram training or the MRC cache clearing
> was the cause of the problem? I changed my protocol to do: clear cache,
> train after changing the SPD/MCU frequency/etc, then when it succeeds
> disable the MRC cache clearing hook, and do a few reboots or a power off
> before doing the memtest. And this was sufficient to get stabilility at
> 666Mhz and frequencies above without having to tweak the SPD anymore (like
> adding some latency to the detected values)
> 
> Currently 800Mhz is validated, I haven't tested 933 MHz because ram
> training success seems to be a probability that goes down with the
> frequency, and pressing on the power button every time it fails quickly
> gets boring!
> 
> I have no way to prove that, but I believe that the ram training by itself
> interferes with the stability of the RAM, or that there is some non
> deterministic part in the code. I don't know why or how, it's a lot of code
> that I haven't read in detail, but it's what my tests suggests. I would
> love to compare these results to ones the blob raminit would give. But blob
> raminit causes video issues. So I'm trying to focus on native video init
> first, in case it helps me to use mrc.bin.

I can't recall if Patrick or you mentioned this already, there's a rela-
ted patch up for review:
  https://review.coreboot.org/#/c/17389/
Do you have that already applied locally?

> 
> Well, I don't see any setting that could really break something. The
>> code might just be buggy. I'll go through the settings anyway, taking
>> src/mainboard/lenovo/t520/devicetree.cb as example:
>>
> 
> I used this code as a source, as I figured the screen was likely to be the
> same.
> 
> That's about ACPI, let's not care (the last appendix in the ACPI spec if
>> you want to have a look).
>>
> 
> I will check that too because Fn Home and Fn end (brightness + and -) don't
> work in memtest86, even if they work in linux and windows

That's expected. coreboot does the brightness control through ACPI.
And ACPI is something for bloated OSs not for payloads (it comes with
a whole bunch of code including a byte-code interpreter).

A workaround would be to do the EC event handling in SMM until ACPI
takes over. But that's usually discouraged as SMM has security issues.
So we want as little code as possible in SMM.

> 
> 
>> Those are real register settings, you can dump the whole GMA MMIO
>> space with `inteltool -f` (hoping that your system doesn't hang). The
>> registers are described in [1, chapter 2.4].
>>
> 
> Sorry, but it did hang.  I couldn't even ssh to it. Any alternative to get
> the info?

Look up the base address of the gfx MMIO space:
  $ lspci -vs 00:02.0 | grep Memory
or
  $ head -1 /sys/devices/pci0000:00/0000:00:02.0/resource

Grab a copy of iotools: https://github.com/adurbin/iotools.git
You should be able to read single registers with the mem_read32 command:
  # ./iotools mem_read32 $((base+register_offset))

Nico