On Sat, 2010-09-11 at 23:36 +0300, Juhana Helovuo wrote:
On Sat, 2010-09-11 at 07:03 -0600, Myles Watson wrote:
So it works with my updated patch, but not with the uma & reserved patches? Or does it not work at all? Does it still work with your previous patch set?
Could you send a full log with the last working version and the first broken version from this list?
- Your original patches
- 5799 + my board patch
- 2 + uma.diff
- 3 + reserved.diff
uma.diff & reserved.diff should be independent, so you could try 2+reserved.diff also
Ok, now I have some test results:
- 5792 + my original patches --> boots ok
- 5792 + your simplified patches --> boots ok
- 5792 + your simplified patches + uma.diff --> boots ok (log attached)
- 5792 + your simplified patches + uma.diff + reserved.diff --> does not
build, patch seems to be for newer trunk version
5799 + my original patches --> SATA failure (log attached)
5799 + your simplified patches --> SATA failure
5802 + your simplified patch for 5800 --> SATA failure
My conclusion from this is that something in the trunk has broken between 5792 and 5799.
Sorry about spamming the list, but I managed to test a few cases more:
* 5795 + your simplified patches --> boots ok * 5795 + your simplified patches + uma.diff + reserved.diff --> boots ok
* 5796 + your simplified patches --> SATA failure
It seems to me that the patches are ok, but some trunk change from 5795 to 5796 is causing the problems.
Best regards, Juhana Helovuo
On Sat, Sep 11, 2010 at 3:47 PM, Juhana Helovuo juhe@iki.fi wrote:
On Sat, 2010-09-11 at 23:36 +0300, Juhana Helovuo wrote:
On Sat, 2010-09-11 at 07:03 -0600, Myles Watson wrote:
So it works with my updated patch, but not with the uma & reserved patches? Or does it not work at all? Does it still work with your previous patch set?
Could you send a full log with the last working version and the first broken version from this list?
- Your original patches
- 5799 + my board patch
- 2 + uma.diff
- 3 + reserved.diff
uma.diff & reserved.diff should be independent, so you could try 2+reserved.diff also
Ok, now I have some test results:
- 5792 + my original patches --> boots ok
- 5792 + your simplified patches --> boots ok
- 5792 + your simplified patches + uma.diff --> boots ok (log attached)
- 5792 + your simplified patches + uma.diff + reserved.diff --> does not
build, patch seems to be for newer trunk version
5799 + my original patches --> SATA failure (log attached)
5799 + your simplified patches --> SATA failure
5802 + your simplified patch for 5800 --> SATA failure
My conclusion from this is that something in the trunk has broken between 5792 and 5799.
Sorry about spamming the list, but I managed to test a few cases more:
- 5795 + your simplified patches --> boots ok
- 5795 + your simplified patches + uma.diff + reserved.diff --> boots ok
Thanks for tracking it down. I committed the patches as Rev 5805 (PCI & SB700) Rev 5806 (Multiboot tables) Rev 5807 (fam10 UMA) Rev 5808 (fam10 reserved regions) Rev 5809 (Asus M4A785-M)
- 5796 + your simplified patches --> SATA failure
It seems to me that the patches are ok, but some trunk change from 5795 to 5796 is causing the problems.
There seems to be a problem with CONFIG_MMCONF_SUPPORT vs. CONFIG_MMCONF_SUPPORT_DEFAULT.
This should work for you until it's fixed. If it doesn't, I need to dig a little deeper.
Index: src/northbridge/amd/amdfam10/Kconfig =================================================================== --- src/northbridge/amd/amdfam10/Kconfig (revision 5804) +++ src/northbridge/amd/amdfam10/Kconfig (working copy) @@ -23,7 +23,6 @@ select HAVE_DEBUG_SMBUS select HYPERTRANSPORT_PLUGIN_SUPPORT select NORTHBRIDGE_AMD_AMDFAM10_ROOT_COMPLEX - select MMCONF_SUPPORT
config AGP_APERTURE_SIZE hex
Thanks, Myles
On Mon, 2010-09-13 at 09:08 -0600, Myles Watson wrote:
Thanks for tracking it down. I committed the patches as Rev 5805 (PCI & SB700) Rev 5806 (Multiboot tables) Rev 5807 (fam10 UMA) Rev 5808 (fam10 reserved regions) Rev 5809 (Asus M4A785-M)
Thanks a lot for all your troubles and advice!
- 5796 + your simplified patches --> SATA failure
It seems to me that the patches are ok, but some trunk change from 5795 to 5796 is causing the problems.
There seems to be a problem with CONFIG_MMCONF_SUPPORT vs. CONFIG_MMCONF_SUPPORT_DEFAULT.
This should work for you until it's fixed. If it doesn't, I need to dig a little deeper.
Hmm. I just checked out 5810 and tested. Configuration and building go smoothly (although menuconfig gives wrong ROM size by default, should be 1 MByte), but Linux still fails on SATA and USB.
I edited away MMCONF_SUPPORT before configuring like you wrote, but it does not help.
Also, I ran into another problem:
When I upgrade RAM from 2 GB to 4 GB (or 6 GB), executing the VGA BIOS gets stuck. Coreboot jumps into the BIOS code, but it never returns.
With more RAM, the UMA gets mapped to 0xd0000000 and PCI iomem areas get mapped over real RAM, starting at 0xe0000000. In the boot log I found the following line:
Warning: Can't set up MTRR hole for UMA due to pre-existing MTRR hole.
This message does not appear with 2 GB RAM.
More test results with 6 GB RAM:
* If I disable the GFXUMA option, do not include VGA BIOS on CBFS, and install another VGA card in a PCI slot, I get the same results. Coreboot locates the BIOS on the external card, copies to low RAM, jumps into it, and gets stuck.
* If I hardwire UMA start address to be 0x70000000 and UMA size 64M in mainboard.c, the BIOS does not get stuck and Coreboot launches Grub2. However, Grub2 is very unstable and crashes randomly. I also sometimes get random garbage on VGA. Loading Linux does not work, causes machine to get stuck or reboot.
Apparently there is some address conflict, but where should I start looking?
Best Regards, Juhana Helovuo
On Mon, Sep 13, 2010 at 10:08 AM, Juhana Helovuo juhe@iki.fi wrote:
On Mon, 2010-09-13 at 09:08 -0600, Myles Watson wrote:
Thanks for tracking it down. I committed the patches as Rev 5805 (PCI & SB700) Rev 5806 (Multiboot tables) Rev 5807 (fam10 UMA) Rev 5808 (fam10 reserved regions) Rev 5809 (Asus M4A785-M)
Thanks a lot for all your troubles and advice!
- 5796 + your simplified patches --> SATA failure
It seems to me that the patches are ok, but some trunk change from 5795 to 5796 is causing the problems.
There seems to be a problem with CONFIG_MMCONF_SUPPORT vs. CONFIG_MMCONF_SUPPORT_DEFAULT.
This should work for you until it's fixed. If it doesn't, I need to dig a little deeper.
Hmm. I just checked out 5810 and tested. Configuration and building go smoothly (although menuconfig gives wrong ROM size by default, should be 1 MByte), but Linux still fails on SATA and USB.
I edited away MMCONF_SUPPORT before configuring like you wrote, but it does not help.
Also, I ran into another problem:
When I upgrade RAM from 2 GB to 4 GB (or 6 GB), executing the VGA BIOS gets stuck. Coreboot jumps into the BIOS code, but it never returns.
With more RAM, the UMA gets mapped to 0xd0000000 and PCI iomem areas get mapped over real RAM, starting at 0xe0000000. In the boot log I found the following line:
Warning: Can't set up MTRR hole for UMA due to pre-existing MTRR hole.
This message does not appear with 2 GB RAM.
More test results with 6 GB RAM:
- If I disable the GFXUMA option, do not include VGA BIOS on CBFS, and
install another VGA card in a PCI slot, I get the same results. Coreboot locates the BIOS on the external card, copies to low RAM, jumps into it, and gets stuck.
- If I hardwire UMA start address to be 0x70000000 and UMA size 64M in
mainboard.c, the BIOS does not get stuck and Coreboot launches Grub2. However, Grub2 is very unstable and crashes randomly. I also sometimes get random garbage on VGA. Loading Linux does not work, causes machine to get stuck or reboot.
Apparently there is some address conflict, but where should I start looking?
The problem seems to be that the mainboard code gets its information on the size of memory from TOP_MEM, but that hasn't been set correctly with respect to the PCI resources yet.
m4a785m_enable, TOP MEM: msr.lo = 0xe0000000, msr.hi = 0x00000000 m4a785m_enable, TOP MEM2: msr2.lo = 0xa0000000, msr2.hi = 0x00000001 m4a785m_enable: uma size 0x10000000, memory start 0xd0000000 ... Root Device assign_resources, bus 0 link: 0 split: 64K table at =cfff0000 0: mmio_basek=00300000, basek=00400000, limitk=00680000 Adding UMA memory area
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
You could test that theory by hard coding the top mem logic in mainboard.c.
Because the mainboard device is at the root of the tree, the northbridge initialization hasn't been done yet, so the values it wants haven't been calculated.
Thanks, Myles
On Mon, 2010-09-13 at 10:29 -0600, Myles Watson wrote:
The problem seems to be that the mainboard code gets its information on the size of memory from TOP_MEM, but that hasn't been set correctly with respect to the PCI resources yet.
m4a785m_enable, TOP MEM: msr.lo = 0xe0000000, msr.hi = 0x00000000 m4a785m_enable, TOP MEM2: msr2.lo = 0xa0000000, msr2.hi = 0x00000001 m4a785m_enable: uma size 0x10000000, memory start 0xd0000000 ... Root Device assign_resources, bus 0 link: 0 split: 64K table at =cfff0000 0: mmio_basek=00300000, basek=00400000, limitk=00680000 Adding UMA memory area
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
You could test that theory by hard coding the top mem logic in mainboard.c.
Because the mainboard device is at the root of the tree, the northbridge initialization hasn't been done yet, so the values it wants haven't been calculated.
I tried with the following:
/* TOP_MEM: the top of DRAM below 4G */ msr = rdmsr(TOP_MEM); //hardwire this for testing if (msr.lo > 0x80000000) { msr.lo = 0x80000000; }
This code manages to set UMA to a lower address, but the effect is the same as with hardwiring UMA address. Boot proceeds past VGA BIOS, but results in random crashes and reboots.
Or did you mean some other hardwiring?
Since variable "msr" here is local to the routine, I don't see how it could have effect on anything else but the UMA location and size. The value read from TOP_MEM2 isn't even used for anything but printing.
Best regards, Juhana Helovuo
On Mon, Sep 13, 2010 at 11:26 AM, Juhana Helovuo juhe@iki.fi wrote:
On Mon, 2010-09-13 at 10:29 -0600, Myles Watson wrote:
The problem seems to be that the mainboard code gets its information on the size of memory from TOP_MEM, but that hasn't been set correctly with respect to the PCI resources yet.
m4a785m_enable, TOP MEM: msr.lo = 0xe0000000, msr.hi = 0x00000000 m4a785m_enable, TOP MEM2: msr2.lo = 0xa0000000, msr2.hi = 0x00000001 m4a785m_enable: uma size 0x10000000, memory start 0xd0000000 ... Root Device assign_resources, bus 0 link: 0 split: 64K table at =cfff0000 0: mmio_basek=00300000, basek=00400000, limitk=00680000 Adding UMA memory area
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
You could test that theory by hard coding the top mem logic in mainboard.c.
Because the mainboard device is at the root of the tree, the northbridge initialization hasn't been done yet, so the values it wants haven't been calculated.
I tried with the following:
/* TOP_MEM: the top of DRAM below 4G */ msr = rdmsr(TOP_MEM); //hardwire this for testing if (msr.lo > 0x80000000) { msr.lo = 0x80000000; }
This code manages to set UMA to a lower address, but the effect is the same as with hardwiring UMA address. Boot proceeds past VGA BIOS, but results in random crashes and reboots.
Or did you mean some other hardwiring?
I thought more about the problem than the solution :)
Since variable "msr" here is local to the routine, I don't see how it could have effect on anything else but the UMA location and size. The value read from TOP_MEM2 isn't even used for anything but printing.
I was hoping that changing the uma location would be enough to affect the rest.
Could you try
if (msr.lo > 0xc0000000) { msr.lo = 0xc0000000; }
I'm thinking that would place uma at 0xb0000000 & coreboot tables at 0xafff0000.
Then could you send the log if that's still unstable.
Thanks, Myles
On Mon, 2010-09-13 at 11:36 -0600, Myles Watson wrote:
I was hoping that changing the uma location would be enough to affect the rest.
Could you try
if (msr.lo > 0xc0000000) { msr.lo = 0xc0000000; }
I'm thinking that would place uma at 0xb0000000 & coreboot tables at 0xafff0000.
Then could you send the log if that's still unstable.
Yes, the placement is exactly as you said.
This version is very unstable. On most attempts is does not get past "raminit_amdmct" phase before it starts booting from the start.
If it gets to the payload, then VGA output is just random pixel noise and usually there is a spontaneous reboot in a few seconds.
The log is attached.
Best regards, Juhana Helovuo
On Mon, Sep 13, 2010 at 1:12 PM, Juhana Helovuo juhe@iki.fi wrote:
On Mon, 2010-09-13 at 11:36 -0600, Myles Watson wrote:
I was hoping that changing the uma location would be enough to affect the rest.
Could you try
if (msr.lo > 0xc0000000) { msr.lo = 0xc0000000; }
I'm thinking that would place uma at 0xb0000000 & coreboot tables at 0xafff0000.
Then could you send the log if that's still unstable.
Yes, the placement is exactly as you said.
I can't see the conflict. I'll have to think about it.
Marc: Do you have any insight?
This version is very unstable. On most attempts is does not get past "raminit_amdmct" phase before it starts booting from the start.
That's especially interesting because we haven't changed anything that early.
If it gets to the payload, then VGA output is just random pixel noise and usually there is a spontaneous reboot in a few seconds.
Thanks for the info, Myles
Myles Watson mylesgw@gmail.com writes:
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
PCI resources at 0xd0000000? Doesn't this conflict with the setting of NV_BottomIO in src/northbridge/amd/amdmct/wrappers/mcti_d.c?
On Mon, Sep 13, 2010 at 1:35 PM, Arne Georg Gleditsch arne.gleditsch@numascale.com wrote:
Myles Watson mylesgw@gmail.com writes:
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
PCI resources at 0xd0000000? Doesn't this conflict with the setting of NV_BottomIO in src/northbridge/amd/amdmct/wrappers/mcti_d.c?
That could be. I'm totally ignorant of the fam10 code oustside of northbridge/amd/amdfam10.
Looking at the boot log, it doesn't seem unreasonable to have PCI resources from starting at 0xc0000000.
It seems like we have a couple of options: 1. Reclaim the area used for MMCONF on this board 2. Move NV_BottomIO
Probably 2 is the best. It's too bad to have that hard coded.
Thanks, Myles
On Mon, Sep 13, 2010 at 2:31 PM, Myles Watson mylesgw@gmail.com wrote:
On Mon, Sep 13, 2010 at 1:35 PM, Arne Georg Gleditsch arne.gleditsch@numascale.com wrote:
Myles Watson mylesgw@gmail.com writes:
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
PCI resources at 0xd0000000? Doesn't this conflict with the setting of NV_BottomIO in src/northbridge/amd/amdmct/wrappers/mcti_d.c?
That could be. I'm totally ignorant of the fam10 code oustside of northbridge/amd/amdfam10.
Looking at the boot log, it doesn't seem unreasonable to have PCI resources from starting at 0xc0000000.
It seems like we have a couple of options:
- Reclaim the area used for MMCONF on this board
- Move NV_BottomIO
Probably 2 is the best. It's too bad to have that hard coded.
Yes, option 2 is best. These options may need to move to the mainboard directory now that they need to be changed.
Marc
On Mon, 2010-09-13 at 14:31 -0600, Myles Watson wrote:
On Mon, Sep 13, 2010 at 1:35 PM, Arne Georg Gleditsch arne.gleditsch@numascale.com wrote:
Myles Watson mylesgw@gmail.com writes:
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
PCI resources at 0xd0000000? Doesn't this conflict with the setting of NV_BottomIO in src/northbridge/amd/amdmct/wrappers/mcti_d.c?
That could be. I'm totally ignorant of the fam10 code oustside of northbridge/amd/amdfam10.
Looking at the boot log, it doesn't seem unreasonable to have PCI resources from starting at 0xc0000000.
It seems like we have a couple of options:
- Reclaim the area used for MMCONF on this board
- Move NV_BottomIO
Probably 2 is the best. It's too bad to have that hard coded.
Hello,
How to proceed with finding the proper memory setup? I tried reading through amdfam10 code and RS780 southbridge code, but they are quite incomprehensible without a hardware programming manual. Is such a detailed documentation freely available somewhere, or is it under NDA only?
It appears that the northbridge is quite clever in mapping memory, since Linux booted with Asus BIOS reports in /proc/iomem that there is usable RAM up to 0x19fffffff, i.e. addresses up to 6.5 GB, even though there is only 6 GB RAM installed.
The "extra" addresses apparently come from the fact that there is no real RAM mapped to some addresses below 4G, and these addresses are used for PCI memory-mapped IO.
Here are some parts of /proc/iomem:
00100000-cff8ffff : System RAM d0000000-dfffffff : PCI Bus 0000:01 d0000000-dfffffff : 0000:01:05.0 e0000000-efffffff : PCI MMCONFIG 0 [00-ff] e0000000-efffffff : pnp 00:0d fdf00000-fdffffff : PCI Bus 0000:02 fe900000-feafffff : PCI Bus 0000:01 feb00000-febfffff : PCI Bus 0000:02 100000000-19fffffff : System RAM
My interpretation of this is roughly the following:
00000000-cfffffff is 3.25 GB RAM. d0000000-dfffffff is not RAM, but is 256M memory-mapped IO addresses to VGA e0000000-efffffff is 256M RAM, but not usable to OS, because it is the UMA f0000000-ffffffff is not RAM, but IO addresses for more PCI, APICs, and other devices 100000000-19fffffff is 2.5 GB RAM
Does this look correct? If yes, then the northbridge is creating holes into RAM, in order to have PCI memory-mapped IO in 32-bit addresses.
Now, if I wish to have Coreboot to cope with large memory and holes, what manuals do I need to understand and modify the codes managing this?
Best regards, Juhana Helovuo
On Thu, Sep 16, 2010 at 12:42 PM, Juhana Helovuo juhe@iki.fi wrote:
On Mon, 2010-09-13 at 14:31 -0600, Myles Watson wrote:
On Mon, Sep 13, 2010 at 1:35 PM, Arne Georg Gleditsch arne.gleditsch@numascale.com wrote:
Myles Watson mylesgw@gmail.com writes:
So even though there are PCI resources located at 0xc0000000, RAM gets used for UMA at 0xd0000000 and tables get placed at 0xcfff0000.
PCI resources at 0xd0000000? Doesn't this conflict with the setting of NV_BottomIO in src/northbridge/amd/amdmct/wrappers/mcti_d.c?
That could be. I'm totally ignorant of the fam10 code oustside of northbridge/amd/amdfam10.
Looking at the boot log, it doesn't seem unreasonable to have PCI resources from starting at 0xc0000000.
It seems like we have a couple of options:
- Reclaim the area used for MMCONF on this board
- Move NV_BottomIO
Probably 2 is the best. It's too bad to have that hard coded.
Hello,
How to proceed with finding the proper memory setup? I tried reading through amdfam10 code and RS780 southbridge code, but they are quite incomprehensible without a hardware programming manual. Is such a detailed documentation freely available somewhere, or is it under NDA only?
It appears that the northbridge is quite clever in mapping memory, since Linux booted with Asus BIOS reports in /proc/iomem that there is usable RAM up to 0x19fffffff, i.e. addresses up to 6.5 GB, even though there is only 6 GB RAM installed.
The "extra" addresses apparently come from the fact that there is no real RAM mapped to some addresses below 4G, and these addresses are used for PCI memory-mapped IO.
Here are some parts of /proc/iomem:
00100000-cff8ffff : System RAM d0000000-dfffffff : PCI Bus 0000:01 d0000000-dfffffff : 0000:01:05.0 e0000000-efffffff : PCI MMCONFIG 0 [00-ff] e0000000-efffffff : pnp 00:0d fdf00000-fdffffff : PCI Bus 0000:02 fe900000-feafffff : PCI Bus 0000:01 feb00000-febfffff : PCI Bus 0000:02 100000000-19fffffff : System RAM
My interpretation of this is roughly the following:
00000000-cfffffff is 3.25 GB RAM. d0000000-dfffffff is not RAM, but is 256M memory-mapped IO addresses to VGA
This is UMA. UMA is RAM that's been reserved for the display (Unified Memory Architecture)
e0000000-efffffff is 256M RAM, but not usable to OS, because it is the UMA
This is MMCONFIG, not UMA. It allows the OS to use memory mapped accesses for PCI configuration accesses.
f0000000-ffffffff is not RAM, but IO addresses for more PCI, APICs, and other devices
memory mapped IO.
100000000-19fffffff is 2.5 GB RAM
Does this look correct? If yes, then the northbridge is creating holes into RAM, in order to have PCI memory-mapped IO in 32-bit addresses.
That's right.
Now, if I wish to have Coreboot to cope with large memory and holes, what manuals do I need to understand and modify the codes managing this?
BKDG (BIOS and Kernel Developers' Guide)
Here's the link for fam10h. The other ones are there too. http://support.amd.com/us/Processor_TechDocs/31116.pdf
Thanks, Myles