I am going to continue on this but it will be slow, I have a lot going on.
The basic problem as I see it is that there is a little more magic in the build process than it can stand, and specific platforms are breaking.
In c_start.S and the constructed crt0.S there is a symbol called _start. Same name. There are also, in these files, a set of symbols that on some builds are needed for that build. So we need the .o from both files in some builds. But the conflicting _start symbols are causing trouble.
Simple attempts to change the name of the start symbol (e.g. change _start in crt0.S to _machine_reset_start) lead to a host of other undefined symbols, since once the _start is undefined, that file is not pulled in and the symbols in that file are not resolved.
I would rather not get into 'weak' symbols in assembly code files.
Anyway, if anyone gets a chance to look at this and has some ideas ... let me know. But we need to fix this situation.
ron
On Tue, 11 Feb 2003, Ronald G. Minnich wrote:
In c_start.S and the constructed crt0.S there is a symbol called _start. Same name. There are also, in these files, a set of symbols that on some builds are needed for that build. So we need the .o from both files in some builds. But the conflicting _start symbols are causing trouble.
more summary: the problem is limited to those cases where someone wants to use SERIAL_POST.
Seems pretty fixable.
ron
In c_start.S and the constructed crt0.S there is a symbol
called _start.
Same name. There are also, in these files, a set of symbols that on some builds are needed for that build. So we need the .o from both files in some builds. But the conflicting _start symbols are causing trouble.
more summary: the problem is limited to those cases where someone wants to use SERIAL_POST.
Seems pretty fixable.
ron
Any more ideas on this? I chased it around a bit but there sure seems to be a symbol conflict on _start.
-Steve
more summary: the problem is limited to those cases where someone wants to use SERIAL_POST.
Seems pretty fixable.
Any more ideas on this? I chased it around a bit but there sure seems to be a symbol conflict on _start.
Okay I see the point, these are separate pieces of code. Looks like the post is inlined unless SERIAL_POST is set, at which point console_tx_al is called in crt0. The address of console_tx_al is not fixed at the point c_start.o is linked. Not easy to fix, maybe someone else has some ideas.
-Steve
"Steve M. Gehlbach" steve@nexpath.com writes:
more summary: the problem is limited to those cases where someone wants to use SERIAL_POST.
Seems pretty fixable.
Any more ideas on this? I chased it around a bit but there sure seems to be a symbol conflict on _start.
Okay I see the point, these are separate pieces of code. Looks like the post is inlined unless SERIAL_POST is set, at which point console_tx_al is called in crt0. The address of console_tx_al is not fixed at the point c_start.o is linked. Not easy to fix, maybe someone else has some ideas.
O.k. I believe I see what is going on.
When CONFIG_COMPRESS was introduced intel_chip_post simply yielded a post code. So it was safe to call it both from both binaries.
Then someone modified intel_chip_post to change it's behavior when SERIAL_POST is called to output the post code to the serial port.
That is so bad, and broken.
It breaks the compile of c_start.S because it attempts to call code in another binary. It potentially breaks all callers of intel_chip_post because the set of registers stomped has gone from %al to: %eax %esp %edx
Extremely unexpected.
If we want post codes to show up on the serial port even from the assembly code we need to introduce another macro, and carefully replace the users of intel_chip_post. Rather than change the calling conventions of intel_chip_post.
So for the short term the proper fix is to revert the recent change to intel_chip_post....
Eric
On 12 Feb 2003, Eric W. Biederman wrote:
Then someone modified intel_chip_post to change it's behavior when SERIAL_POST is called to output the post code to the serial port.
OK, will the guilty party please confess to me in private no need to bother the list :-)
If we want post codes to show up on the serial port even from the assembly code we need to introduce another macro, and carefully replace the users of intel_chip_post. Rather than change the calling conventions of intel_chip_post.
ok. I'll do that.
But I'm not sure it is a good idea to have a common linuxbios.a for both the compressed image and the initial image. The error was very hard for people to understand. At least this is what I think is going on, since symbols from one image were referenced from another, and the only way I could see this happening was that the 1st and 2nd stage symbols were all in the one .a
comments?
ron
"Ronald G. Minnich" rminnich@lanl.gov writes:
On 12 Feb 2003, Eric W. Biederman wrote:
Then someone modified intel_chip_post to change it's behavior when SERIAL_POST is called to output the post code to the serial port.
OK, will the guilty party please confess to me in private no need to bother the list :-)
If we want post codes to show up on the serial port even from the assembly code we need to introduce another macro, and carefully replace the users of intel_chip_post. Rather than change the calling conventions of intel_chip_post.
ok. I'll do that.
But I'm not sure it is a good idea to have a common linuxbios.a for both the compressed image and the initial image. The error was very hard for people to understand. At least this is what I think is going on, since symbols from one image were referenced from another, and the only way I could see this happening was that the 1st and 2nd stage symbols were all in the one .a
comments?
I am not certain this is happening. I guess we need to track down the multiple _start issue and see how that was triggered. In general everything in the 1st stage is included into crt0.S, so it has no need to look at linuxbios.a. So if something needs to happen it should just be a small tweak of the rules.
Ron you added a global to crt0.S and crt0.o was then linked with the 2nd stage. I guess the bug is that somehow crt0.o is being added to linuxbios.a. That looks like a leftover from before CONFIG_COMPRESS.
O.k. I have tracked the problem down. crt0.o is being manually added to OBJECTS-1 by util/config/NLBConfig.py. And OBJECTS-1 is used to populate linuxbios.a So we should be able to change one line of code and the multiple _start issue will go away.
Part of the problem was is that someone was using 1st stage defines in the 2nd stage, which is quiet unexpected. intel_chip_post is possibly the one define that is safe to use both places.
Eric
On 12 Feb 2003, Eric W. Biederman wrote:
Part of the problem was is that someone was using 1st stage defines in the 2nd stage, which is quiet unexpected. intel_chip_post is possibly the one define that is safe to use both places.
no argument that it was a mistake in what the person did, my only concern was that so many (me too!) got so confused :-)
I love the compressed 2nd stage ... you were right, the shrinkage is very useful. My fallback now does log level 9 (!) and still fits in 40k!
ron
Ron:
I sent some diffs to you the other day for vga etc. that don't seem to be in the repository yet. Should I send them to you again?
The changes expanded the vga splash screen file formats and also added STATUS files to the two motherboards I support.
-Steve
yes, please resend, I'm sorry.
ron
On 12 Feb 2003, Eric W. Biederman wrote:
Part of the problem was is that someone was using 1st stage defines in the 2nd stage, which is quiet unexpected. intel_chip_post is possibly the one define that is safe to use both places.
no argument that it was a mistake in what the person did, my only concern was that so many (me too!) got so confused :-)
I love the compressed 2nd stage ... you were right, the shrinkage is very useful. My fallback now does log level 9 (!) and still fits in 40k!
ron
Maybe we could set a #define STAGE2 at the top of c_start.S, and ifdef in intel.h on STAGE2. The serial posts are useful to me since I am a holdout on getting a PCI post board.
-Steve
On Wed, 12 Feb 2003, Steve M. Gehlbach wrote:
Maybe we could set a #define STAGE2 at the top of c_start.S, and ifdef in intel.h on STAGE2. The serial posts are useful to me since I am a holdout on getting a PCI post board.
that and the fact that POST cards won't even work in some new motherboards.
ron
"Steve M. Gehlbach" steve@nexpath.com writes:
On 12 Feb 2003, Eric W. Biederman wrote:
Part of the problem was is that someone was using 1st stage defines in the 2nd stage, which is quiet unexpected. intel_chip_post is possibly the one define that is safe to use both places.
no argument that it was a mistake in what the person did, my only concern was that so many (me too!) got so confused :-)
I love the compressed 2nd stage ... you were right, the shrinkage is very useful. My fallback now does log level 9 (!) and still fits in 40k!
ron
Maybe we could set a #define STAGE2 at the top of c_start.S, and ifdef in intel.h on STAGE2. The serial posts are useful to me since I am a holdout on getting a PCI post board.
And I have one but have never actually used it.
intel_chip_post cannot be modified to output to the serial port. That requires more registers, making it an unsafe transformation.
The result would likely be mysterious breakage instead of the obvious breakage we see now.
Having an alternative macro that we use where it is safe is a reasonable solution. And for that we can simply not place it in c_start.S
Eric
Maybe we could set a #define STAGE2 at the top of c_start.S,
and ifdef in
intel.h on STAGE2. The serial posts are useful to me since I
am a holdout
on getting a PCI post board.
And I have one but have never actually used it.
intel_chip_post cannot be modified to output to the serial port. That requires more registers, making it an unsafe transformation.
The result would likely be mysterious breakage instead of the obvious breakage we see now.
Having an alternative macro that we use where it is safe is a reasonable solution. And for that we can simply not place it in c_start.S
Eric
Right, it doesn't seem like we will get a serial post output in c_start.S without a lot of work. But it is nice to have it everywhere else.
I thought a STAGE2 define in intel.h and c_start.S would help document the fact that stage2 was "different" and be a minimal change.
I suppose we should also keep in mind that the post macro is only two lines of code...
-Steve
On Tue, 11 Feb 2003, Steve M. Gehlbach wrote:
more summary: the problem is limited to those cases where someone wants to use SERIAL_POST.
Any more ideas on this? I chased it around a bit but there sure seems to be a symbol conflict on _start.
ideas but not time, but I have a kludgy fix that will work for now; I'll try to commit today.
ron
"Ronald G. Minnich" rminnich@lanl.gov writes:
I am going to continue on this but it will be slow, I have a lot going on.
The basic problem as I see it is that there is a little more magic in the build process than it can stand, and specific platforms are breaking.
In c_start.S and the constructed crt0.S there is a symbol called _start. Same name.
Yes but in different binaries. c_start.S and crt0.S should never be linked together.
There is one binary that is all of the code before ram is initialized. And another that is code after ram is initialized.
There are also, in these files, a set of symbols that on some builds are needed for that build. So we need the .o from both files in some builds. But the conflicting _start symbols are causing trouble.
How? What links them together?
Simple attempts to change the name of the start symbol (e.g. change _start in crt0.S to _machine_reset_start) lead to a host of other undefined symbols, since once the _start is undefined, that file is not pulled in and the symbols in that file are not resolved.
I would rather not get into 'weak' symbols in assembly code files.
Anyway, if anyone gets a chance to look at this and has some ideas ... let me know. But we need to fix this situation.
I need to see what the real problem that is linking together two different binaries before I can give a good suggestion.
Eric
On 12 Feb 2003, Eric W. Biederman wrote:
Yes but in different binaries. c_start.S and crt0.S should never be linked together.
agreed. But they seem to be ending up in linuxbios.a, and that's part of the problem.
There is one binary that is all of the code before ram is initialized. And another that is code after ram is initialized.
understood.
How? What links them together?
the makefile.
I need to see what the real problem that is linking together two different binaries before I can give a good suggestion.
This will happen for the ms7308 mainboards if SERIAL_POST is defined.
ron