Hi, all
I am developing the EFI payload for LB (see http://code.google.com/soc/2007/coresystems/appinfo.html?csaid=83B941F46A422 F1A).
And I am unifying the resource management of LB and EFI. In more details, I am parsing the LB table and converting it to the HOB structures of Tiano EFI.
I described this idea several days ago. (http://www.linuxbios.org/pipermail/linuxbios/2007-June/022340.html)
But there is a problem:
The Tiano EFI describes the system resources as types and attributes, such as the following definition:
//*********************************************************
// EFI_RESOURCE_TYPE (define the resource type)
//*********************************************************
typedef UINT32 EFI_RESOURCE_TYPE;
#define EFI_RESOURCE_SYSTEM_MEMORY 0x00000000 (system memory resource)
#define EFI_RESOURCE_MEMORY_MAPPED_IO 0x00000001 (memory-mapped IO)
#define EFI_RESOURCE_IO 0x00000002 (Processor IO space)
#define EFI_RESOURCE_FIRMWARE_DEVICE 0x00000003 (Memory-mapped firmware device)
#define EFI_RESOURCE_MEMORY_RESERVED 0x00000005 (Reserved memory address space)
#define EFI_RESOURCE_IO_RESERVED 0x00000006 (Reserved IO address space)
//************************************************************************** ********
// EFI_RESOURCE_ATTRIBUTE_TYPE(define the resource attributes)
//************************************************************************** ********
typedef UINT32 EFI_RESOURCE_ATTRIBUTE_TYPE;
// These types can be ORed together as needed.
#define EFI_RESOURCE_ATTRIBUTE_PRESENT 0x00000001 (the memory region exists)
#define EFI_RESOURCE_ATTRIBUTE_INITIALIZED 0x00000002 (the memory region has been initialized)
#define EFI_RESOURCE_ATTRIBUTE_TESTED 0x00000004 (the memory region has been tested)
And there are similar definitions in the LB's /src/include/device/resourch.h file.
But I can just find
IORESOURCE_MEM | IORESOURCE_CACHEABLE (system memory) resources types and
LB_MEM_RAM, LB_MEM_TABLE attributes in the LB tables.
So how to identify other resource types and attributes in LB?
Or the current LB implementation just describes these types and attributes,
I need to construct the rest resource types and attributes in the payload itself as the exact requirement?
By the way, I built the LB with qemu emulation and just track qemu emulation LB's source codes.
Thanks & Best regards,
Xiong Yi
Hi all, The diff for the latest winflashrom code is attached. There are still a few issues with this code.
1. I've tested it on ICH-5 board. However, the BIOS binary that it reads from it is 514KB. It should be only 512KB. I'm still trying to find out the bug. Any info from ICH-x code maintainer?
2. The PCI library is still a "brute force" approach. But, it works so far. I'll remove it soon enough.
More updates to come.
Cheers, Darmawan
Hi,
On Mon, Jul 02, 2007 at 04:17:51PM +0700, Darmawan Salihun wrote:
The diff for the latest winflashrom code is attached.
The diff is reversed, so a bit hard to read. Please do svn diff -r1:HEAD next time.
Also, instead of commenting out non-Windows code and adding new Windows-only code, please suggest good ways to abstract the code.
There are still a few issues with this code.
- I've tested it on ICH-5 board. However, the BIOS binary that it
reads from it is 514KB. It should be only 512KB. I'm still trying to find out the bug. Any info from ICH-x code maintainer?
This is probably because Win32 does LF->CR+LF translation when the fopen() mode is "r" or "w" - Win32 introduces "rb" and "wb" for binary files.
- The PCI library is still a "brute force" approach. But, it works
so far. I'll remove it soon enough.
Please make this a high priority. Until this is fixed, the driver is a bluescreen waiting to happen and we can't really responsibly ask people to test it that way. :\
More updates to come.
This is good stuff!
On another note, is it possible to load the driver directly from a resource without writing it out to a file first?
//Peter
Hi, Peter Stuge wrote:
Hi,
On Mon, Jul 02, 2007 at 04:17:51PM +0700, Darmawan Salihun wrote:
The diff for the latest winflashrom code is attached.
The diff is reversed, so a bit hard to read. Please do svn diff -r1:HEAD next time.
ok :)
Also, instead of commenting out non-Windows code and adding new Windows-only code, please suggest good ways to abstract the code.
Basically, I'm in the process of designing the abstraction right now ;-). I'll post it when I'm done. Nonetheless, I need a "working" code as a comparison to know the details of the "incompatible" parts between the windows version and the *NIX version.
There are still a few issues with this code.
- I've tested it on ICH-5 board. However, the BIOS binary that it
reads from it is 514KB. It should be only 512KB. I'm still trying to find out the bug. Any info from ICH-x code maintainer?
This is probably because Win32 does LF->CR+LF translation when the fopen() mode is "r" or "w" - Win32 introduces "rb" and "wb" for binary files.
aha.., I see. I forgot about it :-/. I've fixed such a bug when porting flashrom version 1.23 back then ;-). Thanks for the hint.
- The PCI library is still a "brute force" approach. But, it works
so far. I'll remove it soon enough.
Please make this a high priority. Until this is fixed, the driver is a bluescreen waiting to happen and we can't really responsibly ask people to test it that way. :\
yes, it's on the top priority list right now.
More updates to come.
This is good stuff!
On another note, is it possible to load the driver directly from a resource without writing it out to a file first?
I don't know that for sure. But, I'll try to find out. Maybe using memory-mapped file would allow us to do such a thing.
Anyway, I'll be off for about 4-5 days. So, updates will come only by next week.
Cheers, Darmawan
Peter Stuge wrote:
Hi,
On Mon, Jul 02, 2007 at 04:17:51PM +0700, Darmawan Salihun wrote:
The diff for the latest winflashrom code is attached.
The diff is reversed, so a bit hard to read. Please do svn diff -r1:HEAD next time.
Also, instead of commenting out non-Windows code and adding new Windows-only code, please suggest good ways to abstract the code.
There are a few things that need to be abstracted in order to unify the Unix-based code and the Windows version. After thinking about the solution for a while I found that it maybe better if I made a kind of abstraction layer for the "incompatible" parts that only impose minor changes to the current flashrom code base prior to doing a "redesign" to the overall source code to address the Windows-*NIX "incompatibilities". This is only a temporary workaround. The most obvious parts are:
1. Libpci abstraction for Windows. In this case the libpci logic in the flashrom code base need not be changed. I will make a "libpci for Windows" that doesn't change the logic within the current flashrom code. Even after the redesign, we might choose to preserve this part.
2. Direct I/O access abstraction. I think, in the short term, I will just provide a simple direct I/O "compatibility layer" for inX,outX family of functions in the winflashrom.
3. File I/O abstraction. We might need this because fopen(..) behaves not exactly the same in Windows and *NIX.
Remember that this is only a short term solution. Anyway, do you think it's good to make a branch for the current flashrom code in order to develop the unified (redesigned) flashrom that will host a single code base for both the *NIX version and winflashrom? I mean the branch will be merged into the trunk once we have a well tested unified version of flashrom/winflashrom.
Another note that I have difficulty in limiting the direct I/O access in the current driver because I don't know exactly which port to give access to and which one to block. Below is what I've found from the current flashrom code so far. I/O port usage:
0x2E (Winbond W836_INDEX port) 0x2F (Winbond W836_DATA port) 0x4E 0x4F 0xCD6 0xCD7 0xCFC - 0xCFF (PCI I/O port on x86) 0xC6F a "base + 0x4D" in Via Epia motherboard 0xE1 0xE800 (what port is this? ) 0xE801 0xE802 0xE803 0xE804 0xE807 0xEB 0xFF
I couldn't conclude the the I/O port ranges to open from the port list above because there is still unknown (I think it's dynamically relocatable) I/O port such as the one used by EPIA board. Any explanation on this issue?
That's all for now.
Regards, Darmawan Salihun
On Thu, Jul 12, 2007 at 02:19:47PM +0700, Darmawan Salihun wrote:
- Libpci abstraction for Windows. In this case the libpci logic in
the flashrom code base need not be changed. I will make a "libpci for Windows" that doesn't change the logic within the current flashrom code. Even after the redesign, we might choose to preserve this part.
Perhaps you can make a small libpci-win32 package out of it? I'm sure others would appreciate it as well.
- Direct I/O access abstraction. I think, in the short term, I
will just provide a simple direct I/O "compatibility layer" for inX,outX family of functions in the winflashrom.
This can just be some #defines, use some functions in Windows and the normal in()/out() stuff otherwise.
Put this in some header:
#ifndef __WIN32__ #define my_inb(a,b) inb(a,b) #define my_inw(a,b) inw(a,b) #define my_inl(a,b) inl(a,b) #endif /* __WIN32__ */
..and then implement my_in[bwl]() in a .c file together with all the other stuff needed for Windows. Have a conditional in the Makefile to build and link that file on Windows only. Voila, done.
- File I/O abstraction. We might need this because fopen(..)
behaves not exactly the same in Windows and *NIX.
C89 has the b mode so just change all fopen calls to use "rb", "wb", "ab", "rb+", "wb+" and "ab+" respectively.
do you think it's good to make a branch for the current flashrom code in order to develop the unified (redesigned) flashrom that will host a single code base for both the *NIX version and winflashrom?
Is that really needed? Just submit nice and neat patches against trunk to get them reviewed, acked and applied.
Another note that I have difficulty in limiting the direct I/O access in the current driver because I don't know exactly which port to give access to and which one to block. Below is what I've found from the current flashrom code so far. I/O port usage:
0x2E (Winbond W836_INDEX port) 0x2F (Winbond W836_DATA port) 0x4E 0x4F 0xCD6 0xCD7 0xCFC - 0xCFF (PCI I/O port on x86) 0xC6F a "base + 0x4D" in Via Epia motherboard 0xE1 0xE800 (what port is this? ) 0xE801 0xE802 0xE803 0xE804 0xE807 0xEB 0xFF
I couldn't conclude the the I/O port ranges to open from the port list above because there is still unknown (I think it's dynamically relocatable) I/O port such as the one used by EPIA board. Any explanation on this issue?
This list is still much better than allowing everything!
As for the EPIA board, well, where is that base specified? In a PCI config register or where?
//Peter
Peter Stuge wrote:
On Thu, Jul 12, 2007 at 02:19:47PM +0700, Darmawan Salihun wrote:
- Libpci abstraction for Windows. In this case the libpci logic in
the flashrom code base need not be changed. I will make a "libpci for Windows" that doesn't change the logic within the current flashrom code. Even after the redesign, we might choose to preserve this part.
Perhaps you can make a small libpci-win32 package out of it? I'm sure others would appreciate it as well.
It should be trivial to do that.
- Direct I/O access abstraction. I think, in the short term, I
will just provide a simple direct I/O "compatibility layer" for inX,outX family of functions in the winflashrom.
This can just be some #defines, use some functions in Windows and the normal in()/out() stuff otherwise.
Put this in some header:
#ifndef __WIN32__ #define my_inb(a,b) inb(a,b) #define my_inw(a,b) inw(a,b) #define my_inl(a,b) inl(a,b) #endif /* __WIN32__ */
..and then implement my_in[bwl]() in a .c file together with all the other stuff needed for Windows. Have a conditional in the Makefile to build and link that file on Windows only. Voila, done.
- File I/O abstraction. We might need this because fopen(..)
behaves not exactly the same in Windows and *NIX.
C89 has the b mode so just change all fopen calls to use "rb", "wb", "ab", "rb+", "wb+" and "ab+" respectively.
OK
do you think it's good to make a branch for the current flashrom code in order to develop the unified (redesigned) flashrom that will host a single code base for both the *NIX version and winflashrom?
Is that really needed? Just submit nice and neat patches against trunk to get them reviewed, acked and applied.
I see.
Another note that I have difficulty in limiting the direct I/O access in the current driver because I don't know exactly which port to give access to and which one to block. Below is what I've found from the current flashrom code so far. I/O port usage:
0x2E (Winbond W836_INDEX port) 0x2F (Winbond W836_DATA port) 0x4E 0x4F 0xCD6 0xCD7 0xCFC - 0xCFF (PCI I/O port on x86) 0xC6F a "base + 0x4D" in Via Epia motherboard 0xE1 0xE800 (what port is this? ) 0xE801 0xE802 0xE803 0xE804 0xE807 0xEB 0xFF
I couldn't conclude the the I/O port ranges to open from the port list above because there is still unknown (I think it's dynamically relocatable) I/O port such as the one used by EPIA board. Any explanation on this issue?
This list is still much better than allowing everything!
As for the EPIA board, well, where is that base specified? In a PCI config register or where?
The base is in a PCI config register. Relevant code as follows: ------------------- board_enable.c ---------------------------------------- static int board_via_epia_m(const char *name) { ... /* Get Power Management IO address. */ base = pci_read_word(dev, 0x88) & 0xFF80;
/* enable GPIO15 which is connected to write protect. */ val = inb(base + 0x4D); val |= 0x80; outb(val, base + 0x4D); ... }
Cheers,
Darmawan
On Thu, Jul 12, 2007 at 05:13:38PM +0700, Darmawan Salihun wrote:
As for the EPIA board, well, where is that base specified? In a PCI config register or where?
The base is in a PCI config register. Relevant code as follows:
base = pci_read_word(dev, 0x88) & 0xFF80; val = inb(base + 0x4D); val |= 0x80; outb(val, base + 0x4D);
Ok! Put all of board_enable.c in the kernel driver and add some way for the app to call any function in the board_pciid_enables list, in the kernel driver. Sort of RPC but over a app/kernel split instead of over a network.
Then compile board_enable.c also into the app, but not to call any of the functions (which would fail anyway) but only to get the same board_pciid_enables list. The struct may have to be extended to have a unique index for each entry so that the driver and app can agree on which function is which.
All of the device detection should be done where it is easiest to do it. Since the driver will need to do safe PCI accesses for board enables perhaps it makes sense to contain all PCI accesses in the driver.
On the other hand, perhaps the app will need to do some PCI accesses to choose the right board_enable function and then it's better to do most of them in the app.
What do you think?
Another general matter, make sure the app and driver can exchange version numbers. That way both the driver and the app could have compatiblity code in order to be backwards compatible.
I would like the app to exit with an error message if it can't agree with the kernel driver on a "protocol version" that both of them support. Need not be fancy right now, a simple two-way version check is plenty good.
//Peter
Peter Stuge wrote:
On Thu, Jul 12, 2007 at 05:13:38PM +0700, Darmawan Salihun wrote:
As for the EPIA board, well, where is that base specified? In a PCI config register or where?
The base is in a PCI config register. Relevant code as follows:
base = pci_read_word(dev, 0x88) & 0xFF80; val = inb(base + 0x4D); val |= 0x80; outb(val, base + 0x4D);
Ok! Put all of board_enable.c in the kernel driver and add some way for the app to call any function in the board_pciid_enables list, in the kernel driver. Sort of RPC but over a app/kernel split instead of over a network.
Then compile board_enable.c also into the app, but not to call any of the functions (which would fail anyway) but only to get the same board_pciid_enables list. The struct may have to be extended to have a unique index for each entry so that the driver and app can agree on which function is which.
I'm thinking about creating a board_enable.h file which will hold the "board_pciid_enables list". It may be easier this way because both the app and the driver will refer to it. Therefore, whenever a change is happening, both will conform to the change immediately. Of course board_enable.c will exist in the app and the driver as well. Is that acceptable? or perhaps is it opening too much of access into a "should be private" entity (in this case "board_pciid_enables list")?
All of the device detection should be done where it is easiest to do it. Since the driver will need to do safe PCI accesses for board enables perhaps it makes sense to contain all PCI accesses in the driver.
On the other hand, perhaps the app will need to do some PCI accesses to choose the right board_enable function and then it's better to do most of them in the app.
What do you think?
I think it's better to move it to the kernel driver and only provides a function call to it in the app. I think that's the way to go. However, that would be a huge #ifdef in the beginning of the transition into a new unified architecture :-(.
Another general matter, make sure the app and driver can exchange version numbers. That way both the driver and the app could have compatiblity code in order to be backwards compatible.
It's quite easy to add this capability in the driver initialization.
I would like the app to exit with an error message if it can't agree with the kernel driver on a "protocol version" that both of them support. Need not be fancy right now, a simple two-way version check is plenty good.
roger that ;-).
--Darmawan Salihun
Peter Stuge wrote:
On Thu, Jul 12, 2007 at 05:13:38PM +0700, Darmawan Salihun wrote:
As for the EPIA board, well, where is that base specified? In a PCI config register or where?
The base is in a PCI config register. Relevant code as follows:
base = pci_read_word(dev, 0x88) & 0xFF80; val = inb(base + 0x4D); val |= 0x80; outb(val, base + 0x4D);
Ok! Put all of board_enable.c in the kernel driver and add some way for the app to call any function in the board_pciid_enables list, in the kernel driver. Sort of RPC but over a app/kernel split instead of over a network.
Then compile board_enable.c also into the app, but not to call any of the functions (which would fail anyway) but only to get the same board_pciid_enables list. The struct may have to be extended to have a unique index for each entry so that the driver and app can agree on which function is which.
All of the device detection should be done where it is easiest to do it. Since the driver will need to do safe PCI accesses for board enables perhaps it makes sense to contain all PCI accesses in the driver.
On the other hand, perhaps the app will need to do some PCI accesses to choose the right board_enable function and then it's better to do most of them in the app.
What do you think?
Another general matter, make sure the app and driver can exchange version numbers. That way both the driver and the app could have compatiblity code in order to be backwards compatible.
I would like the app to exit with an error message if it can't agree with the kernel driver on a "protocol version" that both of them support. Need not be fancy right now, a simple two-way version check is plenty good.
Anyway, I'm currently working on a version that can be compilable in both Linux and Windows. The last one (that I showed with diff result) wouldn't compile in Linux because of the commented Makefile and other commented parts.
Any ideas how to detect whether we are in MinGW or true Linux's bash in the Makefile?
TIA, Darmawan Salihun
* Darmawan Salihun darmawan.salihun@gmail.com [070713 11:56]:
Any ideas how to detect whether we are in MinGW or true Linux's bash in the Makefile?
OS = $(shell uname -s)
* Peter Stuge peter@stuge.se [070712 17:27]:
Then compile board_enable.c also into the app, but not to call any of the functions (which would fail anyway) but only to get the same board_pciid_enables list. The struct may have to be extended to have a unique index for each entry so that the driver and app can agree on which function is which.
All of the device detection should be done where it is easiest to do it. Since the driver will need to do safe PCI accesses for board enables perhaps it makes sense to contain all PCI accesses in the driver.
On the other hand, perhaps the app will need to do some PCI accesses to choose the right board_enable function and then it's better to do most of them in the app.
What do you think?
Application logic belongs into the application, not into the driver.
Another general matter, make sure the app and driver can exchange version numbers. That way both the driver and the app could have compatiblity code in order to be backwards compatible.
The driver should be stupid enough so it does not need to be changed. There is no need for a driver in Linux, either, it's merely a makeshift.
I would like the app to exit with an error message if it can't agree with the kernel driver on a "protocol version" that both of them support. Need not be fancy right now, a simple two-way version check is plenty good.
Why do we need a protocol? I think this is overengineering the problem.
Stefan Reinauer wrote:
- Peter Stuge peter@stuge.se [070712 17:27]:
Then compile board_enable.c also into the app, but not to call any of the functions (which would fail anyway) but only to get the same board_pciid_enables list. The struct may have to be extended to have a unique index for each entry so that the driver and app can agree on which function is which.
All of the device detection should be done where it is easiest to do it. Since the driver will need to do safe PCI accesses for board enables perhaps it makes sense to contain all PCI accesses in the driver.
On the other hand, perhaps the app will need to do some PCI accesses to choose the right board_enable function and then it's better to do most of them in the app.
What do you think?
Application logic belongs into the application, not into the driver.
This makes me wonder if the current driver is good enough :-/
Another general matter, make sure the app and driver can exchange version numbers. That way both the driver and the app could have compatiblity code in order to be backwards compatible.
The driver should be stupid enough so it does not need to be changed. There is no need for a driver in Linux, either, it's merely a makeshift.
The reality is the current driver is as stupid as it is. Because, it responds to every direct I/O functions (inb, outb, etc.) in the correct manner and mapped the requested physical address range as we expect from mmap'ing /dev/mem.
However, I'm thinking and experimenting about moving the board_enable and chipset_enable support to the driver, so that every low-level accesses is kept to the driver.
Which implementation approach do you think is best ?
Honestly, looking at the current "design paradigm" that I can read from flashrom, the simplistic approach is the one that fits the original flashrom. But, I left the decision to all of us here.
-Darmawan Salihun
* Darmawan Salihun darmawan.salihun@gmail.com [070714 05:13]:
Application logic belongs into the application, not into the driver.
This makes me wonder if the current driver is good enough :-/
Take it as a suggestion. If there is no good way of doing it different, than the way you are doing it in your driver is fine.
The reality is the current driver is as stupid as it is. Because, it responds to every direct I/O functions (inb, outb, etc.) in the correct manner and mapped the requested physical address range as we expect from mmap'ing /dev/mem.
However, I'm thinking and experimenting about moving the board_enable and chipset_enable support to the driver, so that every low-level accesses is kept to the driver.
Which implementation approach do you think is best ?
good question. There is certainly an advantage to both approaches.
Having a simple driver would allow to update the userspace program while the driver always stays the same.
On the other hand, letting userspace do arbitrary io accesses is not good.
Can anyone else except flashrom easily use the driver? By accident? On purpose, to break into the system?
Stefan Reinauer wrote:
- Darmawan Salihun darmawan.salihun@gmail.com [070714 05:13]:
Application logic belongs into the application, not into the driver.
This makes me wonder if the current driver is good enough :-/
Take it as a suggestion. If there is no good way of doing it different, than the way you are doing it in your driver is fine.
The reality is the current driver is as stupid as it is. Because, it responds to every direct I/O functions (inb, outb, etc.) in the correct manner and mapped the requested physical address range as we expect from mmap'ing /dev/mem.
However, I'm thinking and experimenting about moving the board_enable and chipset_enable support to the driver, so that every low-level accesses is kept to the driver.
Which implementation approach do you think is best ?
good question. There is certainly an advantage to both approaches.
Having a simple driver would allow to update the userspace program while the driver always stays the same.
Nonetheless, this would require driver recompilation every time we have new board support. It won't be a big problem once everything can be handled by MinGW alone (by means of Winpooch -- http://sourceforge.net/docman/?group_id=122629). Anyway, I haven't finish moving board_enable.c and chipset_enable.c to the driver.
On the other hand, letting userspace do arbitrary io accesses is not good.
Can anyone else except flashrom easily use the driver? By accident? On purpose, to break into the system?
I don't think it's easy to use flashrom driver by accident because it's loaded and unloaded dynamically. It only runs when flashrom is executing. However, I think it's best to keep the direct I/O routine in the driver and application will exchange data through the so-called IRP (I/O request packet) with the driver. Currently, IRP is in use but not efficient because sometimes it's only exchanging one or two bytes for I/O accesses. It's not efficient if we take into account that we can move almost the entire I/O accesses to the driver and do it there.
Anyway, there's an issue regarding the PCI library that I'm aware just recently. Which one is the following approach is better? 1. Let windows detect the PCI devices in the system and just parse the data that it presents.
or
2. Detect the PCI devices in the system ourself by doing direct I/O (PCI detection) in the driver. This way, the code will detect the device using libpci that's ported to kernel mode driver.
That's all for now. More to come ;-).
Regards, Darmawan Salihun
* Darmawan Salihun darmawan.salihun@gmail.com [070716 06:52]:
Anyway, there's an issue regarding the PCI library that I'm aware just recently. Which one is the following approach is better?
- Let windows detect the PCI devices in the system and just parse the
data that it presents.
or
- Detect the PCI devices in the system ourself by doing direct I/O (PCI
detection) in the driver. This way, the code will detect the device using libpci that's ported to kernel mode driver.
In general, letting the OS do the detection sounds more solid. So if there's no big advantage in doing it yourself, Windows should do it.
Peter Stuge wrote:
On Thu, Jul 12, 2007 at 05:13:38PM +0700, Darmawan Salihun wrote:
As for the EPIA board, well, where is that base specified? In a PCI config register or where?
The base is in a PCI config register. Relevant code as follows:
base = pci_read_word(dev, 0x88) & 0xFF80; val = inb(base + 0x4D); val |= 0x80; outb(val, base + 0x4D);
Ok! Put all of board_enable.c in the kernel driver and add some way for the app to call any function in the board_pciid_enables list, in the kernel driver. Sort of RPC but over a app/kernel split instead of over a network.
Then compile board_enable.c also into the app, but not to call any of the functions (which would fail anyway) but only to get the same board_pciid_enables list. The struct may have to be extended to have a unique index for each entry so that the driver and app can agree on which function is which.
All of the device detection should be done where it is easiest to do it. Since the driver will need to do safe PCI accesses for board enables perhaps it makes sense to contain all PCI accesses in the driver.
After reading through majority of flashrom code, I found out that the direct I/O port accesses reside in two files, as follows: - board_enable.c - chipset_enable.c
I'll try find a way to "move" the routine in these files into the kernel driver and provide a way to call them indirectly through the driver interface (the so-called I/O Request Packet).
Everything else other than the direct I/O port accesses are accomplished through memory-mapped I/O (MMIO) R/W transactions. Because the current code is already capable of mapping MMIO ranges that we need to the app, we can consider this problem solved.
--Darmawan Salihun
* Darmawan Salihun darmawan.salihun@gmail.com [070712 09:19]:
Another note that I have difficulty in limiting the direct I/O access in the current driver because I don't know exactly which port to give access to and which one to block. Below is what I've found from the current flashrom code so far. I/O port usage:
0x2E (Winbond W836_INDEX port) 0x2F (Winbond W836_DATA port) 0x4E 0x4F
4e and 4f are superio ports, too
0xC6F 0xCD6 0xCD7
not sure. acpi something?
0xCFC - 0xCFF (PCI I/O port on x86)
a "base + 0x4D" in Via Epia motherboard
base is the interesting thing here.. It's determined dynamically.
0xE800 (what port is this? ) 0xE801 0xE802 0xE803 0xE804 0xE807
I think these are hard codes of some legacy bios setting and should be read dynamically, too..
0xE1 0xEB 0xFF
Wow, what are these?
I don't think you can easily limit this. I am working on support for a board where the final IO address is read dynamically from PCI register space. This might be required on more boards in the future.
I couldn't conclude the the I/O port ranges to open from the port
list above because there is still unknown (I think it's dynamically relocatable) I/O port such as the one used by EPIA board. Any explanation on this issue?
I'm currently completing the MMIO mapping routine in the device driver, but I have a problem as follows: - The current driver doesn't limit the application to map certain MMIO phy address range even if it's already mapped by another routine in the same application. I actually plan to "lock" the mapped MMIO phy address range so that once it's mapped, it must be unmapped first before another routine can map that range. Actually, this is to avoid two routine from manipulating the same area at the same time.
My question is, do you think we really need the "lock-ing" mechanism in the driver? I just think it is a potential problem if the MMIO access is not "lock-ed".
Another note is: The current driver limits the application capability to map MMIO to the low_1MB address range and (4GB-20MB) to 4GB address range.
Any suggestion is welcomed.
Regards, Darmawan
* Darmawan Salihun darmawan.salihun@gmail.com [070724 06:11]:
I'm currently completing the MMIO mapping routine in the device driver, but I have a problem as follows:
The current driver doesn't limit the application to map certain MMIO phy address range even if it's already mapped by another routine in the same application. I actually plan to "lock" the mapped MMIO phy address range so that once it's mapped, it must be unmapped first before another routine can map that range. Actually, this is to avoid two routine from manipulating the same area at the same time.
My question is, do you think we really need the "lock-ing" mechanism in the driver? I just think it is a potential problem if the MMIO access is not "lock-ed".
Since the places where mmap happens are only a few I'd say it is not really "needed". It's a good idea though as it is a safety measure against application bugs.
Another note is: The current driver limits the application capability to map MMIO to the low_1MB address range and (4GB-20MB) to 4GB address range.
This sounds reasonable.
Stefan
* Stefan Reinauer stepan@coresystems.de [070725 11:12]:
Since the places where mmap happens are only a few I'd say it is not really "needed". It's a good idea though as it is a safety measure against application bugs.
Just stumbled upon "WinIO" which seems to do something similar
http://www.internals.com/utilities_main.htm
Stefan Reinauer wrote:
- Stefan Reinauer stepan@coresystems.de [070725 11:12]:
Since the places where mmap happens are only a few I'd say it is not really "needed". It's a good idea though as it is a safety measure against application bugs.
Just stumbled upon "WinIO" which seems to do something similar
I'll have a look. Nice catch ;-)
Cheers, Darmawan
Stefan Reinauer wrote:
- Stefan Reinauer stepan@coresystems.de [070725 11:12]:
Since the places where mmap happens are only a few I'd say it is not really "needed". It's a good idea though as it is a safety measure against application bugs.
Just stumbled upon "WinIO" which seems to do something similar
As I couldn't find any better replacement to the PCI direct I/O function at the moment (which is needed in chipset_enable and board_enable), I will try to implement them using a quite old kernel mode API, i.e. HalGetBusData and HalSetBusData. I hope the kernel "emulates" these functions so that it works like in NT4 and Windows 2000. These functions may be non-portable to Windows Vista. However, this is more stable than directly probing the bus with assembler routines. I'm still looking for a better solution because Windows is very strict about accessing the PCI/PCIe/Hypertransport bus, only driver for a corresponding PCI device can "interrogate" the bus. I don't know yet if we can "fool" this mechanism with a "fake PCI bus filter driver" which is right now __not documented__ at best. As for the other ports, it will be direct I/O in the driver because there are no specific support in Windows for them.
Darmawan Salihun wrote:
As I couldn't find any better replacement to the PCI direct I/O function at the moment (which is needed in chipset_enable and board_enable), I will try to implement them using a quite old kernel mode API, i.e. HalGetBusData and HalSetBusData. I hope the kernel "emulates" these functions so that it works like in NT4 and Windows 2000.
A more "portable" way:
Make all CPUs spinning inside a DPC function: KeQueryActiveProcessors() is used to get all processors. KeSetTargetProcessorDpc() is used to tie the DPC function and particular SMP processors. When all CPUs are spinning inside the function, one of them does IO (CF8/CFC) to the PCI configuration space. After the IO completes, let all DPC functions finish.
Roman
On 7/30/07, Roman Kononov kononov@dls.net wrote:
Darmawan Salihun wrote:
As I couldn't find any better replacement to the PCI direct I/O function at the moment (which is needed in chipset_enable and board_enable), I will try to implement them using a quite old kernel mode API, i.e. HalGetBusData and HalSetBusData. I hope the kernel "emulates" these functions so that it works like in NT4 and Windows 2000.
A more "portable" way:
Make all CPUs spinning inside a DPC function: KeQueryActiveProcessors() is used to get all processors. KeSetTargetProcessorDpc() is used to tie the DPC function and particular SMP processors. When all CPUs are spinning inside the function, one of them does IO (CF8/CFC) to the PCI configuration space. After the IO completes, let all DPC functions finish.
Roman
Nice idea ;-).
I'll try tocombine that with Windows HAL function for the PCI direct IO because doing it through CF8/CFC is very risky. I'll let the OS do it ;-).
Roman Kononov wrote:
Darmawan Salihun wrote:
As I couldn't find any better replacement to the PCI direct I/O function at the moment (which is needed in chipset_enable and board_enable), I will try to implement them using a quite old kernel mode API, i.e. HalGetBusData and HalSetBusData. I hope the kernel "emulates" these functions so that it works like in NT4 and Windows 2000.
A more "portable" way:
Make all CPUs spinning inside a DPC function: KeQueryActiveProcessors() is used to get all processors. KeSetTargetProcessorDpc() is used to tie the DPC function and particular SMP processors. When all CPUs are spinning inside the function, one of them does IO (CF8/CFC) to the PCI configuration space. After the IO completes, let all DPC functions finish.
Roman
After discussing about the access method at length with more experienced Windows driver developers. I think if HalGetBusData is not working at all then direct I/O port access with your "multiprocessor-aware kernel thread protection" is the "safest" method.
--Darmawan Salihun
On Fri, Aug 10, 2007 at 03:08:26PM +0700, Darmawan Salihun wrote:
Roman Kononov wrote:
Make all CPUs spinning inside a DPC function:
After discussing about the access method at length with more experienced Windows driver developers. I think if HalGetBusData is not working at all then direct I/O port access with your "multiprocessor-aware kernel thread protection" is the "safest" method.
But PCI config accesses are not atomic operations. Is there a guarantee that the other CPUs are not in the middle of doing a PCI access already?
And even if they are actually doing something else, perhaps they (erroneously? but we don't want to break them anyway) rely on 0xcfc being what they set it to in the last PCI config access?
I really prefer using whatever API Windows offers.
//Peter
Peter Stuge wrote:
On Fri, Aug 10, 2007 at 03:08:26PM +0700, Darmawan Salihun wrote:
Roman Kononov wrote:
Make all CPUs spinning inside a DPC function:
After discussing about the access method at length with more experienced Windows driver developers. I think if HalGetBusData is not working at all then direct I/O port access with your "multiprocessor-aware kernel thread protection" is the "safest" method.
But PCI config accesses are not atomic operations. Is there a guarantee that the other CPUs are not in the middle of doing a PCI access already?
And even if they are actually doing something else, perhaps they (erroneously? but we don't want to break them anyway) rely on 0xcfc being what they set it to in the last PCI config access?
I really prefer using whatever API Windows offers.
That's why I'm trying this Hal***BusDataByOffset right now. Some developers say that they work in some platform and not in others. That's why I want to test it first hand ;-). I'm completing/reviewing the code right now before testing it in my testbed machines.
Stay tuned :-)
Darmawan Salihun
On Fri, Aug 10, 2007 at 05:21:11PM +0700, Darmawan Salihun wrote:
I really prefer using whatever API Windows offers.
That's why I'm trying this Hal***BusDataByOffset right now.
Aye.
Some developers say that they work in some platform and not in others.
It's marked as deprecated in MSDN since Windows 2000 and it is suggested to use the PnP manager instead. These code snippets look useful:
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
I also found a code snippet from PLX (who make PCI interface chips) saying that sending IRPs crashes Windows 98. I would easily sacrifice Win98/ME in favor of Vista or even XP if neccessary.
That's why I want to test it first hand ;-). I'm completing/reviewing the code right now before testing it in my testbed machines.
Stay tuned :-)
Great! I'm looking forward to the results! :)
//Peter
Peter Stuge wrote:
On Fri, Aug 10, 2007 at 05:21:11PM +0700, Darmawan Salihun wrote:
I really prefer using whatever API Windows offers.
That's why I'm trying this Hal***BusDataByOffset right now.
Aye.
OK. The good news is HalGetBusDataByOffset function _is_ working in my test system. But, I still have some bugs to "iron-out" because some offset seems to be read incorrectly. Perhaps, I missed something in the PCI configuration data structure returned by this function. Anyway, HalSetBusDataByOffset is still obscure. It may work but, the way I invoke the function is not right :-(.
Regards, Darmawan
Peter Stuge wrote:
But PCI config accesses are not atomic operations. Is there a guarantee that the other CPUs are not in the middle of doing a PCI access already?
And even if they are actually doing something else, perhaps they (erroneously? but we don't want to break them anyway) rely on 0xcfc being what they set it to in the last PCI config access?
By making all the CPUs spinning inside your DPC you avoid these problems. The Windoze kernel protects itself and does not execute scheduled DPC when the CPU is in the middle of a PCI access or anything similar. For sure, when a CPU makes a PCI access its "IRQL" is raised to "HIGH_LEVEL", which means that a dedicated spin lock is acquired and that CPU's interrupts are disabled.
I did not take the above statement about IRQL from an official document, I made it based on my experience and common sense.
Regards,
Roman
Roman Kononov wrote:
Peter Stuge wrote:
But PCI config accesses are not atomic operations. Is there a guarantee that the other CPUs are not in the middle of doing a PCI access already?
And even if they are actually doing something else, perhaps they (erroneously? but we don't want to break them anyway) rely on 0xcfc being what they set it to in the last PCI config access?
By making all the CPUs spinning inside your DPC you avoid these problems. The Windoze kernel protects itself and does not execute scheduled DPC when the CPU is in the middle of a PCI access or anything similar. For sure, when a CPU makes a PCI access its "IRQL" is raised to "HIGH_LEVEL", which means that a dedicated spin lock is acquired and that CPU's interrupts are disabled.
I did not take the above statement about IRQL from an official document, I made it based on my experience and common sense.
Regards,
Roman
Apparently, this is the only solution right now because the Hal***BusDataByOffset() function is _not_ working as expected.
My latest test results with 2 different PCs with WIndows XP SP2 shows that HalGetBusDataByOffset() is not a stable function, it works in one of the test platform and crashes in others. While the HalSetBusDataByOffset() is _not_ working at all. I think the symbol is defined in the kernel but may not be implemented in Win XP SP2.
Moreover, direct port I/O was working flawlessly with the older flashrom that I port to windows back then. I think the "DPC" trick will guarantee the atomic operation and will give us the level of confidence to do the direct port I/O.
I'll be reporting as soon as the DPC version of the direct port I/O driver routine has been tested.
Regards, Darmawan Salihun
On Tue, Aug 14, 2007 at 03:07:49PM +0700, Darmawan Salihun wrote:
Moreover, direct port I/O was working flawlessly with the older flashrom that I port to windows back then. I think the "DPC" trick will guarantee the atomic operation and will give us the level of confidence to do the direct port I/O.
I am not at all confident about that.
Roman did make a good point about IRQL but that does not eliminate the problem, we will still be changing hardware state underneath the OS and that is ALWAYS a bad idea.
Please look into:
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
for Windows 2000 and newer.
I do think we should try to support NT 4 as well, and there the Hal API is good, so please don't throw the code away yet.
//Peter
Peter Stuge wrote:
On Tue, Aug 14, 2007 at 03:07:49PM +0700, Darmawan Salihun wrote:
Moreover, direct port I/O was working flawlessly with the older flashrom that I port to windows back then. I think the "DPC" trick will guarantee the atomic operation and will give us the level of confidence to do the direct port I/O.
I am not at all confident about that.
Roman did make a good point about IRQL but that does not eliminate the problem, we will still be changing hardware state underneath the OS and that is ALWAYS a bad idea.
Please look into:
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
for Windows 2000 and newer.
I do think we should try to support NT 4 as well, and there the Hal API is good, so please don't throw the code away yet.
No, I'm not. I put the code into my own local repository for future experimentation. Anyway, I've read both of the articles you mentioned. I will try looking into the issue much further tonight. I may missed something ;-).
The problem is HalSetBusDataByOffset is not working as documented in MSDN :-(.
--Darmawan
On Tue, Aug 14, 2007 at 08:49:22PM +0700, Darmawan Salihun wrote:
Please look into:
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
for Windows 2000 and newer.
I do think we should try to support NT 4 as well, and there the Hal API is good, so please don't throw the code away yet.
No, I'm not. I put the code into my own local repository for future experimentation.
Excellent! :) Though not a high priority I think the typical NT 4 server should be treated to some LB fun as well. :)
Anyway, I've read both of the articles you mentioned. I will try looking into the issue much further tonight. I may missed something ;-).
The problem is HalSetBusDataByOffset is not working as documented in MSDN :-(.
Both those pages show pretty complete examples of how the PnP Manager is used to do PCI config reads and writes, exactly because the HAL is deprecated. Please check out the code there, it looks like it could solve your problem.
//Peter
On 08/14/2007 07:56 AM, Peter Stuge wrote:
Roman did make a good point about IRQL but that does not eliminate the problem, we will still be changing hardware state underneath the OS and that is ALWAYS a bad idea.
The CF8/CFC sequence can preserve CF8 port value. What other hardware state would be changed?
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
BTW, here is the "official" example: http://support.microsoft.com/kb/253232
Unfortunately, AFAIK, this approach does not work for cases like yours. It requires the "DeviceObject", which MUST be associated with a PARTICULAR PCI function.
Q: In the above links, among HtsReadWriteConfig() and WritePCIConfigSpace() argument lists, which arguments are bus number and device number? A: They are inside PDEVICE_OBJECT, which structure is "opaque".
Regarding how long DPC method takes. A scheduled DPC is launched as soon as the CPU's IRQL drops below DISPATCH_LEVEL. The CPU can be at DISPATCH_LEVEL (and higher) only running kernel code. This can last many time slices. This means that the DPC method might be quite expensive.
Regards,
Roman
On Tue, Aug 14, 2007 at 12:14:22PM -0500, Roman Kononov wrote:
On 08/14/2007 07:56 AM, Peter Stuge wrote:
Roman did make a good point about IRQL but that does not eliminate the problem, we will still be changing hardware state underneath the OS and that is ALWAYS a bad idea.
The CF8/CFC sequence can preserve CF8 port value. What other hardware state would be changed?
The bits that are changed in the device config space.
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
BTW, here is the "official" example: http://support.microsoft.com/kb/253232
Unfortunately, AFAIK, this approach does not work for cases like yours. It requires the "DeviceObject", which MUST be associated with a PARTICULAR PCI function.
Right, I assumed that it would be possible to get that through some sequence of system calls that look at the PCI bus.
Q: In the above links, among HtsReadWriteConfig() and WritePCIConfigSpace() argument lists, which arguments are bus number and device number? A: They are inside PDEVICE_OBJECT, which structure is "opaque".
Yup. Are you saying it is simply not possible to access PCI config space of a device from a device driver unless the driver is in fact part of the driver stack associated with that device?
I suppose Microsoft considers that a feature?
It would certainly explain why there aren't already several applications for reprogramming flash on Windows.
But, on the other hand, there are a few applications that _do_. So how do they do it?
This means that the DPC method might be quite expensive.
If this is the only way to access PCI config space for unrelated devices then I guess it's the best we can do - unless we can make ourselves be part of the driver stack for the southbridge?
Or maybe there is in fact a userspace API for PCI config access?
I am by no means a Windows API or WDM expert, then I'd already written the code. :p That world is a pretty strange place.
//Peter
On 08/14/2007 02:47 PM, Peter Stuge wrote:
The CF8/CFC sequence can preserve CF8 port value. What other hardware state would be changed?
The bits that are changed in the device config space.
I cannot imagine why the OS would care about a couple of configuration bits in the SB.
Yup. Are you saying it is simply not possible to access PCI config space of a device from a device driver unless the driver is in fact part of the driver stack associated with that device?
Yes.
I suppose Microsoft considers that a feature?
Yes. They don't want one driver to mess with other devices.
But, on the other hand, there are a few applications that _do_. So how do they do it?
Using undocumented features is not uncommon in Windows.
we can make ourselves be part of the driver stack for the southbridge?
It sounds too painful.
Or maybe there is in fact a userspace API for PCI config access?
I doubt. It would be a huge security hole.
I am by no means a Windows API or WDM expert, then I'd already written the code. :p That world is a pretty strange place.
Lucky you... It is not a pleasure to write Windows code.
Regards,
Roman
Roman Kononov wrote:
On 08/14/2007 02:47 PM, Peter Stuge wrote:
The CF8/CFC sequence can preserve CF8 port value. What other hardware state would be changed?
The bits that are changed in the device config space.
I cannot imagine why the OS would care about a couple of configuration bits in the SB.
Yup. Are you saying it is simply not possible to access PCI config space of a device from a device driver unless the driver is in fact part of the driver stack associated with that device?
Yes.
Yeah, this is an already known issue from the beginning ;-). Technically, we don't have the PDO of the southbridge chip, therefore we cannot access it in anyway as documented by Micro$oft.
Regards, Darmawan Salihun
On Wed, Aug 15, 2007 at 03:21:29PM +0700, Darmawan Salihun wrote:
Yup. Are you saying it is simply not possible to access PCI config space of a device from a device driver unless the driver is in fact part of the driver stack associated with that device?
Yes.
Yeah, this is an already known issue from the beginning ;-). Technically, we don't have the PDO of the southbridge chip, therefore we cannot access it in anyway as documented by Micro$oft.
What is required to get into the southbridge driver stack?
From the little reading I've done the last day or two it seems that
a driver would not have to do very much once in the stack. Is it really not feasible? It would definately be the cleanest way.
//Peter
Peter Stuge wrote:
On Wed, Aug 15, 2007 at 03:21:29PM +0700, Darmawan Salihun wrote:
Yup. Are you saying it is simply not possible to access PCI config space of a device from a device driver unless the driver is in fact part of the driver stack associated with that device?
Yes.
Yeah, this is an already known issue from the beginning ;-). Technically, we don't have the PDO of the southbridge chip, therefore we cannot access it in anyway as documented by Micro$oft.
What is required to get into the southbridge driver stack?
I'm not too sure. I think a "PCI bus filter driver", but that would be an overkill at the moment. The other problem is it's not well documented (maybe not documented at all :-/). I hardly found information about such a thing. Even, experts in Windows driver development says so.
From the little reading I've done the last day or two it seems that
a driver would not have to do very much once in the stack. Is it really not feasible? It would definately be the cleanest way.
Yes, this is the cleanest way. However, we have to "attach" our driver entry point functions, including "AddDevice" function upon the first time the southbridge driver is installed. More like we make a driver for Intel, NVidia, AMD or Via chipset driver. The PnP manager will try to find the driver for the corresponding device the first time it's found after Windows installation and it seems once it has the driver we wouldn't be able to add our own "hook" into the driver stack. Unless we make the thing called "PCI bus filter driver" or other "Bus filter driver" as needed. But, the problem goes back to the beginning, it's not even documented. I think Microsoft has a reason to make it not documented.
Regards, Darmawan
On Wed, Aug 15, 2007 at 03:37:28PM +0700, Darmawan Salihun wrote:
What is required to get into the southbridge driver stack?
I'm not too sure. I think a "PCI bus filter driver", but that would be an overkill at the moment.
Why?
The other problem is it's not well documented (maybe not documented at all :-/). I hardly found information about such a thing. Even, experts in Windows driver development says so.
I found some:
MSDN > Win32 and COM Development > Windows Driver Kit > Kernel-Mode Driver Architecture > Design Guide (check out Reference too) http://msdn2.microsoft.com/en-us/library/ms796245.aspx
really not feasible? It would definately be the cleanest way.
Yes, this is the cleanest way. However, we have to "attach" our driver entry point functions, including "AddDevice" function upon the first time the southbridge driver is installed.
Well we don't have to be the _only_ driver. I don't expect that to work well.
More like we make a driver for Intel, NVidia, AMD or Via chipset driver.
I think separate drivers is OK but only one is of course ideal.
The PnP manager will try to find the driver for the corresponding device the first time it's found after Windows installation and it seems once it has the driver we wouldn't be able to add our own "hook" into the driver stack. Unless we make the thing called "PCI bus filter driver" or other "Bus filter driver" as needed.
Right, this is what I thought seemed right.
But, the problem goes back to the beginning, it's not even documented.
There is talk about it. Google has a few bits of info too, as usual.
One link is Doron Holan's blog: (Technical lead for WDF) http://blogs.msdn.com/doronh/archive/2006/09/18/761325.aspx (here talk about class filter drivers)
And there is an example, look for "toaster", in the WDK/DDK.
It was introduced in this old newsletter it seems: http://www.microsoft.com/whdc/resources/news/newsletters/MHN_090803.htm
The WDK doesn't seem to be readily available without registration and possibly payment but there is a DDK for 2k<=SP4 XP<=SP1 and 2003<=SP1 immediately downloadable from:
http://www.microsoft.com/whdc/DevTools/ddk/default.mspx
..which probably works just fine also for later versions.
There's documentation to go with the DDK too:
http://www.microsoft.com/whdc/DevTools/WDK/WDKdocs.mspx
There are also some interesting docs on the toaster:
http://www.microsoft.com/whdc/driver/foundation/toastersamp.mspx
http://download.microsoft.com/download/3/5/a/35a609bf-872a-4eb8-a0d6-a3e026f... Google HTML: http://209.85.135.104/search?q=cache:wRLrFGamtKYJ:download.microsoft.com/dow...
http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295... Google HTML: http://209.85.135.104/search?q=cache:H-hy11oPjsIJ:download.microsoft.com/dow...
The latter one is slides about driver distribution and installation. They have tools for that too:
http://www.microsoft.com/whdc/driver/install/difxtools.mspx http://www.microsoft.com/whdc/driver/install/DIFxtls.mspx http://search.microsoft.com/results.aspx?mkt=en-US&setlang=en-US&q=d...
Another good resource is:
http://www.codeproject.com/system/driverdev4asp.asp
here part 4 in a driver writing tutorial series that goes into healthy depth about technicalities, and provides code.
Have a look.
//Peter
Peter Stuge wrote:
On Wed, Aug 15, 2007 at 03:37:28PM +0700, Darmawan Salihun wrote:
What is required to get into the southbridge driver stack?
I'm not too sure. I think a "PCI bus filter driver", but that would be an overkill at the moment.
Why?
First off, I need to make at least a "safe" working code this weekend, for GSoC compliance ;-). I have a "working" code at the moment with direct I/O but without much of a "protection mechanism" in the kernel. It's an obvious danger, so I would like to add the DPC mechanism for the time being. Of course, venturing through the bus filter driver has been planned for sometime now. So, don't worry ;-).
The other problem is it's not well documented (maybe not documented at all :-/). I hardly found information about such a thing. Even, experts in Windows driver development says so.
I found some:
MSDN > Win32 and COM Development > Windows Driver Kit > Kernel-Mode Driver Architecture > Design Guide (check out Reference too) http://msdn2.microsoft.com/en-us/library/ms796245.aspx
OK. I'll read that. I might have come across that ;-).
really not feasible? It would definately be the cleanest way.
Yes, this is the cleanest way. However, we have to "attach" our driver entry point functions, including "AddDevice" function upon the first time the southbridge driver is installed.
Well we don't have to be the _only_ driver. I don't expect that to work well.
More like we make a driver for Intel, NVidia, AMD or Via chipset driver.
I think separate drivers is OK but only one is of course ideal.
The PnP manager will try to find the driver for the corresponding device the first time it's found after Windows installation and it seems once it has the driver we wouldn't be able to add our own "hook" into the driver stack. Unless we make the thing called "PCI bus filter driver" or other "Bus filter driver" as needed.
Right, this is what I thought seemed right.
But, the problem goes back to the beginning, it's not even documented.
There is talk about it. Google has a few bits of info too, as usual.
One link is Doron Holan's blog: (Technical lead for WDF) http://blogs.msdn.com/doronh/archive/2006/09/18/761325.aspx (here talk about class filter drivers)
OK. This one is on the list. I've been talking with Doron for a while in the OSR's mailing list for a while ;-).
And there is an example, look for "toaster", in the WDK/DDK.
It was introduced in this old newsletter it seems: http://www.microsoft.com/whdc/resources/news/newsletters/MHN_090803.htm
The WDK doesn't seem to be readily available without registration and possibly payment but there is a DDK for 2k<=SP4 XP<=SP1 and 2003<=SP1 immediately downloadable from:
http://www.microsoft.com/whdc/DevTools/ddk/default.mspx
..which probably works just fine also for later versions.
Have tested it for about a month now and seems to be just fine building Win XP device driver ;-).
There's documentation to go with the DDK too:
Have got it too.
There are also some interesting docs on the toaster:
http://www.microsoft.com/whdc/driver/foundation/toastersamp.mspx
http://download.microsoft.com/download/3/5/a/35a609bf-872a-4eb8-a0d6-a3e026f... Google HTML: http://209.85.135.104/search?q=cache:wRLrFGamtKYJ:download.microsoft.com/dow...
http://download.microsoft.com/download/1/8/f/18f8cee2-0b64-41f2-893d-a6f2295... Google HTML: http://209.85.135.104/search?q=cache:H-hy11oPjsIJ:download.microsoft.com/dow...
The latter one is slides about driver distribution and installation. They have tools for that too:
http://www.microsoft.com/whdc/driver/install/difxtools.mspx http://www.microsoft.com/whdc/driver/install/DIFxtls.mspx http://search.microsoft.com/results.aspx?mkt=en-US&setlang=en-US&q=d...
Thx for the links.
Another good resource is:
http://www.codeproject.com/system/driverdev4asp.asp
here part 4 in a driver writing tutorial series that goes into healthy depth about technicalities, and provides code.
Have a look.
Yeah, of course. Thanks for that. I've got two books in Windows driver development as well (Walter Oney's book and the other one about Win2K WDM driver development)
Regards, --Darmawan
Roman Kononov wrote:
On 08/14/2007 07:56 AM, Peter Stuge wrote:
Roman did make a good point about IRQL but that does not eliminate the problem, we will still be changing hardware state underneath the OS and that is ALWAYS a bad idea.
The CF8/CFC sequence can preserve CF8 port value. What other hardware state would be changed?
http://www.hollistech.com/Resources/Misc%20articles/getbusdata.htm http://www.freelists.org/archives/wdmaudiodev/03-2004/msg00010.html
BTW, here is the "official" example: http://support.microsoft.com/kb/253232
Unfortunately, AFAIK, this approach does not work for cases like yours. It requires the "DeviceObject", which MUST be associated with a PARTICULAR PCI function.
Q: In the above links, among HtsReadWriteConfig() and WritePCIConfigSpace() argument lists, which arguments are bus number and device number? A: They are inside PDEVICE_OBJECT, which structure is "opaque".
Regarding how long DPC method takes. A scheduled DPC is launched as soon as the CPU's IRQL drops below DISPATCH_LEVEL. The CPU can be at DISPATCH_LEVEL (and higher) only running kernel code. This can last many time slices. This means that the DPC method might be quite expensive.
According to the official documentation, DPC is running at IRQL_DISPATCH_LEVEL. Nonetheless, I think there is still problem because only one DPC object of one "type" can exist in the system at one instance. This means we need to provide different DPC object "type" for different processor in multiprocessor environment to ensure "atomic" execution of the I/O code which seems to be an overkill and make the system too much loaded. I found this when trying to code the "DPC approach" in my latest device driver. Perhaps, using a kernel mode spinlock is a better approach. To ensure an atomic execution of the I/O operation. Seems to be the DPC approach is not the right solution for this type of problem.
I'm still working on this problem right now.
Regards, Darmawan Salihun
On 08/20/2007 10:05 AM, Darmawan Salihun wrote:
According to the official documentation, DPC is running at IRQL_DISPATCH_LEVEL. Nonetheless, I think there is still problem because only one DPC object of one "type" can exist in the system at one instance.
What do you mean by "type"?
If you have N CPUs:
Initialization: KeInitializeDpc() initializes N DPC objects, each object has the same "DeferredRoutine". KeSetTargetProcessorDpc() ties each DPC object with a CPU.
PCI Configuration Sequence: IRQL is raised to DISPATCH_LEVEL. KeGetCurrentProcessorNumber() tells which CPU it is. KeInsertQueueDpc() schedules the N-1 DPCs to run. All but this CPU are scheduled. Spin waiting for the DeferredRoutines to enter phase A. IRQL is raised to HIGH_LEVEL. The DeferredRoutins are flagged to enter phase B. Spin waiting for the DeferredRoutines to enter phase B. Do the 3f8/3fc business. The DeferredRoutins are flagged to finish. Restore the IRQL.
DeferredRoutine: Indicate that the phase is phase A. Spin waiting for being flagged to enter phase B. Raise IRQL to HICH_LEVEL. Indicate that the phase is phase B. Spin waiting for being flagged to finish. Restore the IRQL.
This means we need to provide different DPC object "type" for
different processor in multiprocessor environment to ensure "atomic" execution of the I/O code which seems to be an overkill and make the system too much loaded.
What is "too much"?
I found this when trying to code the "DPC
approach" in my latest device driver. Perhaps, using a kernel mode spinlock is a better approach. To ensure an atomic execution of the I/O operation.
A spin lock does not help. A CPU, which has acquired a spin lock, does not prevent another CPU to mess with the cf8/cfc registers. You need to lock all other CPUs.
Seems to be the DPC approach is not the right solution for
this type of problem.
You are the man, you make the decision.
Regards,
Roman
Roman Kononov wrote:
On 08/20/2007 10:05 AM, Darmawan Salihun wrote:
According to the official documentation, DPC is running at IRQL_DISPATCH_LEVEL. Nonetheless, I think there is still problem because only one DPC object of one "type" can exist in the system at one instance.
What do you mean by "type"?
DPC objects that has the same "Deferred Routine"
If you have N CPUs:
Initialization: KeInitializeDpc() initializes N DPC objects, each object has the same "DeferredRoutine". KeSetTargetProcessorDpc() ties each DPC object with a CPU.
I might have made a wrong assumption previously. I thought that a DPC object that has the same "Deferred Routine" won't be queued twice in the system's DPC queue. Nonetheless, this might not be the case for multiprocessor machine because every processor has its own DPC queue which implies that DPC objects with the same "Deferred Routine" can be queued in different processor's DPC queue without anyone of them being rejected. In single processor machine, a DPC object with the same "Deferred Routine" cannot be queued twice because the second request to queue the DPC will be rejected by the kernel.
PCI Configuration Sequence: IRQL is raised to DISPATCH_LEVEL. KeGetCurrentProcessorNumber() tells which CPU it is. KeInsertQueueDpc() schedules the N-1 DPCs to run. All but this CPU are scheduled. Spin waiting for the DeferredRoutines to enter phase A. IRQL is raised to HIGH_LEVEL. The DeferredRoutins are flagged to enter phase B. Spin waiting for the DeferredRoutines to enter phase B. Do the 3f8/3fc business.
OK, this is parallel port, right? Mine would be CF8/CFC
The DeferredRoutins are flagged to finish. Restore the IRQL.
DeferredRoutine: Indicate that the phase is phase A. Spin waiting for being flagged to enter phase B. Raise IRQL to HICH_LEVEL. Indicate that the phase is phase B. Spin waiting for being flagged to finish. Restore the IRQL.
This means we need to provide different DPC object "type" for
different processor in multiprocessor environment to ensure "atomic" execution of the I/O code which seems to be an overkill and make the system too much loaded.
What is "too much"?
I implied that every DPC objects are being allocated from the kernel non-paged pool memory which can degrade system performance if we are overusing it. But, this might not be the case.
I found this when trying to code the "DPC
approach" in my latest device driver. Perhaps, using a kernel mode spinlock is a better approach. To ensure an atomic execution of the I/O operation.
A spin lock does not help. A CPU, which has acquired a spin lock, does not prevent another CPU to mess with the cf8/cfc registers. You need to lock all other CPUs.
Yes, locking all of the CPU in a spinlock is what I'm thinking about. But, there possibly other alternative.
Seems to be the DPC approach is not the right solution for
this type of problem.
You are the man, you make the decision.
OK, thanks.
Regards,
Darmawan Salihun
Darmawan Salihun wrote:
I might have made a wrong assumption previously. I thought that a DPC object that has the same "Deferred Routine" won't be queued twice in the system's DPC queue. Nonetheless, this might not be the case for multiprocessor machine because every processor has its own DPC queue which implies that DPC objects with the same "Deferred Routine" can be queued in different processor's DPC queue without anyone of them being rejected. In single processor machine, a DPC object with the same "Deferred Routine" cannot be queued twice because the second request to queue the DPC will be rejected by the kernel.
I think that you can initialize multiple DPC objects with the same routine and enqueue them all. It is a single DPC object which cannot be enqueued many times.
Regards, Roman
So, previous testbed motherboard is dead because of a spike :-(
I've bought an AMD690G system for further development. Nonetheless, there's a problem with the datasheets. The SB600 datasheet from AMD documents informs nothing about the PCI registers pertaining to "flash enable". The only solution is to reverse-engineer a working solution, i.e. Award Winflash to find out about it because it supports the platform. I need this because I need to test my further Winflashrom code in my testbed prior to releasing it. Unless someone would donate a motherboard with an already supported chipset ;-).
My question is, how can I provide you guys with a clean source code that would be legal?
Should I be producing a document and someone else here code it for me and others? ( I think this is what "clean room reverse engineering", right?) Or, there is a better solution.
Regards,
Darmawan Salihun -------------------------------------------------------------------- -= Human knowledge belongs to the world =-
On 19.09.2007 07:34, Darmawan Salihun wrote:
I've bought an AMD690G system for further development. Nonetheless, there's a problem with the datasheets. The SB600 datasheet from AMD documents informs nothing about the PCI registers pertaining to "flash enable".
Since AMD has released detailed datasheets for a few ATI graphics cards in the last few days, I expect detailed SB600 datasheets may be on the horizon as well if we ask nicely.
@AMD: Is there any information missing from the public SB600 data sheets?
However, it is quite possible that SB600 has no flash enable and this is entirely managed by GPIOs on the SuperIO.
The only solution is to reverse-engineer a working solution, i.e. Award Winflash to find out about it because it supports the platform. I need this because I need to test my further Winflashrom code in my testbed prior to releasing it. Unless someone would donate a motherboard with an already supported chipset ;-).
My question is, how can I provide you guys with a clean source code that would be legal?
I have quite some experience with clean room reverse engineering. Back then, it was the Nvidia network driver where we (Andrew de Quincey and me) wrote a hardware data sheet from the binary driver and someone else implemented forcedeth just by looking at the data sheet we had written.
Should I be producing a document and someone else here code it for me and others?
Generally, if you intend to work on the code later on or work on winflashrom at all, you should make sure somebody else does the reversing and writes the data sheet. That way, you are free to implement a clean solution from the data sheet.
( I think this is what "clean room reverse engineering", right?)
Yes, but the recommendations above apply.
Regards, Carl-Daniel
On 9/23/07, Carl-Daniel Hailfinger c-d.hailfinger.devel.2006@gmx.net wrote:
On 19.09.2007 07:34, Darmawan Salihun wrote:
I've bought an AMD690G system for further development. Nonetheless,
there's
a problem with the datasheets. The SB600 datasheet from AMD documents informs nothing about the PCI registers pertaining to "flash enable".
Since AMD has released detailed datasheets for a few ATI graphics cards in the last few days, I expect detailed SB600 datasheets may be on the horizon as well if we ask nicely.
@AMD: Is there any information missing from the public SB600 data sheets?
However, it is quite possible that SB600 has no flash enable and this is entirely managed by GPIOs on the SuperIO.
I see.
The only solution is to reverse-engineer a working solution, i.e. Award
Winflash to find out about it because it supports the platform. I need this because I need to test my further Winflashrom code in my testbed prior to releasing it. Unless someone would donate a motherboard with an already supported chipset ;-).
My question is, how can I provide you guys with a clean source code that would be legal?
I have quite some experience with clean room reverse engineering. Back then, it was the Nvidia network driver where we (Andrew de Quincey and me) wrote a hardware data sheet from the binary driver and someone else implemented forcedeth just by looking at the data sheet we had written.
Should I be producing a document and someone else here code it for me
and
others?
Generally, if you intend to work on the code later on or work on winflashrom at all, you should make sure somebody else does the reversing and writes the data sheet. That way, you are free to implement a clean solution from the data sheet.
I see. So, I should wait for someone else who would like to do that for the rest of us here because actually I am more to writing code for winflashrom than the reverse engineering. ( even if I'd love to reverse it myself :-) -> we should stick to the rules )
( I think this is what "clean room reverse engineering", right?)
Yes, but the recommendations above apply.
Thank you very much for the insight ;-)
Regards,
Darmawan Salihun -------------------------------------------------------------------- -= Human knowledge belongs to the world =-
On 23/09/07 13:49 +0200, Carl-Daniel Hailfinger wrote:
On 19.09.2007 07:34, Darmawan Salihun wrote:
I've bought an AMD690G system for further development. Nonetheless, there's a problem with the datasheets. The SB600 datasheet from AMD documents informs nothing about the PCI registers pertaining to "flash enable".
Since AMD has released detailed datasheets for a few ATI graphics cards in the last few days, I expect detailed SB600 datasheets may be on the horizon as well if we ask nicely.
I believe we've answered this before - but again, we're just engineers much like you all. The folks that make these decisions are far removed from us. In fact, we would like these specifications as much as you; since we are open source engineers, we try to stay away from information that may harm our ability to freely contribute to the public code.
I know they are aware of the demand for these datasheets, but there are other forces at work that preclude us just asking nicely, for now. I promise that once any information becomes available that will interest this group, you will be the first to know.
Jordan
-- Jordan Crouse Systems Software Development Engineer Advanced Micro Devices, Inc.
On 24.09.2007 16:46, Jordan Crouse wrote:
On 23/09/07 13:49 +0200, Carl-Daniel Hailfinger wrote:
On 19.09.2007 07:34, Darmawan Salihun wrote:
I've bought an AMD690G system for further development. Nonetheless, there's a problem with the datasheets. The SB600 datasheet from AMD documents informs nothing about the PCI registers pertaining to "flash enable".
Since AMD has released detailed datasheets for a few ATI graphics cards in the last few days, I expect detailed SB600 datasheets may be on the horizon as well if we ask nicely.
I believe we've answered this before - but again, we're just engineers much like you all. The folks that make these decisions are far removed from us.
Unfortunately that's almost always the problem: Engineers aren't allowed to decide.
In fact, we would like these specifications as much as you; since we are open source engineers, we try to stay away from information that may harm our ability to freely contribute to the public code.
I know they are aware of the demand for these datasheets, but there are other forces at work that preclude us just asking nicely, for now. I promise that once any information becomes available that will interest this group, you will be the first to know.
Thank you so much for your efforts!
Regards, Carl-Daniel
On 9/24/07, Jordan Crouse jordan.crouse@amd.com wrote:
On 23/09/07 13:49 +0200, Carl-Daniel Hailfinger wrote:
On 19.09.2007 07:34, Darmawan Salihun wrote:
I've bought an AMD690G system for further development. Nonetheless,
there's
a problem with the datasheets. The SB600 datasheet from AMD documents informs nothing about the PCI registers pertaining to "flash enable".
Since AMD has released detailed datasheets for a few ATI graphics cards in the last few days, I expect detailed SB600 datasheets may be on the horizon as well if we ask nicely.
I believe we've answered this before - but again, we're just engineers much like you all. The folks that make these decisions are far removed from us. In fact, we would like these specifications as much as you; since we are open source engineers, we try to stay away from information that may harm our ability to freely contribute to the public code.
I know they are aware of the demand for these datasheets, but there are other forces at work that preclude us just asking nicely, for now. I promise that once any information becomes available that will interest this group, you will be the first to know.
That's great. Thank you very much.
Kind Regards,
Darmawan Salihun
-------------------------------------------------------------------- -= Human knowledge belongs to the world =-
Roman Kononov wrote:
Darmawan Salihun wrote:
As I couldn't find any better replacement to the PCI direct I/O function at the moment (which is needed in chipset_enable and board_enable), I will try to implement them using a quite old kernel mode API, i.e. HalGetBusData and HalSetBusData. I hope the kernel "emulates" these functions so that it works like in NT4 and Windows 2000.
A more "portable" way:
Make all CPUs spinning inside a DPC function: KeQueryActiveProcessors() is used to get all processors. KeSetTargetProcessorDpc() is used to tie the DPC function and particular SMP processors. When all CPUs are spinning inside the function, one of them does IO (CF8/CFC) to the PCI configuration space. After the IO completes, let all DPC functions finish.
Anyway, in your experience. Is this approach takes quite a lot "time slice" in the kernel space?
Regards, Darmawan
On 7/25/07, Stefan Reinauer stepan@coresystems.de wrote:
- Darmawan Salihun darmawan.salihun@gmail.com [070724 06:11]:
I'm currently completing the MMIO mapping routine in the device driver, but I have a problem as follows:
The current driver doesn't limit the application to map certain MMIO phy address range even if it's already mapped by another routine in the same application. I actually plan to "lock" the mapped MMIO phy address range so that once it's mapped, it must be unmapped first before another routine can map that range. Actually, this is to avoid two routine from manipulating the same area at the same time.
My question is, do you think we really need the "lock-ing" mechanism in the driver? I just think it is a potential problem if the MMIO access is not "lock-ed".
Since the places where mmap happens are only a few I'd say it is not really "needed". It's a good idea though as it is a safety measure against application bugs.
The MMIO mapping routine completed. I need more testing though.
Anyway, after evaluating all of the possible development path last week, I came to conclusion that the robustness of winflashrom code and its evolution in to the future will be guaranteed if the architecture is as follows: 1. PCI detection routine in Windows is in "user mode" and parsing the data from Windows device "database". 2. The driver will contain the chipset_enable and board_enable routine as they are doing direct IO transactions which should be hidden from the user mode application as much as possible to mitigate possible exploit in the future. It will ease migration into WDK (ofr Windows Vista support in the future) as well. 3. As before, user mode application part of winflashrom will be compiled with MinGW and its driver will be compiled with Microsoft's native device driver compiler that comes with their driver development kit. I decided that this is the best approach because it will be problematic when we migrate to Windows Vista and take a different route.
Anyway, the architecture above will make the winflashrom significantly differ from flashrom. However, "unification" with the flashrom base code is not impossible but I think it's better to do it gradually. Peter's idea of using a "simple protocol" to exchange data between the application and the driver for the chipset_enable and board_enable routine sounds solid at the moment. I will implement it for the rest of Google Summer of Code coding phase and I think we will have a working code in about two weeks from now. However, I'm still looking for other ideas in this matter.
Regards,
-- Darmawan Salihun a.k.a Pinczakko -------------------------------------------------------------------- -= Human knowledge belongs to the world =-
On 7/13/07, Stefan Reinauer stepan@coresystems.de wrote:
...
0xE1 0xEB 0xFF
Wow, what are these?
I finally figure out that those ports are the "magic" board enable ports
in one of ASUS board (IIRC Asus P5-something).
-- Darmawan Salihun a.k.a Pinczakko -------------------------------------------------------------------- -= Human knowledge belongs to the world =-
Peter Stuge wrote:
There are still a few issues with this code.
- I've tested it on ICH-5 board. However, the BIOS binary that it
reads from it is 514KB. It should be only 512KB. I'm still trying to find out the bug. Any info from ICH-x code maintainer?
This is probably because Win32 does LF->CR+LF translation when the fopen() mode is "r" or "w" - Win32 introduces "rb" and "wb" for binary files.
Anyway, I have a question about the rom layout file (rom.layout). Is it being read as binary or text file? Looking at the code and the rom.layout itself (with hex editor) suggests that it's read as text file.
--Darmawan MS
* Darmawan Salihun darmawan.salihun@gmail.com [070713 11:29]:
Anyway, I have a question about the rom layout file (rom.layout). Is it being read as binary or text file? Looking at the code and the rom.layout itself (with hex editor) suggests that it's read as text file.
It's a text file.