With the first two patches applied, I am able to install and boot FreeDOS on an Intel SSD DC P3700, an NVMe 1.0 device, in QEMU using PCI device assignment.
The rest of the patches are minor issues found through inspection.
Rather than using the Identify command with CNS 01b (GET_NS_LIST), which was added in NVMe 1.1, we can just enumerate all of the possible namespace IDs.
The relevant part of the NVMe spec reads:
Namespaces shall be allocated in order (starting with 1) and packed sequentially.
Since the previously-used GET_NS_LIST only returns active namespaces, we also need a check in nvme_probe_ns() to ensure that inactive namespaces are not reported as boot devices. This can be accomplished by checking for non-zero block count - the spec indicates that Identify Namespace for an inactive namespace ID will return all zeroes.
This should have no impact on the QEMU NVMe device model, since it always reports exactly one namespace (NSID 1).
Signed-off-by: Daniel Verkamp daniel@drv.nu --- src/hw/nvme.c | 39 ++++++--------------------------------- 1 file changed, 6 insertions(+), 33 deletions(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index 31edf29..c709a3a 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -218,13 +218,6 @@ nvme_admin_identify_ctrl(struct nvme_ctrl *ctrl) return &nvme_admin_identify(ctrl, NVME_ADMIN_IDENTIFY_CNS_ID_CTRL, 0)->ctrl; }
-static struct nvme_identify_ns_list * -nvme_admin_identify_get_ns_list(struct nvme_ctrl *ctrl) -{ - return &nvme_admin_identify(ctrl, NVME_ADMIN_IDENTIFY_CNS_GET_NS_LIST, - 0)->ns_list; -} - static struct nvme_identify_ns * nvme_admin_identify_ns(struct nvme_ctrl *ctrl, u32 ns_id) { @@ -253,6 +246,10 @@ nvme_probe_ns(struct nvme_ctrl *ctrl, struct nvme_namespace *ns, u32 ns_id) }
ns->lba_count = id->nsze; + if (!ns->lba_count) { + dprintf(2, "NVMe NS %u is inactive.\n", ns_id); + goto free_buffer; + }
struct nvme_lba_format *fmt = &id->lbaf[current_lba_format];
@@ -493,29 +490,10 @@ nvme_controller_enable(struct nvme_ctrl *ctrl) if (!ctrl->ns) goto out_of_memory; memset(ctrl->ns, 0, sizeof(*ctrl->ns) * ctrl->ns_count);
- struct nvme_identify_ns_list *ns_list = nvme_admin_identify_get_ns_list(ctrl); - if (!ns_list) { - dprintf(2, "NVMe couldn't get namespace list.\n"); - goto failed; - } - /* Populate namespace IDs */ int ns_idx; - for (ns_idx = 0; - ns_idx < ARRAY_SIZE(ns_list->ns_id) - && ns_idx < ctrl->ns_count - && ns_list->ns_id[ns_idx]; - ns_idx++) { - nvme_probe_ns(ctrl, &ctrl->ns[ns_idx], ns_list->ns_id[ns_idx]); - } - - free(ns_list); - - /* If for some reason the namespace list gives us fewer namespaces, we just - go along. */ - if (ns_idx != ctrl->ns_count) { - dprintf(2, "NVMe namespace list has only %u namespaces?\n", ns_idx); - ctrl->ns_count = ns_idx; + for (ns_idx = 0; ns_idx < ctrl->ns_count; ns_idx++) { + nvme_probe_ns(ctrl, &ctrl->ns[ns_idx], ns_idx + 1); }
dprintf(3, "NVMe initialization complete!\n"); @@ -545,11 +523,6 @@ nvme_controller_setup(void *opaque) version >> 16, (version >> 8) & 0xFF, version & 0xFF); dprintf(3, " Capabilities %016llx\n", reg->cap);
- if (version < 0x00010100U) { - dprintf(3, "Need at least 1.1.0! Skipping.\n"); - return; - } - if (~reg->cap & NVME_CAP_CSS_NVME) { dprintf(3, "Controller doesn't speak NVMe command set. Skipping.\n"); return;
Hey Daniel,
thanks for the bugfixes! Much appreciated. They all look good to me.
On Thu, 2017-02-23 at 23:27 -0700, Daniel Verkamp wrote:
Rather than using the Identify command with CNS 01b (GET_NS_LIST), which was added in NVMe 1.1, we can just enumerate all of the possible namespace IDs.
That should also fix the code to work correctly on older Qemu versions (<2.7). Nice.
The relevant part of the NVMe spec reads:
Namespaces shall be allocated in order (starting with 1) and packed sequentially.
I see that sentence in my copy of the NVMe 1.0e spec, but for 1.2 it's gone. I'd assume this is no problem for any sane NVMe device.
Julian -- Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschäftsführer: Dr. Ralf Herbrich, Christian Schläger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Amazon Development Center Germany GmbH Berlin - Dresden - Aachen main office: Krausenstr. 38, 10117 Berlin Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger Ust-ID: DE289237879 Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
500 ms is not sufficient for the admin commands used during initialization on some real hardware.
Signed-off-by: Daniel Verkamp daniel@drv.nu --- src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index c709a3a..c194f9f 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -123,7 +123,7 @@ nvme_consume_cqe(struct nvme_sq *sq) static struct nvme_cqe nvme_wait(struct nvme_sq *sq) { - static const unsigned nvme_timeout = 500 /* ms */; + static const unsigned nvme_timeout = 5000 /* ms */; u32 to = timer_calc(nvme_timeout); while (!nvme_poll_cq(sq->cq)) { yield();
It looks like the intent was to exit the loop if a command failed, but the current code would actually continue looping in that case.
Signed-off-by: Daniel Verkamp daniel@drv.nu --- src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index c194f9f..97b05cb 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -571,7 +571,7 @@ nvme_cmd_readwrite(struct nvme_namespace *ns, struct disk_op_s *op, int write) u16 const max_blocks = NVME_PAGE_SIZE / ns->block_size; u16 i;
- for (i = 0; i < op->count || res != DISK_RET_SUCCESS;) { + for (i = 0; i < op->count && res == DISK_RET_SUCCESS;) { u16 blocks_remaining = op->count - i; u16 blocks = blocks_remaining < max_blocks ? blocks_remaining : max_blocks;
On 02/24/2017 03:27 AM, Daniel Verkamp wrote:
It looks like the intent was to exit the loop if a command failed, but the current code would actually continue looping in that case.
Signed-off-by: Daniel Verkamp daniel@drv.nu
Reviewed-by: Philippe Mathieu-Daudé f4bug@amsat.org
src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index c194f9f..97b05cb 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -571,7 +571,7 @@ nvme_cmd_readwrite(struct nvme_namespace *ns, struct disk_op_s *op, int write) u16 const max_blocks = NVME_PAGE_SIZE / ns->block_size; u16 i;
- for (i = 0; i < op->count || res != DISK_RET_SUCCESS;) {
- for (i = 0; i < op->count && res == DISK_RET_SUCCESS;) { u16 blocks_remaining = op->count - i; u16 blocks = blocks_remaining < max_blocks ? blocks_remaining : max_blocks;
The status code field is 8 bits wide starting at bit 1; the previous code would truncate the top bit.
Signed-off-by: Daniel Verkamp daniel@drv.nu --- src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index 97b05cb..9fda011 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -74,7 +74,7 @@ nvme_poll_cq(struct nvme_cq *cq) static int nvme_is_cqe_success(struct nvme_cqe const *cqe) { - return (cqe->status & 0xFF) >> 1 == 0; + return ((cqe->status >> 1) & 0xFF) == 0; }
On 02/24/2017 03:27 AM, Daniel Verkamp wrote:
The status code field is 8 bits wide starting at bit 1; the previous code would truncate the top bit.
Signed-off-by: Daniel Verkamp daniel@drv.nu
Reviewed-by: Philippe Mathieu-Daudé f4bug@amsat.org
src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index 97b05cb..9fda011 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -74,7 +74,7 @@ nvme_poll_cq(struct nvme_cq *cq) static int nvme_is_cqe_success(struct nvme_cqe const *cqe) {
- return (cqe->status & 0xFF) >> 1 == 0;
- return ((cqe->status >> 1) & 0xFF) == 0;
}
Signed-off-by: Daniel Verkamp daniel@drv.nu --- src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index 9fda011..60557ae 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -549,7 +549,7 @@ nvme_controller_setup(void *opaque) static void nvme_scan(void) { - // Scan PCI bus for ATA adapters + // Scan PCI bus for NVMe adapters struct pci_device *pci;
foreachpci(pci) {
On 02/24/2017 03:27 AM, Daniel Verkamp wrote:
Signed-off-by: Daniel Verkamp daniel@drv.nu
Reviewed-by: Philippe Mathieu-Daudé f4bug@amsat.org
src/hw/nvme.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index 9fda011..60557ae 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -549,7 +549,7 @@ nvme_controller_setup(void *opaque) static void nvme_scan(void) {
- // Scan PCI bus for ATA adapters
// Scan PCI bus for NVMe adapters struct pci_device *pci;
foreachpci(pci) {
On Thu, Feb 23, 2017 at 11:27:52PM -0700, Daniel Verkamp wrote:
With the first two patches applied, I am able to install and boot FreeDOS on an Intel SSD DC P3700, an NVMe 1.0 device, in QEMU using PCI device assignment.
The rest of the patches are minor issues found through inspection.
Thanks. I committed this series.
If this was tested with QEMU device passthrough, do you think we can also enable NVMe on real hardware?
-Kevin
On Thu, Mar 2, 2017 at 7:51 AM, Kevin O'Connor kevin@koconnor.net wrote:
On Thu, Feb 23, 2017 at 11:27:52PM -0700, Daniel Verkamp wrote:
With the first two patches applied, I am able to install and boot FreeDOS on an Intel SSD DC P3700, an NVMe 1.0 device, in QEMU using PCI device assignment.
The rest of the patches are minor issues found through inspection.
Thanks. I committed this series.
If this was tested with QEMU device passthrough, do you think we can also enable NVMe on real hardware?
I am not equipped to test on a complete real-hardware system, but I don't see anything that would prevent the driver from working correctly there.
-- Daniel