On Sat, Oct 03, 2020 at 05:40:58AM +0000, Graf (AWS), Alexander wrote:
Hey Kevin,
Am 03.10.2020 um 03:47 schrieb Kevin O'Connor kevin@koconnor.net:
On Wed, Sep 30, 2020 at 11:10:56PM +0200, Alexander Graf wrote: Some NVMe controllers only support small maximum request sizes, such as the AWS EBS NVMe implementation which only supports NVMe requests of up to 32 pages (256kb) at once.
BIOS callers can exceed those request sizes by defining sector counts above this threshold. Currently we fall back to the bounce buffer implementation for those. This is slow.
This patch introduces splitting logic to the NVMe I/O request code so that every NVMe I/O request gets handled in a chunk size that is consumable by the NVMe adapter, while maintaining the fast path PRPL logic we just introduced.
Signed-off-by: Alexander Graf graf@amazon.com
src/hw/nvme.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/src/hw/nvme.c b/src/hw/nvme.c index b92ca52..cc37bca 100644 --- a/src/hw/nvme.c +++ b/src/hw/nvme.c @@ -727,6 +727,22 @@ nvme_cmd_readwrite(struct nvme_namespace *ns, struct disk_op_s *op, int write) u16 const max_blocks = NVME_PAGE_SIZE / ns->block_size; u16 i;
- /* Split up requests that are larger than the device can handle */
- if (op->count > ns->max_req_size) {
u16 count = op->count;
/* Handle the first max_req_size elements */
op->count = ns->max_req_size;
if (nvme_cmd_readwrite(ns, op, write))
return res;
/* Handle the remainder of the request */
op->count = count - ns->max_req_size;
op->lba += ns->max_req_size;
op->buf_fl += (ns->max_req_size * ns->block_size);
return nvme_cmd_readwrite(ns, op, write);
- }
Depending on the disk access, this code could run with a small stack. I would avoid recursion.
This is tail recursion, which any reasonable compiler converts into a jmp :). Take a look at the .o file - for me it did become a plain jump.
Okay, that's fine then.
Otherwise, the patch series looks okay to me. (I don't have enough knowledge of the nvme code to give a full review though).
Thanks :)
I'll leave a few more days to allow others to comment, but otherwise I'm fine with committing.
Cheers, -Kevin