ndfltr.sys: a 32-bit offset+length wrap into a kernel oob read
on this page
part of the glaurung windows driver findings catalog. method narrative: reading all of notepad.exe with an llm.
summary
| driver | ndfltr.sys — Microsoft NetworkDirect / NDKPI RDMA provider filter |
| class | CWE-190 (integer overflow) → CWE-125 (out-of-bounds read) |
| bug | 32-bit (offset + length) bounds check wraps; use site applies the raw 64-bit offset |
| reach | unprivileged local, but only on hosts with an active RDMA / NetworkDirect provider |
| primitive | bounded kernel OOB read; reliable outcome is a wild-read bugcheck (DoS) |
| cvss 3.1 | AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H = 5.5 (medium) |
| proof | real-byte CPU emulation of the shipping driver; no live splat (no RDMA hardware) |
| disclosure | reported to MSRC (case VULN-192293); declined |
the bug
ndfltr.sys is the NetworkDirect filter driver — the kernel piece behind SMB Direct and RDMA on windows. it exposes a control device, \Device\Ndfltr (user path \\.\NetworkDirect), and two of its connection-management ioctl handlers share a bug.
both handlers take an attacker-supplied (offset, length) pair out of the input buffer, validate that the region fits inside the buffer, and then use buffer + offset as the source for a small block of “private data” that the driver transmits to the connection peer inside an InfiniBand connection-manager message. the validation and the use disagree about pointer width.
here is the NdEpReject path. the check, at 0x140015ff9:
mov ecx, [rdx+0x8] ; length
cmp ecx, 0x94 ; length is bounded (<= 0x94)
add ecx, [rdx+0xc] ; ecx = length + offset -- 32-bit add
cmp rcx, [rsp+0x38] ; compare against InputBufferLength
jbe ... ; PASS if (length + offset) <= InputBufferLength the add is 32-bit. the use, a few instructions later at 0x140016062:
mov edx, [rcx+0xc] ; re-read offset
add rdx, rcx ; src = buffer + offset -- 64-bit add
call ... ; src handed to the Reject private-data path so the attacker sets offset = 0xFFFFFFF0, length = 0x10. the check computes 0xFFFFFFF0 + 0x10 = 0x100000000, which truncates to 0 in 32 bits and sails under InputBufferLength. the use computes buffer + 0xFFFFFFF0 in 64 bits — a source pointer roughly 4 GiB past the buffer. length stays capped at 0x94, so this is a bounded read, never a write. the sibling handler NdEpAccept (→ Accept → FormatIbCmRep) has the identical shape with a 0x148-byte cap.
who can reach it
the control device is created with this SDDL:
D:P(A;;GA;;;SY)(A;;GRGWGX;;;BA)(A;;GRGW;;;WD)(A;;GR;;;RC) the (A;;GRGW;;;WD) ACE grants Everyone GENERIC_READ | GENERIC_WRITE. there is no later WdfDeviceInitAssignSDDLString override in the shipping binary. the ioctls are METHOD_BUFFERED, FILE_ANY_ACCESS — no privilege bit. so any local process can open the device and drive the handlers. no administrator required.
the catch, and it is a real one, is that the device only exists on RDMA-configured hosts. testing on a no-RDMA box established this precisely: the driver loads fine without an RDMA NIC (sc start ndfltr runs it), but the everyone-writable control device never appears, because the code that creates it runs at filter-attach-to-provider time, not at DriverEntry. the device shows up only when there is an RDMA-capable NIC with an active NetworkDirect provider bound — storage-spaces-direct, SMB Direct, azure-local with RDMA NICs. on a default desktop or a server without RDMA, the bug is simply unreachable. “unprivileged” is true; “unprivileged on a meaningful population of machines” is the narrower honest claim.
what it actually buys you
the wrap forces the offset up near 2^32, so the source lands ~4 GiB away — not at a tunable small offset into an adjacent pool allocation. that distinction is what caps the impact. the reliable outcome is that the far address is unmapped and the read faults: a kernel bugcheck, i.e. denial of service. that is the floor, and it is what the CVSS reflects (A:H, C:N).
there is an upside the report flagged but did not score: because both handlers feed that source into the CM private-data message the driver transmits to the peer, a far address that happens to be mapped would leak up to ~148 bytes of kernel memory onto the wire. that would be C:L. but it is not a controlled primitive — you do not get to pick the address, and the read is not demonstrated to land anywhere useful — so it stays an upside note, not a scored claim. no write, no code execution, no privilege escalation.
how glaurung found it
discovery followed the calibrated driver pipeline, not an open-ended prompt:
- permissive-ACL sweep. scanning the driver corpus for control devices with
Everyone-writable SDDL surfaced\\.\NetworkDirectas an unprivileged-openable attack surface — the precondition that makes any handler bug an unprivileged one. - ioctl surface + per-handler audit. glaurung lifted the dispatcher (
0x1400024a8, an(ioctl>>2)&0x3findex into a.rdatatable) and the candidate handlers to pseudo-c. an llm read flagged three sites —NdEpConnect,NdEpAccept,NdEpReject— as the integer-overflow-before-bounds-check pattern. - ground-truth disassembly. every claimed validate/use pair was re-checked against capstone disassembly of the real bytes, not the lifted c. the 32-bit
addversus 64-bitadddistinction — the entire bug — is invisible in cleaned-up pseudo-c and only legible in the instruction stream. - adversarial refutation, on one buffer. this is where the finding earned its keep. the three candidates went to independent refutation, and
NdEpConnect → SendIbCmReqfell apart: the send worker re-fetches a different WDF request buffer (WdfIoQueueRetrieveNextRequest), soNdEpConnect’s validated offset never reaches that copy. the original “fault” had been an artifact of running the validate and use cores on separate emulator instances with the pointer hand-seeded. dropped as a false positive. an earlier “3 of 3 verified” claim was wrong and was corrected before anything was sent. - real-byte emulation. the two survivors were confirmed by full-chain emulation starting at the handler entry, on a single crafted buffer, using the real
ndfltr.sysbytes: the 32-bit check wraps and passes, the same buffer reaches the use site, the computed source isbuffer + 0xFFFFFFF0, and the modeled provider read faults. a control run with a non-wrappingoffset = 0x20reads cleanly frombuffer + 0x20, proving the sink is not always-faulting.
reproduction and its limits
the proof is a tier-1 result: the shipping driver’s actual bytes executed under a CPU emulator, with the WDF and provider externals stubbed at the boundary. that is strong evidence the defect is real and the dataflow connects — but it is not a live on-target crash. the lab had no RDMA NIC, so \\.\NetworkDirect could not be instantiated and no kernel dump was captured. an honest reproduction grade states that plainly. closing it to a live bugcheck needs an RDMA-capable host (the plan on file is an OCuLink + ConnectX-3 setup) to instantiate the device and fault it for real.
disclosure
reported to the MSRC researcher portal as case VULN-192293. microsoft declined to service it. that is a defensible call: the exposure is narrow (RDMA-configured hosts only), the demonstrated primitive is a denial-of-service rather than a controlled leak, and there was no live splat. the bug is real and the dataflow is proven; it sits below the bar where microsoft commits to a fix. no public CVE exists for an ndfltr.sys OOB read, and this finding is distinct from the NDKPing.sys NULL-deref — different driver, different class.