Architectural Limits of Kernel-Level EDR

Notes From Building One

When I began building a kernel-level EDR, I assumed visibility was the hard part and privilege was the solution. Running in ring 0 should mean total awareness.

What I learned instead is this:

Kernel privilege gives you authority — not observability.

The limitations are architectural, not technical skill issues.

Below are the real constraints I encountered while building memory-focused detection logic (allocation → RW→RX → execution correlation).

1. There Is No Memory Lifecycle Event Stream

Windows does not expose:

  • A callback when a VAD is inserted.
  • A callback when protection changes (RW→RX).
  • A callback when a region becomes executable.

Memory manager updates its internal state silently.

If you want memory lifecycle awareness, your options are:

  • Hook low-level routines (PatchGuard risk)
  • Walk undocumented structures (fragile)
  • Poll using ZwQueryVirtualMemory
  • Infer execution via thread creation callbacks

There is no official “intent pipeline” in the kernel.

That design choice alone limits what a third-party driver can reliably build.

2. Undocumented Structures Are a Stability Trap

Trying to walk:

  • VadRoot
  • _MMVAD
  • _MMVAD_SHORT
  • _RTL_AVL_TREE

quickly reveals that none are exposed in WDK.

Yes, you can reverse them.
Yes, you can parse them.

But:

  • Offsets change across builds.
  • Layout changes break assumptions.
  • Incorrect traversal crashes the OS.

For research, that’s acceptable.

For enterprise software, it’s unacceptable.

This explains why commercial EDRs avoid deep undocumented internals. Stability beats depth.

3. PatchGuard Defines the Play Area

You cannot freely:

  • Inline hook memory manager routines
  • Patch SSDT entries
  • Modify critical kernel structures
  • Alter MSRs

Even if technically possible, PatchGuard enforces system integrity.

So your detection model must operate inside officially allowed extension points:

  • Thread notifications
  • Process notifications
  • Image load callbacks
  • Object callbacks
  • ETW

Everything else risks system termination.

Kernel development is not raw freedom. It is constrained extension.

4. Intent Detection Is Fragmented

My original model:

Allocation → Protection Change → Execution → Alert.

In practice:

  • Allocation has no direct callback.
  • Protection change has no direct callback.
  • Execution might not correlate cleanly to a new thread.
  • Existing threads can pivot execution.

The only reliable execution signal available is PsSetCreateThreadNotifyRoutine.

That gives you a fragment — not the full story.

Intent must be reconstructed from incomplete signals.

Kernel does not provide a unified behavioral graph.

5. Performance Is a Hard Boundary

Walking the entire VAD tree per process is feasible.

Doing it globally and frequently is not.

Short-interval scanning (e.g., 10ms):

  • Increases CPU overhead
  • Risks race conditions
  • Raises instability risk

Long-interval scanning (e.g., 700ms):

  • Misses short-lived RW→RX windows
  • Reduces detection precision

Deep inspection competes directly with system stability.

Commercial products must prioritize not crashing customer machines.

That constraint shapes architecture more than detection ambition.

6. The Visibility Gap

The kernel internally knows:

  • Exact memory state transitions
  • Page table updates
  • Scheduler-level execution flow
  • Internal MM bookkeeping

Your driver does not automatically gain access to all of that.

You see what:

  • APIs expose
  • Callbacks provide
  • Structures allow you to safely inspect

There is a gap between kernel authority and driver visibility.

That gap cannot be eliminated without redesigning Windows itself.

7. Why Usermode Still Exists in EDR

Kernel answers:

  • What state changed?
  • What executed?
  • What memory became executable?

Usermode answers:

  • Which API path was used?
  • Was it reflective loading?
  • Was it shellcode staging?
  • Which module initiated it?

Kernel gives state truth.
Usermode gives behavioral narrative.

Without combining both, intent inference remains incomplete.

8. The Real Constraint

The limitation is not technical capability.

It is operating system architecture.

Windows was not designed to expose a complete memory intent event stream to third-party drivers.

As a result:

  • Memory manager operations are mostly silent.
  • Undocumented internals are unstable.
  • PatchGuard enforces strict boundaries.
  • Deep inspection harms performance.

These constraints are structural.

Final Conclusion

Kernel EDR is powerful.

But it is bounded by:

  • What the OS exposes.
  • What PatchGuard allows.
  • What can be monitored without destabilizing the system.
  • What can be maintained across Windows builds.

Being in ring 0 does not mean replicating the kernel’s internal awareness.

It means working within architectural limits defined by the operating system.

That realization is what separates theoretical detection models from production-grade engineering.

Leave a Comment