Blog

Iterating Past EDR: Reading the Threshold (Part 2)

Introduction

This is the second part of a two-part series on using an AI coding assistant as a reasoning partner in detection bypass development. Part 1 introduced the iterative bypass loop and walked through three categories of detection, static machine learning, memory scanning, and call stack analysis, that were resolved against Elastic Defend 9.3.1 using that workflow. At the end of Part 1, the majority of forty initial alerts had been eliminated. One remained: Suspicious System Module Image Hollowing.

This part examines that final alert in detail, walks through the approaches that did not work, and validates the result against the EDR’s prevent-mode enforcement layer. The validation produces a finding about how deeply the bypass operates within the product’s detection stack.

Reading the Threshold

The remaining alert presented a different kind of problem. The rule fires when a process modifies the .text section of a system DLL that has been mapped SEC_IMAGE, which is precisely what module stomping does. There is no architectural way to perform module stomping that does not satisfy the conceptual condition the rule is checking. Resolving this alert required examining the rule’s actual implementation rather than reasoning about its description.

The rule lives in the defense_evasion_suspicious_system_module_image_hollowing.toml file in Elastic’s public protections-artifacts repository. Its EQL query is reproduced below.

> > The Elastic behavioral rule for Suspicious System Module Image Hollowing. The clause that matters is the size threshold on process.Ext.api.parameters.size.

The query combines several conditions, but the relevant one for this analysis is process.Ext.api.parameters.size >= 10000. The parameters.size field corresponds to the dwSize argument of VirtualProtect. The rule fires only when a single VirtualProtect call covers ten thousand bytes or more. This threshold is deliberate: it filters out the small protection changes that Windows performs during normal DLL loading, which would otherwise produce continuous false positives. The threshold catches the typical case of a payload making one large VirtualProtect call to flip an entire .text section to executable.

The bypass follows directly from the threshold. Rather than performing one VirtualProtect call covering the entire stomped section, the loader chunks the operation into 4KB page-aligned calls. Each call covers exactly one page. Each call passes a dwSize of 4096, which is comfortably below the 10000-byte threshold. The protection still gets applied across the full section, but no individual API call satisfies the rule’s size condition.

for (DWORD off = 0; off < size; off += 0x1000) {
    DWORD chunk = size – off;
    if (chunk > 0x1000) chunk = 0x1000;
    VirtualProtect(base + off, chunk, new_prot, &old);
}

This is where the AI-assisted workflow produced its clearest result. The rule’s TOML file was fed directly into the assistant’s conversation alongside the loader’s source tree. The assistant parsed the EQL query, identified parameters.size >= 10000 as the exploitable gate condition, mapped the parameters.size field to the dwSize argument of VirtualProtect, and then searched the entire codebase for every location where VirtualProtect was called on memory belonging to a stomped module. It returned three call sites in three separate files: the outer loader’s section permission setup in execute.c, the inner position-independent code loader’s prepare_for_execution routine in loader.c, and the sleep masking routine that toggles protections at every sleep cycle in mask.c. Each call site was flagged with the specific function name and line range where the chunking pattern needed to be applied.

This matters because implementing the pattern in a single location was insufficient. Missing any one of these three call sites resulted in the rule firing. The sleep masking path was particularly easy to overlook, because it is not part of the initial execution flow and only runs during the implant’s sleep cycle. A manual search would likely have found the two obvious locations, missed the third, and produced a result that looked like the fix was wrong rather than incomplete. The assistant’s ability to hold both the rule logic and the full codebase in context simultaneously is what made the difference between a partial fix and a complete one.

After applying the chunking pattern across all three locations and rebuilding, execution produced zero alerts.

> > The Kibana alerts table for the execution window after applying the chunked VirtualProtect pattern.

Approaches That Did Not Work

Not every hypothesis led to a reduction in alerts. Several approaches were tested and abandoned or deprioritized after producing no improvement or, in one case, making the situation worse. These failures were as instructive as the successful fixes.

Changing the memory protection of executable regions from PAGE_EXECUTE_READWRITE to PAGE_EXECUTE_READ was tested on the assumption that the image hollowing rule was protection-based. It was not. The rule detects that the in-memory .text section of a SEC_IMAGE-backed DLL has been modified relative to the file on disk, regardless of the page protection applied to the region. The change to RX was retained as a defense-in-depth measure, since RX is the normal protection for .text sections and avoids drawing attention from rules that specifically look for RWX transitions, but it did not resolve the image hollowing alert.

Stomping a non-System32 DLL was tested on the assumption that the “System Module” qualifier in the rule name implied the rule only applied to DLLs in C:\Windows\System32\. A .NET Framework DLL (clrjit.dll) was used as the stomp target. A different rule variant fired instead. Two Elastic rules cover image hollowing with overlapping scope: one checks call stack patterns, the other checks target path and content divergence. Switching the stomp target changed which rule fired, but did not eliminate the detection.

The most informative failure was the phantom PE allocator. This technique constructs a minimal valid PE with the payload as its .text section, writes it to a temporary file, maps it as SEC_IMAGE, and deletes the backing file. Because the mapped memory is an exact copy of the file that produced it, there is no modification to detect. There is no divergence between memory and disk. The approach should, in theory, eliminate the image hollowing detection entirely. It did eliminate the detection on the outer stomp. Removing the outer module stomp exposed an inner stomp that had been shielded by it. The alert count increased from one to two rather than decreasing to zero. The outer stomp had been absorbing the detection, and the PICO sleep masking layer was encrypting the inner stomp’s memory before the next scan cycle completed. Removing the outer layer removed the shield.

This interaction was not predicted in advance. The assistant was able to explain it because it had maintained a running model of which technique was defeating which detection across all prior iterations. When the alert count went up instead of down, it connected the new alerts to the inner stomp, identified the timing dependency on the PICO masking cycle, and traced the shielding relationship back to the outer stomp. Reaching that diagnosis manually would have required re-testing each layer in isolation, a process that could easily consume an entire session on its own.

Validation Against Prevent Mode

A clean alerts table is necessary but not sufficient evidence of a successful bypass. EDR products commonly support prevent-mode policies that allow the agent to take silent enforcement action—killing processes, blocking memory operations, quarantining files—without surfacing the action as an alert in the security analytics workflow. A loader that produces no alerts in detect mode could still be terminated, blocked, or otherwise neutralized in prevent mode.

The assistant raised this gap after the initial zero-alert result and proposed a validation methodology: switch the EDR to prevent mode, run a known-bad control test first to confirm the policy was active and enforcing, then re-run the loader and compare. It also identified the specific Elasticsearch indices that would need to be queried to check for silent enforcement events (logs-endpoint.alerts-* for file and memory preventions, .alerts-security.alerts-default for behavioral rule actions) and constructed the API queries to check both.

To execute this, the EDR’s policy was reconfigured to set memory_protection, malware, and behavior_protection to prevent mode for the test host via the Fleet API. Before testing the loader, the control test was performed. A current build of mimikatz was downloaded to the host. Elastic Defend quarantined the file at write time, producing a malicious_file event in the logs-endpoint.alerts-* index. This confirmed that prevent mode was active on the agent and that the EDR was willing to take enforcement action against threats it recognized.

Forty-two seconds after the mimikatz prevention event, on the same host under the same policy, the production-configured loader was executed. The implant callback established successfully. In-process commands executed normally. Querying both .alerts-security.alerts-default and logs-endpoint.alerts-* for the test window returned only the mimikatz prevention events from the control test. No alerts from the loader appeared in either index.

The assistant then suggested querying one layer deeper. Elastic Defend’s user-mode API monitor writes telemetry events to the logs-endpoint.events.api-* index when it observes API calls that match its behavior signatures. These events include behavior tags such as hollow_image that the behavioral rules consume to make detection decisions. If the API monitor was tagging the loader’s VirtualProtect calls but the prevent-mode policy was choosing not to act on them, the tags would still appear in this index. Querying it for the loader’s process ID returned zero events. Querying it for hollow_image events from any process returned only legitimate WriteProcessMemory events from lsass.exe performing standard Windows operations. Across 822 API events recorded in the index over a 24-hour window, no VirtualProtect calls were tagged at all, and none originated from the loader.

This result indicates that the chunked VirtualProtect bypass operates at a depth below the rule’s size threshold. The behavior tagger that produces hollow_image events runs on each VirtualProtect call, examines the operation’s characteristics, and determines whether the call is consistent with image hollowing. With each call modifying only a single 4KB page, the tagger does not apply the hollow_image signature. No telemetry event is produced. No rule has any input on which to evaluate. No prevent-mode action can be taken, because prevent-mode action requires the API monitor to first identify an operation as suspicious. The bypass operates below the threshold at which the product evaluates calls as worth tagging in the first place.

This is a stronger result than simply suppressing alerts. Detect-mode bypass means the product saw the activity and chose not to alert. Prevent-mode bypass at the tagging layer means the product never recognized the activity as worth examining in the first place. The two outcomes are not equivalent, and the distinction matters for any team relying on this class of bypass against current Elastic Defend deployments.

What This Approach Reveals About Layered EDR

The most useful artifact of this work is not the specific bypass. It is what the work revealed about how layered detection products actually behave when they are tested against in detail.

Detection rules contain operational details that are almost never visible from the outside. The parameters.size >= 10000 threshold is the clearest example. The rule’s name and description give no indication that a size gate exists, much less that the gate sits at exactly ten thousand bytes. A bypass developer working from rule names and product documentation alone would have no path to the chunked VirtualProtect solution, because the relevant detail does not exist outside the rule’s source code. For products that publish their detection logic, time spent reading the actual queries is consistently recovered in faster, more targeted bypass development. The published rule is the most valuable piece of intelligence available about the product.

Layered evasion architectures produce interactions that are difficult to predict and easy to misattribute. As the phantom PE experiment described above demonstrated, removing one evasion layer can expose another that was being shielded by it. Reasoning about layered configurations requires tracking not only which technique defeats which detection, but which techniques depend on others for cover.

The depth at which a bypass operates within the product’s detection stack matters as much as whether it works. A bypass that suppresses an alert at the rule layer is a different kind of result than a bypass that prevents a behavior tag from being applied at the API monitor layer. The former relies on the rule’s logic remaining unchanged. The latter is unaffected by rule updates that occur above the tagging layer. Understanding which depth a given bypass operates at requires querying the product’s underlying telemetry indices, not just its alert workflows. A clean alerts dashboard answers the wrong question. The right question is whether the product’s instrumentation noticed the activity at all. It is worth noting that these results were validated against Elastic Defend 9.3.1 specifically. Thresholds, tagging logic, and behavioral rule conditions can change between product versions. The specific bypass described here should be expected to have a shelf life, even if the methodology that produced it does not.

The analytical work in this kind of development is the bottleneck, not the implementation. Writing the chunked VirtualProtect loop took less than a minute. Identifying the threshold that made it necessary, locating every call site that needed to be modified, and confirming the bypass operated below the tagging layer rather than above the alert layer took most of the iteration cycles. Tools that accelerate the analytical step provide proportionally more leverage than tools that accelerate code production, because the analytical step is where the actual cost lives. This is the reason an AI coding assistant with simultaneous visibility into rule source and loader source compressed the bypass development cycle so dramatically. The constraint was never the code.

The work described in this post was conducted entirely against Elastic Defend. The methodology, reading published detection rules, identifying exploitable conditions, and iterating with AI-assisted analysis, is not product-specific and should transfer to any environment where detection logic is accessible. Whether the specific bypasses transfer is a separate question. Products like CrowdStrike and SentinelOne do not publish their behavioral rule logic, which limits the applicability of rule-level analysis. Products that do publish, including Microsoft Sentinel and Splunk ESCU, would be amenable to the same approach. Validating the methodology against additional products is an area for future work.