x86 MASM32 Console IO on 64-bit Windows NT

Download source: hello.asm

Starting assembly on a modern 64-bit version of Windows has some nuances that's not readily apparent, especially to those just picking it up. Most code that you'll find online makes use of interrupt vectors, 21h for Windows and 80h for Unix. For Windows, interrupts are not available for 64-bit versions of Windows NT because NTVDM is not included. NTVDM (NT Virtual DOS Machine) allows execution of 16-bit and 32-bit DOS applications. The DPMI (DOS Protected Mode Interface) which translates 32-bit programs running in protected mode, can send calls to the DOS real mode running in 16-bit; however this service doesn't exist for 64-bit. Instead, the Windows API is required and acts as a liaison between the protection layer rings - otherwise you would need to use a DOS emulator.

Surprisingly, "hello world" on 64-bit is hard to find. I wrote up a solution which I think is fairly clean and I'll walk through some of the decisions and references used. For this I'll be using Visual Studio 2017 which already includes all the libraries necessary and MASM.

Line 1 - 3: Some simple setup; we're using 32-bit so the memory model must be flat and the calling/naming type is stdcall. More info here.

Line 4 and 5: EQU keyword denotes a constant value; the -11 isn't arbitrary and comes from the Microsoft Docs. In those same docs is the definition of GetStdHandle, which I created an equivalent prototype function for on Line 5.

Line 6 - 10: This is the function prototype for WriteConsole, which is identical to WriteConsoleA - where the A means ANSI. You'll notice that unlike GetStdHandle which has identical parameter types to my prototype, this does not. This is because not all Windows Data Types are available (at least on the MASM assembler Im using through VS2017). So instead I use general types that I know are accepted which are equivalent.

Line 11: Self explanatory at this point, here's the doc.

Line 13 - 16: The .data section setup. charsWritten is here because WriteConsole has an output (not the return) from the lpNumberOfCharsWritten parameter. The question mark (?) initializes the DD (Double Word) but doesn't set its value. lpBuffer is defined as "Hello World" with "13" and "10", which are ascii for carriage return and line feed ("\r\n"). Finally, nNumberOfCharsToWrite is 13, a static number simply by counting the characters in "Hello World" + 2 for "13" and "10" - this extra field is not really necessary because later we could just push the hardcoded number "13" onto the stack.

Line 20 - 21: This is where we start to actually call some functions. The parameter "-11" (STD_OUTPUT_HANDLE) is pushed onto the stack for our function GetStdHandle. The result of this function is put into EAX, because that's the return register.

Line 23 - 28: Remember that we're dealing with a stack, we push the arguments in "reverse". You can think of "offset" keyword as a reference from C/C++ (the & sign). This satisfies having a pointer back to charsWritten so it can be updated, instead of using an LPDWORD as stated in the docs. When EAX gets pushed, it's the sort-of-not-sneaky way of using the return value from earlier after I called GetStdHandle to pass the HANDLE.

Line 30 - 31: Self explanatory

Now in this example I manually pushed the parameters onto the stack and then used a CALL. Alternatively I could have used INVOKE and simplified my main into 3 lines of code, doing the exact same thing. Note that you can only use INVOKE with a prototype (PROTO), otherwise you must use CALL.

Something really important to note is that I pushed onto the stack, and I never popped anything back off of it. Normally if you push, you must also pop or else you will load up the stack until you get a stack overflow. In this specific case I was safe because the functions I was calling were WINAPI, and WINAPI is a macro that evaluates to an stdcall. With stdcalls, the callee is responsible for cleaning the stack - so this was done automatically for us. Below is a demonstration on that.

I started debugging my WriteConsole procedure at the third parameter, also note that I added a nop after WriteConsole so we can test the registers. At the first step, the stack pointer (ESP) is at 0x4FFE54. This means the stack pointer was at 0x4FFE54 + 0x8 = 0x4FFE5C for the first argument. With each step you can see the stack pointer decreasing by 4 bytes (WORD). When we finally reach the nop, after WriteConsole was called, we can see that ESP is pointing back to 0x4FFE5C.