Next: , Previous: 386 Assembler, Up: Assembler and Code Words


6.28.5 AMD64 (x86_64) Assembler

The AMD64 assembler is a slightly modified version of the 386 assembler, and as such shares most of the syntax. Two new prefixes, .q and .qa, are provided to select 64-bit operand and address sizes respectively. 64-bit sizes are the default, so normally you only have to use the other prefixes. Also there are additional register operands R8-R15.

The registers lack the 'e' or 'r' prefix; even in 64 bit mode, rax is called ax. Additional register operands are available to refer to the lowest-significant byte of all registers: R8L-R15L, SPL, BPL, SIL, DIL.

The Linux-AMD64 calling convention is to pass the first 6 integer parameters in rdi, rsi, rdx, rcx, r8 and r9 and to return the result in rax and rdx; to pass the first 8 FP parameters in xmm0–xmm7 and to return FP results in xmm0–xmm1. So abi-code words get the data stack pointer in di and the address of the FP stack pointer in si, and return the data stack pointer in ax. The other caller-saved registers are: r10, r11, xmm8-xmm15. This calling convention reportedly is also used in other non-Microsoft OSs.

Windows x64 passes the first four integer parameters in rcx, rdx, r8 and r9 and return the integer result in rax. The other caller-saved registers are r10 and r11.

On the Linux platform, according to https://uclibc.org/docs/psABI-x86_64.pdf page 21 the registers AX CX DX SI DI R8 R9 R10 R11 are available for scratch.

The addressing modes for the AMD64 are:

     \ running word A produces a memory error as the registers are not initialised ;-)
     ABI-CODE A  ( -- )
         500        #               AX  MOV     \ immediate
             DX              AX  MOV     \ register
             200             AX  MOV     \ direct addressing
             DX  )           AX  MOV     \ indirect addressing
         40  DX  D)          AX  MOV     \ base with displacement
             DX  CX      I)  AX  MOV     \ scaled index
             DX  CX  *4  I)  AX  MOV     \ scaled index
         40  DX  CX  *4  DI) AX  MOV     \ scaled index with displacement
     
             DI              AX  MOV     \ SP Out := SP in
                                 RET
     END-CODE

Here are a few examples of an AMD64 abi-code words:

     abi-code my+  ( n1 n2 -- n3 )
     \ SP passed in di, returned in ax,  address of FP passed in si
     8 di d) ax lea        \ compute new sp in result reg
     di )    dx mov        \ get old tos
     dx    ax ) add        \ add to new tos
     ret
     end-code
     \ Do nothing
     ABI-CODE aNOP  ( -- )
            DI  )       AX      LEA          \ SP out := SP in
                                RET
     END-CODE
     \ Drop TOS
     ABI-CODE aDROP  ( n -- )
        8   DI  D)      AX      LEA          \ SPout := SPin - 1
                                RET
     END-CODE
     \ Push 5 on the data stack
     ABI-CODE aFIVE   ( -- 5 )
        -8  DI  D)      AX      LEA          \ SPout := SPin + 1
        5   #           AX  )   MOV          \ TOS := 5
                                RET
     END-CODE
     \ Push 10 and 20 into data stack
     ABI-CODE aTOS2  ( -- n n )
        -16 DI  D)      AX      LEA          \ SPout := SPin + 2
        10  #       8   AX  D)  MOV          \ TOS - 1 := 10
        20  #           AX  )   MOV          \ TOS := 20
                                RET
     END-CODE
     \ Get Time Stamp Counter as two 32 bit integers
     \ The TSC is incremented every CPU clock pulse
     ABI-CODE aRDTSC   ( -- TSCl TSCh )
                                RDTSC        \ DX:AX := TSC
        $FFFFFFFF #     AX      AND          \ Clear upper 32 bit AX
       0xFFFFFFFF #     DX      AND          \ Clear upper 32 bit DX
            AX          R8      MOV          \ Tempory save AX
        -16 DI  D)      AX      LEA          \ SPout := SPin + 2
            R8      8   AX  D)  MOV          \ TOS-1 := saved AX = TSC low
            DX          AX  )   MOV          \ TOS := Dx = TSC high
                                RET
     END-CODE
     \ Get Time Stamp Counter as 64 bit integer
     ABI-CODE RDTSC   ( -- TSC )
                                RDTSC        \ DX:AX := TSC
        $FFFFFFFF #     AX      AND          \ Clear upper 32 bit AX
        32  #           DX      SHL          \ Move lower 32 bit DX to upper 32 bit
            AX          DX      OR           \ Combine AX wit DX in DX
        -8  DI  D)      AX      LEA          \ SPout := SPin + 1
            DX          AX  )   MOV          \ TOS := DX
                                RET
     END-CODE
     VARIABLE V
     
     \ Assign 4 to variable V
     ABI-CODE V=4 ( -- )
            BX                  PUSH         \ Save BX, used by gforth
        V   #           BX      MOV          \ BX := address of V
        4   #           BX )    MOV          \ Write 4 to V
            BX                  POP          \ Restore BX
            DI  )       AX      LEA          \ SPout := SPin
                                RET
     END-CODE
     VARIABLE V
     
     \ Assign 5 to variable V
     ABI-CODE V=5 ( -- )
        V   #           CX      MOV          \ CX := address of V
        5   #           CX )    MOV          \ Write 5 to V
        DI )            AX      LEA          \ SPout := SPin
                                RET
     END-CODE
     ABI-CODE TEST2  ( -- n n )
        -16 DI  D)  AX          LEA          \ SPout := SPin + 2
        5   #       CX          MOV          \ CX := 5
        5   #       CX          CMP
        0= IF
            1   #   8   AX  D)      MOV      \ If CX = 5 then TOS - 1 := 1  <--
        ELSE
            2   #   8   AX  D)      MOV      \ else TOS - 1 := 2
        THEN
        6   #       CX          CMP
        0= IF
            3   #       AX  )       MOV      \ If CX = 6 then TOS := 3
        ELSE
            4   #       AX  )       MOV      \ else TOS := 4  <--
        THEN
                                RET
     END-CODE
     \ Do four loops. Expect : ( 4 3 2 1 -- )
     ABI-CODE LOOP4  ( -- n n n n )
            DI          AX      MOV          \ SPout := SPin
        4   #           DX      MOV          \ DX := 4  loop counter
        BEGIN
            8   #           AX      SUB      \ SP := SP + 1
                DX          AX  )   MOV      \ TOS := DX
            1   #           DX      SUB      \ DX := DX - 1
        0= UNTIL
                                RET
     END-CODE

Here's a AMD64 example that deals with FP values:

     abi-code my-f+  ( r1 r2 -- r )
     \ SP passed in di, returned in ax,  address of FP passed in si
     si )       dx mov         \ load fp
     8 dx d)  xmm0 movsd       \ r2
     dx )     xmm0 addsd       \ r1+r2
     xmm0  8 dx d) movsd       \ store r
     8 #      si ) add         \ update fp
     di         ax mov         \ sp into return reg
     ret
     end-code