Chapter 13
Inline assembler code

The built-in assembler allows you to write Intel assembler code within Object Pascal programs. It implements a large subset of the syntax supported by Turbo Assembler and Microsoft's Macro Assembler, including all 8086/8087 and 80386/80387 opcodes and all but a few of Turbo Assembler's expression operators. Moreover, the built-in assembler allows you to use Object Pascal identifiers in assembler statements.

Except for DB, DW, and DD (define byte, word, and double word), none of Turbo Assembler's directives (such as EQU, PROC, STRUC, SEGMENT, and MACRO) are supported by the built-in assembler. Operations implemented through Turbo Assembler directives, however, are largely matched by corresponding Object Pascal constructions. For example, most EQU directives correspond to constant, variable, and type declarations; the PROC directive corresponds to procedure and function declarations; and the STRUC directive corresponds to record types.

As an alternative to the built-in assembler, you can link to .OBJ files that contain external procedures and functions. See "Linking to .OBJ files" for more information.

The asm statement

The built-in assembler is accessed through asm statements, which have the form

asm statementList end

where statementList is a sequence of assembler statements separated by semicolons, end-of-line characters, or Object Pascal comments.

Comments in an asm statement must be in Object Pascal style. A semicolon does not indicate that the rest of the line is a comment.

The reserved word inline and the directive assembler are maintained for backward compatibility only. They have no effect on the compiler.

Register use

In general, the rules of register use in an asm statement are the same as those of an external procedure or function. An asm statement must preserve the EDI, ESI, ESP, EBP, and EBX registers, but can freely modify the EAX, ECX, and EDX registers. On entry to an asm statement, BP points to the current stack frame, SP points to the top of the stack, SS contains the segment address of the stack segment, and DS contains the segment address of the data segment. Except for EDI, ESI, ESP, EBP, and EBX, an asm statement can assume nothing about register contents on entry to the statement.

Assembler statement syntax

This syntax of an assembler statement is

Label: Prefix Opcode Operand1, Operand2

where Label is a label, Prefix is an assembler prefix opcode (operation code), Opcode is an assembler instruction opcode or directive, and Operand is an assembler expression. Label and Prefix are optional. Some opcodes take only one operand, and some take none.

Comments are allowed between assembler statements, but not within them. For example,

MOV AX,1 {Initial value}   { OK }
MOV CX,100 {Count}     { OK }

MOV {Initial value} AX,1;  { Error! }
MOV CX, {Count} 100    { Error! }

Labels

Labels are used in built-in assembler statements as they are in Object Pascal--by writing the label and a colon before a statement. There is no limit to a label's length, but only the first 32 characters are significant. As in Object Pascal, labels must be declared in a label declaration part in the block containing the asm statement. There is one exception to this rule: local labels.

Local labels are labels that start with an at-sign (@). They consist of an at-sign followed by one or more letters, digits, underscores, or at-signs. Use of local labels is restricted to asm statements, and the scope of a local label extends from the asm reserved word to the end of the asm statement that contains it. A local label doesn't have to be declared.

Instruction opcodes

The built-in assembler supports the following opcodes.

LOCK

REP

REPE

REPZ

REPNE

REPNZ

SEGES

SEGCS

SEGSS

SEGDS

SEGFS

SEGGS

ADC,mLeft

ADD,mLeft

AND,mLeft

AAA,mAX

AAS,mAX

AAD,mAX

AAM,mAX

BOUND,
  mNONE

BSF,mLeft

BSR,mLeft

BT

BTC,mLeft

BTR,mLeft

BTS,mLeft

CALL,mNONE

CMP

CBW,mAX

CWDE,mAX

CWD,
  <mAX,mDX>

CDQ,
  <mAX,mDX>

CLC

CLD

CLI

CMC

CMPSB,
  <mSIDI>

CMPSW,
  <mSIDI>

CMPSD,
  <mSIDI>

DAA,mAX

DAS,mAX

DEC,mLeft

DIV,mLeft

ENTER,
  mNONE

HLT

IDIV,mLeft

IMUL,mLeft

IN,mLeft

INC,mLeft

INSB,mDI

INSW,mDI

INSD,mDI

INT

INTO

IRET

IRETD

JMP

JO

JNO

JC

JB

JNAE

JNC

JAE

JNB

JE

JZ

JNE

JNZ

JBE

JNA

JA

JNBE

JS

JNS

JP

JPE

JNP

JPO

JL

JNGE

JGE

JNL

JLE

JNG

JG

JNLE

JCXZ

JECXZ

LAHF,mAX

LEA,mLeft

LEAVE,
  mNONE

LDS,mSpecial

LES,mSpecial

LFS,mSpecial

LGS,mSpecial

LSS,mSpecial

LODSB,
  <mAX,mDI>

LODSW,
  <mAX,mDI>

LODSD,
  <mAX,mDI>

LOOP,mCX

LOOPE,mCX

LOOPZ,mCX

LOOPNE,mCX

LOOPNZ,mCX

LOOPD,mCX

LOOPDE,mCX

LOOPDZ,mCX

LOOPDNE,
  mCX

LOOPDNZ,
  mCX

MOV,mLeft

MOVSX,mLeft

MOVZX,mLeft

MOVSB,
  <mSIDI>

MOVSW,
  <mSIDI>

MOVSD,
  <mSIDI>

MUL,mLeft

NEG,mLeft

NOP

NOT,mLeft

OR,mLeft

OUT

OUTSB,mSI

OUTSW,mSI

OUTSD,mSI

POP,mLeft

POPF

POPA,mSpecial

POPAD,
  mSpecial

POPFD,
  mSpecial

PUSH

PUSHF

PUSHA

PUSHAD

PUSHFD

RET

RETN

RETF

SUB,mLeft

SBB,mLeft

RCL,mLeft

RCR,mLeft

ROL,mLeft

ROR,mLeft

SAL,mLeft

SHL,mLeft

SAR,mLeft

SHR,mLeft

SHLD,mLeft

SHRD,mLeft

SAHF

SCASB,mDI

SCASW,mDI

SCASD,mDI

STC

STD

STI

STOSB,mDI

STOSW,mDI

STOSD,mDI

TEST

WAIT

XCHG,<mLeft,
  mRight>

XLAT,mAX

XOR,mLeft

SETA,mLeft

SETAE,mLeft

SETB,mLeft

SETBE,mLeft

SETC,mLeft

SETE,mLeft

SETG,mLeft

SETGE,mLeft

SETL,mLeft

SETLE,mLeft

SETNA,mLeft

SETNAE,mLeft

SETNB,mLeft

SETNBE,mLeft

SETNC,mLeft

SETNE,mLeft

SETNG,mLeft

SETNGE,mLeft

SETNL,mLeft

SETNLE,mLeft

SETNO,mLeft

SETNP,mLeft

SETNS,mLeft

SETNZ,mLeft

SETO,mLeft

SETP,mLeft

SETPE,mLeft

SETPO,mLeft

SETS,mLeft

SETZ,mLeft

ARPL

LAR,mLeft

CLTS

LGDT

SGDT

LIDT

SIDT

LLDT

SLDT

LMSW

SMSW

LSL,mLeft

LTR,mLeft

STR,mLeft

VERR

VERW

BSWAP,mLeft

XADD,mLeft

CMPXCHG,
  <mLeft,mAX>

INVD

WBINVD

INVLPG

FLD,m87

FILD,m87

FST,m87

FSTP,m87

FIST,m87

FISTP,m87

FADD,m87

FADDP,m87

FIADD,m87

FSUB,m87

FSUBP,m87

FSUBR,m87

FSUBRP,m87

FISUB,m87

FISUBR,m87

FMUL,m87

FMULP,m87

FIMUL,m87

FDIV,m87

FDIVP,m87

FDIVR,m87

FDIVRP,m87

FIDIV,m87

FIDIVR,m87

FCOM,m87

FCOMP,m87

FCOMPP,m87

FICOM,m87

FICOMP,m87

F2XM1,m87

FABS,m87

FBLD,m87

FBSTP,m87

FCHS,m87

FDECSTP,m87

FFREE,m87

FINCSTP,m87

FLD1,m87

FLDCW,m87

FLDENV,m87

FLDL2E,m87

FLDL2T,m87

FLDLG2,m87

FLDLN2,m87

FLDPI,m87

FLDZ,m87

FNOP,m87

FPREM,m87

FPATAN,m87

FPTAN,m87

FRNDINT,m87

FRSTOR,m87

FSCALE,m87

FSETPM,m87

FSQRT,m87

FTST,m87

FWAIT,m87

FXAM,m87

FXCH,m87

FXTRACT,m87

FYL2X,m87

FYL2XP1,m87

FCLEX,m87

FNCLEX,m87

FDISI,m87

FNDISI,m87

FENI,m87

FNENI,m87

FINIT,m87

FNINIT,m87

FSAVE,m87

FNSAVE,m87

FSTCW,m87

FNSTCW,m87

FSTENV,m87

FNSTENV,m87

FSTSW,m87

FNSTSW,m87

FUCOM,m87

FUCOMP,m87

FUCOMPP,m87

FPREM1,m87

FCOS,m87

FSIN,m87

FSINCOS,m87

 

 

 

For a complete description of each instruction, refer to your microprocessor documentation.

RET instruction sizing

The RET instruction opcode always generates a near return.

Automatic jump sizing

Unless otherwise directed, the built-in assembler optimizes jump instructions by automatically selecting the shortest, and therefore most efficient, form of a jump instruction. This automatic jump sizing applies to the unconditional jump instruction (JMP), and to all conditional jump instructions when the target is a label (not a procedure or function).

For an unconditional jump instruction (JMP), the built-in assembler generates a short jump (one-byte opcode followed by a one-byte displacement) if the distance to the target label is -128 to 127 bytes. Otherwise it generates a near jump (one-byte opcode followed by a two-byte displacement).

For a conditional jump instruction, a short jump (one-byte opcode followed by a one-byte displacement) is generated if the distance to the target label is -128 to 127 bytes. Otherwise, the built-in assembler generates a short jump with the inverse condition, which jumps over a near jump to the target label (five bytes in total). For example, the assembler statement

JC  Stop

where Stop isn't within reach of a short jump, is converted to a machine code sequence that corresponds to this:

JNC  Skip
JMP  Stop
Skip:

Jumps to the entry points of procedures and functions are always near.

Assembler directives

The built-in assembler supports three assembler directives: DB (define byte), DW (define word), and DD (define double word). Each generates data corresponding to the comma-separated operands that follow the directive.

The DB directive generates a sequence of bytes. Each operand can be a constant expression with a value between -128 and 255, or a character string of any length. Constant expressions generate one byte of code, and strings generate a sequence of bytes with values corresponding to the ASCII code of each character.

The DW directive generates a sequence of words. Each operand can be a constant expression with a value between -32,768 and 65,535, or an address expression. For an address expression, the built-in assembler generates a near pointer--that is, a word that contains the offset part of the address.

The DD directive generates a sequence of double words. Each operand can be a constant expression with a value between -2,147,483,648 and 4,294,967,295, or an address expression. For an address expression, the built-in assembler generates a far pointer--that is, a word that contains the offset part of the address, followed by a word that contains the segment part of the address.

The data generated by the DB, DW, and DD directives is always stored in the code segment, just like the code generated by other built-in assembler statements. To generate uninitialized or initialized data in the data segment, you should use Object Pascal var or const declarations.

Some examples of DB, DW, and DD directives follow.

asm
  DB      0FFH                         { One byte }
  DB      0,99                         { Two bytes }
  DB      'A'                          { Ord('A') }
  DB      'Hello world...',0DH,0AH     { String followed by CR/LF }
  DB      12,"Delphi"                  { Object Pascal style string }
  DW      0FFFFH                       { One word }
  DW      0,9999                       { Two words }
  DW      'A'                          { Same as DB 'A',0 }
  DW      'BA'                         { Same as DB 'A','B' }
  DW      MyVar                        { Offset of MyVar }
  DW      MyProc                       { Offset of MyProc }
  DD      0FFFFFFFFH                   { One double-word }
  DD      0,999999999                  { Two double-words }
  DD      'A'                          { Same as DB 'A',0,0,0 }
  DD      'DCBA'                       { Same as DB 'A','B','C','D' }
  DD      MyVar                        { Pointer to MyVar }
  DD      MyProc                       { Pointer to MyProc }
end;

In Turbo Assembler, when an identifier precedes a DB, DW, or DD directive, it causes the declaration of a byte-, word-, or double-word-sized variable at the location of the directive. For example, Turbo Assembler allows the following:

ByteVar  DB  ?
WordVar  DW  ?
IntVar   DD  ?
...
     MOV   AL,ByteVar
     MOV   BX,WordVar
     MOV   ECX,IntVar

The built-in assembler doesn't support such variable declarations. The only kind of symbol that can be defined in an inline assembler statement is a label. All variables must be declared using Object Pascal syntax; the preceding construction can be replaced by

var
  ByteVar: Byte;
  WordVar: Word;
  IntVar: Integer;
...
asm
  MOV   AL,ByteVar
  MOV   BX,WordVar
  MOV   ECX,IntVar
end;

Operands

Built-in assembler operands are expressions that consist of constants, registers, symbols, and operators.

Within operands, the following reserved words have predefined meanings

Table 13.1   Built-in assembler reserved words 

AH

BX

DI

EBX

ESP

OFFSET

SP

AL

BYTE

DL

ECX

FS

OR

SS

AND

CH

DS

EDI

GS

PTR

ST

AX

CL

DWORD

EDX

HIGH

QWORD

TBYTE

BH

CS

DX

EIP

LOW

SHL

TYPE

BL

CX

EAX

ES

MOD

SHR

WORD

BP

DH

EBP

ESI

NOT

SI

XOR

Reserved words always take precedence over user-defined identifiers. For example,

var
  Ch: Char;
...
asm
  MOV   CH, 1
end;

loads 1 into the CH register, not into the Ch variable. To access a user-defined symbol with the same name as a reserved word, you must use the ampersand (&) override operator:

MOV   &Ch, 1

It is best to avoid user-defined identifiers with the same names as built-in assembler reserved words.

Expressions

The built-in assembler evaluates all expressions as 32-bit integer values. It doesn't support floating-point and string values, except string constants.

Expressions are built from expression elements and operators, and each expression has an associated expression class and expression type.

Differences between Object Pascal and assembler expressions

The most important difference between Object Pascal expressions and built-in assembler expressions is that assembler expressions must resolve to a constant value--a value that can be computed at compile time. For example, given the declarations

const
  X = 10;
  Y = 20;
var
  Z: Integer;

the following is a valid statement.

asm
  MOV   Z,X+Y
end;

Because both X and Y are constants, the expression X + Y is a convenient way of writing the constant 30, and the resulting instruction simply moves of the value 30 into the variable Z. But if X and Y are variables--

var
  X, Y: Integer;

--the built-in assembler cannot compute the value of X + Y at compile time. In this case, to move the sum of X and Y into Z you would use

asm
  MOV   EAX,X
  ADD   EAX,Y
  MOV   Z,EAX
end;

In an Object Pascal expression, a variable reference denotes the contents of the variable. But in an assembler expression, a variable reference denotes the address of the variable. In Object Pascal the expression X + 4 (where X is a variable) means the contents of X plus 4, while to the built-in assembler it means the contents of the word at the address four bytes higher than the address of X. So, even though you're allowed to write

asm
  MOV   EAX,X+4
end;

this code doesn't load the value of X plus 4 into AX; instead, it loads the value of a word stored four bytes beyond X. The correct way to add 4 to the contents of X is

asm
  MOV   EAX,X
  ADD   EAX,4
end;

Expression elements

The elements of an expression are constants, registers, and symbols.

Constants

The built-in assembler supports two types of constant: numeric constants and string constants.

Numeric constants

Numeric constants must be integers, and their values must be between -2,147,483,648 and 4,294,967,295.

By default, numeric constants use decimal notation, but the built-in assembler also supports binary, octal, and hexadecimal. Binary notation is selected by writing a B after the number, octal notation by writing an O after the number, and hexadecimal notation by writing an H after the number or a $ before the number.

Numeric constants must start with one of the digits 0 through 9 or the $ character. When you write a hexadecimal constant using the H suffix, an extra zero is required in front of the number if the first significant digit is one of the digits A through F. For example, 0BAD4H and $BAD4 are hexadecimal constants, but BAD4H is an identifier because it starts with a letter.

String constants

String constants must be enclosed in single or double quotation marks. Two consecutive quotation marks of the same type as the enclosing quotation marks count as only one character. Here are some examples of string constants:

'Z'
'Delphi'
"That's all folks"
'"That''s all folks," he said.'
'100'
'"'
"'"

String constants of any length are allowed in DB directives, and cause allocation of a sequence of bytes containing the ASCII values of the characters in the string. In all other cases, a string constant can be no longer than four characters and denotes a numeric value which can participate in an expression. The numeric value of a string constant is calculated as

Ord(Ch1) + Ord(Ch2) shl 8 + Ord(Ch3) shl 16 + Ord(Ch4) shl 24

where Ch1 is the rightmost (last) character and Ch4 is the leftmost (first) character. If the string is shorter than four characters, the leftmost characters are assumed to be zero. The following table shows string constants and their numeric values.

Table 13.2   String examples and their values 

String

Value

'a'

00000061H

'ba'

00006261H

'cba'

00636261H

'dcba'

64636261H

'a '

00006120H

' a'

20202061H

'a' * 2

000000E2H

'a'-'A'

00000020H

not 'a'

FFFFFF9EH

Registers

The following reserved symbols denote CPU registers:.

Table 13.3   CPU registers

32-bit general purpose

EAX  EBX  ECX  EDX

32-bit pointer or index

ESP  EBP  ESI  EDI

16-bit general purpose

AX BX CX DX

16-bit pointer or index

SP BP SI DI

8-bit low registers

AL BL CL DL

16-bit segment registers

CS DS SS ES

 

 

32-bit segment registers

FS GS

8-bit high registers

AH BH CH DH

Coprocessor register stack

ST

When an operand consists solely of a register name, it is called a register operand. All registers can be used as register operands, and some registers can be used in other contexts.

The base registers (BX and BP) and the index registers (SI and DI) can be written within square brackets to indicate indexing. Valid base/index register combinations are [BX], [BP], [SI], [DI], [BX+SI], [BX+DI], [BP+SI], and [BP+DI]. You can also index with all the 32-bit registers--for example, [EAX+ECX], [ESP], and [ESP+EAX+5].

The segment registers (ES, CS, SS, DS, FS, and GS) are supported, but segments are normally not useful in 32-bit applications.

The symbol ST denotes the topmost register on the 8087 floating-point register stack. Each of the eight floating-point registers can be referred to using ST(X), where X is a constant between 0 and 7 indicating the distance from the top of the register stack.

Symbols

The built-in assembler allows you to access almost all Object Pascal identifiers in assembler expressions, including constants, types, variables, procedures, and functions. In addition, the built-in assembler implements the special symbol @Result, which corresponds to the Result variable within the body of a function. For example, the function

function Sum(X, Y: Integer): Integer;
begin
  Result := X + Y;
end;

could be written in assembler as

function Sum(X, Y: Integer): Integer; stdcall;
begin
  asm
    MOV   EAX,X
    ADD   EAX,Y
    MOV   @Result,EAX
  end;
end;

The following symbols cannot be used in asm statements:

The following table summarizes the kinds of symbol that can be used in asm statements.

Table 13.4   Symbols recognized by the built-in assembler 

Symbol

Value

Class

Type

Label

Address of label

Memory reference

SHORT

Constant

Value of constant

Immediate value

0

Type

0

Memory reference

Size of type

Field

Offset of field

Memory

Size of type

Variable

Address of variable

Memory reference

Size of type

Procedure

Address of procedure

Memory reference

NEAR

Function

Address of function

Memory reference

NEAR

Unit

0

Immediate value

0

@Code

Code segment address

Memory reference

0FFF0H

@Data

Data segment address

Memory reference

0FFF0H

@Result

Result variable offset

Memory reference

Size of type

With optimizations disabled, local variables (variables declared in procedures and functions) are always allocated on the stack and accessed relative to EBP, and the value of a local variable symbol is its signed offset from EBP. The assembler automatically adds [EBP] in references to local variables. For example, given the declaration

var Count: Integer;

within a function or procedure, the instruction

MOV   EAX,Count

assembles into MOV EAX,[EBP-4].

The built-in assembler treats var parameters as a 32-bit pointers, and the size of a var parameter is always 4. The syntax for accessing a var parameter is different from that for accessing a value parameter. To access the contents of a var parameter, you must first load the 32-bit pointer and then access the location it points to. For example,

function Sum(var X, Y: Integer): Integer; stdcall;
begin
  asm
    MOV   EAX,X
    MOV   EAX,[EAX]
    MOV   EDX,Y
    ADD   EAX,[EDX]
    MOV   @Result,AX
  end;
end;

Identifiers can be qualified within asm statements. For example, given the declarations

type
  TPoint = record
    X, Y: Integer;
  end;
  TRect = record
    A, B: TPoint;
  end;
var
  P: TPoint;
  R: TRect;

the following constructions can be used in an asm statement to access fields.

MOV   EAX,P.X
MOV   EDX,P.Y
MOV   ECX,R.A.X
MOV   EBX,R.B.Y

A type identifier can be used to construct variables on the fly. Each of the following instructions generates the same machine code, which loads the contents of [EDX] into EAX.

MOV   EAX,(TRect PTR [EDX]).B.X
MOV   EAX,TRect(EDX]).B.X
MOV   EAX,TRect[EDX].B.X
MOV   EAX,[EDX].TRect.B.X

Expression classes

The built-in assembler divides expressions into three classes: registers, memory references, and immediate values.

An expression that consists solely of a register name is a register expression. Examples of register expressions are AX, CL, DI, and ES. Used as operands, register expressions direct the assembler to generate instructions that operate on the CPU registers.

Expressions that denote memory locations are memory references. Object Pascal's labels, variables, typed constants, procedures, and functions belong to this category.

Expressions that aren't registers and aren't associated with memory locations are immediate values. This group includes Object Pascal's untyped constants and type identifiers.

Immediate values and memory references cause different code to be generated when used as operands. For example,

const
  Start = 10;
var
  Count: Integer;
...
asm
  MOV   EAX,Start    { MOV EAX,xxxx }
  EBX,Count        { MOV EBX,[xxxx] }
  MOV   ECX,[Start]    { MOV ECX,[xxxx] }
  MOV   EDX,OFFSET Count   { MOV EDX,xxxx }
end;

Because Start is an immediate value, the first MOV is assembled into a move immediate instruction. The second MOV, however, is translated into a move memory instruction, as Count is a memory reference. In the third MOV, the brackets convert Start into a memory reference (in this case, the word at offset 10 in the data segment). In the fourth MOV, the OFFSET operator converts Count into an immediate value (the offset of Count in the data segment).

The brackets and OFFSET operator complement each other. The following asm statement produces identical machine code to the first two lines of the previous asm statement.

asm
  MOV   EAX,OFFSET [Start]
  MOV   EBX,[OFFSET Count]
end;

Memory references and immediate values are further classified as either relocatable or absolute. Relocation is the process by which the linker assigns absolute addresses to symbols. A relocatable expression denotes a value that requires relocation at link time, while an absolute expression denotes a value that requires no such relocation. Typically, expressions that refer to labels, variables, procedures, or functions are relocatable, since the final address of these symbols is unknown at compile time. Expressions that operate solely on constants are absolute.

The built-in assembler allows you to carry out any operation on an absolute value, but it restricts operations on relocatable values to addition and subtraction of constants.

Expression types

Every built-in assembler expression has a type--or, more correctly, a size, because the assembler regards the type of an expression simply as the size of its memory location. For example, the type of an Integer variable is four, because it occupies 4 bytes. The built-in assembler performs type checking whenever possible, so in the instructions

var
  QuitFlag: Boolean;
  OutBufPtr: Word;
...
asm
  MOV   AL,QuitFlag
  MOV   BX,OutBufPtr
end;

the assembler checks that the size of QuitFlag is one (a byte), and that the size of OutBufPtr is two (a word). The instruction

MOV   DL,OutBufPtr

produces an error because DL is a byte-sized register and OutBufPtr is a word. The type of a memory reference can be changed through a typecast; these are correct ways of writing the previous instruction:

MOV   DL,BYTE PTR OutBufPtr
MOV   DL,Byte(OutBufPtr)
MOV   DL,OutBufPtr.Byte

These MOV instructions all refer to the first (least significant) byte of the OutBufPtr variable.

In some cases, a memory reference is untyped. One example is an immediate value enclosed in square brackets:

MOV   AL,[100H]
MOV   BX,[100H]

The built-in assembler permits both of these instructions, because the expression [100H] has no type--it just means "the contents of address 100H in the data segment," and the type can be determined from the first operand (byte for AL, word for BX). In cases where the type can't be determined from another operand, the built-in assembler requires an explicit typecast:

INC   BYTE PTR [100H]
IMUL  WORD PTR [100H]

The following table summarizes the predefined type symbols that the built-in assembler provides in addition to any currently declared Object Pascal types.

Table 13.5   Predefined type symbols 

Symbol

Type

BYTE

1

WORD

2

DWORD

4

QWORD

8

TBYTE

10

Expression operators

The built-in assembler provides a variety of operators. Precedence rules are different from Object Pascal; for example, in an asm statement, AND has lower precedence than the addition and subtraction operators. The following table lists the built-in assembler's expression operators in decreasing order of precedence.

Table 13.6   Precedence of built-in assembler expression operators 

Operators

Remarks

Precedence

&

 

highest

( ), [ ], ., HIGH, LOW

 

 

+, -

(unary + and -)

 

:

 

 

OFFSET, SEG, TYPE, PTR, *, /, MOD, SHL, SHR, +, -

(binary + and -)

 

NOT, AND, OR, XOR

 

lowest

The following table defines the built-in assembler's expression operators.

Table 13.7   Definitions of built-in assembler expression operators 

Operator

Description

&

Identifier override. The identifier immediately following the ampersand is treated as a user-defined symbol, even if the spelling is the same as a built-in assembler reserved symbol.

(...)

Subexpression. Expressions within parentheses are evaluated completely prior to being treated as a single expression element. Another expression can precede the expression within the parentheses; the result in this case is the sum of the values of the two expressions, with the type of the first expression.

[...]

Memory reference. The expression within brackets is evaluated completely prior to being treated as a single expression element. The expression within brackets can be combined with the BX, BP, SI, or DI registers using the plus (+) operator, to indicate CPU register indexing. Another expression can precede the expression within the brackets; the result in this case is the sum of the values of the two expressions, with the type of the first expression. The result is always a memory reference.

.

Structure member selector. The result is the sum of the expression before the period and the expression after the period, with the type of the expression after the period. Symbols belonging to the scope identified by the expression before the period can be accessed in the expression after the period.

HIGH

Returns the high-order 8 bits of the word-sized expression following the operator. The expression must be an absolute immediate value.

LOW

Returns the low-order 8 bits of the word-sized expression following the operator. The expression must be an absolute immediate value.

+

Unary plus. Returns the expression following the plus with no changes. The expression must be an absolute immediate value.

-

Unary minus. Returns the negated value of the expression following the minus. The expression must be an absolute immediate value.

+

Addition. The expressions can be immediate values or memory references, but only one of the expressions can be a relocatable value. If one of the expressions is a relocatable value, the result is also a relocatable value. If either of the expressions is a memory reference, the result is also a memory reference.

-

Subtraction. The first expression can have any class, but the second expression must be an absolute immediate value. The result has the same class as the first expression.

:

Segment override. Instructs the assembler that the expression after the colon belongs to the segment given by the segment register name (CS, DS, SS, FS, GS, or ES) before the colon. The result is a memory reference with the value of the expression after the colon. When a segment override is used in an instruction operand, the instruction is prefixed with an appropriate segment-override prefix instruction to ensure that the indicated segment is selected.

OFFSET

Returns the offset part (double word) of the expression following the operator. The result is an immediate value.

TYPE

Returns the type (size in bytes) of the expression following the operator. The type of an immediate value is 0.

PTR

Typecast operator. The result is a memory reference with the value of the expression following the operator and the type of the expression in front of the operator.

*

Multiplication. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

/

Integer division. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

MOD

Remainder after integer division. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

SHL

Logical shift left. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

SHR

Logical shift right. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

NOT

Bitwise negation. The expression must be an absolute immediate value, and the result is an absolute immediate value.

AND

Bitwise AND. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

OR

Bitwise OR. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

XOR

Bitwise exclusive OR. Both expressions must be absolute immediate values, and the result is an absolute immediate value.

Assembler procedures and functions

You can write complete procedures and functions using inline assembler code, without including a begin...end statement. For example,

function LongMul(X, Y: Integer): Longint;
asm
  MOV   EAX,X
  IMUL  Y
end;

The compiler performs several optimizations on these routines:

Assembler functions return their results as follows.