Chapter 11
Memory management

This chapter explains how programs use memory and describes the internal formats of Object Pascal data types.

Delphi's memory manager

The memory manager manages all dynamic memory allocations and deallocations in a Delphi application. The New, Dispose, GetMem, ReallocMem, and FreeMem standard procedures use the memory manager, and all objects and long strings are allocated through the memory manager.

Delphi's memory manager is optimized for applications that allocate large numbers of small- to medium-sized blocks, as is typical for object-oriented applications and applications that process string data. Other memory managers, such as the implementations of GlobalAlloc, LocalAlloc, and private heap support in Windows, typically do not perform well in such situations, and would slow down an application if they were used directly.

To ensure the best performance, the memory manager interfaces directly with the Win32 virtual memory API (the VirtualAlloc and VirtualFree functions). The memory manager reserves memory from the operating system in 1-MB sections of address space, and commits memory as required in 16-KB increments. It decommits and releases unused memory in 16-KB and 1-MB sections. For smaller blocks, committed memory is further suballocated.

Memory manager blocks are always rounded upward to a 4-byte boundary, and always include a 4-byte header in which the size of the block and other status bits are stored. This means that memory manager blocks are always double-word-aligned, which guarantees optimal CPU performance when addressing the block.

The memory manager maintains two status variables, AllocMemCount and AllocMemSize, which contain the number of currently allocated memory blocks and the combined size of all currently allocated memory blocks. Applications can use these variables to display status information for debugging.

The System unit provides two procedures, GetMemoryManager and SetMemoryManager, that allow applications to intercept low-level memory manager calls. The System unit also provides a function called GetHeapStatus that returns a record containing detailed memory-manager status information. For further information about these routines, see the online Help.

Variables

Global variables are allocated on the application data segment and persist for the duration of the program. Local variables (declared within procedures and functions) reside in an application's stack. Each time a procedure or function is called, it allocates a set of local variables; on exit, the local variables are disposed of. Compiler optimization may eliminate variables earlier.

An application's stack is defined by two values: the minimum stack size and the maximum stack size. The values are controlled through the $MINSTACKSIZE and $MAXSTACKSIZE compiler directives, and default to 16,384 (16K) and 1,048,576 (1M) respectively. An application is guaranteed to have the minimum stack size available, and an application's stack is never allowed to grow larger than the maximum stack size. If there is not enough memory available to satisfy an application's minimum stack requirement, Windows will report an error upon attempting to start the application.

If an application requires more stack space than specified by the minimum stack size, additional memory is automatically allocated in 4K increments. If allocation of additional stack space fails, either because more memory is not available or because the total size of the stack would exceed the maximum stack size, an EStackOverflow exception is raised. (Stack overflow checking is completely automatic. The $S compiler directive, which originally controlled overflow checking, is maintained for backward compatibility.)

Dynamic variables created with the GetMem or New procedure are heap-allocated and persist until they are deallocated with FreeMem or Dispose.

Long strings, wide strings, dynamic arrays, variants, and interfaces are heap-allocated, but their memory is managed automatically.

Internal data formats

The following sections describe the internal formats of Object Pascal data types.

Integer types

The format of an integer-type variable depends on its minimum and maximum bounds.

Character types

A Char, an AnsiChar, or a subrange of a Char type is stored as an unsigned byte. A WideChar is stored as an unsigned word.

Boolean types

A Boolean type is stored as a Byte, a ByteBool is stored as a Byte, a WordBool type is stored as a Word, and a LongBool is stored as a Longint.

A Boolean can assume the values 0 (False) and 1 (True). ByteBool, WordBool, and LongBool types can assume the values 0 (False) or nonzero (True).

Enumerated types

An enumerated type is stored as an unsigned byte if the enumeration has no more than 256 values and the type was declared in the {$Z1} state (the default). If an enumerated type has more than 256 values, or if the type was declared in the {$Z2} state, it is stored as an unsigned word. If an enumerated type is declared in the {$Z4} state, it is stored as an unsigned double-word.

Real types

The real types store the binary representation of a sign (+ or -), an exponent, and a significand. A real value has the form

+/- significand * 2exponent

where the significand has a single bit to the left of the binary decimal point. (That is, 0 <= significand < 2.)

In the figures that follow, the most significant bit is always on the left and the least significant bit on the right. The numbers at the top indicate the width (in bits) of each field, with the leftmost items stored at the highest addresses. For example, for a Real48 value, e is stored in the first byte, f in the following five bytes, and s in the most significant bit of the last byte.

The Real48 type

A 6-byte (48-bit) Real48 number is divided into three fields:

[ s (1 bit) ] [ f (39 bits) ] [ e (8 bits) ]

If 0 < e <= 255, the value v of the number is given by

   v = (-1)s * 2(e-129) * (1.f )

If e = 0, then v = 0.

The Real48 type can't store denormals, NaNs, and infinities. Denormals become zero when stored in a Real48, while NaNs and infinities produce an overflow error if an attempt is made to store them in a Real48.

The Single type

A 4-byte (32-bit) Single number is divided into three fields:

[ s (1 bit) ] [ e (8 bits) ] [ f (23 bits) ]

The value v of the number is given by

   if 0 < e < 255, then v = (-1)s * 2(e-127) * (1.f )
   if e = 0 and f <> 0, then v = (-1)s * 2(-126) * (0.f )
   if e = 0 and f = 0, then v = (-1)s * 0
   if e = 255 and f = 0, then v = (-1)s * Inf
   if e = 255 and f <> 0, then v is a NaN

The Double type

An 8-byte (64-bit) Double number is divided into three fields:

[ s (1 bit) ] [ e (11 bits) ] [ f (52 bits) ]

The value v of the number is given by

   if 0 < e < 2047, then v = (-1)s * 2(e-1023) * (1.f )
   if e = 0 and f <> 0, then v = (-1)s * 2(-1022) * (0.f )
   if e = 0 and f = 0, then v = (-1)s * 0
   if e = 2047 and f = 0, then v = (-1)s * Inf
   if e = 2047 and f <> 0, then v is a NaN

The Extended type

A 10-byte (80-bit) Extended number is divided into four fields:

[ s (1 bit) ] [ e (15 bits) ] [ i (1 bit) ] [ f (63 bits) ]

The value v of the number is given by

   if 0 <= e < 32767, then v = (-1)s * 2(e-16383) * (i.f )
   if e = 32767 and f = 0, then v = (-1)s * Inf
   if e = 32767 and f <> 0, then v is a NaN

The Comp type

An 8-byte (64-bit) Comp number is stored as a signed 64-bit integer.

The Currency type

An 8-byte (64-bit) Currency number is stored as a scaled and signed 64-bit integer with the four least-significant digits implicitly representing four decimal places.

Pointer types

A Pointer type is stored in 4 bytes as a 32-bit address. The pointer value nil is stored as zero.

Short string types

A string occupies as many bytes as its maximum length plus one. The first byte contains the current dynamic length of the string, and the following bytes contain the characters of the string.

The length byte and the characters are considered unsigned values. Maximum string length is 255 characters plus a length byte (string[255]).

Long string types

A long string variable occupies four bytes of memory which contain a pointer to a dynamically allocated string. When a long string variable is empty (contains a zero-length string), the string pointer is nil and no dynamic memory is associated with the string variable. For a nonempty string value, the string pointer points to a dynamically allocated block of memory that contains the string value in addition to a 32-bit length indicator and a 32-bit reference count. The table below shows the layout of a long-string memory block.

Table 11.1   Long string dynamic memory layout 

Offset

Contents

-8

32-bit reference-count

-4

32-bit length indicator

0..Length - 1

character string

Length

NULL character

The NULL character at the end of a long string memory block is automatically maintained by the compiler and the built-in string handling routines. This makes it possible to typecast a long string directly to a null-terminated string.

For string constants and literals, the compiler generates a memory block with the same layout as a dynamically allocated string, but with a reference count of -1. When a long string variable is assigned a string constant, the string pointer is assigned the address of the memory block generated for the string constant. The built-in string handling routines know not to attempt to modify blocks that have a reference count of -1.

Wide string types

A wide string variable occupies four bytes of memory which contain a pointer to a dynamically allocated string. When a wide string variable is empty (contains a zero-length string), the string pointer is nil and no dynamic memory is associated with the string variable. For a nonempty string value, the string pointer points to a dynamically allocated block of memory that contains the string value in addition to a 32-bit length indicator. The table below shows the layout of a wide-string memory block.

Table 11.2   Wide string dynamic memory layout

Offset

Contents

-4

32-bit length indicator (in bytes)

0..Length - 1

character string

Length

NULL character

The string length is the number of bytes, so it is twice the number of wide characters contained in the string.

The NULL character at the end of a wide string memory block is automatically maintained by the compiler and the built-in string handling routines. This makes it possible to typecast a wide string directly to a null-terminated string.

Set types

A set is a bit array where each bit indicates whether an element is in the set or not. The maximum number of elements in a set is 256, so a set never occupies more than 32 bytes. The number of bytes occupied by a particular set is equal to

(Max div 8) - (Min div 8) + 1

where Max and Min are the upper and lower bounds of the base type of the set. The byte number of a specific element E is

(E div 8) - (Min div 8)

and the bit number within that byte is

E mod 8

where E denotes the ordinal value of the element.

Static array types

A static array is stored as a contiguous sequence of variables of the component type of the array. The components with the lowest indexes are stored at the lowest memory addresses. A multidimensional array is stored with the rightmost dimension increasing first.

Dynamic array types

A dynamic-array variable occupies four bytes of memory which contain a pointer to the dynamically allocated array. When the variable is empty (uninitialized) or holds a zero-length array, the pointer is nil and no dynamic memory is associated with the variable. For a nonempty array, the variable points to a dynamically allocated block of memory that contains the array in addition to a 32-bit length indicator and a 32-bit reference count. The table below shows the layout of a dynamic-array memory block.

Table 11.3   Dynamic array memory layout

Offset

Contents

-8

32-bit reference-count

-4

32-bit length indicator (number of elements)

0..Length * (size of element) - 1

array elements

Record types

When a record type is declared in the {$A+} state (the default), and when the declaration does not include a packed modifier, the type is an unpacked record type, and the fields of the record are aligned for efficient access by the CPU. The alignment is controlled by the type of each field. Every data type has an inherent alignment, which is automatically computed by the compiler. The alignment can be 1, 2, 4, or 8, and represents the byte boundary that a value of the type must be stored on to provide the most efficient access. The table below lists the alignments for all data types.

Table 11.4   Type alignment masks 

Type

Alignment

Ordinal types

size of the type (1, 2, 4, or 8)

Real types

2 for Real48 and Extended, 4 for all other real types

Short string types

1

Array types

same as the element type of the array.

Record types

the largest alignment of the fields in the record

Set types

size of the type if 1, 2, or 4, otherwise 1

All other types

4

To ensure proper alignment of the fields in an unpacked record type, the compiler inserts an unused byte before fields with an alignment of 2, and up to three unused bytes before fields with an alignment of 4, if required. Finally, the compiler rounds the total size of the record upward to the byte boundary specified by the largest alignment of any of the fields.

When a record type is declared in the {$A-} state, or when the declaration includes the packed modifier, the fields of the record are not aligned, but are instead assigned consecutive offsets. The total size of such a packed record is simply the size of all the fields.

File types

File types are represented as records. Typed files and untyped files occupy 332 bytes, which are laid out as follows:

type
  TFileRec = record
  Handle: Integer;
  Mode: Integer;
  RecSize: Cardinal;
  Private: array[1..28] of Byte;
  UserData: array[1..32] of Byte;
  Name: array[0..259] of Char;
  end;

Text files occupy 460 bytes, which are laid out as follows:

type
  TTextBuf = array[0..127] of Char;
  TTextRec = record
    Handle: Integer;
    Mode: Integer;
    BufSize: Cardinal;
    BufPos: Cardinal;
    BufEnd: Cardinal;
    BufPtr: PChar;
    OpenFunc: Pointer;
    InOutFunc: Pointer;
    FlushFunc: Pointer;
    CloseFunc: Pointer;
    UserData: array[1..32] of Byte;
    Name: array[0..259] of Char;
    Buffer: TTextBuf;
  end;

Handle contains the file's handle (when the file is open).

The Mode field can assume one of the values

const
  fmClosed = $D7B0;
  fmInput  = $D7B1;
  fmOutput = $D7B2;
  fmInOut  = $D7B3;

where fmClosed indicates that the file is closed, fmInput and fmOutput indicate that the file is a text file that has been reset (fmInput) or rewritten (fmOutput), and fmInOut indicates that the file variable is a typed or an untyped file that has been reset or rewritten. Any other value indicates that the file variable is not assigned (and hence not initialized).

The UserData field is available for user-written routines to store data in.

Name contains the file name, which is a sequence of characters terminated by a null character (#0).

For typed files and untyped files, RecSize contains the record length in bytes, and the Private field is unused but reserved.

For text files, BufPtr is a pointer to a buffer of BufSize bytes, BufPos is the index of the next character in the buffer to read or write, and BufEnd is a count of valid characters in the buffer. OpenFunc, InOutFunc, FlushFunc, and CloseFunc are pointers to the I/O routines that control the file; see "Device functions".

Procedural types

A procedure pointer is stored as a 32-bit pointer to the entry point of a procedure or function. A method pointer is stored as a 32-bit pointer to the entry point of a method, followed by a 32-bit pointer to an object.

Class types

A class-type value is stored as a 32-bit pointer to an instance of the class, which is called an object. The internal data format of an object resembles that of a record. The object's fields are stored in order of declaration as a sequence of contiguous variables. Fields are always aligned, corresponding to an unpacked record type. Any fields inherited from an ancestor class are stored before the new fields defined in the descendant class.

The first 4-byte field of every object is a pointer to the virtual method table (VMT) of the class. There is exactly one VMT per class (not one per object); distinct class types, no matter how similar, never share a VMT. VMTs are built automatically by the compiler, and are never directly manipulated by a program. Pointers to VMTs, which are automatically stored by constructor methods in the objects they create, are also never directly manipulated by a program.

The layout of a VMT is shown in the following table. At positive offsets, a VMT consists of a list of 32-bit method pointers--one per user-defined virtual method in the class type--in order of declaration. Each slot contains the address of the corresponding virtual method's entry point. This layout is compatible with a C++ v-table and with COM. At negative offsets, a VMT contains a number of fields that are internal to Object Pascal's implementation. Applications should use the methods defined in TObject to query this information, since the layout is likely to change in future implementations of Object Pascal.

Table 11.5   Virtual method table layout 

Offset

Type

Description

-76

Pointer

pointer to virtual method table (or nil)

-72

Pointer

pointer to interface table (or nil)

-68

Pointer

pointer to Automation information table (or nil)

-64

Pointer

pointer to instance initialization table (or nil)

-60

Pointer

pointer to type information table (or nil)

-56

Pointer

pointer to field definition table (or nil)

-52

Pointer

pointer to method definition table (or nil)

-48

Pointer

pointer to dynamic method table (or nil)

-44

Pointer

pointer to short string containing class name

-40

Cardinal

instance size in bytes

-36

Pointer

pointer to a pointer to ancestor class (or nil)

-32

Pointer

pointer to entry point of SafecallException method (or nil)

-28

Pointer

entry point of AfterConstruction method

-24

Pointer

entry point of BeforeDestruction method

-20

Pointer

entry point of Dispatch method

-16

Pointer

entry point of DefaultHandler method

-12

Pointer

entry point of NewInstance method

-8

Pointer

entry point of FreeInstance method

-4

Pointer

entry point of Destroy destructor

0

Pointer

entry point of first user-defined virtual method

4

Pointer

entry point of second user-defined virtual method

...

...

...

Class reference types

A class-reference value is stored as a 32-bit pointer to the virtual method table (VMT) of a class.

Variant types

A variant is stored as a 16-byte record that contains a type code and a value (or a reference to a value) of the type given by the code. The System unit defines constants and types for variants.

The TVarData type represents the internal structure of a Variant variable, which is identical to the Variant type used by COM and the Win32 API. The TVarData type can be used in typecasts of Variant variables to access the internal structure of a variable.

The VType field of a TVarData record contains the type code of the variant in the lower twelve bits (the bits defined by the varTypeMask constant). In addition, the varArray bit may be set to indicate that the variant is an array, and the varByRef bit may be set to indicate that the variant contains a reference as opposed to a value.

The Reserved1, Reserved2, and Reserved3 fields of a TVarData record are unused.

The contents of the remaining eight bytes of a TVarData record depend on the VType field. If neither the varArray nor the varByRef bits are set, the variant contains a value of the given type.

If the varArray bit is set, the variant contains a pointer to a TVarArray structure that defines an array. The type of each array element is given by the varTypeMask bits in the VType field.

If the varByRef bit is set, the variant contains a reference to a value of the type given by the varTypeMask and varArray bits in the VType field.

The varString type code is private to Delphi. Variants containing a varString value should never be passed to a non-Delphi function. Delphi's Automation support automatically converts varString variants to varOleStr variants before passing them as parameters to external functions.