XPLPX.TXT       30-Jan-2010

                       OPTIMIZING 32-BIT COMPILER

XPLPX combines the features of XPLX (the optimized 16-bit version) with
XPLP (the non-optimized 32-bit protected-mode version) which provides:
32-bit integers that run as fast as the fastest 16-bit code, much faster
floating point, megabytes of array space, large .exe files up to 600K,
protected mode with or without Windows, and extended graphics.

VESA graphic modes are supported in a straightforward way. For instance,
SetVid($112) sets the display for 640x480x24 - that's 16.8 million
colors. The much faster Point intrinsic can completely fill this screen
20 times per second (on a Duron 850 - and with an nVidia card using a
linear frame buffer the screen can be filled 60 times per second.)

32-bit integer operations are about twice as fast as with XPLP. Floating
point operations (reals) are up to ten times faster than with XPLX (even
when linking with NATIVE7X). Point plotting is a remarkable one hundred
times faster than with XPLP (or even XPLX prior to version 2.4.6).

Arrays can be reserved by their declarations, e.g: int Array(20,10);
Arrays can also be reserved, like before, by using the Reserve intrinsic.

Listings show source and assembly code side-by-side when the /d (debug)
switch is used on the command line.

Characters can be displayed in VESA graphic modes by using device 6. They
can be positioned with either the Cursor intrinsic or the Move intrinsic.
When the Move intrinsic is used, text is not limited to being positioned
on character cell boundaries. An 8x16 font is used. On a 640x480 screen
this displays 30 lines of 80 characters. Device $106 displays an 8x8 font
(and on some computers device $206 will display an 8x14 font).

The Attrib intrinsic supports the increased number of colors available
with VESA. The location of the bits that set the background and
foreground colors depend on the number of colors provided by the video
mode. For modes with 16 or fewer colors the high nibble sets the
background color, and the low nibble sets the foreground color. For 8-bit
color modes (such as $13 and $101) the high byte sets the background
color, and the low byte sets the foreground color. For 15- and 16-bit
color modes the high word sets the background color, and the low word
sets the foreground color. For 24-bit color modes the background color is
black, and the foreground color is the value of the argument.


COMPATIBILITY

Some of the features in 16-bit XPL0 are not supported, such as segment
arrays and short reals. Also, not all of the intrinsics are supported.

A mouse pointer cannot be displayed in VESA graphics modes like it's done
for VGA modes (not even under Windows). VMouse.xpl provides an example of
how to display a mouse pointer in VESA modes.


SEGMENTS

Segment addressing was a kludge used by Intel to extend the memory
address space, and it was adopted by 16-bit XPL0. Since 32-bit XPL can
directly access up to 4GB, there is little incentive to continue
supporting segment addressing.

In protected mode the CPU does not do segment addressing. Instead it
fakes it by using "descriptor tables". XPLPX could have faked it too, for
compatibility's sake, by automatically converting segment addresses to
linear addresses. But since there are so few users of XPL0, the luxury
was taken to drop the burden of supporting this obsolete feature.

Fortunately it's not difficult to modify existing 16-bit XPL0 code to
eliminate segment arrays, and the resulting code is usually simpler. For
example:

   seg char Buffer(1);
   begin
   Buffer(0):= Malloc(1024);    \1024 paragraphs of 16 bytes each
   for I:= 0, 16383 do Buffer(0, I):= 0;
   . . .

can be simplified using a normal one-dimensional array:

   char Buffer(16384);
   begin
   for I:= 0, 16383 do Buffer(I):= 0;
   . . .

When using physical addresses, the situation is a little uglier. For
example:

   seg char Screen(1);
   begin
   Screen(0):= $A000;
   for I:= 0, 64000-1 do Screen(0, I):= 0;
   . . .

must be handled something like this:

   char Screen;
   begin
   CpuReg:= GetReg;
   DSeg:= CpuReg(12);
   Screen:= ($A000 - DSeg) << 4;
   for I:= 0, 64000-1 do Screen(I):= 0;
   . . .

XPL0 automatically adds the base address of its data (DSeg) to every
address it accesses. Thus to access a particular physical location, DSeg
must be subtracted to compensate. Since DSeg is a segment (or paragraph)
address, it must be multiplied by 16 to convert it to an actual (byte)
address. In the example above, $A000 is the segment address of the start
of the graphic memory for the video display.

The Peek, Poke, and Blit intrinsics, which use segment addresses in their
arguments, still work the way they have always worked. Thus:

   char Memory;
   begin
   CpuReg:= GetReg;
   DSeg:= CpuReg(12);
   Memory:= $400 - DSeg;
   Byte:= Memory(0);
   . . .

which reads the first BIOS variable at location 0040:0000, can simply be
replaced with:

   Byte:= Peek($40, 0);


Even though segment addressing has been dropped, the Malloc and Release
intrinsics are still used. They are needed in some instances to provide
chunks of memory when DOS or BIOS are called with the SoftInt intrinsic.
DOS and BIOS run in real mode and can only access memory in the first
megabyte. XPLPX's array space is normally above this, in extended memory.
(Exceptions are constant arrays and strings in quote marks, which reside
in the first megabyte - conventional memory.)


MISSING INTRINSICS

The following intrinsics are currently not implemented:

IntRet=66, ExtJmp=67, and ExtCal=68 are very rarely used, and currently
give runtime error 5 (bad intrinsic).

Equip=77: A '386 system generally has everything that this tests for and
a lot more. This currently gives runtime error 5 (bad intrinsic).


OTHER COMPATIBILITY ISSUES

The Read and Write intrinsics use the large I/O buffers, which can
intefere with device 3. The Read intrinsic uses the input buffer, and the
Write intrinsic uses the output buffer. These buffers are used because
the DOS routines used to do the actual disk reads and writes cannot
access memory above 1MB. A further consequence of this is that each
sector is buffered, which makes reading and writing multiple sectors
somewhat slower than with 16-bit XPL0. Windows XP only allows read or
write access to floppy drives. (It should go without saying that writing
directly to the hard drive using the Write intrinsic is a VERY dangerous
operation.)

Windows 3.1 doesn't display VESA modes properly (at least not on my Duron
850 system). It displays multiple images as though the horizontal sync
frequency is wrong. Who cares about Windows 3.1 these days?

Some computers don't display characters in VESA graphic modes when
calling DOS and BIOS routines. To provide this expected capability,
XPL0's device 6 does the low level display of characters when the video
mode is greater than $0FF. Devices 0 and 1 still call DOS and BIOS
routines to display a character like they have always done, however be
aware that this does not always work.

There are new command words such as 'fix', 'float', 'sqrt', 'abs', and
'port'. These perform their functions significantly faster than their
corresponding intrinsics by using in-lined code instead of calling a
subroutine. 'abs' and 'sqrt' work for both real and integer arguments.

External assembly language routines must not alter the EDI and ESI
registers. Under 16-bit XPL0 these registers (DI and SI) were
automatically saved and restored across any external assembly language
call. Now, for speed's sake (and for other reasons), it's the
programmer's responsibility to make sure that these registers are
preserved.

Note that segment registers cannot be used in protected mode like they
are used in real mode. In protected mode they're selectors into
descriptor tables. XPLPX sets up the FS register to select a descriptor
with a base address of 0 so it can be used to access absolute locations
such as BIOS variables and video RAM.

XPX.BAT defaults to using short-circuit boolean evaluation (i.e. the /b
switch is used). This is faster than the traditional evaluation, but it
can alter the way a program runs in some rare situations. (For a detailed
explanation of this see the 16-bit documentation ADDENDUM.TXT.)

The Reserve intrinsic now rounds the reserved space up to the next
double-word boundary (i.e. the next size evenly divisible by 4). This
guarantees that subsequent (local) variables will be aligned on
double-word boundaries, which can double the speed that they are
accessed. However, this also means that sequential calls of the Reserve
intrinsic might not allocate contiguous memory locations. It's very
unlikely that this will affect any existing code, but it could.

Since the heap is located in extended memory, normal arrays cannot be
accessed by BIOS or DOS routines. Use the Malloc intrinsic to set up an
array in conventional memory.

Array space under Windows 98 can end up in virtualized memory (i.e. on
the hard drive). This can slow a program tremendously the first time this
array space is accessed.

The 32-bit version of the HexIn intrinsic (26) works slightly differently
than the 16-bit version. The 32-bit version returns after 8 hex digits
whether or not the string of digits is terminated with a non-hex
character (such as a space). The 16-bit version only returns when the
digits are terminated. Thus the value returned for 16-bit XPL0 is the
last 4 digits if more than 4 digits are input, and the value returned for
32-bit XPL0 is the first 8 digits if more than 8 digits are input.

Be aware that when the GetReg intrinsic is used to return a value from
DOS or BIOS, the high 16 bits are not defined. They probably should
be masked off.

The Line intrinsic (42) is significantly slower in 32-bit XPL0 for planar
graphic modes (such as mode $12) than it is in 16-bit XPL0. The built-in
high-speed routines in 16-bit XPL0 required too much code to support the
myriad graphic modes, many of which are now obsolete. Non-planar modes
(such as $101) are not only faster, but they provide at least 256 colors.

Beware of thrashing the VESA window. A Mandelbrot program that used VESA
mode $101 was made many times faster by NOT mirroring the symmetric top
and bottom halves of the image. For animations, it's best to use an image
buffer and the new Paint intrinsic.


PAINT INTRINSIC

A new intrinsic has been added called Paint (81). It's used to quickly
copy an image to video memory, which is useful for animations. Paint
works for all video modes (especially VESA), except those modes with 16
colors or less (planar modes: $12, $102, $104, $106).

Paint passes six integer arguments in this format:
        Paint(X, Y, W, H, Image, W2);

X,Y are the coordinates where the upper-left corner of the Image data
will be displayed on the screen. W,H are the width and height (in pixels)
of the portion of the Image data that gets displayed. "Image" is the
address of the Image data array. W2 is the actual width (in pixels) of
the Image data array.

Normally the image data array is equal to the width and height defined by
W and H, and thus W2 and W are set to the same value. However, the
apparently redundant argument W2, provides a great deal of flexibility.
X, Y, W and H define a window on the screen. But a second window into the
image data array can also be defined.

For example, if an offset (X2,Y2) is added to Image (such as: Image + X2
+ Y2*W2) then the Image data will appear to move left and up (by X2,Y2)
inside the major window defined by X,Y,W,H. In this case W2 would be
greater than W. Also the Image data array would have additional data
extending its actual height beyond the height defined by H.

For the 8-bit color modes ($13, $100, $101, $103, $105, $107) the Image
array is reserved as a byte array, for example: char Image(640*480).

For the 15 and 16-bit color modes ($10D, $10E, $110, $111, $113, $114,
$116, $117, $119, $11A) the size of the Image array must be doubled, for
example: char Image(640*480*2). You (the programmer) must combine a pair
of bytes for each pixel, and deal with the way the hardware packs the
colors into the resulting 16-bit word. For instance, mode $111, packs the
bits like this:

        bit:    F E D C B A 9 8  7 6 5 4 3 2 1 0
        color:  R R R R R G G G  G G G B B B B B

While the 15-bit color mode $110 packs the bits like this:

        bit:    F E D C B A 9 8  7 6 5 4 3 2 1 0
        color:  - R R R R R G G  G G G B B B B B

In 24-bit color modes ($10F, $112, $115, $118, $11B) the image is
reserved as an integer array, for example: int Image(640*480). The order
of the colors in the 4-byte integer is: $xxRRGGBB. The high byte is
currently unused.


NULL-TERMINATED STRINGS

The common standard convention of terminating ASCII strings with a null
character (binary zero) can now be used.

XPL0 normally terminates strings by setting the most significant bit (MSB)
on the last character. This saves a byte, but it's awkward when calling
DOS and BIOS routines that use the null-character convention. It's also
awkward when using extended ASCII characters in strings (those characters
with values greater than $7F). Null termination also allows strings to be
empty (i.e: "").

The new command word 'string' is used to select the convention. If
"string 0;" (without the quote marks) appears in a program, from that
point on, strings are terminated with a null. If "string 1;" (or any
nonzero value) appears then strings are terminated with the MSB set.

The compiler defaults to MSB termination for compatibility with old code.
The Text intrinsic (12) changes with the 'string' directive. The RawText
intrinsic (71) does not change, and is obsolete but still supported. The
intrinsics Chain (28) and FOpen (29) have always supported null-
terminated strings.


IN-LINE ASSEMBLY CODE

XPL0 has the ability to handle assembly code that is inserted directly
into an XPL0 program. The command word "asm" designates that the fol-
lowing characters on the line are assembly code, and they are to be
copied directly to the output (.ASM) file. For example:

        asm     cli
        asm     mov     eax, 102         ;comment
        asm     mov     ebx, Frog        ;Comment

Assembly code must be written in lowercase characters except when an XPL0
variable or constant name is used. These are written the usual way with
at least the first letter capitalized. This enables the compiler to
distinguish them from the rest of the assembly code and to substitute
them with their assembly code representation. For instance, in the above
example, "Frog" might be replaced with something like "[ESI+4]". Capital
letters may be used in a comment set off with a semicolon because
comments are ignored by the compiler.

If several lines of assembly code are used, they can be written this way:

        asm     {
                cli
                mov     eax, 102         ;comment
                mov     ebx, Frog        ;Comment
                }

The ability to insert assembly code into a high-level language program
is a two-edged sword. In general it should be avoided, but there are
instances when it is very useful.

The most obvious application is to replace compiled code with more
efficient assembly code. For instance "Irq(false)" can be replaced with
"asm cli", which is at least ten times faster (except under Windows XP,
which simulates the cli). Similarly, "POut(Time, $40, 1)" could be
replaced with:

        asm     mov eax, Time
        asm     out 40h, ax


Assembly language provides low-level control that a high-level language
can't. Consider this expression: Frog * 777777 / 1000000. If Frog is
above 2761, the calculation will overflow. However the following will
not overflow even when Frog is 1600000000:

        asm     {
                mov     eax, 777777
                imul    Frog            ;eax:edx := eax * Frog
                mov     ecx, 1000000
                idiv    ecx             ;eax := eax:edx / ecx
                }

Here is an example of a double-precision add:

        TimeLo:= TimeLo + 143;
        asm     {jnc    tm10
                 inc    TimeHi
                tm10:};


RULES AND RESTRICTIONS

With the power of assembly language, it's easy to shoot yourself in the
foot. When using XPLPX, the ESI and EDI registers must not be altered,
and of course altering the descriptor selectors ES, DS, SS or CS is not
allowed.

The compiler generates the correct code for named constants such as: "def
Frog=123; asm mov eax, Frog". It also generates the correct code for
variables except if the variable is at an intermediate level (neither
local nor global). In that case the inserted assembly code must make sure
that the correct BASEn is loaded into the EBP register.


PMODE

PMODE is the DOS extender that makes XPLPX possible. It's a free piece of
software available on the Internet and written by Tran (aka: Thomas
Pytel). Thanks Tran! Wherever you are.

PMODE is basically a DOS Protected Mode Interface (DPMI) like used in
Windows 98. PMODE 3.07 uses a multi-level design that provides whatever
is lacking in a system. If an XPLPX program is running under Windows then
PMODE does very little: It switches into 32-bit protected mode and hands
control over to the DPMI in Windows. If however the system only has
HIMEM.SYS then PMODE uses it to allocate extended memory and provides the
other functions necessary to complete the DPMI.

These are the levels that PMODE supports. Each level includes the
functions of the levels below it:

        DPMI    WINDOWS
        VCPI    EMM386
        XMS     HIMEM
        Raw     DOS

This design makes the DOS extender compatible with any system.

A DPMI provides a set of routines that are called by software interrupt
31h, similar to the DOS interrupt 21h routines. These routines manage
memory, manipulate interrupt vectors, set up protected-mode descriptors,
and provide access to DOS and BIOS routines that pass segment registers.

Switching to protected mode provides 32-bit operations and up to 4GB of
address space, but it also cuts a program off from the outside world. I/O
is often done through DOS and BIOS, which call device handlers that run
in 16-bit real mode. The DPMI provides access to these routines by
switching back and forth between real and protected mode.

The DPMI also deals with a problem that occurs within an 1/18th of a
second after switching into protected mode: the system timer hardware
interrupt. Since the real-mode interrupt vectors are no longer used,
protected-mode vectors are provided. These point to routines that switch
to real mode, call the appropriate interrupt service routine, switch back
to protected mode, and then return.


MEMORY MAP

0000:   Vectors, BIOS variables, DOS, TSRs, etc.

(Location that the .exe loader decides to load your XPL0 program)
????:   PSP (256 bytes) (Contains file name typed on command line, if used)
        PMODE code (PMODE_TEXT) DOS Extender
        NATIVEPX code (CSEG) Runtime support code for XPL0 (intrinsics etc.)
        Compiled .XPL code (up to about 600K maximum)
        NATIVEPX data (DSEG)            <- base address of all XPL0 data
        Compiled data (XPL0 strings and constant arrays)
        Miscellaneous buffer space
        Heap space for global variables (HEAPLO up to 1600 reals = 12800 bytes)
        Stack (stacklen = 16K bytes)

        Unused/available conventional memory, which can be Malloc'ed for use
         with DOS and BIOS routines that require buffer space

A0000:  Video graphic memory
B8000:  Video text memory
F0000:  BIOS
100000: Start of extended memory
        DOS (if it's loaded high)

(Location that DPMI returns when NATIVEPX allocates extended memory)
??????: Heap space for everything except global variables
        (Size is set to whatever the operating system will provide)
        (The pointer to a global array is in conventional memory while the
        actual array space is here in extended memory)


KNOWN BUGS

Hitting Ctrl+C does not release high memory when running under pure DOS
(i.e. no Windows). The next time an XPLPX program is run, an out-of-
memory error will occur. This problem does not occur under WinXP. Ctrl+C
does not work at all under Win98.

The Ctrl+C vector should be redirected to Nativepx's exit routine.
However it's in real mode, and I don't know how to switch to protected
mode. (There is a similar bug in 16-bit XPL0 that does not restore the
divide-by-zero vector when Ctrl+C is hit.)

The CTRL+C vector is supposed to be "reflected" from real mode to
protected mode. However hooking interrupt 23h using function 502h
interrupt 31h does not appear to do anything.

The PMODE documentation says that it does not restore extended memory
upon exit (via function 4Ch interrupt 21h), which is a departure from the
standard. DPMI version 1.0 specifically says that it releases extended
memory upon exit, whereas version 0.9 did not.

Avoid using Ctrl+C with programs compiled with XPLPX. Ctrl+C does the
following bizarre things depending on the operating system.

Pure DOS with HIMEM.SYS in CONFIG.SYS: Ctrl+C bypasses the code that
releases extended memory. Thus the next time you run a program compiled
with XPLPX, you get an OUT OF MEMORY error. If HIMEM.SYS is disabled, the
next time you try to run a program compiled with XPLPX, it blows up!

Win 3.1 & 98: Ctrl+C echoes to the screen, but is ignored. (Ctrl+C works
normally for 16-bit XPL0 programs.)


The divide-by-zero handler does not work when errors are not trapped,
i.e: Trap(false). XPLPX does not simply generate IDIV EBX, but instead
uses a variety of addressing modes, so the current code is too simple
minded. 16-bit XPL0's code, which picks apart the address format so it
knows how many bytes to skip, might work. XPLPX can generate a variety of
forms including IDIV ECX and IDIV [ESI+24].

However because of the DOS extender, picking apart the instruction
pointed to by the return address on the stack probably will not work. The
simple solution is to always generate an IDIV instruction that uses a
register as the source operand, and never use a memory operand (so the
instruction is always the same number of bytes long).


It has been reported that on some Win98 machines attempting to run
extended VESA graphics in windowed mode (not full screen) causes a lock
up that necessitates a reboot.

Device 6 in VESA graphic modes does not scroll up lines of text.

The SetWind intrinsic does not work for VESA graphic modes.


A maximum of 8 real arguments can be passed to a procedure. Should the
compiler give an error if there are more than 8 real arguments? Even
flagging this error will not guarantee that the FPU stack won't overflow.
(The other compilers [XPLP, XPLX, etc.] allow an unlimited number of real
arguments to be passed.)

There is an RPN evaluator that prevents overflowing the FPU stack (and
generates better code, like in Delphi) for pathological cases such as
this: J:= A+(B+(C+(D+(E+(F+(G+(H+I)))))));


Weird things happen with the Paint intrinsic when Width and Height are
within a few pixels of 0. Should the Paint intrinsic support planar
graphic modes?

Should the Paint intrinsic clip to the screen dimensions?


Some floating point trap error numbers are wrong as demonstrated by
FTEST.


REVISION HISTORY

XPLPX version 3.3.2 has the following new features and bug fixes:

The most significant feature is the ability to include assembly code
directly in an .XPL file by using the new command word 'asm'. This is
explained in the file XPLPX.TXT.

Fixed a bug in the 'fix' command word. It gave the wrong result when it
was used in complex expressions such as "CT(3):= fix(60.0-Am) + 5".

The literal characters "^Z" are now allowed inside strings. These are
used to display the right-arrow character. Previously, "^Z" was converted
to an actual control-Z, which is the end-of-file marker, which terminated
the compile with error 63.

Made small improvements to the code generated for arrays.

Fixed a bug in the compiler where DO_IMM_STK called OPSTRING without
passing all the proper arguments. This is mostly academic since it
probably could not be detected by a user's program.

Fixed a bug in the Point and Line intrinsics that would display a few bad
pixels when using 24-bit color on some computers.

The Line intrinsic now displays the correct color when using graphic
modes with more than 8 bits of color. For those modes, dashed lines are
defined by bits 31..24 instead of the usual bits 15..8.

Allow extended VESA modes $180..$1FF.

Fixed a bug in the SoftInt intrinsic where the wrong status flags were
returned.

Fixed a similar bug in the Chain intrinsic that returned the wrong status
for the carry flag.

                                  - - -

XPLPX version 3.3.3 fixes a bug introduced in (actually unmasked by)
version 3.3.2. An example of this bug is: S(I-1), where S is type
character and I is the control variable of a 'for' loop. Any value other
than "1" does not have this problem, for example: S(I-2). Still, this is
a rather severe bug, and I'm embarrassed that it slipped past my battery
of tests. (The compiler would not even correctly compile itself.)

STDLIB.XPL has been updated to work better with 32-bit XPL0. The changes
mask off the high 16 bits that are returned when calling some DOS and
BIOS interrupt routines. These bits are undefined.

A second collection of library routines, called LIB2.XPL, is included.
These are less official than STDLIB, but are often handy. Use them as
guidlines and modify them as needed.

                                  - - -

XPLPX version 3.3.4 has the following changes:

The command word 'sqrt' has been added. This does the same operation as
the Sqrt intrinsic, but is slightly faster and works for both real and
integer arguments (like 'abs' does).

The way text is written to VESA graphic screens (modes >= $100) has been
improved. Device 6 displays an 8x16 font, and device $106 displays an 8x8
font. Characters can be positioned to any pixel location using the Move
intrinsic - they're no longer limited to just character cell boundaries.
(The Cursor intrinsic still works like it always did.)

The Attrib intrinsic supports more colors for device 6. For 16-bit (and
15-bit) graphic modes the high word (16 bits) of the argument specifies
the background color, and low word specifies the foreground color. For
24-bit graphic modes the background is always black, and the argument
specifies the foreground color.

A bug in nVidia cards is bypassed: The Attrib intrinsic could not set the
background color for text when using graphic mode $13.

The Point intrinsic clips pixels that are drawn beyond the screen
boundaries when VESA graphic modes are used.

The ReadPix intrinsic works for ATI graphic cards in VESA modes.

A bug where a program had more than 64K of constant arrays or strings has
been fixed. File I/O or the SetVid intrinsic could overwrite the data.

A bug that could occur when dividing by a 'for' loop control variable has
been fixed.

Don't flag EOF error for null strings ("") when conditional compile is
false.
                                  - - -

The most significant change to XPLPX in version 3.3.5 is that variables
can be declared after procedures. This makes it easier to break programs
up into separate files, which can make them more modular and easier to
understand and manipulate.

For example, often there is a group of procedures that share common
variables that are not used by the rest of the program. You can now put
these procedures and their variables into a single file and 'include' the
file in the main body of the program. Previously, these global variables
had to be declared at the beginning of the program.

Another advantage of this feature is that you can now declare a variable
immediately above the Main procedure if that's the only place it's used.
For example, if you use "I" as an index in Main, it's nice if "I" is not
global to the entire program where it might mistakenly be used by a
nested procedure.

Sometimes it's convenient to think in terms of binary instead of hex. The
percent sign can now be used to represent a binary number. For example,
%10011100 is the same value as $9C.

Because binary numbers can blur into a meaningless string of 1's and 0's,
underlines can be used to separate them (%1001_1100 = $9C).

For consistency underlines can be inserted into any number. For example,
$12_34, or 123_456.78. The underlines are simply ignored by the compiler.
Underlines may also appear in any number read in by the intrinsics IntIn,
HexIn and RlIn.


The command word "sq" has been added, which squares the value of its
argument. It works on both reals and integers (like sqrt, for example,
sq(5) = 25).

The command words: abs, sq, sqrt, swap, fix, and float can be used in
defines (with constant values).


A new intrinsic called GetTime returns the current time in microseconds.
It uses the 1.19 MHz system clock (based on the 8253 chip) in the PC. A
microsecond delay routine can easily be made with it like this:

        proc    Delay(D);
        int     D;  \number of microseconds to delay
        int     T;
        begin
        T:= GetTime;
        repeat until GetTime-T >= D;
        end;

(Don't be tempted to write "GetTime >= T+D" which won't work with signed
arithmetic.)


Another new intrinsic is Backup. This enables the last byte read in from
the input stream to be re-read. It's like C's ungetc function. Backup is
handy, for example, when you want to provide the opportunity for a user
to change a number or accept its current value. If a new number is typed
in, you change it; but if the Enter key is struck you don't change it.

        if ChIn(0) # CR then 
                begin
                Backup;
                Number:= IntIn(0);      \change Number
                end;


                                  - - -

There were no significant changes made in version 3.3.6.

                                  - - -

The main new feature in version 3.3.7 is the decrementing 'for' loop.
This uses the new command word 'downto'. What was previously kludged
using negative values can now be written in a straightforward way, for
example:

 for I:= -10, -1 do    [IntOut(0,-I); CrLf(0)];	\old method
 for I:= 10 downto 1 do [IntOut(0,I); CrLf(0)];	\new method

For symmetry, a normal incrementing for loop can use the new command word
'to' in place of the comma, for example:

 for I:= 1, 10 do   [IntOut(0,I); CrLf(0)];	\existing method
 for I:= 1 to 10 do [IntOut(0,I); CrLf(0)];	\generates exact same code


There is a new shift operator that does an arithmetic shift right instead
of a logical shift right. Its symbol is "->>". An arithmetic shift can be
used to quickly divide integers by powers of 2. This works in a simple
way for positive integers, but does not give the exact same value as a
divide for negative integers. The difference is that if there is a
remainder, an integer divide truncates the quotient toward zero, whereas
the arithmetic shift right truncates the quotient toward negative
infinity. For example:

   27 / 4 = 6
   27 >> 2 = 6
   27 ->> 2 = 6
  -27 / 4 = -6
  -27 >> 2 = 1073741817
  -27 ->> 2 = -7
  -24 ->> 2 = -6


There are some new command-line switches for the XPLPX compiler.

/W displays warning messages. These are handled like error messages but
they don't discard the compiled output file (.asm).

/C makes identifier names (variables, procedure names, etc.) case-
sensitive. For example "Frog" would not be the same name as "FROG". This
switch is intended for making sure that names have consistent capital-
izing, rather than allow separate names be declared with different
capitalizing. Of course it's your decision on how to use it.

/I is used to output intermediate code (I2L), which is used for debugging
the compiler. These codes were previously output with the /C switch.


A couple minor bugs have been fixed. The Clear intrinsic now sets the
text cursor to 0,0 when using VESA graphic modes. The Restart intrinsic
now works for programs running under Windows, as well as under DOS.

A few minor optimizations, such as better handling of zero constants and
streamlining byte indexing, have been added to the code generator.


-Loren Blaney
