	Octa-assembler 0.14

	Octasm is a multi-pass assembler for x86 processors
At least a cpu 80386+80387 and DOS is required.
Octasm only outputs plain binary files.
In this doc only especial features of Octasm are explained, is assumed
that the reader already knows x86 instruction set.
_______________________________________________________________________
	Whats new in version 0.15
	Complete ia32 instruction set
	Character '.' now is interpreted as a leter and can be used in names.
	New version of 'atob' now conversion from ascii to bin
	is faster ,more accurate and character '_' can be used to make numbers
	more readables example: 0000_1001_0000_1000_b
_______________________________________________________________________
	Whats new in version 0.14
	Some bugs corrections.
	Multiline comments with ';;' at the begining of the text line.
example:

;;      comment begin here
	and
	finish
;;      here


	Now 'define' can have parameters, there are many examples
on the files 'keyw.asm' and 'r'.

_______________________________________________________________________
	Whats new in version 0.13

	Some bugs corrections. Filenames are relative to current source
directory not to work directory .Blocks '{ }' save and restore the current section
in version 0.11 you have to write something like this:
	code { data dd 0 } code
now with version 0.13:
	code { data dd 0 }

	New directives
'define name text' asigns text to end of line to name, later
in the source 'name' will be replaced by 'text'.'name' must
be defined before is used and can not be redefined in nested
directories example:

 {
 define uno 1
	{ #uno  ;error
	define dos 2
	}
	{
	#dos   ;correct ,is another directory
	}
 }

'enum inc,name1,name2...namen' this instruction can be used to
define variables ,is the same as:
   #name1 rb inc .......... #namen rb inc

Example that used 'define' and 'enum' to easy declare stack
variables for subroutines:

use16
prueba(1,2,3) ret

define LOCALS   pusha enter ->1,0 virtual org >1->2 # enum 2,  ;2 bytes variables at least one is required
define PARAMS   # rb 20 enum 2,
define CODE     #RETURNS code
define RETURN   [bp+16]=ax leave popa retp RETURNS-20

#prueba{    ;'{' is required to avoid duplicate name definitions
	LOCALS x,y,z          ;locals variables
	PARAMS p1,p2,p3       ;parameters passed by calling procedure
	CODE
	w[bp+x]=10 w[bp+y]=20 w[bp+z]=30   ;inicialization of local variables
	ax=[bp+p1] add ax,[bp+y]           ;do something
	RETURN                             ;return result in ax
					   ;other registers are preserved

       }

;subroutine 'prueba',using the old sintax:

	#prueba{
		virtual
		org -6 #x rb 2 #y rb 2 #z rb 2
		rb 20 #p1 rb 2 #p2 rb 2 #p3 rb 2
		code
		pusha enter 6,0
		w[bp+x]=10 w[bp+y]=20 w[bp+z]=30
		ax=[bp+p1] add ax,[bp+y]
		[bp+16]=ax leave popa retp 6
		}



	The 'R'  program has been updated and now it prints also
the floating points registers.
	Aded a few example programs:
>octasm \off                    ;turn off the computer
calc.bat 1+2*4.43               ;calculates the numeric expression (without spaces)
_________________________________________________________________________

	First i will show some examples of using octasm.
example1:
>octasm dx="hello world$" ah=9 int 21h
ex2:
>octasm org 100h file_out \hello.com dx="hello world$" ah=9 int 21h ret

The first example is a program that is assembled
on ram memory and then executed
In the second example, the program 'hello.com' is writen to disk.
Next example requires at least 150kb of free memory and the cpu must be
in real mode.

ex3:
>octasm \octasm.asm

This will include the souce file 'octasm.asm'
source files must start with the string: 'LANG OCTASM,0.1'
Octasm is case sensitive and most instructions are lower case
Intel like sintax is used with some exceptions.
the char ';' marks the end of a instruction line.
Instruction that always have only one operand can be writen with multiple
operands, octasm will repeat the same instruction for every operand
ex: 'pop ax,5,[bx]'   is assembled as 'pop ax pop 5 pop [bx]'
Instruction 'push' is the exception because it gets operands
from right to left.
ex: 'push ax,5,[bx]'  is the same that 'push [bx] push 5 push ax '
a few instructions can be writen using keywords or like this:
ex:             is the same that:
eax=0           mov eax,0
ax=[bx+2]       mov ax,[bx+2]
ax=bx+2         lea ax,[bx+2]
es=ss           push ss pop es
es=5            push 5  pop es
d[bx]=[esp+4]   push d[esp+4] pop d[bx]
eax=ah          movzx eax,ah
f1()            call f1    ;without spaces
f1( 1,2,3 )     push 1,2,3 call f1


Names:
	In octasm names can be numbers,offsets and directories
the same name can be used for a directorie and a number/offset
examples:
n1=10           ;now 'ax=n1' is equivalent to 'ax=10'
#l1             ;offset/label definition 'l1:' in others assemblers
#               ;anonimous label
{               ;anonimous directory ,names used in this dir.
		;can be used only in this dir.
}               ;close dir.
d1{             ;definition of directory 'd1'
  #l1           ;complete pathname is d1\l1
  #l2{          ;definition of directory and offset
     #l2 { }    ;offset + anonimous directory
     l1()       ;if 'l1' is not defined in actual directory it will be
		;searched in parents directories
     d1=5       ;pathname 'd1\l2\d1'
     ax=d1      ;ax=5
     }
  ax=d1         ;ax=2
  ax=l2\d1      ;ax=5
  l2\l2()       ;outside of directory 'l2' the pathname is required
  jmp <1        ;error because '#' is not defined in this dir. but in parent
  jmp >1        ;jmp forward
  #
  }
d1=2            ;'d1' already defined as a dir. and now is also a number.
jmp <1          ;jmp back

Numbers:
numbers are case insensitive.
hexadecimals end with 'h' and start with '0' if first char is a letter
ex:  43h 0123h
binary numbers end with 'b' ex: 1101b
floating point numbers should have '.'
ex:        1.0e34  -0.0034E+3

Files:
	Filenames should begin with '\' and should end by ' ' or new-line
others chars are valid included ';'.
examples:
file:                   is writen:
test.asm                \test.asm
\test.asm               \\test.asm
c:test.asm              \c:test.asm
files can be used as operands, octasm open the file in mode r/w/create
ex:
eax=\file.pcx          ;'eax=16' if 16 is the handler returned by the OS
eax=\file.pcx          ;'eax=17' the file is opened another time

Strings:
	'" are used to delimite text strings
sigle cuote ' means that the string is placed in the instruction
double cuote " stores the string in 'section text' and returns a pointer.
examples:
		eax='A'+1 ret
is equal to     eax=65+1 ret
		eax="hello world" ret
is equal to     eax=label ret #label db 'hello world'
eax='hello word'    ;erroneus because the string has more than 32bits
two commads can be used inside a string:
%nb       returns the number of bytes to the end of the string, 255 max
	  must be followed by one espace.
%0-255    used to represent any char ,if followed by a espace ,the
	  espace is suprimed.

example: db '%nb %13%10 %nb '     equals to :  db 2,13,10,0

Arithmetic:
	octasm uses the math coprocesor to calculate expressions
is important to know how numbers are rounded:
in octasm 3/2=2   in others assemblers 3/2=1
because 3/2=1.5 and 1.5 is rounded to 2

anonimous labels '#':
	previous labels '<1' max 7
	next labels     '>1' max 8
examples:  jmp >2 jc >1 # jc <1 #

Pointers:
	Pointers are memory locations
	brackets are used ex:'[bx+5]' '[5*ebx+es+5]' 'd[cs+128]'
	the operand size must be before the brackets
	ex: 'd[bx+5]'
	operand size must be used only if not specified by the instruction.
	octasm can use 32 bits offsets in 16 bits programs if required.
	ex: '[12]'  ; 8 bits
		 '[1234]' ;16 bits in 16 bits programs
			  ;or 32bits in 32 bits programs
		'[123456]'; 32 bits


Instructions 'call' and 'jmp' :
	Call operator '()' only can be used if the operand is a name
examples:  f1() f2( ax,bx )
operands are pushed from right to left
in others situations use 'call'
ej:   call >1 ret  #
      call far [bx]
      call far d[bx]  ;32 bits offset + 16 bit segment
      jmp far w[56]   ;16 bits offset + 16 bit segment
      call ax
      jmp 1,2        ; offset,segment not segment:offset
      call d 1,8     ;32 bits offset + 16 bit segment


Directives:

b,w,d,q  operand size is 1,2,4,8 bytes
	 write 'push w(4+5)' instead of 'push w 4+5'

far      offset+segment

db,dw,dd,dq  data declarations for numbers of 1,2,4,8 bytes
	      'db' is also used for strings

float,double,dt   data declarations for floating point numbers 4,8,10 bytes

rb number         writes a number of bytes with zeros

section number(0-15)
		  section is the memory espace where the code
		  will be assembled
		  by default four sections are initialized
 code      section 0
		  when the program is assembled to ram memory
		  this section will use the same code segment
		  that octasm and the maximum size wil be about 44kb
		  Instructio 'align' uses 'nop' in this section
		  and zeros in others sections
 data      section 2
		  when the program is assembled to ram this section
		  is allocated into the data segment
		  that is limited to 4Gb
 text      section 1
		  the same that data section,this section
		  is also used to store double cuoted strings
 virtual   section 3
		  this section is only used to calculate offsets
		  no data is writen in this section
		  like the '?' operator in some assemblers.

  examples:
  code    section 5     ;defines section 5 as code section
  text    section 5     ;now strings will be stored in section 5
			;that is also a code section
  data                  ;predefined section 2
  text                  ;set section 5
  section 4             ;defines seccion 4 like section 5 (code,strings)
  data                  ;set seccion 2
  text                  ;set seccion 4
		   sections are assembled by numerical order (0->15)
		   and are written to files in the same order
		   when assembling to ram sections can be placed anywere


org number     sets the instruction pointer used in offsets calculations
	       default value is zero
	       this directive must be used only
	       to assemble files
	       ex: org 100h
		   #
		   inicio=5000
		   org inicio+45
		   org <1        ;org 100h


align number   fill with zeros or 'nop' until the instruction pointer
	       is multiple of number
	       ex:
	       align 4                ok
	       align 7c00h+510        ok
	       align >1 #             error

include file,position,number of bytes to copy
	       used to copy binary files inside the program
	       position and number of bytes are optional
     examples:
  >octasm file_out \ab include \a include \b
    same as DOS command '>copy /b a + b ab'
    the next example copy a file into the program
    starting at position 54
    ex:  include \img.bmp,54

file_out file
	 sets the file where assembled data must be writen
	 can be used in every section (not in virtual)
	 examples:
	 section 0                      ;assembled to ram memory
	 section 1                      ;assembled to ram
	 section 2  file_out \prog.com  ;assembled to 'prog.com'
	 section 3                      ;virtual , no code output
	 section 4  file_out \prog.dat  ;assembled to  'prog.dat'
	 section 5                      ;assembled to  'prog.dat'
					;etc...

file_sym file
	sets the file to store the symbol table
	is a file that defines all the names used in a program
	can be used only one time.
	symbol tables are useful for making:
	incremental compilation,overlays and debug.
	When assembling to ram the program works like a octasm's
	overlay and octasm code can be used including the
	symbol table
	example:
	lib{ \octasm.sym } ;'octasm.sym' is generated from 'octasm.asm'
			   ;the use of 'lib' directory prevents from
			   ;name conflicts
	ecx=40 lib\malloc() ;allocate 40 bytes of memory see 'memlib.asm'

	instructions 'usexx_xx' should be used only one time in every
	section and before other instructions
use16   16 bits program,32 bits instructions will be coded with prefixes
use32   32 bits program
use16_32  unreal mode program ,is the default option
	  is like use 16 but instructions with implied registers will
	  be code in the 32bits form:
	  lods  (lodsb lodsw lodsd)
	  movs stos scas cmps ins outs
	  loop

Asembling programs to ram
	Octasm allows directly execution of source code without writing
	executable files on disk. Programs will run as octasm's overlays
	code sections are allocated in code segment and others in
	data segment (ds=cs+1000h).
	When the computer is in real mode, octasm will change segments
	limits (es,ds,ss,fs,gs) to 4Gb and 32bits addresses will be
	required for data.
	Programs must end with 'ret' instruction, so octasm could return
	used memory to DOS
	If the computer is no in real mode octasm will use only
	64kb of memory for data segment, 16 bit pointers could then be
	used, but only small programs could be assembled.

Asembling programs to files
	By default octasm initializes with: 'org 0 use16_32'
	to make a '.com' program these instructions are required:
	org 100h use16
	and instructions like 'eax=\file ' should not be used

R       File 'R' is a program that can be used to learn assembly
	executing a few instructions and printing the contents of registers
	registers are saved in 'regs.dat'.
	use: >octasm \r instructions
	esp must remain unchanged.
	examples:
>octasm \r eax=1 ecx=4
>octasm \r shl eax,cl
>octasm \r add eax,ecx
	now eax=20

________________________________________________________
	Bugs
	There are a few bugs that require important changes
in the source and will be corrected late.
	eax=dir dir\v1()  ;octasm crash if is the first time 'v1' is used
	dir2(dir2\v1)     ;same problem
the solution is put it alone in another line:
	eax=dir
	dir\v1()
or insert another keyword betwen:
	push dir2\v1 call dir2
_______________________________________________________
for updates or more information visit my web page:
http://octavio.vega.fernandez.googlepages.com/octaos

