This is an UNREVIEWED and UNGUARANTEED document describing *some* of the aspects of djasm. We think it might describe djasm, but then it might not. DJ From dj-admin@delorie.com Thu Jul 30 22:47:19 1998 Sender: bcurrie@tssc.co.nz Date: Fri, 31 Jul 1998 14:40:02 +1200 From: Bill Currie Organization: Telecommunication Systems Support Centre X-Mailer: Mozilla 3.01Gold (X11; I; SunOS 5.5.1 sun4m) MIME-Version: 1.0 To: djgpp-workers@delorie.com CC: bill@taniwha.tssc.co.nz Subject: djasm documentation Content-Type: multipart/mixed; boundary="------------129A607C6851" X-Mailing-List: djgpp-workers@delorie.com X-Unsubscribes-To: listserv@delorie.com Precedence: bulk This is a multi-part message in MIME format. --------------129A607C6851 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Here is a rough beginning of some documentation for djasm. It covers most (if not all) of the directives (their usage the hardest part to glean from the sources) but skimps in a lot of the other places. eg instruction syntax, differences from standard Intel format etc. The `.obj' format is still undocumented as it's support is still incomplete, but all other output formats are covered. The doc is in plain text because I don't know how to write texinfo (probably not too hard to learn, but I'm lazy:). Comments, patches, flames etc are all welcome and desired. Bill -- Leave others their otherness --------------129A607C6851 Content-Type: text/plain; charset=us-ascii; name="djasm.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="djasm.txt" djasm infile [outfile] [mapfile] `outfile' defaults to `infile.exe' after dropping `infile's extension. The extension of `outfile' determines the output format of `outfile' supported output extensions: `.exe' Normal dos `MZ' executable. This is the only format that supports the `.copyright' directive. Does not support more than one segment. `.com' Normal dos com file. Program offsets start at 0x100. `.bin', `.sys' Binary files starting at offset 0. `.sys' is supported so that device drivers can be built directly, rather than renaming the output file. `.h' Produces a C compilable array of bytes (without the declaration and `{}'s `.inc' Produces a `.db' array for other djasm `.asm' files. `.ah' Produces a `.byte' array for gas `.s' or `.S' files For `.h', `.inc' and `.ah' output types, use the `.type' directive to determine which binary type to emit. Supported output images (set with `.type'): ext/typ exe com bin sys h inc ah exe exe exe* exe exe exe exe exe com 0 com 0 0 0 0 0 bin bin bin* bin bin bin bin bin sys bin bin* bin bin bin bin bin h exe com bin bin exe bin bin inc exe com bin bin exe bin bin ah exe com bin bin exe bin bin * first 0x100 bytes of `program' are 0x90 (nop) then generated code. NOTE: sys is just a synonym for bin. Assembler Directives: `.align' usage: .align BOUNDARY [, FILL] Causes djasm to emit bytes (if necessary) so that the `pc' address is a multiple of BOUNDARY (ie: pc % BOUNDARY == 0). FILL, which defaults to 0x90 (nop), specifies what bytes to emit. eg: .align 4 ; align on a four byte boundary, using 0x90 to fill the spaces .align 16,0 ; align on a 16 byte boundary, filling the space with 0's `.bss' usage .bss Signifies the end of generated bytes. No more bytes will be emmited into the image, but space will be reserved so labels can be declared after this point to create uninitialize variables. `.copyright' usage .copyrigth STRING Puts a copyright message (STRING) into the header of the executable (.exe output only). STRING is a standard C string. multiple .copyright directives concatenate. `.db' `.dd' `.dw' usage .db dblist .dw dwlist .dd ddlist dblist : dbitem | dblist ',' dbitem ; dbitem : const | STRING | const .dup const ; dwlist : dwitem | dwlist ',' dwitem ; dwitem : const | UID offset | const .dup const ; ddlist : dditem | ddlist ',' dditem ; dditem : const | UID offset | const .dup const ; `.dup' See above. `.id' usage: .id Emits an RCS id of the form: $Id: djasm.txt,v 1.2 2001/01/17 19:42:12 jtw dead $ @(#) foo.asm built 10/04/97 14:03:30 by djasm `.include' usage: .include "foo.inc" Include a text file for assembly. `.linkcoff' usage: .linkcoff "cofffile.o" Link in a gas produced 32 bit coff file. (eg the output of `gcc -c foo.c') Supports only i386 coff files with `.text', `.data' and `.bss' sections. These must be the first 3 sections in the coff file (in that order). Any other sections will be ignored. These three sections will be placed into the output image in the same order as in the coff file. The `.bss' section of the coff file will be placed into the output image by emitting the appropriate number of zeros. This means that the `.bss' section from the coff file will occupy space in the output image, rather than being implicit. `.org' `.stack' `.start' `.type' `.struct' `.union' `.ends' .struct FARPTR offs .dw seg .dw .ends Defines the following symbols: FARPTR = 4 ; size of FARPTR structure FARPTR.offs = 0 ; offset of `offs' in the `FARPTR' structure FARPTR.seg = 2 ; offset of `seg' in the `FARPTR' structure If `.union' had been used instead of `.struct', the above symbols would have the following values: FARPTR = 2 ; size of FARPTR union FARPTR.offs = 0 ; offset of `offs' in the `FARPTR' union FARPTR.seg = 0 ; offset of `seg' in the `FARPTR' union Summarising, `.struct' is the same as `struct' in C, while `.union' is the same as `union' (as should be obvious:) Nested structures are supported. Using the FARPTR structure (ie the .struct example), here is an example of nested structures: .struct XMS_MOVE length .dd src_handle .dw src_offset .struct FARPTR ; .union would be just as valid dst_handle .dw dst_offset .struct FARPTR .ends The above defines the following symbols: XMS_MOVE = 16 ; size of XMS_MOVE structure XMS_MOVE.length = 0 XMS_MOVE.src_handle = 4 XMS_MOVE.src_offset = 6 XMS_MOVE.src_offset.offs = 6 XMS_MOVE.src_offset.seg = 8 XMS_MOVE.dst_handle = 10 XMS_MOVE.dst_offset = 12 XMS_MOVE.dst_offset.offs = 12 XMS_MOVE.dst_offset.seg = 14 Note that `XMS_MOVE.src_offset' and `XMS_MOVE.src_offset.offs' share the same offset, as do `XMS_MOVE.dst_offset' and `XMS_MOVE.dst_offset.offs'. This is so that the offset of the nested structure or union can be used directly. Also, when declaring structure or union members of a structure (or union), `.struct' and `.union' can be used interchangably without changine the layout of the structure. `.struct' can be use when inserting unions and `.union' can be used when inserting structures, but this is not recommended. It is recommended for readability's sake that the approriate directive is used, but there is no enforcement of this (mostly because there's no way of detecting it :). Variables can be declared using the structures in the followin manners (NOTE: the above message aoubt `.struct' and `.union' apply here as well): xms_move_packet .struct XMS_MOVE This has the same effect as: xms_move_packet: xms_move_packet.length: .dd 0 xms_move_packet.src_handle: .dw 0 xms_move_packet.src_offset: xms_move_packet.src_offset.offs: .dw 0 xms_move_packet.src_offset.seg: .dw 0 xms_move_packet.dst_handle: .dw 0 xms_move_packet.dst_offset: xms_move_packet.dst_offset.offs: .dw 0 xms_move_packet.dst_offset.seg: .dw 0 xms_move_packet .struct XMS_MOVE (.) ; does not emit any bytes! xms_move_packet .struct XMS_MOVE (. - sh_handle_cache) These have the same effect as (X=. or X=.-sh_handle_chache, respectively): xms_move_packet = X xms_move_packet.length = X xms_move_packet.src_handle = X + 4 xms_move_packet.src_offset = X + 6 xms_move_packet.src_offset.offs = X + 6 xms_move_packet.src_offset.seg = X + 8 xms_move_packet.dst_handle = X + 10 xms_move_packet.dst_offset = X + 12 xms_move_packet.dst_offset.offs = X + 12 xms_move_packet.dst_offset.seg = X + 14 The `(.)' form is useful for initializing structures (the declaration line would be followed by anonymous .db/.dw/.dd lines). Unfortunately, this is prone to error, but is currently the only way of doing this. The form with soee other expression (any expression that djasm can handle as a relocation is premitted in the parentheses) is usefull for declaring structures at some specific address (eg the PSP structure or some of the DOS system tables). `.addrsize' `.opsize' `.segcs' `.segds' `.seges' `.segss' `.segfs' `.seggs' usage .addrsize mov ax,[foo] These directives all cause the appropriate instruction prefix byte to be emmitted into the instruction stream. The `.segXX' seems to be most appropriate when used with the string instructions (movs, stos, etc). --------------129A607C6851--