BBC BASIC assembler

From ARMwiki
Revision as of 03:38, 2 January 2012 by Admin (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

You can begin writing assembler code on any RISC OS machine straight away. You don't need a development suite or, indeed, anything other than somewhere to save your files. Everything else is built into the machine. You can use !Edit to write your programs, then make use of the assembler built into BASIC to assemble your code.

Contents

Introduction to the BASIC assembler

The BASIC assembler is powerful and flexible. You can define constants as BASIC variables, perform multi-pass assembly, offset assembly, and so forth. My intermingling BASIC and assembler, you can perform many of the functions of a macro-capable assembler.

Here is a simple program demonstrating various concepts. Take a look, then we'll dissect the program line by line.

  DIM code% 20
  OS_Write0 = &2
  
  FOR loop% = 0 TO 2 STEP 2
    P% = code%
    [ OPT    loop%
    
      ADR    R0, message
      SWI    OS_Write0
      MOV    PC, R14
      
    .message
      EQUS   "Hello!" + CHR$(0)
      ALIGN
    ]
  NEXT
  
  CALL code%

It is simple, it prints "Hello!" on the screen.

Now lets look at it line by line.

  DIM code% 19

You must allocate memory to assemble into. Because DIM counts from zero, the 19 reserves 20 bytes.
Note that odd things can happen if you overrun this memory, though there is an OPT option to perform bounds checking.

  OS_Write0 = &2

This defines a variable OS_Write0 and sets its value to two.

  FOR loop% = 0 TO 2 STEP 2

In our code, we make use of a forward reference (to message). BASIC, however, cannot perform references that it does not yet know about. Therefore, it is normal to perform two passes of the assembly. The first time ignoring all errors, this allows BASIC to essentially scan through the code discovering the labels. The second time is the pass that matters, with errors enabled.

    P% = code%

We want to assemble to the same place each time, so we set P% inside the loop. P% is a special variable that indicates to BASIC where the code is to be assembled.

    [ OPT    loop%

The left square bracket tells BASIC that we are switching to assembly mode. The OPT pseudo-instruction tells BASIC which assembly options you would like. The first time the value is zero (no errors), the second time it is two (errors enabled).

      ADR    R0, message

This uses the pseudo-instruction sets R0 to point to message.

      SWI    OS_Write0

We call the SWI instruction. This is an OS call to output a null-terminated string. It uses a variable to show how BASIC variables can be used within an assembly. BASIC understands SWI names, so we could have used its name instead, like SWI "OS_Write0".

      MOV    PC, R14

On entry to the code the return address is placed into R14, as you would expect. So pushing R14 back into PC returns to BASIC.

    .message

This describes a label. Any location that would need to be referenced in a program, whether a function or a block of data, needs a label. These labels are describled to BASIC by prefixing with a full stop.

      EQUS   "Hello!" + CHR$(0)

The instruction EQUS is a pseudo-instruction that tells BASIC that you will be inserting a string. The CHR$(0) at the end (note, a BASIC function!) null-terminates the string as required by the OS_Write0 call.

      ALIGN

Our final pseudo-instruction simply tells BASIC to advance the assembly pointer to the next word boundary. Code must be word aligned, so this makes it easy to ensure everything is correctly aligned after data has been inserted.

    ]

The right square bracket indicates that we are now returning back to BASIC.

  NEXT

The end of the two-pass FOR loop.

  CALL code%

Now, having assembled the code, we run it.

Assembly format

The code to be assembled lies between square brackets.

Typically each instruction goes on a line by itself, however you can use a colon to hold multiple instructions on one line (just like BASIC).

A full stop indicates a label, and any text following the full stop is the label name.

Comments are introduced with a semicolon, a backslash ('\') or REM and whatever follows until the end of the line or a colon is taken as the comment.

Spaces between instructions and the operands, or the operands, are ignored.

BASIC recognises R0 to R15, and PC. Later versions of BASIC understand SP and LR as well, but this should not be relied upon.

There is a module, ExtBasicAsm by Darren Salt that adds a lot of useful functionality to the BASIC assembler, however it is not currently 32 bit compatible.

Pseudo-instructions

ADR Rx, label

ADR assembles an instruction (usually something like ADD Rx, PC, #xx) to load the address of label into Rx. Because of the instruction used, it is only possible to provide offsets within the range that can be synthesised with a shift, which usually limits the range to +/- 4096 bytes.

ALIGN

This ensure P% (or O%) is aligned to the next word boundary.

EQU[B|W|D|S] / DC[B|W|D|S] <data>

These instructions place data into the assembly and advance P% (or O%) accordingly.

EQUB, DCB, =
Defines one byte of memory. If the value provides is larger than a byte, the lowest significant byte will be used. The three examples are equivalent.

  EQUB  123
  DCB   123
  =     123

EQUW, DCW
Defines two bytes of memory.

  EQUW  1234

EQUD, DCD
Defines four bytes of memory.

  EQUD  &210665

EQUS, DCS
Defines up to 255 bytes of data as specified by the string expression.

  EQUS  "This is a string!" + CHR$(13) + CHR$(10) + CHR$(0)

Word aligning is not necessary for inserting data. The string example could have been written as:

  EQUS  "This is a string!"
  EQUB  13
  EQUB  10
  EQUB  0

The exact meanings of the type suffix are - Byte, Word, Doubleword, and String.
The use of Word to refer to 16 bit values and Doubleword for 32 bit values is somewhat nonsensical on the ARM and is an anacronism from the Acorn's 6502 heritage.

OPT

OPT controls the assembler. Up to five bits may be defined.

Bit 0 Produce an assembly listing
It is not normal these days to output a listing as it writes junk to the screen so isn't suitable for inline code in a game or a multitasking program. That said, it might be useful during testing and debugging.

Bit 1 Enable errors
For the first pass of assembly this should be unset to suppress errors, otherwise forward references will fail with an "Unknown or missing variable" error. For the second pass, this bit should be set so genuinely unknown references are faulted.

Bit 2 Offset assembly
Sometimes there is a need to perform offset assembly, in which the code is going to be executed at a different place to that where it is assembled. In this case, you must define both P% and O%. The difference is that the code is assembled to the memory now pointed to by O% while being set up to run from the address pointed to by P%.

Bit 3 Bounds checking
This checks that the code assembled does not exceed the memory allocated; namely that P% does not exceed L%. You use it as follows:

  codesize% = 1024
  DIM code% codesize%
  L% = code% + codesize%

and then set bit 3 in the OPT statement.

Bit 4 Extended instructions
Apparently on RISC OS 4 (and later?) this bit allows extended instructions (i.e. ARMv4 and later) to be assembled. Reference: Steve Drain's BASIC StrongHelp manual, but I couldn't find any reference to this elsewhere.

Running assembled code

Assembled code can either be saved and loaded as a program, or run in-situ as a part of the current program.

Saving code

This is used when you don't need BASIC, it is simply the tool you're using to assemble with.
If we defined our block of memory to be code%, we could save it to MyFile as follows:

 OSCLI "Save MyFile "+STR$~(code%)+" "+STR$~(P%)

This would usually be followed by a SetType command to allow RISC OS to be able to run it correctly.

CALL

The CALL command enters a block of assembled code from within BASIC. This allows code to be assembled in place, and is usually for speeding up certain parts of the code, or doing some things that BASIC is not capable of doing for itself (such as entering a privileged mode to interact with I/O).

  CALL code%

Before the code is entered, the registers are set up as follows:

R0 A%
R1 B%
R2 C%
R3 D%
R4 E%
R5 F%
R6 G%
R7 H%
R8 Pointer to BASIC workspace
R9 Pointer to list of parameters
R10 Number of parameters
R11 Pointer to BASIC's string accumulator
R12 BASIC's LINE pointer (points to the current statement)
R13 Pointer to BASIC's stack (a fully descending stack)
R14 Link back to BASIC

R8 and R9 provide some flexible and advanced features which are outside of the scope of this document. If you are interested, get a copy of Steve Drain's BASIC manual. Alternatively, download the BBC BASIC Guide as a PDF.

USR

USR is the same as CALL, with the exceptions that it cannot accept any registers, and that it passes back the result of R0. This is useful if you need to have a way of your assembler code communicating directly with the BASIC program.

  result% = USR(code%)

Support by RISC OS version

From the horse's mouth, so to speak.

RISC OS 3

  >HELP [
  Assembly language is contained in [] and assembled at P%. Labels follow '.'.
  Syntax:
  SWI[<cond>] <expr>
  ADC|ADD|AND|BIC|EOR|ORR|RSB|RSC|SBC|SUB[<cond>][S] <reg>,<reg>,<shift>
  MOV|MVN[<cond>][S] <reg>,<shift>
  CMN|CMP|TEQ|TST[<cond>][S|P] <reg>,<shift>
  MLA[<cond>][S] <reg>,<reg>,<reg>,<reg>
  MUL[<cond>][S] <reg>,<reg>,<reg>
  LDR|STR[<cond>][B] <reg>, '[ <reg>[,<shift>] '] [,<shift>][!]
  LDM|STM[<cond>]DA|DB|EA|ED|FA|FD|IA|IB <reg>[!],{<reg list>}[^]
  B[L][<cond>] <label>
  OPT|=|DCB|EQUB|DCW|EQUW|DCD|EQUD|EQUS <expr>
  ADR[<cond>] <reg>,<label>
  ALIGN
  where <shift>=<reg>|#<expr>|<reg>,ASL|LSL|LSR|ASR|ROR <reg>|#<expr>|RRX
  and <cond>=AL|CC|CS|EQ|GE|GT|HI|HS|LE|LS|LT|LO|MI|NE|NV|PL|VC|VS
  and <reg>=R0 to 15 or PC or <expr>

Note that RiscPC builds of BASIC do not support all of the instructions present in the ARMv3 family, such as MRS. The above was taken from BASIC in RISC OS 3.70, but it holds true for RISC OS 3.10 as well.

RISC OS 4

  >HELP [
  Assembly language is contained in [] and assembled at P%. Labels follow '.'.
  Syntax:
  SWI[<cond>] <expr>
  ADC|ADD|AND|BIC|EOR|ORR|RSB|RSC|SBC|SUB[<cond>][S] <reg>,<reg>,<shift>
  MOV|MVN[<cond>][S] <reg>,<shift>
  CMN|CMP|TEQ|TST[<cond>][S|P] <reg>,<shift>
  MUL[<cond>][S] <reg>,<reg>,<reg>
  MLA|UMULL|UMLAL|SMULL|SMLAL[<cond>][S] <reg>,<reg>,<reg>,<reg>
  LDR|STR[<cond>][B|SB|H|SH] <reg>, '[ <reg>[,<shift>] '] [,<shift>][!]
  LDM|STM[<cond>]DA|DB|EA|ED|FA|FD|IA|IB <reg>[!],{<reg list>}[^]
  B[L][<cond>] <label>
  BX[<cond>] <reg>
  MRC|MCR[<cond>] <copro>,<expr>,<reg>,<cpreg>,<cpreg> [,<expr>]
  CDP[<cond>] <copro>,<expr>,<cpreg>,<cpreg>,<cpreg> [,<expr>]
  LDC|STC[<cond>][L] <copro>, '[ <reg>[,#<expr>] '] [,#<expr>][!]
  SWP[<cond>][B] <reg>,<reg>, '[<reg> ']
  MRS[<cond>] <reg>,<psr>
  MSR[<cond>] <psr>_[c][x][s][f],<reg>|#<expr>
  ADF|MUF|SUF|RSF|DVF|RDF|POW
     |RPW|RMF|FML|FDV|FRD|POL[<cond>]<prec>[<round>] <fpreg>,<fpreg>,<fpop>
  MVF|MNF|ABS|RND|SQT|LOG|LGN|EXP
     |SIN|COS|TAN|ASN|ACS|ATN|URD|NRM[<cond>]<prec>[<round>] <fpreg>,<fpop>
  FLT[<cond>]<prec>[<round>] <fpreg>,<reg>
  FIX[<cond>][<round>] <reg>,<fpreg>
  WFS|RFS|WFC|RFC[<cond>] <reg>
  CMF|CNF[E][<cond>] <fpreg>,<fpop>
  LDF|STF[<cond>]<prec> <fpreg>, '[ <reg>[,#<expr>] '] [,#<expr>][!]
  LFM|SFM[<cond>] <fpreg>,<expr>, '[ <reg>[,#<expr>] '] [,#<expr>][!]
  LFM|SFM[<cond>]EA|FD <fpreg>,<expr>, '[ <reg> '] [!]
  OPT|=|DCB|EQUB|DCW|EQUW|DCD|EQUD|EQUS <expr>
  DCF|EQUF<prec> <expr>
  ADR[<cond>] <reg>,<label>
  ALIGN|NOP
  where <shift>=<reg>|#<expr>|<reg>,ASL|LSL|LSR|ASR|ROR <reg>|#<expr>|RRX
  and <cond>=AL|CC|CS|EQ|GE|GT|HI|HS|LE|LS|LT|LO|MI|NE|NV|PL|VC|VS
  and <reg>=R0 to 15 or LR or PC or <expr>
  and <copro>=CP0 to 15 or <expr>
  and <cpreg>=C0 to 15 or <expr>
  and <fpreg>=F0 to 7 or <expr>
  and <fpop>=F0 to 7 or #<expr>, where <expr>=0,0.5,1,2,3,4,5 or 10
  and <prec>=S|D|E|P
  and <round>=P|M|Z
  and <psr>=CPSR|SPSR

This contails co-processor instructions, FP instructions, and the additional instructions in the StrongARM (etc) processor, such as unsigned long multiply.

RISC OS 5

  >HELP [
  Assembly language is contained in [] and assembled at P%. Labels follow '.'.
  Syntax:
  SWI[<cond>] <expr>
  BKPT <expr>
  ADC|ADD|AND|BIC|EOR|ORR|RSB|RSC|SBC|SUB[<cond>][S] <reg>,<reg>,<shift>
  MOV|MVN[<cond>][S] <reg>,<shift>
  CMN|CMP|TEQ|TST[<cond>][S|P] <reg>,<shift>
  CLZ[<cond>] <reg>,<reg>
  QADD|QSUB|QDADD|QDSUB[<cond>] <reg>,<reg>,<reg>
  MUL[<cond>][S] <reg>,<reg>,<reg>
  MLA|UMULL|UMLAL|SMULL|SMLAL[<cond>][S] <reg>,<reg>,<reg>,<reg>
  SMUL<W|B|T><B|T>[<cond>] <reg>,<reg>,<reg>
  SMLA[L]<W|B|T><B|T>[<cond>] <reg>,<reg>,<reg>,<reg>
  LDR|STR[<cond>][B|T|BT|SB|H|SH|D] <reg>, '[ <reg>[,<shift>] '] [,<shift>][!]
  LDM|STM[<cond>]DA|DB|EA|ED|FA|FD|IA|IB <reg>[!],{<reg list>}[^]
  SWP[<cond>][B] <reg>,<reg>, '[<reg> ']
  PLD '[ <reg>[,<shift>] ']
  B[L][<cond>] <label>
  BLX <label>
  B[L]X[<cond>] <reg>
  MRC|MCR[<cond>|2] <copro>,<expr>,<reg>,<cpreg>,<cpreg> [,<expr>]
  MCRR|MRRC[<cond>] <copro>,<expr>,<reg>,<reg>,<cpreg>
  CDP[<cond>|2] <copro>,<expr>,<cpreg>,<cpreg>,<cpreg> [,<expr>]
  LDC|STC[<cond>|2][L] <copro>, '[ <reg>[,#<expr>] '] [,#<expr>|{expr}][!]
  MRS[<cond>] <reg>,<psr>
  MSR[<cond>] <psr>_[c][x][s][f],<reg>|#<expr>
  ADF|MUF|SUF|RSF|DVF|RDF|POW
     |RPW|RMF|FML|FDV|FRD|POL[<cond>]<prec>[<round>] <fpreg>,<fpreg>,<fpop>
  MVF|MNF|ABS|RND|SQT|LOG|LGN|EXP
     |SIN|COS|TAN|ASN|ACS|ATN|URD|NRM[<cond>]<prec>[<round>] <fpreg>,<fpop>
  FLT[<cond>]<prec>[<round>] <fpreg>,<reg>
  FIX[<cond>][<round>] <reg>,<fpreg>
  WFS|RFS|WFC|RFC[<cond>] <reg>
  CMF|CNF[E][<cond>] <fpreg>,<fpop>
  LDF|STF[<cond>]<prec> <fpreg>, '[ <reg>[,#<expr>] '] [,#<expr>][!]
  LFM|SFM[<cond>] <fpreg>,<expr>, '[ <reg>[,#<expr>] '] [,#<expr>][!]
  LFM|SFM[<cond>]EA|FD <fpreg>,<expr>, '[ <reg> '] [!]
  OPT|=|DCB|EQUB|DCW|EQUW|DCD|EQUD|EQUS <expr>
  DCF|EQUF<prec> <expr>
  ADR[<cond>] <reg>,<label>
  ALIGN|NOP
  where <shift>=<reg>|#<expr>|<reg>,ASL|LSL|LSR|ASR|ROR <reg>|#<expr>|RRX
  and <cond>=AL|CC|CS|EQ|GE|GT|HI|HS|LE|LS|LT|LO|MI|NE|NV|PL|VC|VS
  and <reg>=R0 to 15 or SP or LR or PC or <expr>
  and <copro>=CP0 to 15 or <expr>
  and <cpreg>=C0 to 15 or <expr>
  and <fpreg>=F0 to 7 or <expr>
  and <fpop>=F0 to 7 or #<expr>, where <expr>=0,0.5,1,2,3,4,5 or 10
  and <prec>=S|D|E|P
  and <round>=P|M|Z
  and <psr>=CPSR|SPSR

This contails co-processor instructions, FP instructions, and the additional instructions in the later processors (StrongARM through ARM9 etc) such as saturated addition.

Personal tools
Namespaces

Variants
Actions
Navigation
Contents
Toolbox