Tutorial 01 - Examining our first program
Our first program
In the first tutorial we wrote a short program to print a message on the screen. We shall now take this program apart and see how it works.
Here is the program:
DIM code% 64 FOR loop% = 0 TO 3 STEP 3 P% = code% [ OPT loop% ADR R0, message SWI "OS_Write0" MOV PC, LR .message EQUS "Awesome! My first assembler program! :-)" EQUB 10 EQUB 0 ALIGN ] NEXT CALL code%
The "shell" of the program is a wrapper written in BBC BASIC. I shall discuss this first.
The BBC BASIC wrapper
The beginning of the program is:
DIM code% 64 FOR loop% = 0 TO 3 STEP 3 P% = code% [ OPT loop%
What we are doing here is allocating 64 bytes of memory which is pointed to by the variable
code%. Then, since BASIC can run as a two-pass assembler, we set up a loop to run the assembly with OPT being zero (do not stop on errors) and then again with OPT being three (assemble normally, print to screen).
The third line sets
P% to point to the reserved memory.
P% is a special variable reserved for pointing to where BASIC should assemble to.
The opening square bracket tells BASIC to enter the assembler. As of this point, BASIC keywords don't work, but assembly mnemonics do. The first one we encounter is OPT which is a pseudo-instruction (meaning it is aimed at the assembler, not the processor) telling it what OPTions to use during assembly.
Then follows ARM code. I'll describe this in the next paragraph.
Afterwards, the tail end of the wrapper:
] NEXT CALL code%
The closing square bracket ends assembler mode and returns you to normal BASIC. The
NEXT terminates the loop.
CALL instruction tells BASIC to run the assembled code, starting at its base address,
CALL is optional - you could instead save the assembled code to your disc, to run later.
Here is a basic wrapper which will be useful for the following tutorials. As we progress in the tutorials, you will see the BASIC wrapper less and less as we will be concentrating on the assembler code, not the BASIC part. Simply paste the assembler into the wrapper.
ON ERROR PRINT REPORT$ + " at line " + STR$(ERL/10) : END DIM code% 1024 FOR loop% = 0 TO 2 STEP 2 P% = code% [ OPT loop% ; your code goes here (^_^) ] NEXT CALL code%
Notice that I use OPT 2. This will assemble without splatting rubbish all over the screen. I have also added an error report. The divide-by-ten is so you can press
F5 in the editor to jump directly to the line concerned. Additionally, reserving a kilobyte means there's plenty of room to play.
What the program does
Now to turn our attentions to the actual assembler. There are only two instructions. The rest are assembler directives!
Here's the code:
ADR R0, message SWI "OS_Write0" MOV PC, LR .message EQUS "Awesome! My first assembler program! :-)" EQUB 10 EQUB 0 ALIGN
The ADR is a special pseudo-instruction. What it does is ask the assembler to generate an offset address by taking the value of the program counter and adding (or subtracting) to generate the desired address. It is, effectively, an ADD or SUB with the address being an eight bit value rotated two places, thus the effective range of an ADR is +/- 4KiB.
Here is a disassembly showing this:
The addition only needs to add four to point to the desired instruction. This is because of how the ARM works, the actual value of PC is eight bytes (two instructions) on from the instruction actually being executed.
The next instruction is a SWI call. This is, essentially, a system call where BASIC helpfully converts the textual SWI name ("OS_Write0") into the correct SWI number (&2).
- It is worth noting that under RISC OS, hex numbers are prefixed with an ampersand - thus
$1234are expressed as
&1234. In this tutorial, as we are using RISC OS, we shall use the '&' prefix.
- Under RISC OS, the C compiler will understand the 0x prefix, BASIC and assemblers will understand the '&' prefix. Some assemblers will accept '0x' as well. Nothing understands '$' prefix or 'h' suffix.
The final instruction is MOV to take the value of R14 (which is the "Link Register") and push it into the Program Counter. When a system call, vector, or BL (Branch-with-Link) happens, the return address is written to R14. This is akin to the 6502's
JSR instruction only instead of saving to the processor stack, it is saved to register 14.
- All vectors and branches write to R14. This isn't as freaky as it sounds, as each different processor mode has its own private copy of R14. We won't delve into this right now.
- However, you should be aware that if you include a BL instruction, you must preserve the current R14 around the branch, otherwise your return address will be stomped on and it'll all go bang. This will be demonstrated in a future tutorial, so don't worry about it right now.
What the act of placing R14 into PC did was, essentially, end the assembler program.
.message defines a label called "message". This is used earlier in the program (in the ADR), so knowing the current PC and the address of the label, the assembler can generate the correct instruction.
The EQU... instructions tell the assembler to insert literal data. EQUS inserts a string, the following EQUB inserts a character code 10 (RISC OS newline), and the final EQUB null-terminates the string.
Finally is ALIGN. All ARM instructions must be word aligned. This may not be the case following data (as it happens, we're two bytes short) so ALIGN pushes the assembly offset to the next word boundary.
That's it. That's our program.
Next time, we'll try something a little more exciting.