It is the 1747th of March 2020 (aka the 11th of December 2024)
You are 18.97.9.169,
pleased to meet you!
mailto:blog-at-heyrick-dot-eu
Hardcore geeking-out!
It is pretty standard that most programmers when testing a build environment write a little program to spit "Hello World!" to the display. There is no real requirement for originality here as we don't really care much for what is written, only that it actually works. Because it is utterly pointless to try any sort of coding when you don't have a way to make the programs become.
The prerequisites are the OSD sources and something to run them on (I use Portable Ubuntu - details here):
cd neuros-bsp
./setup-rootfs
source neuros-env
That is to the system can find the "arm-linux-<blah>" tools and their libraries/resources.
Anyway, I wrote the following:
And compiled it with:
arm-linux-gcc -o helloworld helloworld.c
And I was rewarded with a 7,154 byte program.
You WHAT? A little shy of 8K just to write 13 characters to the screen? You're kidding, right? I know there will be some overheads for file headers and library code, but come on. I once made a complete GIF decoder in nearly a quarter of that size.
The next optimisation is to realise that there is a lot of junk put into the file that isn't necessary. Luckily there's a tool to cleanse the file:
arm-linux-strip helloworld
Now we're looking at a file 3,072 bytes in size. To print a 13 character string.
So, being the sort of boring geek that I am, I decided to do something about this.
The first step was to brush up on the ELF file format (you'll find the spec at http://refspecs.freestandards.org/elf/elf.pdf), and how one calls system functions under Linux (you'll need to Google around and wade through lots of INT 0x80 trivia).
This is important because - sorry guys - we're pretty much going to have to drop C in the bin. Or /dev/null if you prefer.
So let me introduce, to you, the all new revised helloworld_tiny.s program.
The first part is the ELF header:
@ Write a basic ELF header [ALL words are written backwards!]
.word 0x464C457F @ ELF "magic" value
.word 0x61010101 @ Type = 32 bit, word order LSB, ver 1
.word 0 @ padding
.word 0 @ padding
.word 0x00280002 @ executable file, ARM CPU
.word 0x00000001 @ version = 1 (current)
.word 0x00008068 @ entry point, start of execution
.word 0x00000034 @ program header table offset
.word 0 @ section table offset (there is none)
.word 0x00000002 @ processor specific flags (2=???)
.word 0x00200034 @ ELF header size, size of ptab entry
.word 0x00000001 @ num of ptab ents, size of sectab ents
.word 0 @ num sectab ents, ptr to string table
This defines a fairly simple header to say we're for a 32 bit ARM system, there's one program table entry to define what we'll be loading. Oh, and our entry point is &8068.
Some of the "unknown" values (processor-specific flags, bytes that should be zero but aren't) have just been copied from the headers of other files.
So obviously, the next step is the program header table, which is:
@ Now for a basic program header table
.word 0x00000001 @ type = PT_LOAD (loadable)
.word 0 @ offset (0 = load from start)
.word 0x00008000 @ virtual address to load to
.word 0x00008000 @ physical address to load to
.word 0x0000007F @ number of bytes to load
.word 0x0000007F @ size of memory image
.word 0x00000005 @ flags = Executable (1) and Read (4)
.word 0x00008000 @ alignment
This is it for the headers. Note that the ELF specification (v1.2) says that a section table (that .text, .bss stuff) is optional. Likewise, the PT_PHDR definition in the program header table does not have to be present.
So the final part of the equation is some actual code:
@ Now for some really simple code to print the message to the terminal.
message:
.ascii "Hello World! :-)\n"
.byte 0
.byte 0
.byte 0
entry:
mov r0, #1 @ 1 = stdout
adr r1, message @ pointer to message
mov r2, #17 @ message length
swi 0x900004 @ swi call for Sys_Write
mov r0, #0 @ set return code
swi 0x900001 @ swi call for Sys_Exit
@ That's it! Done.
This sets up, basically, a call to the system "write" function to write the 17 bytes of the message to the console (the stdout pseudo-file). Then we call the system "exit" function with a return code of zero.
Yes. That is correct. Sadly the GNU assembler does not appear to have an option to spit out a flat binary file. It is a marked-up ELF for passing to the linker. I'm not actually sure if it is physically possible to get a pure binary file out of the GNU tools.
At this point I cheat. A lot. I copy the file back to Windows, and use Hexplorer to simply rip out the bits that are not wanted. As we are working in pure assembly here, we know exactly what our file should look like. And it should look like this:
And there, ladies and gentlemen, you have it. The smallest legal program to display a Hello World message on ARM Linux, a mere 128 bytes. The stripped C version being 24 times larger.
There are more optimisations possible, no smiley in the message to save a word, and I could start abusing the ELF headers. But I'm looking for the smallest valid program, not the smallest nightmare. ☺
There is a change, I should mention, where the later EABI kernels tidy up the messy SWI dispatch mechanism by placing the call in R7 and calling SWI 0. But the OSD doesn't run such a kernel (unless you are called Gerry Boland!), and the later kernels are backwardly compatible.
Right now I have a massive smile on my face. Not just because I got this working, but because I have been able to write some ARM code (all <cough>six</cough> lines of it!). There is just something karmically pleasing about ARM code.
Downloads (right click and 'Save link as...' to download)
[YouTube] See-Saw - Indio - these girls, and this style of music, ought to be familiar to .hack fans. In Japanese.
Your comments:
Please note that while I check this page every so often, I am not able to control what users write; therefore I disclaim all liability for unpleasant and/or infringing and/or defamatory material. Undesired content will be removed as soon as it is noticed. By leaving a comment, you agree not to post material that is illegal or in bad taste, and you should be aware that the time and your IP address are both recorded, should it be necessary to find out who you are. Oh, and don't bother trying to inline HTML. I'm not that stupid! ☺ ADDING COMMENTS DOES NOT WORK IF READING TRANSLATED VERSIONS.
You can now follow comment additions with the comment RSS feed. This is distinct from the b.log RSS feed, so you can subscribe to one or both as you wish.
joe, 4th April 2011, 15:07
Hi Rick, I cannot comment on this problem, because I don't have enough ARM assembler knowledge, I've just started on this journey, I would appreciate some pointers, how to know, which bits you can delete from the binary file using hexeditor and where to find the opcodes and reverse engineering tips.
This web page is licenced for your personal, private, non-commercial use only. No automated processing by advertising systems is permitted.
RIPA notice: No consent is given for interception of page transmission.