Initial version: 2025-02-19
Last update: 2025-02-19
section .data
text: db "Hello World!", 10
len: equ $-text
section .text
global _start
_start:
mov edx, len
mov ecx, text
mov ebx, 1
mov eax, 4
int 0x80
; Exit
mov ebx, 0
mov eax, 1
int 0x80
; End of the code
Verify correctness of the code by assembling it with:
nasm -f elf hello.asm
linking:
ld hello.o -o hello
and finally running
./hello
If no errors are reported the result is as follow:
fulmanp@fulmanp:~/assembler$ ./hello
Hello World!
nasm -f elf hello.asm
but link as:
ld -m elf_i386 hello.o -o hello
Such a program is a 32-bit program, which can be verified by readelf
Unix command:
fulmanp@fulmanp-k2:~/assembler$ readelf -h hello
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048080
Start of program headers: 52 (bytes into file)
Start of section headers: 216 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Size of section headers: 40 (bytes)
Number of section headers: 6
Section header string table index: 3
fulmanp@fulmanp-k2:~/assembler$ nasm -f elf64 hello.asm
fulmanp@fulmanp-k2:~/assembler$ ld hello.o -o hello
fulmanp@fulmanp-k2:~/assembler$ readelf -h hello
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x4000b0
Start of program headers: 64 (bytes into file)
Start of section headers: 264 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 2
Size of section headers: 64 (bytes)
Number of section headers: 6
Section header string table index: 3
xxd
Unix command do dump these files in "readable" format:
fulmanp@fulmanp-k2:~/assembler$ xxd hello.o
0000000: 7f45 4c46 0101 0100 0000 0000 0000 0000 .ELF............
0000010: 0100 0300 0100 0000 0000 0000 0000 0000 ................
0000020: 4000 0000 0000 0000 3400 0000 0000 2800 @.......4.....(.
0000030: 0700 0300 0000 0000 0000 0000 0000 0000 ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0000050: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0000060: 0000 0000 0000 0000 0100 0000 0100 0000 ................
0000070: 0300 0000 0000 0000 6001 0000 0d00 0000 ........`.......
0000080: 0000 0000 0000 0000 0400 0000 0000 0000 ................
0000090: 0700 0000 0100 0000 0600 0000 0000 0000 ................
00000a0: 7001 0000 2200 0000 0000 0000 0000 0000 p..."...........
00000b0: 1000 0000 0000 0000 0d00 0000 0300 0000 ................
00000c0: 0000 0000 0000 0000 a001 0000 3100 0000 ............1...
00000d0: 0000 0000 0000 0000 0100 0000 0000 0000 ................
00000e0: 1700 0000 0200 0000 0000 0000 0000 0000 ................
00000f0: e001 0000 7000 0000 0500 0000 0600 0000 ....p...........
0000100: 0400 0000 1000 0000 1f00 0000 0300 0000 ................
0000110: 0000 0000 0000 0000 5002 0000 1b00 0000 ........P.......
0000120: 0000 0000 0000 0000 0100 0000 0000 0000 ................
0000130: 2700 0000 0900 0000 0000 0000 0000 0000 '...............
0000140: 7002 0000 0800 0000 0400 0000 0200 0000 p...............
0000150: 0400 0000 0800 0000 0000 0000 0000 0000 ................
0000160: 4865 6c6c 6f20 576f 726c 6421 0a00 0000 Hello World!....
0000170: ba0d 0000 00b9 0000 0000 bb01 0000 00b8 ................
0000180: 0400 0000 cd80 bb00 0000 00b8 0100 0000 ................
0000190: cd80 0000 0000 0000 0000 0000 0000 0000 ................
00001a0: 002e 6461 7461 002e 7465 7874 002e 7368 ..data..text..sh
00001b0: 7374 7274 6162 002e 7379 6d74 6162 002e strtab..symtab..
00001c0: 7374 7274 6162 002e 7265 6c2e 7465 7874 strtab..rel.text
00001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00001f0: 0100 0000 0000 0000 0000 0000 0400 f1ff ................
0000200: 0000 0000 0000 0000 0000 0000 0300 0100 ................
0000210: 0000 0000 0000 0000 0000 0000 0300 0200 ................
0000220: 0b00 0000 0000 0000 0000 0000 0000 0100 ................
0000230: 1000 0000 0d00 0000 0000 0000 0000 f1ff ................
0000240: 1400 0000 0000 0000 0000 0000 1000 0200 ................
0000250: 0068 656c 6c6f 2e61 736d 0074 6578 7400 .hello.asm.text.
0000260: 6c65 6e00 5f73 7461 7274 0000 0000 0000 len._start......
0000270: 0600 0000 0102 0000 0000 0000 0000 0000 ................
section .data
text: db "Hello World!", 10
len: equ $-text
section .text
global _start
_start:
mov edx, len
mov ecx, text
mov ebx, 1
mov eax, 4
int 0x80
; Exit
mov ebx, 0
mov eax, 1
int 0x80
; End of the code
;
starts comment which and extend to the end of the line.section .data
text: db "Hello World!", 10
LF
, decimal code: 10
) as the newline marker. By the way, MS-DOS chose CR+LF
(decimal: 13 and 10), and Windows inherited this.len: equ $ - text
$
) minus address of the first element of variable text
-- this should be equal to the length of the text you are going to print. Notice that len
is a value (constant of the compilation), not an address. If you prefer variables replace this line by len: dd $-text
.section .text
global _start
_start
as their entry point. Use ld -e foo
to override the default._start:
ld
entry point.mov edx, len
(or mov edx, [len]
if you prefer variables than constants)
EDX
register (EDX
is a 32-bit register, RDX
is its 64-bit equivalent) length of the text to print -- this would be a third argument of the function you are going to call. In the first case length is a constant, in the second you take it from variable. By the way copying data with MOV
from one memory cell to the other is not allowed:
mov [dest], [src] ; this is not allowed
::= X
::=
::= A | B | C | D | E
::= R | E
::= X | H | L
where, for example, correct register names for letter A
are:
RAX, EAX, AX, AH, AL
In this case you reference to register A
and it's different parts and sizes:
6 33 11 00 0
3 21 65 87 0
| | | || |
| | |.AH||.AL| AH and AL: 8 bits
| | |...AX...| AX: 16 bits
| |......EAX.......| EAX: 32 bits
|............RAX...............| RAX: 64 bits
mov ecx, text
ECX
register (RSI
register in 64-bit equivalent code) address of the first element of the text -- this would be a second argument of the function you are going to call.mov ebx, 1
EBX
register (RDI
) value 1
– this would be a first argument of the function you are going to call, so called file descriptor or file handler, indicating where to write (in this case stdout
– standard output i.e. screen). Other file descriptors are: 0
– standard input (stdin
) and 2
– standard error (stderr
).mov eax, 4
EAX
register (RAX
) value 4 (1
). This is a number of Linux function sys_write
you are going to call. Notice that these numbers are different for different architectures and operation systems.int 0x80
(syscall
)
EAX
register (RAX
). In this case this is sys_write
function which takes three arguments in registers EBX
, ECX
and EDX
(RDI
, RSI
and RDX
).
EBX
, ECX
, EDX
, ESI
, EDI
and EBP
. EAX
is used to specify the number of a system function you are going to call.
RDI
, RSI
, RDX
, R10
, R8
, R9
. RAX
is used to specify the number of a system function. Values in registers RCX
and R11
are destroyed.
INT
means interrupt, and the number 0x80
is the interrupt number. An interrupt "transfers" the program flow to whomever is handling that interrupt. In Linux, 0x80
interrupt handler is the kernel, and is used to make system calls to the kernel by other programs.
EAX
. Each system call have different requirements about the use of the other registers. For example, a value of 1
in EAX
means a system call of exit()
; in this case the value in EBX
holds the value of the status code for exit()
.mov ebx, 0
EBX
register (RDI
) value 0
-- this would be a first argument of the function you are going to call, so called errorlevel
, indicating whether program was terminated correctly or not (0
means that everything was all right and program terminates normally).mov eax, 1
EAX
register (RAX
) value 1
(60
). This is a number of Linux function sys_exit
you are going to call to terminate program.int 0x80
(syscall
)
EAX
register (RAX
).
.data # Data section
text: .ascii "Hello World!\n"
len = . - text
.text
.global _start
_start:
movl $len, %edx
movl $text, %ecx
movl $1, %ebx
movl $4, %eax
int $0x80
# Exit
movl $0, %ebx
movl $1, %eax
int $0x80
# End of the code
The code looks a little bit strange but is equivalent to previously presented NASM version what you can verify assembling it:
as hello.s -o hello.o
linking:
ld hello.o -o hello
and finally running:
fulmanp@fulmanp-k2:~/assembler$ ./hello
Hello World!
fulmanp@fulmanp-k2:~/assembler$ as --32 hello.s -o hello.o
fulmanp@fulmanp-k2:~/assembler$ ld -m elf_i386 hello.o -o hello
As previously you can verify this is a 32-bit code:
fulmanp@fulmanp-k2:~/assembler$ readelf -h hello
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048074
Start of program headers: 52 (bytes into file)
Start of section headers: 204 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Size of section headers: 40 (bytes)
Number of section headers: 6
Section header string table index: 3
As you may notice both NASM and GNU AS code are quite similar but they differ in details. This is because in NASM you use Intel syntax wile in the GNU AS you use AT\amp;T syntax. The next section describes the most important differences between them.
/*
comment
*/
#
).%
. To reference EAX
:
AT&T: %eax
Intel: eax
EBX
with the value in EAX
:
AT&T: movl %eax, %ebx
Intel: mov ebx, eax
$
. To load EAX
with the address of the variable foo
:
AT&T: movl $foo, %eax
Intel: mov eax, foo
To load EBX
with 1
:
AT&T: movl $1, %ebx
Intel: mov ebx, 1
b
, w
, or l
to specify the width of the destination register as a byte
, word
or longword
(double word
). If you omit this, GNU AS will attempt to guess but it may do this incorrectly. The only way to know about mistake is during execution of your code which may be very difficult to diagnose, so better use these specifiers.
AT&T: movw %ax, %bx
Intel: mov bx, ax
AT&T: immed32(basepointer,indexpointer,indexscale)
Intel: [basepointer + indexpointer*indexscale + immed32]
The formula to calculate the address is:
immed32 + basepointer + indexpointer * indexscale
You don't have to use all those fields, but you have to use at least one of immed32
or basepointer
. For example:
AT&T: foo
Intel: [foo]
AT&T: (%eax)
Intel: [eax]
AT&T: variable(%eax)
Intel: [eax + variable]
AT&T: array(,%eax,4)
Intel: [eax*4 + array]
AT&T: 1(%eax)
Intel: [eax + 1]
EAX
holds the number of the record desired, EBX
has the wanted char's offset within the record):
AT&T: array(%ebx,%eax,8)
Intel: [ebx + eax*8 + array]
+------------------------------+------------------------------------+
| Intel Code | AT&T Code |
+------------------------------+------------------------------------+
| mov eax,1 | movl $1,%eax |
| mov ebx,0ffh | movl $0xff,\%ebx |
| int 80h | int $0x80 |
| mov ebx, eax | movl %eax, %ebx |
| mov eax,[ecx] | movl (%ecx),%eax |
| mov eax,[ebx+3] | movl 3(%ebx),%eax |
| mov eax,[ebx+20h] | movl 0x20(%ebx),%eax |
| add eax,[ebx+ecx*2h] | addl (%ebx,%ecx,0x2),%eax |
| lea eax,[ebx+ecx] | leal (%ebx,%ecx),%eax |
| sub eax,[ebx+ecx*4h-20h] | subl -0x20(%ebx,%ecx,0x4),%eax |
+------------------------------+------------------------------------+
; This program demonstrates basic text output to a screen.
; No "C" library functions are used.
; Calls are made to the operating system directly.
;
; assemble: nasm -f elf64 hello64.asm
; link: ld hello64.o -o hello64
; run: ./hello64
; output is: Hello World
section .data ; Data section
text: db "Hello World!", 10 ; The string to print, 10=LF
len: equ $-text ; "$" means "here"
; len is a value, not an address
section .text ; Code section
global _start ; Make label available to linker
; We must export the entry point to the ELF linker or
; loader. They conventionally recognize _start as their
; entry point. Use ld -e foo to override the default.
_start: ; Standard ld entry point
mov rdx, len ; arg3: length of string to print
mov rsi, text ; arg2: pointer to string
mov rdi, 1 ; arg1: where to write, so called file descriptor
; in this case stdout (screen)
mov rax, 1 ; System call number (sys_write)
syscall ; Call a system function
; Exit
mov rdi, 0 ; Exit code, 0=normal
mov rax, 60 ; System call number (sys_exit)
syscall ; Call a system function
; End of the code
Verify correctness of the code by assembling it:
nasm -f elf64 hello_64.asm -o hello_64.o
linking:
ld hello_64.o -o hello_64
and finally running:
fulmanp@fulmanp-k2:~/assembler$ ./hello_64
Hello World!
For the explanation of the code, see description of the code in the preceding section Explain the code.
EAX
with RAX
), and even compiling it as 64-bit program the result you obtain is not a real 64-bit program as it was mentioned in the section Making (pseudo) 64-bit code on 64-bit system with NASM.
File 1: routines.asm
os_return:
;some code to return to os
do_something:
;some code to do something
File 2: useRoutines.asm
main:
call do_something ; call function from separate file to do something
... maybe do something else here ...
call os_return ; call function from separate file to finish program
You can do this quite naturally:
routines.asm
:
section .data
strHello db "Hello", 10
strLen equ $ - strHello
sys_exit equ 1
sys_write equ 4
stdout equ 1
section .text
global do_something
global exit
do_something:
mov edx, strLen
mov ecx, strHello
mov eax, sys_write
mov ebx, stdout
int 0x80
ret
exit:
mov eax, sys_exit
xor ebx, ebx
int 0x80
ret
useRoutines.asm
:
section .text
extern do_something
extern exit
global _start
_start:
call do_something
call exit
Having separate file you can compile them, link and run almost as you do for single file:
fulmanp@fulmanp-k2:~/assembler$ nasm -f elf -o routines.o routines.asm
fulmanp@fulmanp-k2:~/assembler$ nasm -f elf -o useRoutines.o useRoutines.asm
fulmanp@fulmanp-k2:~/assembler$ ld -m elf_i386 -o testSeparateRoutines routines.o useRoutines.o
fulmanp@fulmanp-k2:~/assembler$ ./testSeparateRoutines
Hello
If you want to use GCC to link your code, you have to change it a little bit in useRoutines.asm
:
useRoutines_for_gcc.asm
:
section .text
extern do_something
extern exit
global main
main:
call do_something
call exit
fulmanp@fulmanp-k2:~/assembler$ nasm -f elf -o routines.o routines.asm
fulmanp@fulmanp-k2:~/assembler$ nasm -f elf -o useRoutines_for_gcc.o useRoutines_for_gcc.asm
fulmanp@fulmanp-k2:~/assembler$ gcc -m32 -o testSeparateRoutine routines.o useRoutines_for_gcc.o
fulmanp@fulmanp-k2:~/assembler$ ./testSeparateRoutine
Hello