Strings in Assembly

CS 301 Lecture, Dr. Lawlor

Constant Strings

The bottom line is a C string is just a region of memory with some ASCII characters in it.  One ASCII character is one byte, and a zero byte indicates the end of the string.  So any way you can create bytes with known values, you can create strings too.  Here we're using the handy C function puts to print the string onto the screen.
extern puts

mov rdi,the_secret_message
call puts ; write our string to the screen

ret

the_secret_message:
db 0x59
db 0x75
db 0x70
db 0 ;<- zero byte marks end of string

(Try this in NetRun now!)

That's a pretty atrocious way to write strings, so the assembler supports a bunch of other syntaxes.  These are all equivalent:
	db 0x4D, 0x6F, 0x6F, 0x73, 0x65, 0  (Try this in NetRun now!)
	db 'M', 'o', 'o', 's', 'e', 0  (Try this in NetRun now!)
	db 'Moose', 0  (Try this in NetRun now!)
In the assembler, single and double quotes are interchangable, unlike in C++, where single quote like 'M' is an integral value 0x4D, but double quote "M" is a pointer to a zero-terminated string {0x4D,0}.  Also unlike in C++, "\n" doesn't give a newline, it prints "\n"!  To get an actual newline, you need to use 10 or 0xA (this is ASCII "LF", new line).  Don't forget the zero byte to end the string.
	db 'Moose',0xA
db '... and squirrel.',0

(Try this in NetRun now!)

Keep in mind that puts adds a newline at the end of the string.  Call printf to avoid the newline.

Variable Strings

There are these handy C functions gets and putsthat read or write strings to the screen.  They both take just one argument, a pointer to the string data to read or write. (Beware!  Because neither function takes the buffer size, they can both run way off into uncharted memory, causing crashes or security problems!  Demostration use only!)

For example, I can store a modifiable string statically, in "section .data":
extern gets
extern puts
mov rdi,mystring
call gets
mov rdi,mystring
call puts
ret

section .data
mystring:
times 100 db 'v'

(Try this in NetRun now!)

*Or* I can allocate space on the stack to store the string (and hello, stack-smashing attacks!)
extern gets
extern puts

sub rsp,100 ; allocate 100 bytes of stack space

mov rdi,rsp
call gets ; read into our string
mov rdi,rsp
call puts ; write our string to the screen

add rsp,100; give back stack space
ret

(Try this in NetRun now!)

Or I can call "malloc" to allocate space for the string.  I need to use a preserved register to hang onto the allocated pointer; here I'm using r12.
extern gets
extern puts
extern malloc, free

push r12 ; preserve main's copy on the stack

mov rdi,100
call malloc
mov r12,rax ; <- malloc returns the pointer in rax

mov rdi,r12
call gets ; read into our string
mov rdi,r12
call puts ; write our string to the screen

mov rdi,r12
call free ; dispose of our copy of the string

pop r12 ; restore main's copy of this register
ret

(Try this in NetRun now!)

The bottom line: any pointer can store the data for a string.