- Say you want to write a subroutine "fact" that
returns the factorial of the input integer n. Below is a C version,
which is dangerous because the integer multiplication overflows for
n>=13. Use assembly to detect this overflow,
and return -1 if overflow occurs. Note that the "mul" instruction will set CF
(checked by jae/jb) if the product overflows *or* if either input is
negative.
You could leave "fact" and the loop in C, and use gcc inline assembly
to write an overflow-detecting multiplication subroutine. If
you've got lots of time and patience, you may also rewrite "fact" in
assembly,
or just write everything in assembly.
(NetRun example)
int fact(int n) {
if (n<=1) return 1; /* fact(1)==1 */
else return n*fact(n-1); /* fact(n)==n*fact(n-1) */
}
int foo(void) {
int i;
for (i=1;i<=20;i++)
print_int(fact(i));
return 0;
}
Once your program is working correctly, it should print out:
Printing integer 1 (0x1)
Printing integer 2 (0x2)
Printing integer 6 (0x6)
Printing integer 24 (0x18)
Printing integer 120 (0x78)
Printing integer 720 (0x2D0)
Printing integer 5040 (0x13B0)
Printing integer 40320 (0x9D80)
Printing integer 362880 (0x58980)
Printing integer 3628800 (0x375F00)
Printing integer 39916800 (0x2611500)
Printing integer 479001600 (0x1C8CFC00)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Printing integer -1 (0xFFFFFFFF)
Program complete. Return 0 (0x0)
- Write an x86 gcc C++ inline assembly subroutine "sort_ints" that
takes an array of integers (as a pointer and an integer length), and
sorts the array. I highly recommend using the standard C library routine "qsort",
which takes four arguments: a pointer to the array, the length of the
array in integers, the size of each array element in bytes, and a pointer
to a comparison routine--here, you can use the C routine "compare".
Test and turn in your subroutine with
this C++ driver code, which (aside from the __asm__ block) you are not
allowed to change. Your program should work for any input n.
(NetRun example)
extern "C" void sort_ints(int *arr,int n);
/* qsort-compatible comparison routine */
extern "C" int compare(int *a,int *b)
{ return *a-*b; }
__asm__( /* Assembly function body */
"sort_ints:\n"
" # Your code here...\n"
" ret\n"
);
int foo(void) {
int i,n=read_input(); /* Array contains n ints */
int *arr=new int[n];
for (i=0;i<n;i++) /* Create the ints */
arr[i]=rand()&0xf;
sort_ints(arr,n);
for (i=0;i<n;i++) /* Print them */
std::cout<<i<<" = "<<arr[i]<<std::endl;
delete[] arr;
return 0;
}
- Write an x86 gcc C++ inline assembly subroutine "add_ints" that
takes an array of integers (as a pointer and an integer length), and
adds 100 (decimal) to each integer in the array. Your code should work for any
array length, including zero. Test and turn in your subroutine with
this C++ driver code, which (aside from the __asm__ block) you are not allowed to change:
(NetRun example)
extern "C" void add_ints(int *arr,int n); /* C++ function prototype */
__asm__( /* Assembly function body */
"add_ints:\n"
" # Your code here...\n"
" ret\n"
);
int foo(void) {
int i,n=read_input(); /* Array contains n ints */
int *arr=new int[n];
for (i=0;i<n;i++) /* Create the ints */
arr[i]=rand()&0xf;
add_ints(arr,n); /* Add to them */
for (i=0;i<n;i++) /* Print them */
std::cout<<i<<" = "<<arr[i]<<std::endl;
delete[] arr;
return 0;
}
- The "rdtsc" instruction on x86 machines reads the "Time
Stamp Counter", an integer counter that increments every clock
cycle. This instruction takes no parameters, and returns the low
32 bits of this counter in %eax. Write a little routine named
"call_rdtsc" using inline assembly to make this instruction easy to
call from C++. Once call_rdtsc is written, submit it with this little routine that returns the number
of clock cycles it takes to call rdtsc (should be about 100, but it
varies...).
int foo(void) {
int t1=call_rdtsc();
int t2=call_rdtsc();
return t2-t1;
}
Like HW3, you'll turn these problem in by just naming them HW4_1, HW4_2, etc. in