CS 301 - Homework 4

  1. Say you want to write a subroutine "fact" that returns the factorial of the input integer n.  Below is a C version, which is dangerous because the integer multiplication overflows for n>=13.  Use assembly to detect this overflow, and return -1 if overflow occurs.  Note that the "mul" instruction will set CF (checked by jae/jb) if the product overflows *or* if either input is negative.  

  2. You could leave "fact" and the loop in C, and use gcc inline assembly to write an overflow-detecting multiplication subroutine.  If you've got lots of time and patience, you may also rewrite "fact" in assembly, or just write everything in assembly. 
    (NetRun example)
    int fact(int n) { 
    if (n<=1) return 1; /* fact(1)==1 */
    else return n*fact(n-1); /* fact(n)==n*fact(n-1) */
    }
    int foo(void) {
    int i;
    for (i=1;i<=20;i++)
    print_int(fact(i));
    return 0;
    }

    Once your program is working correctly, it should print out:

    Printing integer 1 (0x1)
    Printing integer 2 (0x2)
    Printing integer 6 (0x6)
    Printing integer 24 (0x18)
    Printing integer 120 (0x78)
    Printing integer 720 (0x2D0)
    Printing integer 5040 (0x13B0)
    Printing integer 40320 (0x9D80)
    Printing integer 362880 (0x58980)
    Printing integer 3628800 (0x375F00)
    Printing integer 39916800 (0x2611500)
    Printing integer 479001600 (0x1C8CFC00)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Printing integer -1 (0xFFFFFFFF)
    Program complete. Return 0 (0x0)
  3. Write an x86 gcc C++ inline assembly subroutine "sort_ints" that takes an array of integers (as a pointer and an integer length), and sorts the array. I highly recommend using the standard C library routine "qsort", which takes four arguments: a pointer to the array, the length of the array in integers, the size of each array element in bytes, and a pointer to a comparison routine--here, you can use the C routine "compare".  Test and turn in your subroutine with this C++ driver code, which (aside from the __asm__ block) you are not allowed to change.  Your program should work for any input n.  (NetRun example)
    extern "C" void sort_ints(int *arr,int n);

    /* qsort-compatible comparison routine */
    extern "C" int compare(int *a,int *b)
    { return *a-*b; }

    __asm__( /* Assembly function body */
    "sort_ints:\n"
    " # Your code here...\n"
    " ret\n"
    );

    int foo(void) {
    int i,n=read_input(); /* Array contains n ints */
    int *arr=new int[n];
    for (i=0;i<n;i++) /* Create the ints */
    arr[i]=rand()&0xf;
    sort_ints(arr,n);
    for (i=0;i<n;i++) /* Print them */
    std::cout<<i<<" = "<<arr[i]<<std::endl;
    delete[] arr;
    return 0;
    }
  4. Write an x86 gcc C++ inline assembly subroutine "add_ints" that takes an array of integers (as a pointer and an integer length), and adds 100 (decimal) to each integer in the array. Your code should work for any array length, including zero. Test and turn in your subroutine with this C++ driver code, which (aside from the __asm__ block) you are not allowed to change:
    (NetRun example)
    extern "C" void add_ints(int *arr,int n); /* C++ function prototype */
    __asm__( /* Assembly function body */
    "add_ints:\n"
    " # Your code here...\n"
    " ret\n"
    );

    int foo(void) {
    int i,n=read_input(); /* Array contains n ints */
    int *arr=new int[n];
    for (i=0;i<n;i++) /* Create the ints */
    arr[i]=rand()&0xf;
    add_ints(arr,n); /* Add to them */
    for (i=0;i<n;i++) /* Print them */
    std::cout<<i<<" = "<<arr[i]<<std::endl;
    delete[] arr;
    return 0;
    }
  5. The "rdtsc" instruction on x86 machines reads the "Time Stamp Counter", an integer counter that increments every clock cycle.  This instruction takes no parameters, and returns the low 32 bits of this counter in %eax.  Write a little routine named "call_rdtsc" using inline assembly to make this instruction easy to call from C++.  Once call_rdtsc is written, submit it with this little routine that returns the number of clock cycles it takes to call rdtsc (should be about 100, but it varies...).
    int foo(void) {
    int t1=call_rdtsc();
    int t2=call_rdtsc();
    return t2-t1;
    }
Like HW3, you'll turn these problem in by just naming them HW4_1, HW4_2, etc. in NetRun.  These should be run in the "Whole Subroutine (file)" mode of NetRun, not the "Subroutine Fragment" mode as we've been running before.  Your code MUST preserve all the registers x86 code is supposed to preserve (%esp, %ebp, %esi, %edi), and your code must not crash after running!

Problems are due Monday, October 10, at Midnight. 

O. Lawlor, ffosl@uaf.edu
Up to: Class Site, CS, UAF