Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

C compiler optimisation question

Options
  • 31-07-2003 12:27pm
    #1
    Closed Accounts Posts: 5,564 ✭✭✭


    Question

    short c=strlen(variable);

    for(a=0;a<c;a++)

    for(a=0;a<strlen(variable);a++)

    Which is better optimised?
    ie, does strlen(variable) get evaluated on each iteration, or another way of putting it, does the vairable c have any performance gain assuming gcc 2.96 on Linux 2.2.1x


Comments

  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    #include <stdio.h>
                                                                                                                                                  
                                                                                                                                                  
                                                                                                                                                  
    short tester(short value){
            printf("I got called\n");
            return value;
    };
                                                                                                                                                  
    int main(int argc,char**argv){
                                                                                                                                                  
            int a;
                                                                                                                                                  
            for(a=0;a<tester(10);a++){
                    printf("\nIterating\n");
            }
    return 0;
    };
    

    [php]
    <?php
    function tester($value){
    printf("I got called\n");
    return $value;
    };


    $a;

    for($a=0;$a<tester(10);$a++){
    printf("\nIterating\n");
    }
    ?>
    [/php]


  • Closed Accounts Posts: 282 ✭✭glimmerman


    I would hazard a guess and say: yep, the c var should make it more optimal? At least you know *for sure* that c contains just a number that is not re-evaluated for each iteration.

    I would also hazard a guess and say the optimisation also depends on the -O switches you pass the compiler.

    <later>....
    hmmmm, just compiled the above to examine the assembler, it seems the compiler has by default already realised that the strlen is a fixed value, and compares it to a.. so the answer is <drumroll pleeeasee...> they are both the same. at least for my version of gcc which is 3.1.
    That was without specifiying -O anything, can't remeber the default but it might be 2.

    And yes, changing the O switches substantially messes with the assembler produced.

    I'd say this behaviour will vary from compiler to compiler, so to keep the optimisation portable I'd still go with setting c to the result of strlen.


    Also, I think your second set of C code might change the way the compiler optimises this, since it now HAS to call the printf... so it may choose to call the test function. Comments above refer t the original question.


  • Registered Users Posts: 491 ✭✭Silent Bob


    A good compiler (and gcc is a good compiler) will optimise well beyond what the programmer can think of.

    There will be no performance gain from using the variable c. As far as optimisations go, for a compiler that's an easy one.

    My compiler design lecturer told the class last year that there is little point trying to over optimise your code yourself, the compiler can and will do a better job.

    There's a complete list of the different optimisation options for gcc here


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    Yeah but, the code is being compiled into php, which is being statically linked into apache.

    I'm not sure what the effect would be of changing around the default optomisation settings that have come with php and have built the other (n) php modules. I'd feel slightly uneasy about 'tweaking' optomisation settings on a production build for our webservers/php modules/in house php extension modules.


  • Registered Users Posts: 491 ✭✭Silent Bob


    Well then it's a matter of how good do you think the c-to-php compiler is... ;)


  • Advertisement
  • Closed Accounts Posts: 9,314 ✭✭✭Talliesin


    In a specific case the only way to know for sure is to try it and see what is produced.

    Could you try your test compilation again with tester explicitly marked as inline and tell us what happens?

    If that's an improvement then of course the question becomes, is strlen treated as inline on GCC? (or perhaps, is it treated as inline on GCC with the options being used?).

    With standard functions that are both heavily used and potentially expensive it's common that some extremely heavy optimisations will occur, as such the optimiser is likely to "know" more about strlen than about tester and act appropriately.

    In general it's best not to worry too much about it, unless you are squeezing the last drop out of the processor (in which case you should go off and learn lots about how your optimiser works and plenty about the comparative costs of the instructions produced. Of couse compiling to something other than machine code or easily processed byte-code makes these concerns greater, but even then the efficiency of the overall algorithms and program structure tend to be of much greater importance than reasonable differences between a couple of lines of code.

    I would have generally gone for the former example not because it is likely to be more efficient, but because I prefer to have simple for clauses.
    That said I prefer to use the logic of a half-open sequence in for clauses whereever applicable and that often means a call to a end() member function, I generally leave those in the for clause partly because the read well as "from beginning to end", partly because I'm so used to their being there, and partly because end() member functions are typically trivial to inline, hence there should be no cost over alternative forms.

    You might like to look at:
    while (*variable++){
    }
    
    or
    while (*(variable + a++)){
    }
    

    The latter preserves variable, and lets you know the current index if that value was being used inside the loop. Like the examples of the original question they assume that there is no potential of a very large string causing an error.

    In cases where the function was a template I would be more careful to use the former version if strlen() might be a different function depending on the template arguments. In such a case I can't know in advance the relative complexity of the function called and even experimentation won't tell me how well the optimiser will deal with it for all cases, so it's better to be that little bit more fussy about efficiency at that level of coding.


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    Neither specifying that the function is inline nor using -O3 (with and without inline) stops the function being called.

    Bummer..


  • Closed Accounts Posts: 9,314 ✭✭✭Talliesin


    It would appear therefore that the php-thing is overriding the normal optimisation. Best to assume there are none, especially if that's the only way that code is going to be used.


  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    Nah.

    I've made a file called a.c

    a.c contains the function outlined above.

    If I specify that function as inline (in the c source) and compile it like this

    gcc -O3 a.c -o a

    The function still prints itself out.
    Unless I've missed something?

    [edit]
    bodonoghue@Vader:~$ more a.c
    #include <stdio.h>
                                                                                                                                                                  
                                                                                                                                                                  
                                                                                                                                                                  
    static short inline tester(short value){
            printf("I got called\n");
            return value;
    };
                                                                                                                                                                  
    int main(int argc,char**argv){
                                                                                                                                                                  
            int a;
                                                                                                                                                                  
            for(a=0;a<tester(10);a++){
                    printf("\nIterating\n");
            }
    return 0;
    };
    bodonoghue@Vader:~$
    
    
    bodonoghue@Vader:~$ gcc -O3 -o a a.c
    bodonoghue@Vader:~$ ./a
    I got called

    Iterating
    I got called

    Iterating
    I got called
    <snip>
    bodonoghue@Vader:~$


  • Registered Users Posts: 1,931 ✭✭✭Zab


    If the function didn't get called in your example, it would be a totally incorrect optimisation. The way your code is written, the function has to get called, because it is printing something out. The only time a compiler could optimise out a function call is if it knows the return value already and knows that there are no side-effects to the call ( such as printing something to the screen ). I don't even know if any compilers do this, except perhaps for known library calls (strlen in above example).

    Zab.


  • Advertisement
  • Closed Accounts Posts: 5,564 ✭✭✭Typedef


    Without -O3
    Dump of assembler code for function main:
    0x8048328 <main>: push %ebp
    0x8048329 <main+1>: mov %esp,%ebp
    0x804832b <main+3>: sub $0x8,%esp
    0x804832e <main+6>: and $0xfffffff0,%esp
    0x8048331 <main+9>: mov $0x0,%eax
    0x8048336 <main+14>: sub %eax,%esp
    0x8048338 <main+16>: movl $0x0,0xfffffffc(%ebp)
    0x804833f <main+23>: sub $0xc,%esp
    0x8048342 <main+26>: push $0xa
    0x8048344 <main+28>: call 0x8048372 <tester>
    0x8048349 <main+33>: add $0x10,%esp
    0x804834c <main+36>: cwtl
    0x804834d <main+37>: cmp %eax,0xfffffffc(%ebp)
    0x8048350 <main+40>: jl 0x8048354 <main+44>
    0x8048352 <main+42>: jmp 0x804836b <main+67>
    0x8048354 <main+44>: sub $0xc,%esp
    0x8048357 <main+47>: push $0x80483d0
    0x804835c <main+52>: call 0x8048268 <printf>
    0x8048361 <main+57>: add $0x10,%esp
    0x8048364 <main+60>: lea 0xfffffffc(%ebp),%eax
    0x8048367 <main+63>: incl (%eax)
    0x8048369 <main+65>: jmp 0x804833f <main+23>
    0x804836b <main+67>: mov $0x0,%eax
    0x8048370 <main+72>: leave
    0x8048371 <main+73>: ret

    With -O3
    0x8048324 <main>: push %ebp
    0x8048325 <main+1>: mov %esp,%ebp
    0x8048327 <main+3>: push %ebx
    0x8048328 <main+4>: push %eax
    0x8048329 <main+5>: and $0xfffffff0,%esp
    0x804832c <main+8>: mov $0x9,%ebx
    0x8048331 <main+13>: lea 0x0(%esi),%esi
    0x8048334 <main+16>: sub $0xc,%esp
    0x8048337 <main+19>: push $0x8048398
    0x804833c <main+24>: call 0x8048254 <puts>
    0x8048341 <main+29>: add $0x10,%esp
    0x8048344 <main+32>: dec %ebx
    0x8048345 <main+33>: jns 0x8048334 <main+16>
    0x8048347 <main+35>: xor %eax,%eax
    0x8048349 <main+37>: mov 0xfffffffc(%ebp),%ebx
    0x804834c <main+40>: leave
    0x804834d <main+41>: ret
    0x804834e <main+42>: nop
    0x804834f <main+43>: nop
    End of assembler dump.

    With -O2
    Dump of assembler code for function main:
    0x8048324 <main>: push %ebp
    0x8048325 <main+1>: mov %esp,%ebp
    0x8048327 <main+3>: push %ebx
    0x8048328 <main+4>: push %eax
    0x8048329 <main+5>: and $0xfffffff0,%esp
    0x804832c <main+8>: mov $0x9,%ebx
    0x8048331 <main+13>: lea 0x0(%esi),%esi
    0x8048334 <main+16>: sub $0xc,%esp
    0x8048337 <main+19>: push $0x8048398
    0x804833c <main+24>: call 0x8048254 <puts>
    0x8048341 <main+29>: add $0x10,%esp
    0x8048344 <main+32>: dec %ebx
    0x8048345 <main+33>: jns 0x8048334 <main+16>
    0x8048347 <main+35>: xor %eax,%eax
    0x8048349 <main+37>: mov 0xfffffffc(%ebp),%ebx
    0x804834c <main+40>: leave
    0x804834d <main+41>: ret
    0x804834e <main+42>: nop
    0x804834f <main+43>: nop


Advertisement