Optimisation tips / tricks

Khannie · 08-02-2007 03:09PM #1

I'll try to keep this language agnostic. I've been looking around for tips / tricks today to help speed up c/c++ code, but a lot of the stuff I've come across / already know is generic. Maybe we can build up a list of useful tips / tricks if people add to this thread.....

Let's keep this specific to algorithmic / code based speed optimisations, not style based stuff (like the "if (NULL==value)" which is a useful tip, but wont help speed).

for(i = 0; i < get_int_value(); i++)
{
    do_stuff....
}

where the return value of get_int_value() is not a const int (but wont change during the for loop's lifetime) should be replaced with:

int n = get_int_value();
for (i = 0; i < n; i++)
{
    do_stuff....
}

Why?

in the first instance, get_int_value() will be called n times, causing n stack pushes (no matter what language you use) and possibly n expensive calculations. In the 2nd instance, there is only 1 stack push. Stack pushes are expensive. Especially with modern hardware and pipelining.

I have tested this one out myself for performance, and it is /substantially/ faster to use the 2nd instance, even with little calculation in get_int_value().

for (i = 0; i < n; i++)
{
    if (a)
    {
        do_something();
    }
    else if (b)
    {
        do_something_else();
    }
    else
    {
        do_default();
    }
}

This seems perfectly natural....however.....If a / b will not change during the lifetime of the for loop, this can be optimised to:

if (a)
{
    for (i = 0; i < n; i++)
    {
        do_something();
    }
}
else if (b)
{
    for (i = 0; i < n; i++)
    {
        do_something_else();
    }
}
else
{
    for (i = 0; i < n; i++)
    {
        do_default();
    }
}

why?

You save n-1 if / else if / else's.

It seems unusual to have the 3 for loops, but the optimisation is clear here.

Anyone else got some gems?

edit: I will add more to this....just a bit busy right now, but wanted to get the ball rolling....

satchmo · 08-02-2007 06:40PM

Good idea Khannie. I have a couple of C++-specific ones off the top of my head:

1) Pass in references instead of objects:

BigObject bar;
foo(bar);

void foo(BigObject arg){
... [COLOR="Red"]bad[/COLOR] ...
}

void foo(BigObject [b]&[/b]arg){
... [COLOR="Green"]good[/COLOR] ...
}

In the first case, the whole of BigObject (which might be hundreds of bytes or more) is being passed in on the stack, and has to be copied in order to be used in the function. On the other hand, the second case only uses a 32-bit reference, which can be passed in a register and is therefore very fast. You just need to be aware that since this is a reference now, any changes to the object inside the function will affect the original object, as opposed to the first case where it's simply a copy. You can use a pointer too instead of a reference, although this will need the calling code to be changed (foo(&bar)) as well as the function argument (void foo(BigObject *arg)).

2) Disable exception handling
Obviously not something to do if you actually use exceptions, but for the vast majority of C++ code that doesn't, turning off exceptions in the compiler (Project properties->C/C++->Code Generation in Visual Studio 8) is a 10-second job and can be a small to moderate speed win, as well as reducing code size. In any case it can't hurt.

3) Avoid memory allocations
Memory allocation is slow. If you need to allocate memory, try to do it intelligently in an constructor or somewhere else that isn't inside a speed-critical loop. This is also applicable for STL's vector and other similar dynamic containers that silently allocate memory; if you know in advance the approximate size that you need the container to be, initialise it that way in the first place (Vector::reserve() or similar). This will avoid unwanted memory allocations caused by the container running out of space that might happen in an inner loop, right when you want the most speed.

4) Inline
Although it's only a hint to the compiler, inline small functions that will be called frequently by using the inline keyword or by including the function body in the class definition. If the overhead of the function call is close to the amount of time spent inside the actual function (eg small utility functions like max() or vector operations), it's a good candidate for inlining.

That's all I can think of at the moment...

Martyr · 08-02-2007 07:03PM

Agner Fog has written a good paper on optimising c++

Optimizing software in C++

amen · 08-02-2007 10:56PM

[PHP]if (a)
{
for (i = 0; i < n; i++)
{
do_something();
}
}
else if (b)
{
for (i = 0; i < n; i++)
{
do_something_else();
}
}
else
{
for (i = 0; i < n; i++)
{
do_default();
}
}[/PHP]

I don't like the multiple for loops in the above code
not sure what to replace them with though...

Civilian_Target · 08-02-2007 11:53PM

To be honest, the best optimisation tips (as someone who has worked in performance analysis for a living) are generally to avoid performance antipatterns.
Yes, you can save a few ms here and there by what we're describing above but certain bad developer behaviour butchers code. Here's a few common anti-paterns I've observed.

- Check your path ordering! People just shove stuff into libraries that are then placed on the path but most machines always read of the path in the order that the OS gives them. Therefore, the most used ones should come first, and the less frequent ones should come last.

- Cache it if you can. Don't synchronise it if you don't have to.

- Never SELECT *. Only ever select what you need, and remove from the selects what you don't need. I strongly advocate the use of aspect-oriented weaving to update things like this. Other big DB tips are, avoiding joins unless you really need them, particularly outer joins, and INDEX the clauses you use in where statements, don't index the ones you don't use!

- Buffers are a fantastic saver of menial work. Particularly file write buffers, display buffers and string buffers.

- If you're using network comms, use asynchronous sends and receives wherever possible, keep working for as long as you can and use a wait for the data to arrive.

- Watch your threads. 1 thread can be disadvantageous, but its still better than spawning 100 for every little thing!

- Know your complexity theory, and know the complexity of code you use for large things frequently. You will be amazed how many stupid methods are called NP times when they could be called polynomially, or often just once!

satchmo · 09-02-2007 01:50AM

amen wrote:

I don't like the multiple for loops in the above code
not sure what to replace them with though...

What is it you don't like about them, just the aesthetic of the code? There aren't really many alternatives in this example...

Khannie · 09-02-2007 10:22AM

amen wrote:

I don't like the multiple for loops in the above code
not sure what to replace them with though...

It definitely does look weird initially, but the computational savings are clear.

I was looking through the xvid code yesterday (a powerful open source video codec) and saw it littered with examples like the above. Video compression is one of the few tasks that still takes hours for the average joe to run, so optimisation is important there.

Mutant_Fruit · 09-02-2007 01:32PM

Pass by reference (pointers) rather than by value. Pushing a pointer on the stack costs a lot less than pushing a 160byte object. The benefits are clear. It can increase speed by up to 3x. I'd only really recommend it for functions which will be called a lot of times.

Pull as many things outside of loops as you can. For example if you need to use the current time as part of your calculations there is no need to find out the current time at the start of each iteration of a tight loop.

Good:
DateTime current = DateTime.Now;
for(int i=0; i < 10000; i++)
dostuff();

Bad:
for(int i=0; i < 100000; i++)
{
DateTime current = DateTime.Now;
dostuff();
}

Reuse, reduce, recycle. Suppose you're doing a lot of transferring of data with sockets and you are using some byte arrays as temporary buffers. You should reclaim the buffer when you are finished using it and store it in a "BufferManager" and then when you need a new buffer you take one from the buffer. This reduces the amount of allocations you need to make and so speeds up your program.

When you are multithreading, you *will* need to use locks. Try to keep each lock activated for as little time as possible to reduce contention.

Talon.ie · 12-02-2007 03:18PM

satchmo wrote:
1) Pass in references instead of objects:
BigObject bar;
foo(bar);

void foo(BigObject arg){
... [COLOR="Red"]bad[/COLOR] ...
}

void foo(BigObject [b]&[/b]arg){
... [COLOR="Green"]good[/COLOR] ...
}
snip...

You just need to be aware that since this is a reference now, any changes to the object inside the function will affect the original object, as opposed to the first case where it's simply a copy.

And if don't want your reference to be modified you can just use const

BigObject bar;
foo(bar);

void foo(BigObject arg){
... [COLOR="Red"]bad[/COLOR] ...
}

void foo([b]const[/b] BigObject [b]&[/b]arg){
... [COLOR="Green"]good[/COLOR] ...
}

Talon.ie · 12-02-2007 03:56PM

While the tips above are all worth paying attention to the best way to speed up your programs is to follow the cardinal rule of optimisation: profile profile profile.

Profile your code, fix the area the profiler highlights, repeat.

And just to stay totally on topic here's a few tips.

Prefer object initialisation over assignment.

// Initialization
SomeType x = t; //Invokes copy constructor

// Assignment
SomeType x;   // default ctor
x = t; // assignment operator

Catch exceptions by reference

try{
...
}
catch(SomeException ex){
  ...
[COLOR="Red"]// bad[/COLOR] 
}

try{
...
}
catch(SomeException & ex){
  ...
  [COLOR="Green"]// good[/COLOR]
}

Khannie · 13-02-2007 10:48AM

Use a reference argument instead of a return value:

Use

void myMethod(Object& object)
{
    //do stuff to object in here
}

instead of

Object myMethod(void)
{
    //do stuff....
    return object;
}

You can see examples of this littered throughout MFC (though with pointers instead of references).

Mutant_Fruit · 14-02-2007 03:07PM

Now before everyone decides that passing pointers is better than passing by value, it's not!

It's only important when you want *every* bit of speed you can get. It can also be language dependant as to the benefits. For example in C# if i passed an object as a return type, that's the same performance penalty as passing a pointer back (or assigning to the "output" variable). Thats because all objects in C# are pretty much just pointers to the managed heap, so you are actually passing a pointer around all the time.

So the performance would be the same in these two snippets:

public void MyMethod(ref MyHugeClass  result)
{
    // DO STUFF
    // asssume i have a variable called "Answer" which is a huge class

    result = answer;
}


public void MyMethod()
{
    // DO STUFF
    // asssume i have a variable called "Answer" which is a MyHugeClass

    return answer;
}

However if i was passing back a struct, that would incur a bigger penalty as structs are passed by value by default, meaning i could be copying 100's of bytes in the stack.

If i were using structs, the first method above could be considerably faster. The actual gain would be highly dependant on the code being executed in the method.

Optimisation tips / tricks

Comments