C++ STL and memory management

Call Me Jimmy · 07-07-2009 12:18PM #1

After finishing my first C++ book and putting a lot of work into writing container classes for educational purposes I have moved on to the STL.

Obviously, I realise the value of being able to write your own classes for specific memory management cases and to better understand things like the STL, but what I don't understand is, why is the STL not used for almost all memory management?

If it truly takes care of memory management for any set of objects in a container, why (for example) do I not see it in any game source code? All game source I've seen seems to ignore the library and I just can't figure out why, I must be misunderstanding something.

Can anyone shed any light on this for me?

Martyr · 09-07-2009 12:36PM

perhaps its something to do with efficiency of the code in terms of both size and speed - some games might be limited to the amount of system memory or disk space it can use up.

more experienced programmers might prefer to manually allocate/deallocate memory.

there may be less bugs this way too.

Call Me Jimmy · 09-07-2009 01:23PM

Yea fair enough but from what I'm reading the STL is pretty much as efficient as it possibly can be and was developed over many years. Surely the time saved using it and optimizing it for a certain purpose (like game development) is more beneficial than 'reinventing the wheel'?

I'm beginning to think it could be to do with encapsulation and readability of code. The STL requires a lot of different parts to work, i.e. to use a container one has to use an iterator etc. and maybe it is more desirable on big projects to encapsulate all of this and hide it from the client. But that begs the question, why not use the STL classes as a base and simply wrap them in more straightforward classes?

Martyr · 09-07-2009 01:41PM

why not use the STL classes as a base and simply wrap them in more straightforward classes?

don't know, maybe its not a good idea?
when i learned some c++ at college, they never taught us how to use STL so i wouldn't be as familiar with it as yourself.

i have started playing with it more recently though because it makes writing code easier, although i've realised that final executables can grow very large in size.

for example, i wrote 2 apps not long ago using mingw (g++) 1 using C functions, the other using C++ containers.

the C++ app was 298 KB in size, even with symbols stripped/size optimized, whereas the C app which did the same thing was only 7KB.

some say the size of c++ executables using STL isn't an issue on linux machines, these were windows specific apps so i haven't bothered investigating.

iostream seems to inflate exe files too, which is why i'd prefer to use printf for formatted output instead of std::cout.

fasty · 09-07-2009 02:41PM

When it comes to game development, it's often because vendor specific versions or STL are bugged or incomplete. Since most code it written these days to be multiplatform, you just can't use STL.

Also, if you're just talking about storing mesh, colour and texture data in containers... Often you need pretty quick access to this in order to manipulate it in a small time frame before sending updated date to the videocard, or perhaps it might be stored in a more complex tree like data structure or developers can have their own in-house containers as part of whatever framework they use to make games.

Call Me Jimmy · 09-07-2009 04:55PM

Interesting. So for someone who is aiming to work on graphics intensive programs in C++, is the STL worth studying in-depth? Or would time be better spent learning more about data structures and algorithms? Does anyone know how much the STL is generally used in C++ programming in the industry as a whole? (all types of programs).

Anima · 09-07-2009 05:21PM

I've recently started studing the Source Engine from Valve to make my own mod. They recreated their own libraries and dont use the STL at all. I've seen that in other engines as well.

The Source Engine runs on the Xbox and Windows so maybe it has something to do with that.

Call Me Jimmy · 09-07-2009 06:35PM

With the source engine could it possibly be that it's ultimately based on the quake engine which was written in C?

I gave up on using the source engine as the documentation is terrible compared to other engines, albeit non-commercial.

fasty · 09-07-2009 09:01PM

Call Me Jimmy wrote: »

Interesting. So for someone who is aiming to work on graphics intensive programs in C++, is the STL worth studying in-depth? Or would time be better spent learning more about data structures and algorithms? Does anyone know how much the STL is generally used in C++ programming in the industry as a whole? (all types of programs).

STL and other containers are used a good bit in the industry, just less so for games. You would be much better off knowing data structures and algorithms inside out though. After that, using STL when you need it isn't that hard.

Call Me Jimmy wrote: »

With the source engine could it possibly be that it's ultimately based on the quake engine which was written in C?

I gave up on using the source engine as the documentation is terrible compared to other engines, albeit non-commercial.

There's not much, if any of the Quake engine left in Source by now. The API is pretty much a mix of C++ classes and macros. So much like Microsoft's MFC.

satchmo · 10-07-2009 12:55AM

It's a good question. We use our own homegrown STL-like container library at Radical. I wasn't around when the initial decision was made, but there are a few reasons that I imagine were a factor...

Memory allocation patterns
It's funny you should mention using STL for memory management - that's one of the main reasons I can think of for not using STL in game code. STL in general is wasteful when it comes to memory allocations, performs too many dynamic allocations, and is prone to fragmentation which can be a killer in memory-limited platforms like the consoles. Game code is very heavy on container usage, so this is a critical point - we're constantly bumping into memory limits, and every kb needs to be accounted for. Some of these problems can be overcome with custom allocators, but they come with their own set of gotchas, and are generally a pain in the ass to work with from what I understand.

Alignment
The STL also does not make it easy to allocate with specific alignment, so doing things like DMA'ing data to an SPU directly from an STL container is not possible without workarounds. Specifically aligned allocations are often preferable for good performance on PPC processors, and required for data to be fed to the GPUs.

Debugging
The STL is far from easy to debug - lack of code documentation, poorly named variables, the use of void pointers, overly cryptic compile errors, etc. And the STL code itself is not at all easy to read, something that is underrated... until you're trying to figure out the callstack of a random crash that ends deep in the container codebase.

Compile time
This has probably changed since I tried it last, but I seem to remember STL having a fairly significant impact on compile time. As our builds already takes minutes to link, I'd hate to see how much that time would increase if we were using STL everywhere. And that's just link time, never mind compile time itself - although like many other studios we use distributed build systems so that's not such a big deal.

I'm sure there are plenty of other reasons too - like it's nice to have a consistent container implementation across all the platforms that we work on. Additionally, having our own container library allows us to tie them in nicely with our Memory monitoring, assert and debugging tools - something that wouldn't be possible with a generic STL implementation.

I have never really used STL to the extent that I now use our container library, so there are probably counter-points to many of these arguments - most of which are likely "you're not using STL right", an argument which I consider means that it's too unwieldy to use correctly without having read "Effective STL" cover to cover, which we don't all have time to do.

If you have some time, you should read the design doc for EASTL, EA's STL replacement library - there's some good stuff in there. Even within the game development community, opinions vary wildly as to whether replacing the tried-and-tested STL with a custom solution (along with all the associated bug-fixing, documentation and maintenance) is worth it. In our case I think it definitely was, but in general the accepted position seems to be "unless you have a good reason to do all the work to replace it (eg a proven performance or memory problem), just go ahead and use STL".

Call Me Jimmy · 10-07-2009 01:24AM

Very interesting. I'll have a look at that link. Following on from this I'm wondering if people have a recommendation for a well laid out book on data structures and algorithms/ memory management, preferably one that uses C++?

I've been looking online and I find it very hard to decide what books to use so any ideas would be great!

Call Me Jimmy · 10-07-2009 02:34AM

Having made an attempt at reading that EASTL link, I can't help but feel daunted by programming. I am a pretty competent c++ programmer and I feel I have good OOP awareness, but it just seems that I never know enough.

When it gets down to custom allocators and contiguous memory etc. I just start sinking. There's always too much stuff I don't know. It's very disheartening!

Anima · 10-07-2009 04:58AM

I'm pretty sure everyone thinks that they dont know enough when it comes to programming. There is just so much to learn that it is imposssible to know everything and to do everything in the most efficient way possible.

You have to realise that the likes of EA and other games designers are looking for the most efficient code as possible, so its always going to look like ass for most programmers.

At the end of the day, its just code and nothing extraordinary so just read up on what you don't understand and this is what will make you a better programmer.

satchmo · 10-07-2009 08:15PM

Yeah Anima's right - everybody feels like that at some point, don't let it get you down. You don't really start learning this stuff properly until you have to, ie when you're working in the industry and learning directly from people with much more experience than you. Until then, just keep writing code and learn what you can. And of course post up here any questions or discussions you have - I'd love to see more advanced C++ and low-level programming discussion on this board, it seems to be mostly web apps and beginner/homework questions at the moment.

As for books, I thought C++ for Game Programmers was very well written and easy to understand, and has good chapters on memory management and performance as well as other things. Very easy to read from cover to cover. It's written by Noel Llopis (was a senior guy at High Moon before starting up his own iPhone game company), so it comes from experience. And of course if you haven't read Effective C++ then that should be top of your list.

Call Me Jimmy · 10-07-2009 08:28PM

Yea I have read some of effective C++ but I find it a harder read than the previous book I finished, but I will finish it. I'll also have a look at the other one you suggested.

I don't feel so bad about it now, I realized that the EASTL would have simply used calloc instead of malloc to return a pointer to contiguous memory so it wasn't as low-level or complicated as I first thought.

I'm interested though in some specific books that detail ADTs, Data Structures and Algorithms and has anyone here ever used one? if so could they recommend it?

Incidentally, I would recommend C++ in 21 Days to anyone new to c++, I found it very clear, concise and manageable. As introductory books come, it seemed to be the best.

Webmonkey · 10-07-2009 08:28PM

satchmo wrote: »

I'd love to see more advanced C++ and low-level programming discussion on this board, it seems to be mostly web apps and beginner/homework questions at the moment.

Same here, but usually people that work in these areas are good enough to work out their own problems.
I've plenty questions at the moment but not too sure if anyone would have any experience in the area: Using the Radio Interface Layer of windows mobile. I might give it a go, never know

Call Me Jimmy · 11-07-2009 12:21AM

I've been looking at some data structures and algorithm stuff (well I knew the basics before) and something came up.

Say you have a custom array class that does it's own memory management and then you want to use that array class. Say you want to make the Array object itself use the heap also,
e.g. Array<int>* a = new Array<int>(5);

The problem I have is that when using a subscript operator on this, I have to dereference the array pointer every time and so it looks like this: (*a)[0] to access the array element 0.

There must be a way to simplify this sort of operation as I'd imagine it's very common. I tried overloading the operator* for the array and doing other things but I can't figure out how to do it. A friend function maybe?

Any help would be greatly appreciated

Talon.ie · 13-07-2009 03:22PM

you could add an "at" method to your "Array" class.

char& at(int index) {return buff[index]; }

and use it like this :

a->at(0);

or like this

a.at(0)

It does slightly break your idea of making your Array class behave like a normal array, but then again, the whole point of using a class is to make working with it easier.

JC 2K3 · 13-07-2009 06:01PM

AFAIK, for pointers in C++, the [] operator returns the contents of memory at the address of the pointer plus the offset times sizeof (<the type the pointer points to>).

I don't think you can overload operators for pointer types (open to correction).

You could use a reference though:

Array<int> & a = *(new Array<int>(5))

Call Me Jimmy · 13-07-2009 11:24PM

That's a neat trick with the reference, never thought of it before. Are you sure it's safe? It seems error prone? Or at the very least, problematic from a design point of view?

fasty · 14-07-2009 12:07AM

JC 2K3 wrote: »
AFAIK, for pointers in C++, the [] operator returns the contents of memory at the address of the pointer plus the offset times sizeof (<the type the pointer points to>).

I don't think you can overload operators for pointer types (open to correction).

You could use a reference though:
Array<int> & a = *(new Array<int>(5))

Just don't forget to delete that later, and you can't delete a reference!

As an aside, new and delete are great, but don't feel like you HAVE TO use them all the time!

Call Me Jimmy · 14-07-2009 01:02AM

I'm aware you don't have to use new and delete all the time, but when yer talkin' about such a fundamental datastructure, surely every part of it needs to be allocated on the heap. I mean, it may only be a pointer or two and a variable or two (like arraySize), but wouldn't they all add up?

If the Array class is used often, and it's data is dynamically-allocated but it's other member data is not, will that not cause a problem down the line? In that case, would it not be correct to dynamically allocate each array object?

JC 2K3 · 14-07-2009 01:38AM

If arraySize is a pointer to an int rather than an int, you'll have two pointers on the stack (one for the array inside your array class and one pointing to arraySize). That's two pointers on the stack versus one if you were to create a pointer to your array object. I wouldn't be too concerned about this unless you have severely low resources or your array class needs a lot of memory, in which case, I'd use at ().

satchmo · 14-07-2009 02:09AM

Generally this problem wouldn't be too much of a concern. Unless you're working with a small example app, containers like that usually belong to a class that is itself allocated on the heap. So you'd never have to explicitly new the Array, but rather you'd new the object that contains the array - the array still gets allocated on the heap, just not by you directly. That's just the array class itself though which is small enough that it wouldn't really matter where it gets allocated (it's probably just a few pointers). That's different however to the array's actual data - the data should always be allocated on the heap, assuming the array can reach any decent size.

I wouldn't be concerned about the syntax of (*a)[0], I see code like that all the time. If you are, then using a reference will work too.

Call Me Jimmy · 14-07-2009 02:28AM

Yea that's what I suspected, that other classes would use the Array and hence the array itself wouldn't need to be explicitly 'newed' as ye said. Thanks for clearing up about the syntax, that was my main concern because it seemed quite unwieldy but after writing the class I was using it in example apps while testing before writing classes that use the array, so the syntax issue came up. Thanks a lot, I understand completely now.

fasty · 14-07-2009 09:39AM

Call Me Jimmy wrote: »

I'm aware you don't have to use new and delete all the time, but when yer talkin' about such a fundamental datastructure, surely every part of it needs to be allocated on the heap. I mean, it may only be a pointer or two and a variable or two (like arraySize), but wouldn't they all add up?

If the Array class is used often, and it's data is dynamically-allocated but it's other member data is not, will that not cause a problem down the line? In that case, would it not be correct to dynamically allocate each array object?

That depends on what you want to put in the data structure and how you allocate space IN the data structure.

If I make a FastyMap or a FastyArray and pass it a size of 10 or 100 or 1000, the data structure allocates memory for the items in the structure. The heap is bigger than you think! I deal with some large data structures all the time without any stack overflow issues!

Edit: But I see people have already answered your query on that one!

Call Me Jimmy · 14-07-2009 03:53PM

Thanks anyway fasty. I'm interested in the discussion in general to see what more experienced programmers have to say on the issue.

On a slightly different topic, do people find themselves using pointers to types rather than types in arrays most of the time? There seems to be such huge benefits when it comes to copying/swapping items that what reason would you have not to? Unless it was ints or floats or something.

fasty · 15-07-2009 09:34AM

It really depends... A lot of this stuff usually doesn't come up until you're actually coding something and often, the first thing you try isn't going to be the best solution.

I usually use pass objects by const reference to the container where they're copies to its internal data structure, but for bigger data I use pointers. Although I never use raw pointers if I can help it. Smart pointers are the way to go!

As as example, in a game I could have a model which consists of several materials, each of which has several meshes. I would new the meshes and add them to a container member of my model class.

Call Me Jimmy · 15-07-2009 11:21AM

Yea there seems to be so many different approaches, I haven't used smart pointers yet as I want to make sure I understand everything (through trial and error if needs be) before I use higher level classes/libraries.

I'm using a wrapped std::sort in my Array class and in order to get it to sort pointers I had to create a functor object PtrCmp and pass it as the compare function in std::sort, all it does is dereference the objects before comparing them.

It's probably bad design but i have 2 functions in the Array class, Sort (for arrays of objects) and SortPtrs(for arrays of pointers, as explained above). At the moment the client must decide which Array sort function to call based on what type of data it holds.

I'm wondering is there some other way around this bad design, without using smart pointers?

fasty · 15-07-2009 11:36AM

It would be better if YourArray::Sort just called a compare functor relevant to the type you're keeping in the array but whatever works for you...

Smart pointers wouldn't really get around that issue, they're just a way of wrapping pointers using templates to they're deleted when the smart pointer goes out of scope.

Why do you need two sort functions? Can you post code?

Call Me Jimmy · 15-07-2009 11:41AM

Thanks for the quick response!

Well here is the complete Array<T> code, Sort and SortPtrs down at the very bottom are the relevant functions, when you see their implementation you may see why I need to seperate them, unless there's a way to determine if a T is a pointer or just an object.

#include "shared.h"
#include <algorithm>
#include <cstdlib>
#include <functional>


// pointer comparison functor object
template <class T>
struct PtrCmp: public std::binary_function<T,T,bool>
{
	inline bool operator()(T ptr1, T ptr2) const
	{
		if(ptr1 && ptr2)
			return *ptr1 < *ptr2;
		else
			return false;
	}
};

template <class T>
class Array
{
public:
	Array();
	Array(unsigned int);
	Array(const Array<T>&);
	~Array();

	unsigned int			Size() const { return m_nSize; }
	T const*				Data() const { return m_pData; }

	const T&				operator[](unsigned int) const;
	T&						operator[](unsigned int);

	void					Resize(unsigned int);
	void					Sort();
	void					SortPtrs();
	bool					IsEmpty() const { return (m_pData == 0); }

	template <class T>
	friend std::ostream&	operator<<(std::ostream&, const Array<T>&);

	
	

protected:
	T* m_pData;
	unsigned int m_nSize;

};

	
template <class T>
Array<T>::Array():
m_pData(0),
m_nSize(0)
{
}

template <class T>
Array<T>::Array(unsigned int l):
m_pData(new T [l]),
m_nSize(l)
{
	for(int i=0; i<m_nSize; i++)
		m_pData[i] = 0;
}

template <class T>
Array<T>::Array(const Array<T>& rhs):
m_pData(0),
m_nSize(0)
{
	try
	{
		m_pData = new T[rhs.Size()];
	}
	catch(bad_alloc)
	{
		cout<< "Array couldn't allocate new memory";
		return;
	}
	m_nSize = rhs.Size();

	for(int i=0; i< m_nSize; i++)
		m_pData[i] = rhs[i];
}
template <class T>
Array<T>::~Array()
{
	delete [] m_pData;
}


template <class T>
const T& Array<T>::operator[](unsigned int offset) const
{
	if(offset >= m_nSize)
		throw out_of_range("Invalid position");

	return m_pData[offset];
}

template <class T>
T& Array<T>::operator[](unsigned int offset)
{
	if(offset >= m_nSize)
		throw out_of_range("Invalid position");

	return m_pData[offset];
}

template <class T>
void Array<T>::Resize(unsigned int size)
{
	unsigned int min = std::_cpp_min(size, m_nSize);
	unsigned int max = std::_cpp_max(size,m_nSize);
	int i;

	T* newData = 0;
	try
	{
		newData = new T[size];
	}
	catch(bad_alloc)
	{
		cout<< "Array couldn't allocate new memory";
		return;
	}
	
	//copy to new array
	for(i=0; i<min; i++)
		newData[i] = m_pData[i];

	// if the new array is larger than the old one
	// then initialize the surplus cells to 0
	if(size > m_nSize)
	{
		for(i=min; i<max; i++)
			newData[i] = 0;
	}

	if(m_pData)
		delete [] m_pData;

	m_pData = newData;
	m_nSize = size;
	
}


template <class T>
void Array<T>::Sort()
{
	std::sort(m_pData, (m_pData+ m_nSize));
	
}

template <class T>
void Array<T>::SortPtrs()
{
	std::sort(m_pData, (m_pData+ m_nSize), PtrCmp<T>());
}



template <class T>
std::ostream& operator<<(std::ostream& out, const Array<T>& a)
{
	if(a.IsEmpty())
		return out;

	for(int i=0; i< a.Size(); i++)
		cout << "[" << i << "] " << a[i] << endl;
	
	return out;
}

C++ STL and memory management

Comments