仿照实现SGI-STL空间配置器

STL里面的Allocator是用来管理空间分配的，这里的空间基本上指的就是内存，管理内存是影响到很多STL的容器性能的主要因素之一，所以对Allocator的了解需要非常深入而不是仅仅知道如何调用．笔者前段时间大致的看了侯捷老师的那本«STL源码剖析»，对SGI版本的STL算是有点简略的认识，可是这种走马观花的认识也只能帮助我理解其大致机理，当遇到真正让你实现一个简单的Allocator时候就会显得束手无策，所以有必要通过编码，简单临摹一下SGI的Allocator.

简单粗暴Naive方式实现Allocator

我们可以编写一个最简单的符合STL规范的Naive版本的Allocator,它起码需要allocate/deallocate函数用来申请/释放空间，construct/destroy函数用来在内存空间上构造和析构对象, 还有rebind成员模板以及一些类型别名等.

这个Allocator只是简单的申请/释放空间，没有任何特别之处.

template<typename T>
class allocator{

public:
//常用类型别名,符合STL规范
	typedef T value_type;
	typedef T* pointer;
	typedef const T* const_pointer;
	typedef T& reference;
	typedef const T& const_reference;
	typedef unsigned int size_type;
	typedef int difference_type;

	template<typename U>
	struct rebind{
		typedef allocator<U> other;
	};

	allocator() = default;
	allocator(const allocator &rhs) = default;
	~allocator() = default;

//常用函数, 符合STL规范
	pointer allocate(size_type n, const void* hint= 0);
	void deallocate(pointer p, size_type n);

	size_type max_size() const;

	void construct(pointer p, const T& x);
	void destroy(pointer p);

	pointer address(const_reference x) const;

};

template<typename T>
typename allocator<T>::pointer allocator<T>::allocate(size_type n, const void* hint){
	return _allocate(n, (pointer)NULL);
}

template<typename T>
void allocator<T>::deallocate(pointer p, size_type n){
	_deallocate(p);
}

template<typename T>
typename allocator<T>::size_type allocator<T>::max_size() const{
	return UINT_MAX/sizeof(T);
}

template<typename T>
void allocator<T>::construct(pointer ptr, const T& value){
	_construct(ptr, value);
}

template<typename T>
void allocator<T>::destroy(pointer ptr){
	_destroy(ptr);
}

template<typename T>
typename allocator<T>::pointer allocator<T>::address(const_reference x) const{
	return (pointer)&x;
}

上面用到的_allocate, _deallocate, _construct和_destroy函数定义如下:

template<typename T>
inline T* _allocate(ptrdiff_t size, T*){
	T* tmp = (T*)(::operator new((size_t)(size*sizeof(T))));
	// ::operator new 最底层的内存分配，类似malloc
	if(tmp == 0){
		qError("out of memory, allocate failed.");
	}
	qDebug("_allocate");
	return tmp;
}

template<typename T>
inline void _deallocate(T* p){
	::operator delete(p);
	qDebug("_deallocate");
}

template<typename T1, typename T2>
inline void _construct(T1* p, const T2& value){
	new(p) T1(value); //new 指定内存位置的用法
	qDebug("_construct");
}

template<typename T>
inline void _destroy(T* p){
	p->~T();
	qDebug("_destroy");
}

SGI版本的两级Allocator实现

SGI的Allocator有两个版本，一个是类似上面提到的Naive版本，它是为了兼容通用的STL标准，但是这个版本SGI从来不用，而是使用两级的Allocator，这个Allocator与STL标准不兼容，只能被SGI-STL的容器使用,这里就简单介绍一下SGI两级Allocator的设计思路.

两级Allocator > SGI的Allocator分为一级和二级Allocator,其中一级Allocator负责管理大于128字节的内存请求，而二级Allocator负责小于等于128字节的内存请求．

一级Allocator

这里我实现的一级Allocator名字叫做superior_allocator,它的成员函数解释如下:

allocate函数负责分配指定大小内存
deallocate函数负责释放指定指针的内存
reallocate函数负责再次申请扩充内存
set_allocate_failed_handler函数用来接受使用者定义的内存不足处理函数，在遇到内存不足时候可以调用此函数期望释放部分内存
oom_allocate函数用来处理allocate函数分配内存时候由于内存不足而失败这种情况
oom_reallocate函数用来处理reallocate函数分配内存时候由于内存不足而失败这种情况

template<int inst>
class superior_allocator{
private:
	//allocate when out of memory
	static void* oom_allocate(size_t n);
	static void* oom_reallocate(void* ptr, size_t old_sz, size_t new_sz);
	// use defined handler when out of memory
	static void (*allocate_failed_handler)();

public:
	static void* allocate(size_t n);
	static void deallocate(void* ptr, size_t);
	static void* reallocate(void *ptr, size_t old_sz, size_t new_sz);

	static void set_allocate_failed_handler(void (*f)());
};

其中具体函数实现如下:

//初始化内存不足处理函数指针
template<int inst>
void (* superior_allocator<inst>::allocate_failed_handler)() = 0;

template<int inst>
void* superior_allocator<inst>::allocate(size_t n){
	void* result = malloc(n);
	if(result == NULL){
		return oom_allocate(n);
	}
	return result;
}

template<int inst>
void* superior_allocator<inst>::oom_allocate(size_t n){
	if(allocate_failed_handler == 0)
		throw hu::THROW_BAD_ALLOC;
	void* result = 0;
	//一直在不断的循环尝试
	while(result == 0){
		//调用处理例程，试图释放一些内存
		allocate_failed_handler(); 
		result = malloc(n);
	}
	return result;
}

template<int inst>
void superior_allocator<inst>::deallocate(void* ptr, size_t){
	free(ptr);
}

template<int inst>
void* superior_allocator<inst>::reallocate(void *ptr,
								size_t old_sz, size_t new_sz){
	void* result = realloc(ptr, new_sz);
	if(result == 0){
		return oom_reallocate(ptr, old_sz, new_sz);
	}
	return result;
}

template<int inst>
void* superior_allocator<inst>::oom_reallocate(void* ptr,
								size_t old_sz, size_t new_sz){
	if(allocate_failed_handler == 0)
		throw THROW_BAD_ALLOC;
	void *result = 0;
	while(result == 0){
		allocate_failed_handler();
		result = realloc(ptr, new_sz);
	}
	return result;
}

从实现代码可以看出来，一级Allocator其实也很简单，和上面提到的Naive方式几乎差不多，就是malloc/free/realloc, 遇到malloc/realloc失败的话不断的调用用户处理函数以释放部分内存.之所以这么简单，是因为一级Allocator一般只处理较大内存块的分配管理，所以可以不用像二级Allocator那么斤斤计较，管理严密.

二级Allocator

这里我编写的二级Allocator名字叫做junior_allocator,它是完全按照SGI的设计思路实现的，负责较小内存块的管理.

free_list指针数组

要了解SGI二级Allocator的原理就必须先了解free_list指针数组，free_list指针数组的原型如下:

static obj* free_list[FREE_COUNT];

这里势必要介绍一下obj的原型:

union obj{
	obj* next;
	char data[];
};

这个obj的设计非常巧妙，是一个union,这样设计有以下几点好处:

一个obj的地址也就是其data指针值，这样方便通过obj的data值读出其内存地址. > 例如, 我们有obj o那么o.data其实就是&o, 也就执行o的具体位置
一个obj的next字段用来存储指向下个节点的位置,这样能形成类似链表一样的结构.
通过union结合就可以避免单独开辟空间存储指针的空间浪费

结合obj的原型我们再来看free_list指针数组，我们这里free_list指针数组的大小FREE_COUNT = 128 / 8 = 16, 于是有16个obj指针，这16个obj指针分别指向第一个空闲的8字节,16字节,…,128字节内存块的位置

我们可以初始化时候为每个free_list[i]申请连续20个相应块空间，作为备用，初始化之后的内存使用情形如下图:

free_list

上图的free_lst指针数组中的指针指向其对应第一个空闲块，然后后续空闲块通过神奇的obj*连接而成，这里就可以体现到obj设计的精妙了，它一点也不浪费空间，而且还是非常精简的.

另外，对于上图需要解释的就是start_free到end_free这块空间就是junior_allocator分配的还未被使用的空闲内存池.

allocate和deallocate的实现

reallocate基本上就是调用allocate和deallocate,这里就不作表述. ##### allocate过程我们试图从junior_allocator申请大小为n字节的内存,如果n > 128，那么交给superior_allocator完成,否则先根据n的大小算出相应内存块大小，比如n = 7那么内存块大小就是8.接着从free_list中找出相应的内存块第一个空闲内存地址free_list[i]，如果此时有空闲块，那么返回此块地址，并且free_list[i] = free_list[i]->next即指向下一个空闲块;如果不存在空闲块，那么试图从空闲的内存池中申请一块连续的内存，然后切分成free_list[i]对应内存块大小的小块,然后更新free_list[i].

这里我们可以看到，每次某个内存块不够用时候就从内存池取出新的加入，总有内存池用完的时候，当内存池用完了该怎么办呢？这里有两个解决办法:

通过malloc再申请一快大内存池
如果系统所有内存都不够了，那只能从其他free_list中扣除一些内存过来了，因为每个free_list[i]都分配了一定数量的内存块，而有些free_list[i]可能用的很少，所以尚存空闲块，那就把它们的抢占过来.

deallocate

deallocate实现非常简单，如果释放的空间大于128字节，那么就交给一级allocator处理；如果不是,则把内存块加到对应free_list[i]中，通常做法就是直接插入大free_list[i]对应链表的表头.

代码实现

junior_allocator类定义

const size_t LARGE_SIZE = 128; //超过128B就由superior_allocator分配
const size_t SLICE_SIZE = 8; //递增片大小
const size_t FREE_COUNT = LARGE_SIZE/SLICE_SIZE; //free list个数，8,16,...128
const size_t CHUNK_NUM = 20; //默认每个尺寸的chunk一次分配的个数

template<int inst>
class junior_allocator{
private:
	union obj{
		obj* next;
		char data[];
	};
	//free_list指针数组
	static obj* free_list[FREE_COUNT];
	static char* start; //当前可分配内存池的开始位置
	static char* finish; //内存池结束位置
	static size_t heap_size; //当前堆大小

	//计算size应该分配的内存块的编号
	//8字节内存块对应0号,16字节内存块对应1号...
	static size_t march_node(size_t size);
	//计算第node_id号内存块的大小
	static size_t node_size(size_t node_id);

	//从内存池申请n个node_id号内存块大小的连续内存
	static char* chunk_alloc(size_t node_id, size_t& n);
	//通过从内存池取新的内存将free_list[node_id]重新填充
	static void* refill(size_t node_id);
public:

	static void* allocate(size_t n);
	static void* reallocte(void* ptr, size_t old_zs, size_t new_sz);
	static void deallocate(void* ptr, size_t);

};

superior_allocator成员函数具体实现

template<int inst>
char* junior_allocator<inst>::start = 0;

template<int inst>
char* junior_allocator<inst>::finish = 0;

template<int inst>
size_t junior_allocator<inst>::heap_size = 0;

template<int inst>
typename junior_allocator<inst>::obj* junior_allocator<inst>::free_list[FREE_COUNT]
					= {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

template<int inst>
size_t junior_allocator<inst>::march_node(size_t size){
	return (size-1) / SLICE_SIZE;
}

template<int inst>
size_t junior_allocator<inst>::node_size(size_t node_id){
	return (node_id + 1) * SLICE_SIZE;
}

template<int inst>
char* junior_allocator<inst>::chunk_alloc(size_t node_id, size_t &n){
	size_t requre = n*node_size(node_id);
	size_t size_left = finish - start;
	char* result;
	if(size_left < requre){//剩余空间不够
		if(node_size(node_id) < size_left){
			//还够分配一个node
			n = 1;
			result = start;
			start = start + node_size(node_id);
			return result;
		}else{//一个都不够分配了
			if(size_left > 0){
				//把剩余空间分配给相应free_list
				/*
				 * 由于每次分配堆大小都是８的整数倍
				 * 所以剩余的肯定属于8,16,24,...必然
				 * 有一个与它完全适配,不会发生内存碎片
				 */
				size_t node_march = march_node(size_left);
				assert(size_left == node_size(node_march));
				obj* cur = (obj*)start;
				cur->next = free_list[node_march];
				free_list[node_march] = cur;
				start = finish;
			}
			size_t add_size = requre + (heap_size >> 4);
			start = (char*)(::operator new(add_size));
			if(start == 0){
				//分配内存失败
				//应该从合适的free_list中回收一定的内存
				for(size_t i = node_id + 1; i < FREE_COUNT; i++){
					if(free_list[i] != 0){
						result = (char*)(free_list[i]);
						free_list[i]= free_list[i]->next;
						start = result;
						finish = start + node_size(i);
						return chunk_alloc(node_id, n);//再次尝试分配
					}
				}

				//实在不行调用一级空间适配器以调用用户定义的处理历程
				start = (char*)(superior_allocator<inst>::allocate(add_size));
			}
			finish = start + add_size;
		}
	}
	result = start;
	start = start + requre;
	return result;
}

template<int inst>
void* junior_allocator<inst>::refill(size_t node_id){
	 size_t chunk_num = CHUNK_NUM;
	 void* result;
	 result = chunk_alloc(node_id, chunk_num);
	 //chunk_num的值为实际分配到的块数目
	 obj* pos = (obj*)result;
	 //把chunk_num个内存块建立链表索引
	 for(auto i = 1; i < chunk_num; i++){
		 obj* next = (obj*)((char*)pos + node_size(node_id));
		 pos->next = next;
		 pos = pos->next;
	 }
	 pos->next = 0;
	 return result;
}

template<int inst>
void* junior_allocator<inst>::allocate(size_t n){
	if(n > LARGE_SIZE){
		//超过LARGE_SIZE交给superior_allocator分配
		return superior_allocator<inst>::allocate(n);
	}else{
		size_t node_id = march_node(n);
		if(free_list[node_id] == 0){
			//已经没有资源了
			free_list[node_id] = (obj*)refill(node_id);
		}
		void* result = free_list[node_id]->data;
		free_list[node_id] = free_list[node_id]->next;
		return result;
	}
}

template<int inst>
void* junior_allocator<inst>::reallocte(void* ptr, size_t old_sz, size_t new_sz){
	void* new_ptr = allocate(new_sz);
	size_t copy_sz = old_sz < new_sz? old_sz : new_sz;
	memcpy(new_ptr, ptr, copy_sz);
	deallocate(ptr, old_sz);
	return new_ptr;
}

template<int inst>
void junior_allocator<inst>::deallocate(void* ptr, size_t sz){
	if(sz > LARGE_SIZE){
		superior_allocator<inst>::deallocate(ptr, sz);
		return;
	}
	size_t node_march = march_node(sz);
	obj* pos = (obj*)ptr;
	pos->next = free_list[node_march];
	free_list[node_march] = pos;
}

后记

SGI的两级Allocator实现非常优美，特别是junior_allocator的实现，通过严密的管理策略使得几乎不会产生内存碎片.

另外，上面的代码只是笔者简单的模仿实现，SGI的源码则是更加的完整和健壮，并且对多线程情况也做出处理.

- EOF -

声明:本文采用BY-NC-SA协议进行授权.转载请注明: 仿照实现SGI-STL空间配置器

仿照实现SGI-STL空间配置器 --15 April 2014