仿照实现SGI-STL空间配置器

STL里面的Allocator是用来管理空间分配的，这里的空间基本上指的就是内存，管理内存是影响到很多STL的容器性能的主要因素之一，所以对Allocator的了解需要非常深入而不是仅仅知道如何调用．笔者前段时间大致的看了侯捷老师的那本<<STL源码剖析>>，对SGI版本的STL算是有点简略的认识，可是这种走马观花的认识也只能帮助我理解其大致机理，当遇到真正让你实现一个简单的Allocator时候就会显得束手无策，所以有必要通过编码，简单临摹一下SGI的Allocator.

简单粗暴Naive方式实现Allocator

我们可以编写一个最简单的符合STL规范的Naive版本的Allocator,它起码需要allocate/deallocate函数用来申请/释放空间，construct/destroy函数用来在内存空间上构造和析构对象, 还有rebind成员模板以及一些类型别名等.

这个Allocator只是简单的申请/释放空间，没有任何特别之处.

template<typename T>
class allocator{

public:
//常用类型别名,符合STL规范
    typedef T value_type;
    typedef T* pointer;
    typedef const T* const_pointer;
    typedef T& reference;
    typedef const T& const_reference;
    typedef unsigned int size_type;
    typedef int difference_type;

    template<typename U>
    struct rebind{
        typedef allocator<U> other;
    };

    allocator() = default;
    allocator(const allocator &rhs) = default;
    ~allocator() = default;

//常用函数, 符合STL规范
    pointer allocate(size_type n, const void* hint= 0);
    void deallocate(pointer p, size_type n);

    size_type max_size() const;

    void construct(pointer p, const T& x);
    void destroy(pointer p);

    pointer address(const_reference x) const;

};

template<typename T>
typename allocator<T>::pointer allocator<T>::allocate(size_type n, const void* hint){
    return _allocate(n, (pointer)NULL);
}

template<typename T>
void allocator<T>::deallocate(pointer p, size_type n){
    _deallocate(p);
}

template<typename T>
typename allocator<T>::size_type allocator<T>::max_size() const{
    return UINT_MAX/sizeof(T);
}

template<typename T>
void allocator<T>::construct(pointer ptr, const T& value){
    _construct(ptr, value);
}

template<typename T>
void allocator<T>::destroy(pointer ptr){
    _destroy(ptr);
}

template<typename T>
typename allocator<T>::pointer allocator<T>::address(const_reference x) const{
    return (pointer)&x;
}

上面用到的allocate, deallocate, construct和destroy函数定义如下:

template<typename T>
inline T* _allocate(ptrdiff_t size, T*){
    T* tmp = (T*)(::operator new((size_t)(size*sizeof(T))));
    // ::operator new 最底层的内存分配，类似malloc
    if(tmp == 0){
        qError("out of memory, allocate failed.");
    }
    qDebug("_allocate");
    return tmp;
}

template<typename T>
inline void _deallocate(T* p){
    ::operator delete(p);
    qDebug("_deallocate");
}

template<typename T1, typename T2>
inline void _construct(T1* p, const T2& value){
    new(p) T1(value); //new 指定内存位置的用法
    qDebug("_construct");
}

template<typename T>
inline void _destroy(T* p){
    p->~T();
    qDebug("_destroy");
}

SGI版本的两级Allocator实现

SGI的Allocator有两个版本，一个是类似上面提到的Naive版本，它是为了兼容通用的STL标准，但是这个版本SGI从来不用，而是使用两级的Allocator，这个Allocator与STL标准不兼容，只能被SGI-STL的容器使用,这里就简单介绍一下SGI两级Allocator的设计思路.

两级Allocator

SGI的Allocator分为一级和二级Allocator,其中一级Allocator负责管理大于128字节的内存请求，而二级Allocator负责小于等于128字节的内存请求．

一级Allocator

这里我实现的一级Allocator名字叫做superior_allocator,它的成员函数解释如下:

allocate函数负责分配指定大小内存
deallocate函数负责释放指定指针的内存
reallocate函数负责再次申请扩充内存
set_allocate_failed_handler函数用来接受使用者定义的内存不足处理函数，在遇到内存不足时候可以调用此函数期望释放部分内存
oom_allocate函数用来处理allocate函数分配内存时候由于内存不足而失败这种情况
oom_reallocate函数用来处理reallocate函数分配内存时候由于内存不足而失败这种情况

template<int inst>
class superior_allocator{
private:
    //allocate when out of memory
    static void* oom_allocate(size_t n);
    static void* oom_reallocate(void* ptr, size_t old_sz, size_t new_sz);
    // use defined handler when out of memory
    static void (*allocate_failed_handler)();

public:
    static void* allocate(size_t n);
    static void deallocate(void* ptr, size_t);
    static void* reallocate(void *ptr, size_t old_sz, size_t new_sz);

    static void set_allocate_failed_handler(void (*f)());
};

其中具体函数实现如下:

//初始化内存不足处理函数指针
template<int inst>
void (* superior_allocator<inst>::allocate_failed_handler)() = 0;

template<int inst>
void* superior_allocator<inst>::allocate(size_t n){
    void* result = malloc(n);
    if(result == NULL){
        return oom_allocate(n);
    }
    return result;
}

template<int inst>
void* superior_allocator<inst>::oom_allocate(size_t n){
    if(allocate_failed_handler == 0)
        throw hu::THROW_BAD_ALLOC;
    void* result = 0;
    //一直在不断的循环尝试
    while(result == 0){
        //调用处理例程，试图释放一些内存
        allocate_failed_handler(); 
        result = malloc(n);
    }
    return result;
}

template<int inst>
void superior_allocator<inst>::deallocate(void* ptr, size_t){
    free(ptr);
}

template<int inst>
void* superior_allocator<inst>::reallocate(void *ptr,
                                size_t old_sz, size_t new_sz){
    void* result = realloc(ptr, new_sz);
    if(result == 0){
        return oom_reallocate(ptr, old_sz, new_sz);
    }
    return result;
}

template<int inst>
void* superior_allocator<inst>::oom_reallocate(void* ptr,
                                size_t old_sz, size_t new_sz){
    if(allocate_failed_handler == 0)
        throw THROW_BAD_ALLOC;
    void *result = 0;
    while(result == 0){
        allocate_failed_handler();
        result = realloc(ptr, new_sz);
    }
    return result;
}

从实现代码可以看出来，一级Allocator其实也很简单，和上面提到的Naive方式几乎差不多，就是malloc/free/realloc, 遇到malloc/realloc失败的话不断的调用用户处理函数以释放部分内存.之所以这么简单，是因为一级Allocator一般只处理较大内存块的分配管理，所以可以不用像二级Allocator那么斤斤计较，管理严密.

二级Allocator

这里我编写的二级Allocator名字叫做junior_allocator,它是完全按照SGI的设计思路实现的，负责较小内存块的管理.

free_list指针数组

要了解SGI二级Allocator的原理就必须先了解free_list指针数组，free_list指针数组的原型如下:

static obj* free_list[FREE_COUNT];

这里势必要介绍一下obj的原型:

union obj{
    obj* next;
    char data[];
};

这个obj的设计非常巧妙，是一个union,这样设计有以下几点好处:

一个obj的地址也就是其data指针值，这样方便通过obj的data值读出其内存地址.
例如, 我们有obj o那么o.data其实就是&o, 也就执行o的具体位置
一个obj的next字段用来存储指向下个节点的位置,这样能形成类似链表一样的结构.
通过union结合就可以避免单独开辟空间存储指针的空间浪费

结合obj的原型我们再来看free_list指针数组，我们这里free_list指针数组的大小FREE_COUNT = 128 / 8 = 16, 于是有16个obj指针，这16个obj指针分别指向第一个空闲的8字节,16字节,...,128字节内存块的位置

我们可以初始化时候为每个free_list[i]申请连续20个相应块空间，作为备用，初始化之后的内存使用情形如下图:

free_list

上图的free_lst指针数组中的指针指向其对应第一个空闲块，然后后续空闲块通过神奇的obj*连接而成，这里就可以体现到obj设计的精妙了，它一点也不浪费空间，而且还是非常精简的.

另外，对于上图需要解释的就是start_free到end_free这块空间就是junior_allocator分配的还未被使用的空闲内存池.

allocate和deallocate的实现

reallocate基本上就是调用allocate和deallocate,这里就不作表述.

allocate过程

我们试图从junior_allocator申请大小为n字节的内存,如果n > 128，那么交给superior_allocator完成,否则先根据n的大小算出相应内存块大小，比如n = 7那么内存块大小就是8.接着从free_list中找出相应的内存块第一个空闲内存地址free_list[i]，如果此时有空闲块，那么返回此块地址，并且free_list[i] = free_list[i]->next即指向下一个空闲块;如果不存在空闲块，那么试图从空闲的内存池中申请一块连续的内存，然后切分成free_list[i]对应内存块大小的小块,然后更新free_list[i].

这里我们可以看到，每次某个内存块不够用时候就从内存池取出新的加入，总有内存池用完的时候，当内存池用完了该怎么办呢？这里有两个解决办法:

通过malloc再申请一快大内存池
如果系统所有内存都不够了，那只能从其他free_list中扣除一些内存过来了，因为每个free_list[i]都分配了一定数量的内存块，而有些free_list[i]可能用的很少，所以尚存空闲块，那就把它们的抢占过来.

deallocate

deallocate实现非常简单，如果释放的空间大于128字节，那么就交给一级allocator处理；如果不是,则把内存块加到对应free_list[i]中，通常做法就是直接插入大free_list[i]对应链表的表头.

代码实现

junior_allocator类定义

const size_t LARGE_SIZE = 128; //超过128B就由superior_allocator分配
const size_t SLICE_SIZE = 8; //递增片大小
const size_t FREE_COUNT = LARGE_SIZE/SLICE_SIZE; //free list个数，8,16,...128
const size_t CHUNK_NUM = 20; //默认每个尺寸的chunk一次分配的个数

template<int inst>
class junior_allocator{
private:
    union obj{
        obj* next;
        char data[];
    };
    //free_list指针数组
    static obj* free_list[FREE_COUNT];
    static char* start; //当前可分配内存池的开始位置
    static char* finish; //内存池结束位置
    static size_t heap_size; //当前堆大小

    //计算size应该分配的内存块的编号
    //8字节内存块对应0号,16字节内存块对应1号...
    static size_t march_node(size_t size);
    //计算第node_id号内存块的大小
    static size_t node_size(size_t node_id);

    //从内存池申请n个node_id号内存块大小的连续内存
    static char* chunk_alloc(size_t node_id, size_t& n);
    //通过从内存池取新的内存将free_list[node_id]重新填充
    static void* refill(size_t node_id);
public:

    static void* allocate(size_t n);
    static void* reallocte(void* ptr, size_t old_zs, size_t new_sz);
    static void deallocate(void* ptr, size_t);

};

superior_allocator成员函数具体实现

template<int inst>
char* junior_allocator<inst>::start = 0;

template<int inst>
char* junior_allocator<inst>::finish = 0;

template<int inst>
size_t junior_allocator<inst>::heap_size = 0;

template<int inst>
typename junior_allocator<inst>::obj* junior_allocator<inst>::free_list[FREE_COUNT]
                    = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};

template<int inst>
size_t junior_allocator<inst>::march_node(size_t size){
    return (size-1) / SLICE_SIZE;
}

template<int inst>
size_t junior_allocator<inst>::node_size(size_t node_id){
    return (node_id + 1) * SLICE_SIZE;
}

template<int inst>
char* junior_allocator<inst>::chunk_alloc(size_t node_id, size_t &n){
    size_t requre = n*node_size(node_id);
    size_t size_left = finish - start;
    char* result;
    if(size_left < requre){//剩余空间不够
        if(node_size(node_id) < size_left){
            //还够分配一个node
            n = 1;
            result = start;
            start = start + node_size(node_id);
            return result;
        }else{//一个都不够分配了
            if(size_left > 0){
                //把剩余空间分配给相应free_list
                /*
                * 由于每次分配堆大小都是８的整数倍
                * 所以剩余的肯定属于8,16,24,...必然
                * 有一个与它完全适配,不会发生内存碎片
                */
                size_t node_march = march_node(size_left);
                assert(size_left == node_size(node_march));
                obj* cur = (obj*)start;
                cur->next = free_list[node_march];
                free_list[node_march] = cur;
                start = finish;
            }
            size_t add_size = requre + (heap_size >> 4);
            start = (char*)(::operator new(add_size));
            if(start == 0){
                //分配内存失败
                //应该从合适的free_list中回收一定的内存
                for(size_t i = node_id + 1; i < FREE_COUNT; i++){
                    if(free_list[i] != 0){
                        result = (char*)(free_list[i]);
                        free_list[i]= free_list[i]->next;
                        start = result;
                        finish = start + node_size(i);
                        return chunk_alloc(node_id, n);//再次尝试分配
                    }
                }

                //实在不行调用一级空间适配器以调用用户定义的处理历程
                start = (char*)(superior_allocator<inst>::allocate(add_size));
            }
            finish = start + add_size;
        }
    }
    result = start;
    start = start + requre;
    return result;
}

template<int inst>
void* junior_allocator<inst>::refill(size_t node_id){
     size_t chunk_num = CHUNK_NUM;
     void* result;
     result = chunk_alloc(node_id, chunk_num);
     //chunk_num的值为实际分配到的块数目
     obj* pos = (obj*)result;
     //把chunk_num个内存块建立链表索引
     for(auto i = 1; i < chunk_num; i++){
         obj* next = (obj*)((char*)pos + node_size(node_id));
         pos->next = next;
         pos = pos->next;
     }
     pos->next = 0;
     return result;
}

template<int inst>
void* junior_allocator<inst>::allocate(size_t n){
    if(n > LARGE_SIZE){
        //超过LARGE_SIZE交给superior_allocator分配
        return superior_allocator<inst>::allocate(n);
    }else{
        size_t node_id = march_node(n);
        if(free_list[node_id] == 0){
            //已经没有资源了
            free_list[node_id] = (obj*)refill(node_id);
        }
        void* result = free_list[node_id]->data;
        free_list[node_id] = free_list[node_id]->next;
        return result;
    }
}

template<int inst>
void* junior_allocator<inst>::reallocte(void* ptr, size_t old_sz, size_t new_sz){
    void* new_ptr = allocate(new_sz);
    size_t copy_sz = old_sz < new_sz? old_sz : new_sz;
    memcpy(new_ptr, ptr, copy_sz);
    deallocate(ptr, old_sz);
    return new_ptr;
}

template<int inst>
void junior_allocator<inst>::deallocate(void* ptr, size_t sz){
    if(sz > LARGE_SIZE){
        superior_allocator<inst>::deallocate(ptr, sz);
        return;
    }
    size_t node_march = march_node(sz);
    obj* pos = (obj*)ptr;
    pos->next = free_list[node_march];
    free_list[node_march] = pos;
}

后记

SGI的两级Allocator实现非常优美，特别是junior_allocator的实现，通过严密的管理策略使得几乎不会产生内存碎片.

另外，上面的代码只是笔者简单的模仿实现，SGI的源码则是更加的完整和健壮，并且对多线程情况也做出处理.

- EOF -

声明:本文采用BY-NC-SA协议进行授权.转载请注明: 仿照实现SGI-STL空间配置器

仿照实现SGI-STL空间配置器 --15 April 2015