Heap Memory
- What is Heap?
- Using dynamic memory
Diving into glibc heap
Refer

本文是参考 heap-exploitation 的阅读记录。

This book is for understanding the structure of heap memory as well as the different kinds of exploitation techniques related to it. The material provided covers in detail the implementation of glibc’s heap and related memory management functions. Next, different types of attacks are discussed.

@software{dhaval_kapil_2022_6450612,
  author       = {Dhaval Kapil},
  title        = {DhavalKapil/heap-exploitation},
  month        = apr,
  year         = 2022,
  publisher    = {Zenodo},
  version      = {v1.0.0},
  doi          = {10.5281/zenodo.6450612},
  url          = {https://doi.org/10.5281/zenodo.6450612}
}

Heap Memory

What is Heap?

Heap is a memory region allotted to every program. Unlike stack, heap memory can be dynamically allocated. This means that the program can ‘request’ and ‘release’ memory from the heap segment whenever it requires. Also, this memory is global, i.e. it can be accessed and modified from anywhere within a program and is not localized to the function where it is allocated. This is accomplished using ‘pointers’ to reference dynamically allocated memory which in turn leads to a small degradation in performance as compared to using local variables (on the stack).

Using dynamic memory

stdlib.h provides with standard library functions to access, modify and manage dynamic memory. Commonly used functions include malloc and free:

// Dynamically allocate 10 bytes
char *buffer = (char *)malloc(10);

strcpy(buffer, "hello");
printf("%s\n", buffer); // prints "hello"

// Frees/unallocates the dynamic memory allocated earlier
free(buffer);

The documentation about ‘malloc’ and ‘free’ says:

malloc

/*
  malloc(size_t n)
  Returns a pointer to a newly allocated chunk of at least n
  bytes, or null if no space is available. Additionally, on
  failure, errno is set to ENOMEM on ANSI C systems.

  If n is zero, malloc returns a minimum-sized chunk. (The
  minimum size is 16 bytes on most 32bit systems, and 24 or 32
  bytes on 64bit systems.)  On most systems, size_t is an unsigned
  type, so calls with negative arguments are interpreted as
  requests for huge amounts of space, which will often fail. The
  maximum supported value of n differs across systems, but is in
  all cases less than the maximum representable value of a
  size_t.
*/

free

/*
  free(void* p)
  Releases the chunk of memory pointed to by p, that had been
  previously allocated using malloc or a related routine such as
  realloc. It has no effect if p is null. It can have arbitrary
  (i.e., bad!) effects if p has already been freed.

  Unless disabled (using mallopt), freeing very large spaces will
  when possible, automatically trigger operations that give
  back unused memory to the system, thus reducing program
  footprint.
*/

It is important to note that these memory allocation functions are provided by the standard library. These functions provide a layer between the developer and the operating system that efficiently manages heap memory. It is the responsibility of the developer to ‘free’ any allocated memory after using it exactly once. Internally, these functions use two system calls sbrk and mmap to request and release heap memory from the operating system. This post discusses these system calls in detail.

Diving into glibc heap

In this section, implementation of glibc’s heap management functions will be discussed in depth. The analysis was done on glibc’s source code dated 27th March 2017. The source is very well documented.

Apart from the source code, the matter presented is influenced by:

Before moving into the implementation, it is important to keep the following notes in mind:

Instead of size_t, INTERNAL_SIZE_T is used internally (which by default is equal to size_t).
Alignment is defined as 2 * (sizeof(size_t)).
MORECORE is defined as the routine to call to obtain more memory. By default it is defined as sbrk.

Next, we shall study the different data types used internally, bins, chunks, and internals of the different functions used.

malloc_chunk

This structure represents a particular chunk of memory. The various fields have different meaning for allocated and unallocated chunks.

struct malloc_chunk {
  INTERNAL_SIZE_T      mchunk_prev_size;  /* Size of previous chunk, if it is free. */
  INTERNAL_SIZE_T      mchunk_size;       /* Size in bytes, including overhead. */
  struct malloc_chunk* fd;                /* double links -- used only if this chunk is free. */
  struct malloc_chunk* bk;
  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if this chunk is free. */
  struct malloc_chunk* bk_nextsize;
};

typedef struct malloc_chunk* mchunkptr;

该结构体包含以下字段：

mchunk_prev_size：前一个内存块的大小，如果前一个内存块是空闲的。
mchunk_size：内存块的大小，包括内存块头部的开销。
fd 和 bk：双向链表指针，用于将空闲的内存块组织成链表。
fd_nextsize 和 bk_nextsize：双向链表指针，用于将空闲的大内存块组织成链表。

这些字段的含义如下：

mchunk_prev_size 和 mchunk_size 字段用于记录内存块的大小和状态等信息。在内存分配时，程序会在内存块的前面和后面分别添加一个控制结构，用于记录该内存块的大小和状态等信息。 fd 和 bk 字段用于将空闲的内存块组织成双向链表。当程序需要分配内存时，会从链表中找到一个大小合适的空闲内存块，并将其从链表中移除。当程序释放内存时，会将该内存块重新插入到链表中。 fd_nextsize 和 bk_nextsize 字段用于将空闲的大内存块组织成双向链表。当程序需要分配大块内存时，会从该链表中找到一个大小合适的空闲大内存块，并将其从链表中移除。

Allocated chunk

    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if unallocated (P clear)  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of chunk, in bytes                     |A|M|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             User data starts here...                          .
            .                                                               .
            .             (malloc_usable_size() bytes)                      .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             (size of chunk, but used for application data)    |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of next chunk, in bytes                |A|0|1|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Notice how the data of an allocated chunk uses the first attribute (mchunk_prev_size) of the next chunk. mem is the pointer which is returned to the user.

Free chunk

    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of previous chunk, if unallocated (P clear)  |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `head:' |             Size of chunk, in bytes                     |A|0|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Forward pointer to next chunk in list             |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Back pointer to previous chunk in list            |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Unused space (may be 0 bytes long)                .
            .                                                               .
            .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `foot:' |             Size of chunk, in bytes                           |
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
            |             Size of next chunk, in bytes                |A|0|0|
            +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Free chunks maintain themselves in a circular doubly linked list.

P (PREV_INUSE): 0 when previous chunk (not the previous chunk in the linked list, but the one directly before it in memory) is free (and hence the size of previous chunk is stored in the first field). The very first chunk allocated has this bit set. If it is 1, then we cannot determine the size of the previous chunk.
M (IS_MMAPPED): The chunk is obtained through mmap. The other two bits are ignored. mmapped chunks are neither in an arena, not adjacent to a free chunk.
A (NON_MAIN_ARENA): 0 for chunks in the main arena. Each thread spawned receives its own arena and for those chunks, this bit is set.

Note: Chunks in fastbins are treated as allocated chunks in the sense that they are not consolidated with neighboring free chunks.

malloc_state

This structure represents the header details of an Arena. The main thread’s arena is a global variable and not part of the heap segment. Arena headers (malloc_state structures) for other threads are themselves stored in the heap segment. Non main arenas can have multiple heaps (‘heap’ here refers to the internal structure used instead of the heap segment) associated with them.

struct malloc_state
{
  /* Serialize access.  */
  __libc_lock_define (, mutex);
  /* Flags (formerly in max_fast).  */
  int flags;

  /* Fastbins */
  mfastbinptr fastbinsY[NFASTBINS];
  /* Base of the topmost chunk -- not otherwise kept in a bin */
  mchunkptr top;
  /* The remainder from the most recent split of a small request */
  mchunkptr last_remainder;
  /* Normal bins packed as described above */
  mchunkptr bins[NBINS * 2 - 2];

  /* Bitmap of bins */
  unsigned int binmap[BINMAPSIZE];

  /* Linked list */
  struct malloc_state *next;
  /* Linked list for free arenas.  Access to this field is serialized
     by free_list_lock in arena.c.  */
  struct malloc_state *next_free;
  /* Number of threads attached to this arena.  0 if the arena is on
     the free list.  Access to this field is serialized by
     free_list_lock in arena.c.  */

  INTERNAL_SIZE_T attached_threads;
  /* Memory allocated from the system in this arena.  */
  INTERNAL_SIZE_T system_mem;
  INTERNAL_SIZE_T max_system_mem;
};

typedef struct malloc_state *mstate;

Bins and Chunks

A bin is a list (doubly or singly linked list) of free (non-allocated) chunks. Bins are differentiated based on the size of chunks they contain:

Fast bin
Unsorted bin
Small bin
Large bin

Fast bins are maintained using:

typedef struct malloc_chunk *mfastbinptr;

mfastbinptr fastbinsY[]; // Array of pointers to chunks

Unsorted, small and large bins are maintained using a single array:

typedef struct malloc_chunk* mchunkptr;

mchunkptr bins[]; // Array of pointers to chunks

Initially, during the initialization process, small and large bins are empty.

Each bin is represented by two values in the bins array. The first one is a pointer to the ‘HEAD’ and the second one is a pointer to the ‘TAIL’ of the bin list. In the case of fast bins (singly linked list), the second value is NULL.

Fast bins

There are 10 fast bins. Each of these bins maintains a single linked list. Addition and deletion happen from the front of this list (LIFO manner).

Each bin has chunks of the same size. The 10 bins each have chunks of sizes: 16, 24, 32, 40, 48, 56, 64, 72, 80 and 88. Sizes mentioned here include metadata as well. To store chunks, 4 fewer bytes will be available (on a platform where pointers use 4 bytes). Only the prev_size and size field of this chunk will hold meta data for allocated chunks. prev_size of next contiguous chunk will hold user data.

No two contiguous free fast chunks coalesce together.

相邻的两个空闲快（fast chunk）不会合并在一起。在 glibc 中，fast chunk 是指小于 512 字节的内存块，它们通常用于分配小块内存，以提高内存分配的效率。

fast chunk 的分配和释放是通过双向链表实现的。当程序需要分配内存时，会从链表中找到一个大小合适的空闲快，并将其从链表中移除。当程序释放内存时，会将该内存块重新插入到链表中。

由于 fast chunk 的大小较小，因此它们通常是紧密排列的。如果相邻的两个 fast chunk 都是空闲的，那么它们可能会被合并在一起，以减少内存碎片。但是，根据这句话的说法，glibc 中的 fast chunk 不会进行这种合并操作，即使相邻的两个 fast chunk 都是空闲的，它们也不会合并在一起。

这种设计可能是为了避免频繁的内存合并操作，从而提高内存分配的效率。但是，它也可能会导致内存碎片的增加，从而降低内存使用效率。因此，在实际使用中，需要根据具体情况进行权衡和选择。

Unsorted bin

There is only 1 unsorted bin. Small and large chunks, when freed, end up in this bin. The primary purpose of this bin is to act as a cache layer (kind of) to speed up allocation and deallocation requests.

glibc 中只有一个未排序的 bin。当小块和大块内存被释放时，它们都会被放入这个 bin 中。这个 bin 的主要目的是作为缓存层，以加速内存分配和释放请求。

在 glibc 中，内存块的分配和释放是通过多个 bin 来管理的。每个 bin 包含一组大小相近的内存块，用于满足不同大小的内存分配请求。这些 bin 可以分为多个类别，如 fast bin、small bin、large bin 等。

未排序的 bin 是一个特殊的 bin，它用于存放大小不确定的内存块。当程序释放内存时，如果无法确定该内存块的大小，那么它就会被放入未排序的 bin 中。这个 bin 的主要作用是作为一个缓存层，以加速内存分配和释放请求。当程序需要分配内存时，会先从未排序的 bin 中查找是否有合适的内存块可用，如果没有，则会从其他 bin 中查找。

由于未排序的 bin 中的内存块大小不确定，因此它们可能会被频繁地分配和释放。为了避免频繁地在未排序的 bin 中进行内存分配和释放操作，glibc 中的未排序的 bin 通常会被限制为只有一个。这样可以减少内存分配和释放时的开销，提高内存分配和释放的效率。

Small bins

There are 62 small bins. Small bins are faster than large bins but slower than fast bins. Each bin maintains a doubly-linked list. Insertions happen at the ‘HEAD’ while removals happen at the ‘TAIL’ (in a FIFO manner).

Like fast bins, each bin has chunks of the same size. The 62 bins have sizes: 16, 24, … , 504 bytes.

While freeing, small chunks may be coalesced together before ending up in unsorted bins.

glibc 中有 62 个 small bin。Small bin 的速度比 large bin 快，但比 fast bin 慢。每个 bin 维护一个双向链表。插入操作发生在“HEAD”，而删除操作发生在“TAIL”（以 FIFO 方式）。

与 fast bin 类似，每个 bin 都有相同大小的内存块。62 个 bin 的大小分别为：16、24、…、504 字节。

在释放内存时，小块内存可能会在最终进入未排序的 bin 之前合并在一起。

Large bins

There are 63 large bins. Each bin maintains a doubly-linked list. A particular large bin has chunks of different sizes, sorted in decreasing order (i.e. largest chunk at the ‘HEAD’ and smallest chunk at the ‘TAIL’). Insertions and removals happen at any position within the list.

The first 32 bins contain chunks which are 64 bytes apart:

1st bin: 512 - 568 bytes
2nd bin: 576 - 632 bytes
.
.

To summarize

No. of Bins       Spacing between bins

bins of size       8  [ Small bins]
bins of size      64  [ Large bins]
bins of size     512  [ Large bins]
bins of size     4096  [ ..        ]
bins of size    32768
bins of size   262144
bin  of size what's left

Like small chunks, while freeing, large chunks may be coalesced together before ending up in unsorted bins.

There are two special types of chunks which are not part of any bin.

Top chunk

It is the chunk which borders the top of an arena. While servicing ‘malloc’ requests, it is used as the last resort. If still more size is required, it can grow using the sbrk system call. The PREV_INUSE flag is always set for the top chunk.

在 glibc 中，arena 是一块连续的内存区域，用于管理内存的分配和释放。arena 中包含多个 chunk，每个 chunk 用于表示一块内存。顶部 chunk 是 arena 中最后一个 chunk，它与 arena 顶部相邻。

在处理 malloc 请求时，glibc 会先从 fast bin、small bin、large bin 等 bin 中查找是否有合适的内存块可用。如果没有，则会从未排序的 bin 中查找。如果仍然没有找到合适的内存块，则会使用顶部 chunk 来分配内存。

顶部 chunk 的大小通常比较大，因此它可以用于分配较大的内存块。如果需要更多的内存，顶部 chunk 可以使用 sbrk 系统调用来增长。sbrk 系统调用可以将进程的数据段（data segment）增长指定的字节数，从而为进程分配更多的内存。

顶部 chunk 的 PREV_INUSE 标志始终被设置。这个标志用于表示前一个 chunk 是否正在使用中。由于顶部 chunk 是 arena 中最后一个 chunk，因此它的前一个 chunk 通常是正在使用中的。因此，PREV_INUSE 标志被设置为 1。

Last remainder chunk

It is the chunk obtained from the last split. Sometimes, when exact size chunks are not available, bigger chunks are split into two. One part is returned to the user whereas the other becomes the last remainder chunk.

在 glibc 中，内存块的分配和释放是通过多个 bin 来管理的。每个 bin 包含一组大小相近的内存块，用于满足不同大小的内存分配请求。当程序需要分配内存时，会从相应的 bin 中查找是否有合适的内存块可用。如果找到了一个精确大小的内存块，则将其返回给用户。如果没有找到精确大小的内存块，则会找到一个较大的内存块，并将其分成两部分。其中一部分返回给用户，而另一部分成为最后的剩余 chunk。

最后的剩余 chunk 通常是较小的，因为它是从较大的内存块中分割出来的。它可能会被放入 fast bin、small bin 或 large bin 中，以供后续的内存分配请求使用。如果最后的剩余 chunk 大小足够小，它可能会被放入 fast bin 中，以提高内存分配的效率。如果最后的剩余 chunk 大小较大，则可能会被放入 small bin 或 large bin 中，以供后续的内存分配请求使用。

Internal Functions

https://heap-exploitation.dhavalkapil.com/diving_into_glibc_heap/internal_functions

Core Functions

https://heap-exploitation.dhavalkapil.com/diving_into_glibc_heap/core_functions

Security Checks

This presents a summary of the security checks introduced in glibc’s implementation to detect and prevent heap related attacks.

https://heap-exploitation.dhavalkapil.com/diving_into_glibc_heap/security_checks

Refer

https://heap-exploitation.dhavalkapil.com/
https://github.com/DhavalKapil/heap-exploitation
Buffer Overflow Exploit
Syscalls used by malloc
r2Con2016 Glibc Heap Analysis with radare2 video