项目作者: yvt

项目描述 :
Constant-time dynamic memory allocator in Rust
高级语言: Rust
项目地址: git://github.com/yvt/rlsf.git
创建时间: 2021-04-04T12:54:40Z
项目社区:https://github.com/yvt/rlsf

开源协议:Other

下载


rlsf


docs.rs

This crate implements the TLSF (Two-Level Segregated Fit) dynamic memory
allocation algorithm¹. Requires Rust 1.61.0 or later.

  • Allocation and deallocation operations are guaranteed to complete in
    constant time.
    TLSF is suitable for real-time applications.

  • Fast and small. You can have both. It was found to be smaller and
    faster² than most no_std-compatible allocator crates.

  • Accepts any kinds of memory pools. The low-level type
    Tlsf just divides any memory pools you provide
    (e.g., a static array) to serve allocation requests.
    The high-level type GlobalTlsf
    automatically acquires memory pages using standard methods on supported
    systems.

  • This crate supports #![no_std]. It can be used in bare-metal and
    RTOS-based applications.

¹ M. Masmano, I. Ripoll, A. Crespo and J. Real, “TLSF: a new dynamic
memory allocator for real-time systems,” Proceedings. 16th Euromicro
Conference on Real-Time Systems
, 2004. ECRTS 2004., Catania, Italy, 2004,
pp. 79-88, doi: 10.1109/EMRTS.2004.1311009.

² Compiled for and measured on a STM32F401 microcontroller using
FarCri.rs.

Measured Performance

The result of latency measurement on STM32F401 is shown here. rlsf:
260–320 cycles. buddy-alloc: 340–440 cycles. umm_malloc: 300–700 cycles.
dlmalloc: 450–750 cycles.

The result of code size measurement on WebAssembly is shown here. rlsf:
1267 bytes, rlsf + pool coalescing: 1584 bytes, wee_alloc: 1910 bytes,
dlmalloc: 9613 bytes.

Drawbacks

  • It does not support concurrent access. A whole pool must be locked
    for allocation and deallocation. If you use a FIFO lock to protect the
    pool, the worst-case execution time will be O(num_contending_threads).
    You should consider using a thread-caching memory allocator (e.g.,
    TCMalloc, jemalloc) if achieving a maximal throughput in a highly
    concurrent environment is desired.

  • Segregated freelists with constant-time lookup cause internal
    fragmentation proportional to free block sizes.
    The SLLEN paramter
    allows for adjusting the trade-off between fewer freelists and lower
    fragmentation.

  • No special handling for small allocations (one algorithm for all
    sizes).
    This may lead to inefficiencies in allocation-heavy
    applications compared to modern scalable memory allocators, such as
    glibc and jemalloc.

Examples

Tlsf: Core API

  1. use rlsf::Tlsf;
  2. use std::{mem::MaybeUninit, alloc::Layout};
  3. let mut pool = [MaybeUninit::uninit(); 65536];
  4. // On 32-bit systems, the maximum block size is 16 << FLLEN = 65536 bytes.
  5. // The worst-case internal fragmentation is (16 << FLLEN) / SLLEN - 2 = 4094 bytes.
  6. // `'pool` represents the memory pool's lifetime (`pool` in this case).
  7. let mut tlsf: Tlsf<'_, u16, u16, 12, 16> = Tlsf::new();
  8. // ^^ ^^ ^^
  9. // | | |
  10. // 'pool | SLLEN
  11. // FLLEN
  12. tlsf.insert_free_block(&mut pool);
  13. unsafe {
  14. let mut ptr1 = tlsf.allocate(Layout::new::<u64>()).unwrap().cast::<u64>();
  15. let mut ptr2 = tlsf.allocate(Layout::new::<u64>()).unwrap().cast::<u64>();
  16. *ptr1.as_mut() = 42;
  17. *ptr2.as_mut() = 56;
  18. assert_eq!(*ptr1.as_ref(), 42);
  19. assert_eq!(*ptr2.as_ref(), 56);
  20. tlsf.deallocate(ptr1.cast(), Layout::new::<u64>().align());
  21. tlsf.deallocate(ptr2.cast(), Layout::new::<u64>().align());
  22. }

GlobalTlsf: Global Allocator

GlobalTlsf automatically acquires memory pages through platform-specific
mechanisms. It doesn’t support returning memory pages to the system even if
the system supports it.

  1. #[cfg(all(target_arch = "wasm32", not(target_feature = "atomics")))]
  2. #[global_allocator]
  3. static A: rlsf::SmallGlobalTlsf = rlsf::SmallGlobalTlsf::new();
  4. let mut m = std::collections::HashMap::new();
  5. m.insert(1, 2);
  6. m.insert(5, 3);
  7. drop(m);

Details

Changes from the Original Algorithm

  • The end of each memory pool is capped by a sentinel block
    (a permanently occupied block) instead of a normal block with a
    last-block-in-pool flag. This simplifies the code a bit and improves
    its worst-case performance and code size.

Cargo Features

  • unstable: Enables experimental features that are exempt from the API
    stability guarantees.

License

MIT/Apache-2.0