Skip to content

Commit

Permalink
Internal change
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 412115858
Change-Id: Iefeacb58d4cb2c909e0fa835161cdaac2ee61417
  • Loading branch information
TCMalloc Team authored and ckennelly committed Nov 24, 2021
1 parent 5079678 commit bcf4746
Show file tree
Hide file tree
Showing 15 changed files with 246 additions and 247 deletions.
17 changes: 9 additions & 8 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,13 @@ support for TCMalloc.

All users of TCMalloc should consult the following documentation resources:

* The [TCMalloc Quickstart](quickstart.md) covers downloading, installing,
building, and testing TCMalloc, including incorporating within your codebase.
* The [TCMalloc Overview](overview.md) covers the basic architecture of
TCMalloc, and how that may affect configuration choices.
* The [TCMalloc Reference](reference.md) covers the C and C++ TCMalloc API
endpoints.
* The [TCMalloc Quickstart](quickstart.md) covers downloading, installing,
building, and testing TCMalloc, including incorporating within your
codebase.
* The [TCMalloc Overview](overview.md) covers the basic architecture of
TCMalloc, and how that may affect configuration choices.
* The [TCMalloc Reference](reference.md) covers the C and C++ TCMalloc API
endpoints.

More advanced usages of TCMalloc may find the following documentation useful:

Expand Down Expand Up @@ -51,7 +52,7 @@ We've published several papers relating to TCMalloc optimizations:

## License

The TCMalloc library is licensed under the terms of the Apache
license. See LICENSE for more information.
The TCMalloc library is licensed under the terms of the Apache license. See
LICENSE for more information.

Disclaimer: This is not an officially supported Google product.
68 changes: 34 additions & 34 deletions docs/compatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,41 +4,41 @@ This document details what we expect from well-behaved users. Any usage of
TCMalloc libraries outside of these technical boundaries may result in breakage
when upgrading to newer versions of TCMalloc.

Put another way: don't do things that make TCMalloc API maintenance
tasks harder. If you misuse TCMalloc APIs, you're on your own.
Put another way: don't do things that make TCMalloc API maintenance tasks
harder. If you misuse TCMalloc APIs, you're on your own.

Additionally, because TCMalloc depends on Abseil, Abseil's [compatibility
guidelines](https://abseil.io/about/compatibility) also apply.
Additionally, because TCMalloc depends on Abseil, Abseil's
[compatibility guidelines](https://abseil.io/about/compatibility) also apply.

## What Users Must (And Must Not) Do

* **Do not depend on a compiled representation of TCMalloc.** We do not
promise any ABI compatibility — we intend for TCMalloc to be built from
source, hopefully from head. The internal layout of our types may change at
any point, without notice. Building TCMalloc in the presence of different C++
standard library types may change Abseil types, especially for pre-adopted
types (`string_view`, `variant`, etc) — these will become typedefs and
their ABI will change accordingly.
* **Do not rely on dynamic loading/unloading.** TCMalloc does not support
dynamic loading and unloading.
* **You may not open namespace `tcmalloc`.** You are not allowed to define
additional names in namespace `tcmalloc`, nor are you allowed to specialize
anything we provide.
* **You may not depend on the signatures of TCMalloc APIs.** You cannot take the
address of APIs in TCMalloc (that would prevent us from adding overloads
without breaking you). You cannot use metaprogramming tricks to depend on
those signatures either. (This is also similar to the restrictions in the C++
standard.)
* **You may not forward declare TCMalloc APIs.** This is actually a sub-point of
"do not depend on the signatures of TCMalloc APIs" as well as "do not open
namespace `tcmalloc`", but can be surprising. Any refactoring that changes
template parameters, default parameters, or namespaces will be a breaking
change in the face of forward-declarations.
* **Do not depend upon internal details.** This should go without saying: if
something is in a namespace or filename/path that includes the word
"internal", you are not allowed to depend upon it. It's an implementation
detail. You cannot friend it, you cannot include it, you cannot mention it or
refer to it in any way.
* **Include What You Use.** We may make changes to the internal `#include` graph
for TCMalloc headers - if you use an API, please include the relevant header
file directly.
* **Do not depend on a compiled representation of TCMalloc.** We do not
promise any ABI compatibility — we intend for TCMalloc to be built
from source, hopefully from head. The internal layout of our types may
change at any point, without notice. Building TCMalloc in the presence of
different C++ standard library types may change Abseil types, especially for
pre-adopted types (`string_view`, `variant`, etc) — these will become
typedefs and their ABI will change accordingly.
* **Do not rely on dynamic loading/unloading.** TCMalloc does not support
dynamic loading and unloading.
* **You may not open namespace `tcmalloc`.** You are not allowed to define
additional names in namespace `tcmalloc`, nor are you allowed to specialize
anything we provide.
* **You may not depend on the signatures of TCMalloc APIs.** You cannot take
the address of APIs in TCMalloc (that would prevent us from adding overloads
without breaking you). You cannot use metaprogramming tricks to depend on
those signatures either. (This is also similar to the restrictions in the
C++ standard.)
* **You may not forward declare TCMalloc APIs.** This is actually a sub-point
of "do not depend on the signatures of TCMalloc APIs" as well as "do not
open namespace `tcmalloc`", but can be surprising. Any refactoring that
changes template parameters, default parameters, or namespaces will be a
breaking change in the face of forward-declarations.
* **Do not depend upon internal details.** This should go without saying: if
something is in a namespace or filename/path that includes the word
"internal", you are not allowed to depend upon it. It's an implementation
detail. You cannot friend it, you cannot include it, you cannot mention it
or refer to it in any way.
* **Include What You Use.** We may make changes to the internal `#include`
graph for TCMalloc headers - if you use an API, please include the relevant
header file directly.
21 changes: 10 additions & 11 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@ allocator that has the following characteristics:

## Usage

You use TCMalloc by specifying it as the `malloc` attribute on your binary rules
in Bazel.
You use TCMalloc by specifying it as the `malloc` attribute on your binary rules in Bazel.

## Overview

Expand Down Expand Up @@ -80,15 +79,15 @@ size-class. The size-classes are designed to minimize the amount of memory that
is wasted when rounding to the next largest size-class.

When compiled with `__STDCPP_DEFAULT_NEW_ALIGNMENT__ <= 8`, we use a set of
sizes aligned to 8 bytes for raw storage allocated with `::operator new`. This
sizes aligned to 8 bytes for raw storage allocated with `::operator new`. This
smaller alignment minimizes wasted memory for many common allocation sizes (24,
40, etc.) which are otherwise rounded up to a multiple of 16 bytes. On many
compilers, this behavior is controlled by the `-fnew-alignment=...` flag.
When `__STDCPP_DEFAULT_NEW_ALIGNMENT__` is not
specified (or is larger than 8 bytes), we use standard 16 byte alignments for
`::operator new`. However, for allocations under 16 bytes, we may return an
object with a lower alignment, as no object with a larger alignment requirement
can be allocated in the space.
When
`__STDCPP_DEFAULT_NEW_ALIGNMENT__` is not specified (or is larger than 8 bytes),
we use standard 16 byte alignments for `::operator new`. However, for
allocations under 16 bytes, we may return an object with a lower alignment, as
no object with a larger alignment requirement can be allocated in the space.

When an object of a given size is requested, that request is mapped to a request
of a particular size-class using the
Expand Down Expand Up @@ -297,9 +296,9 @@ available objects in the spans, more spans are requested from the back-end.
When objects are
[returned to the central free list](https://github.com/google/tcmalloc/blob/master/tcmalloc/central_freelist.cc),
each object is mapped to the span to which it belongs (using the
[pagemap](#pagemap-and-spans)) and then released into that span. If all the objects that
reside in a particular span are returned to it, the entire span gets returned to
the back-end.
[pagemap](#pagemap-and-spans)) and then released into that span. If all the
objects that reside in a particular span are returned to it, the entire span
gets returned to the back-end.

### Pagemap and Spans

Expand Down
13 changes: 6 additions & 7 deletions docs/gperftools.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,12 @@ implementation itself.
## History

Google open-sourced its memory allocator as part of "Google Performance Tools"
in 2005. At the time, it became easy to externalize code, but more difficult to
keep it in-sync with our internal usage, as discussed by Titus Winters’ in [his
2017 CppCon Talk](https://www.youtube.com/watch?v=tISy7EJQPzI) and the "Software
Engineering at Google" book. Subsequently, our internal implementation diverged
from the code externally. This project eventually was adopted by the community
as "gperftools."
in 2005. At the time, it became easy to externalize code, but more difficult to
keep it in-sync with our internal usage, as discussed by Titus Winters’ in
[his 2017 CppCon Talk](https://www.youtube.com/watch?v=tISy7EJQPzI) and the
"Software Engineering at Google" book. Subsequently, our internal implementation
diverged from the code externally. This project eventually was adopted by the
community as "gperftools."

## Differences

Expand Down Expand Up @@ -68,4 +68,3 @@ exceptions:
Over time, we have found that configurability carries a maintenance burden.
While a knob can provide immediate flexibility, the increased complexity can
cause subtle problems for more rarely used combinations.

23 changes: 10 additions & 13 deletions docs/gwp-asan.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@ GWP-ASan is a recursive acronym: "**G**WP-ASan **W**ill **P**rovide

## Why not just use ASan?

For many cases you **should** use
[ASan](https://clang.llvm.org/docs/AddressSanitizer.html)
For many cases you **should** use [ASan](https://clang.llvm.org/docs/AddressSanitizer.html)
(e.g., on your tests). However, ASan comes with average execution slowdown of 2x
(compared to `-O2`), binary size increase of 2x, and significant memory
overhead. For these reasons, ASan is generally impractical for use in production
Expand All @@ -17,23 +16,21 @@ designed for widespread use in production.

## How to use GWP-ASan

You can enable GWP-ASan by calling
`tcmalloc::MallocExtension::ActivateGuardedSampling()`.
You can enable GWP-ASan by calling `tcmalloc::MallocExtension::ActivateGuardedSampling()`.
To adjust GWP-ASan's sampling rate, see
[below](#what-should-i-set-the-sampling-rate-to).

When GWP-ASan detects a heap memory error, it prints stack traces for the point
of the memory error, as well as the points where the memory was allocated and
(if applicable) freed. These stack traces can then be
symbolized
offline to get file names and line numbers.
symbolized offline to get file names and line
numbers.

GWP-ASan will crash after printing stack traces.

## CPU and RAM Overhead

For guarded sampling rates above 100M (the default), CPU overhead is negligible.
For sampling rates as low as 8M, CPU overhead is under 0.5%.
For guarded sampling rates above 100M (the default), CPU overhead is negligible. For sampling rates as low as 8M, CPU overhead is under 0.5%.

RAM overhead is up to 512 KB on x86\_64, or 4 MB on PowerPC.

Expand All @@ -56,10 +53,10 @@ CPU overhead, we recommend a sampling rate of 8MB.

- GWP-ASan has limited diagnostic information for buffer overflows within
alignment padding, since overflows of this type will not touch a guard
page.
For write-overflows, GWP-ASan will still be able to detect the overflow
during deallocation by checking whether magic bytes have been overwritten,
but the stack trace of the overflow itself will not be available.
page. For write-overflows,
GWP-ASan will still be able to detect the overflow during deallocation by
checking whether magic bytes have been overwritten, but the stack trace of
the overflow itself will not be available.

## FAQs

Expand All @@ -71,7 +68,7 @@ always a true bug, or a sign of hardware failure (see below).
### How do I know a GWP-ASan report isn't caused by hardware failure?

The vast majority of GWP-ASan reports we see are true bugs, but occasionally
faulty hardware will be the actual cause of the crash. In general, if you see
faulty hardware will be the actual cause of the crash. In general, if you see
the same GWP-ASan crash on multiple machines, it is very likely there's a true
software bug.

Expand Down
55 changes: 28 additions & 27 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,25 @@ TCMalloc is designed to be more efficient at scale than other implementations.

Specifically, TCMalloc provides the following benefits:

* Performance scales with highly parallel applications.
* Optimizations brought about with recent C++14 and C++17 standard enhancements,
and by diverging slightly from the standard where performance benefits
warrant. (These are noted within the [TCMalloc Reference](reference.md).)
* Extensions to allow performance improvements under certain architectures, and
additional behavior such as metric gathering.
* Performance scales with highly parallel applications.
* Optimizations brought about with recent C++14 and C++17 standard
enhancements, and by diverging slightly from the standard where performance
benefits warrant. (These are noted within the
[TCMalloc Reference](reference.md).)
* Extensions to allow performance improvements under certain architectures,
and additional behavior such as metric gathering.

## TCMalloc Cache Operation Mode

TCMalloc may operate in one of two fashions:

* (default) per-CPU caching, where TCMalloc maintains memory caches local to
individual logical cores. Per-CPU caching is enabled when running TCMalloc on
any Linux kernel that utilizes restartable sequences (RSEQ). Support for RSEQ
was merged in Linux 4.18.
* per-thread caching, where TCMalloc maintains memory caches local to
each application thread. If RSEQ is unavailable, TCMalloc reverts to using
this legacy behavior.
* (default) per-CPU caching, where TCMalloc maintains memory caches local to
individual logical cores. Per-CPU caching is enabled when running TCMalloc
on any Linux kernel that utilizes restartable sequences (RSEQ). Support for
RSEQ was merged in Linux 4.18.
* per-thread caching, where TCMalloc maintains memory caches local to each
application thread. If RSEQ is unavailable, TCMalloc reverts to using this
legacy behavior.

NOTE: the "TC" in TCMalloc refers to Thread Caching, which was originally a
distinguishing feature of TCMalloc; the name remains as a legacy.
Expand All @@ -35,21 +36,21 @@ locks for most memory allocations and deallocations.

## TCMalloc Features

TCMalloc provides APIs for dynamic memory allocation: `malloc()` using the C
TCMalloc provides APIs for dynamic memory allocation: `malloc()` using the C
API, and `::operator new` using the C++ API. TCMalloc, like most allocation
frameworks, manages this memory better than raw memory requests (such as through
`mmap()`) by providing several optimizations:

* Performs allocations from the operating system by managing
specifically-sized chunks of memory (called "pages"). Having all of these
chunks of memory the same size allows TCMalloc to simplify bookkeeping.
* Devoting separate pages (or runs of pages called "Spans" in TCMalloc) to
specific object sizes. For example, all 16-byte objects are placed within
a "Span" specifically allocated for objects of that size. Operations to get or
release memory in such cases are much simpler.
* Holding memory in *caches* to speed up access of commonly-used objects.
Holding such caches even after deallocation also helps avoid costly system
calls if such memory is later re-allocated.
* Performs allocations from the operating system by managing
specifically-sized chunks of memory (called "pages"). Having all of these
chunks of memory the same size allows TCMalloc to simplify bookkeeping.
* Devoting separate pages (or runs of pages called "Spans" in TCMalloc) to
specific object sizes. For example, all 16-byte objects are placed within a
"Span" specifically allocated for objects of that size. Operations to get or
release memory in such cases are much simpler.
* Holding memory in *caches* to speed up access of commonly-used objects.
Holding such caches even after deallocation also helps avoid costly system
calls if such memory is later re-allocated.

The cache size can also affect performance. The larger the cache, the less any
given cache will overflow or get exhausted, and therefore require a lock to get
Expand All @@ -58,7 +59,7 @@ default behavior should be preferred in most cases. For more information,
consult the [TCMalloc Tuning Guide](tuning.md).

Additionally, TCMalloc exposes telemetry about the state of the application's
heap via `MallocExtension`. This can be used for gathering profiles of the live
heap via `MallocExtension`. This can be used for gathering profiles of the live
heap, as well as a snapshot taken near the heap's highwater mark size (a peak
heap profile).

Expand Down Expand Up @@ -87,8 +88,8 @@ The TCMalloc API obeys the behavior of C90 DR075 and
[DR445](http://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_445)
which states:

The alignment requirement still applies even if the size is too small for
any object requiring the given alignment.
> The alignment requirement still applies even if the size is too small for any
> object requiring the given alignment.
In other words, `malloc(1)` returns `alignof(std::max_align_t)`-aligned pointer.
Based on the progress of
Expand Down
6 changes: 3 additions & 3 deletions docs/platforms.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# TCMalloc Platforms

The TCMalloc code is supported on the following platforms. By "platforms",
we mean the union of operating system, architecture (e.g. little-endian vs.
The TCMalloc code is supported on the following platforms. By "platforms", we
mean the union of operating system, architecture (e.g. little-endian vs.
big-endian), compiler, and standard library.

## Language Requirements
Expand All @@ -13,7 +13,7 @@ We guarantee that our code will compile under the following compilation flags:

Linux:

* gcc 9.2+, clang 9.0+: `-std=c++17`
* gcc 9.2+, clang 9.0+: `-std=c++17`

(TL;DR; All code at this time must be built under C++17. We will update this
list if circumstances change.)
Expand Down
9 changes: 5 additions & 4 deletions docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ starting development using TCMalloc at least run through this quick tutorial.

Running the code within this tutorial requires:

* A compatible platform (E.g. Linux). Consult the [Platforms Guide](platforms.md)
for more information.
* A compatible platform (E.g. Linux). Consult the
[Platforms Guide](platforms.md) for more information.
* A compatible C++ compiler *supporting at least C++17*. Most major compilers
are supported.
* [Git](https://git-scm.com/) for interacting with the Abseil source code
Expand Down Expand Up @@ -45,8 +45,8 @@ Resolving deltas: 100% (1083/1083), done.
$
```

Git will create the repository within a directory named `tcmalloc`.
Navigate into this directory and run all tests:
Git will create the repository within a directory named `tcmalloc`. Navigate
into this directory and run all tests:

```
$ cd tcmalloc
Expand Down Expand Up @@ -136,6 +136,7 @@ local_repository(
path = "/PATH_TO_SOURCE/Source/tcmalloc",
)
```

The "name" in the `WORKSPACE` file identifies the name you will use in Bazel
`BUILD` files to refer to the linked repository (in this case
"com_google_tcmalloc").
Expand Down
Loading

0 comments on commit bcf4746

Please sign in to comment.