On which platforms is thread local storage limited and how much is available?

I was recently made aware that thread local storage is limited on some platforms. For example, the docs for the C++ library boost::thread read:

"Note: There is an implementation specific limit to the number of thread specific storage objects that can be created, and this limit may be small."

I've been searching to try and find out the limits for different platforms, but I haven't been able to find an authoritative table. This is an important question if you're writing a crossplatform app that uses TLS. Linux was the only platform I found information for, in the form of a patch Ingo Monar sent in 2002 to the kernel list adding TLS support, where he mentions, "The number of TLS areas is unlimited, and there is no additional allocation overhead associated with TLS support." Which if still true in 2009 (is it?) is pretty nifty.

But what about Linux today? OS X? Windows? Solaris? Embedded OSes? For OS's that run on multiple architectures does it vary across architectures?

Edit: If you're curious why there might be a limit, consider that the space for thread local storage will be preallocated, so you'll be paying a cost for it on every single thread. Even a small amount in the face of lots of threads can be a problem.

-------------Problems Reply------------

On Linux, if you are using __thread TLS data, the only limit is set by your available address space, as this data is simply allocated as regular RAM referenced by the gs (on x86) or fs (on x86-64) segment descriptors. Note that, in some cases, allocation of TLS data used by dynamically loaded libraries can be elided in threads that do not use that TLS data.

TLS allocated by pthread_key_create and friends, however, is limited to PTHREAD_KEYS_MAX slots (this applies to all conforming pthreads implementations).

For more information on the TLS implemenetation on Linux, see ELF Handling For Thread-Local Storage and The Native POSIX Thread Library for Linux.

That said, if you need portability, your best bet is to minimize TLS use - put a single pointer in TLS, and put everything you need in a data structure hung off that pointer.

I have only used TLS on Windows, and there are slight differences between versions in how much can be used: http://msdn.microsoft.com/en-us/library/ms686749%28VS.85%29.aspx

I assume that your code is only targeting operating systems that support threads - in the past I have worked with embedded and desktop OSes that do not support threading, so do not support TLS.

On the Mac, I know of Task-Specific Storage in the Multiprocessing Services API:


This looks very similar to Windows thread local storage.

I'm not sure if this API is currently recommended for thread local storage on the Mac. Perhaps there is something newer.

It may be that the boost documentation is simply talking about a general configurable limit, not necessarily some hard limit of the platform. On Linux, the ulimit command limits resources processes can have (number of threads, stack size, memory, and a bunch of other stuff). This will indirectly impact your thread local storage. On my system, there doesn't seem to be an entry in ulimit specific to thread local storage. Other platforms may have a way to specify that on its own. Also, I think in many multiprocessor systems, the thread local storage will be in memory dedicated to that CPU, so you may run into limits of physical memory long before the system as a whole has its memory exhausted. I would assume there is some kind of fallback behavior to locate the data in main memory in that situation, but I don't know. As you can tell, I'm conjecturing a lot. Hopefully it still leads you in the right direction...

The thread-local storage declspec on Windows limits you to using it only for static variables, which means you are out of luck if you want to use it in more creative ways.

There is a low-level API on Windows, but it has broken semantics that make it very awkward to initialise: you can't tell whether or not the variable has already been seen by your thread, so you need to explicitly initialise it when you create the thread.

On the other hand, the pthread API for thread-local storage is well thought-out and flexible.

I use a simple template class to provide thread local storage. This simply wraps a std::map and a critical section. This then doesn't suffer from any platform specific thread local problems, the only platform requirement is to get the current thread id as in integer. It might be a little slower than native thread local storage but it can store any data type.

Below is a cut down version of my code. I have removed the the default value logic to simplify the code. As it can store any data type, the increment and decrement operators are only available if T supports them. The critical section is only required to protect looking up and inserting into the map. Once a reference is returned it is safe to use unprotected as only the current thread will use this value.

template <class T>
class ThreadLocal
operator T()
return value();

T & operator++()
return ++value();

T operator++(int)
return value()++;

T & operator--()
return --value();

T operator--(int)
return value()--;

T & operator=(const T& v)
return (value() = v);

T & value()
LockGuard<CriticalSection> lock(m_cs);
return m_threadMap[Thread::getThreadID()];

CriticalSection m_cs;
std::map<int, T> m_threadMap;

To use this class I generally declare a static member inside a class eg

class DBConnection {
DBConnection() {

~DBConnection() {

// ...
static ThreadLocal<unsigned int> m_connectionCount;

ThreadLocal<unsigned int> DBConnection::m_connectionCount

It might not be perfect for every situation but it covers my need and I can easily add any features it is missing as I discover them.

bdonlan is correct this example doesn't clean up after threads exit. However this is very easy to add manually clean up.

template <class T>
class ThreadLocal
static void cleanup(ThreadLocal<T> & tl)
LockGuard<CriticalSection> lock(m_cs);

class AutoCleanup {
AutoCleanup(ThreadLocal<T> & tl) : m_tl(tl) {}
~AutoCleanup() {

ThreadLocal<T> m_tl

// ...

Then a thread that knows it makes explicit use of the ThreadLocal can use ThreadLocal::AutoCleanup in its main function to clean up the variable.

Or in the case of DBConnection

~DBConnection() {
if (--m_connectionCount == 0)

The cleanup() method is static so as not to interfere with operator T(). A global function can be used to call this which would infer the Template parameters.

Category:c# Views:0 Time:2009-09-22

Related post

  • Destruction of static class members in Thread local storage 2011-01-04

    I'm writing a fast multi-thread program, and I want to avoid syncronization (the function which would need to be syncronized must be called something like 5,000,000 times per second, so even a mutex would be too heavy). The scenario is: I have a sing

  • Are C++ exceptions sufficient to implement thread-local storage? 2010-03-21

    I was commenting on an answer that thread-local storage is nice and recalled another informative discussion about exceptions where I supposed The only special thing about the execution environment within the throw block is that the exception object i

  • how to emulate thread local storage at user space in C++? 2010-06-17

    I am working on a mobile platform over Nucleus RTOS. It uses Nucleus Threading system but it doesn't have support for explicit thread local storage i.e, TlsAlloc, TlsSetValue, TlsGetValue, TlsFree APIs. The platform doesn't have user space pthreads a

  • Is there anyway to dynamically free thread-local storage in the Win32 APIs? 2010-07-13

    I need to make use of thread-local storage in a cross-platform project. Under *IX I am using pthreads and can avoid memory leaks thanks to the nice destructor function pointer passed as the second argument to pthread_key_create, but in Windows TlsAll

  • How does a C++ compiler implement thread local storage in C++0x? 2010-09-17

    How does c++ complier implement thread local storage in C++0x I have searched this in google. But I can't find anything about this. Does anyone have any material about this ?? --------------Solutions------------- Have a read of the Wikipedia entry. T

  • thread-local storage overhead 2011-03-27

    Assume there is some not-reentrant function that uses global variables: int i; void foo(void){ /* modify i */ } And then, I want to use this function in multithreaded code, so I can change code this way: void foo(int i){ /* modify i */ } or, by using

  • How to allocate thread local storage? 2011-05-16

    I have a variable in my function that is static, but I would like it to be static on a per thread basis. How can I allocate the memory for my C++ class such that each thread has its own copy of the class instance? AnotherClass::threadSpecificAction()

  • why to use Thread Local Storage (TlsAlloc, TlsGetValue, ets) instead of local variables 2011-06-11

    my question is why use TLS mechanism instead of just local variables in a thread function? Can you please provide some fine example, or what's the advantage of TLS over local vars? Thank you, Mateusz --------------Solutions------------- TLS is helpfu

  • Boost thread local storage in Windows threads 2011-10-30

    I tried to use Boost thread local storage with Windows threads. I built the project without any problems. However my question here is, is it okay to use Boost TLS with Windows threads? --------------Solutions------------- Edit Yes Boost is integrally

  • Memory leak when using shared library with thread local storage via ctypes in a python program 2011-11-10

    I am using the ctypes module in python to load a shared c-library , which contains thread local storage. Its a quite large c-library with a long history, that we are trying to make thread safe. The library contains lots of global variables and static

  • What is "thread local storage" in Python, and why do I need it? 2008-09-19

    In Python specifically, how do variables get shared between threads? Although I have used threading.Thread before I never really understood or saw examples of how variables got shared. Are they shared between the main thread and the children or only

  • What are best practices for using thread local storage in .NET? 2008-10-06

    I have a requirement in my application that I think can be met by using thread local storage, but I'm wondering if it's one of those things that's best to avoid. I have read a few articles on the subject: http://www.dotnetcoders.com/web/Articles/Show

  • Why is thread local storage so slow? 2009-02-03

    I'm working on a custom mark-release style memory allocator for the D programming language that works by allocating from thread-local regions. It seems that the thread local storage bottleneck is causing a huge (~50%) slowdown in allocating memory fr

  • Is thread-local storage persisted between backgroundworker invocations? 2009-02-18

    Are backgroundworker threads re-used? Specifically, if I set a named data slot (thread-local storage) during the DoWork() method of a backgroundworker, will the value of that data slot persist, potentially to be found be some other thread at a later

  • Why would shl_load() fail for libraries with Thread Local Storage? 2009-04-30

    Threads in Perl by default take their own local storage for all variables, to minimise the impact of threads on existing non-thread-aware code. In Perl, a thread-shared variable can be created using an attribute: use threads; use threads::shared; my

  • .Net: Logical thread and Thread Local Storage? 2009-06-17

    I'm reading about the CallContext class (http://msdn.microsoft.com/en-us/library/system.runtime.remoting.messaging.callcontext.aspx). The documentation says something about "logical threads" and "Thread Local Storage". What's a logical thread, I didn

  • Thread local storage used anywhere else? 2009-11-23

    Is thread local storage used anywhere else other than making global and static variables local to a thread?Is it useful in any new code that we write? --------------Solutions------------- TLS can certainly be useful in new code. If you ever want a gl

  • What are the advantages of instance-level thread-local storage? 2010-02-04

    This question led me to wonder about thread-local storage in high-level development frameworks like Java and .NET. Java has a ThreadLocal<T> class (and perhaps other constructs), while .NET has data slots, and soon a ThreadLocal<T> class

  • Thread Local Storage and local method variables 2010-05-06

    In c#, each thread has its own stack space. If this is the case, why is the following code not thread-safe? (It is stated that this code is thread-safe on this post: Locking in C# class Foo { private int count = 0; public void TrySomething() { count+

Copyright (C) dskims.com, All Rights Reserved.

processed in 0.170 (s). 11 q(s)