Discussion:
YA question on thread identifiers
(too old to reply)
Ronald Landheer-Cieslak
2003-08-12 11:39:41 UTC
Permalink
Hello all,

I would be kinda surprised if the answer is "no", but I want to be
sure anyway: is the return value of pthread_self() guaranteed to
persistently point to the thread?

I.e. is this guaranteed to work:
{
pthread_t handle1 = pthread_self();

// do lots and lots and lots of stuff

assert(pthread_equal(handle1, pthread_self()));
}
?

Like I said, I'd be surprised if the answer is "no", but it wouldn't
be the first time I'd be surprised..

Thx in advance for any comments :)

rlc

NB: please Cc me in replies - I don't have access to NNTP except
through Google.
David Butenhof
2003-08-12 15:32:45 UTC
Permalink
<posted & mailed>
Post by Ronald Landheer-Cieslak
I would be kinda surprised if the answer is "no", but I want to be
sure anyway: is the return value of pthread_self() guaranteed to
persistently point to the thread?
{
pthread_t handle1 = pthread_self();
// do lots and lots and lots of stuff
assert(pthread_equal(handle1, pthread_self()));
}
Yes, that ought to work. However, assert (handle1 == pthread_self()) might
NOT have worked, even on implementations where pthread_t is a scalar type
that C can compare directly. That is, pthread_self() is not required to
return the same value each time its called within a given thread... but if
not then pthread_equal() needs to be able to adjust for that. (I know of no
current implementation where pthread_self() will ever return more than one
value for a given thread, though there are some where pthread_t isn't a
scalar type.)

On the other hand, this is a pretty silly thing to assert except possibly if
you're developing a user-mode context switching "pthread emulation", in
which case you ought to have a better way to do this. If, on the other
hand, you suspect that you're USING someone else's implementation that's so
badly broken that thread identity within a single routine is realistically
in question... what makes you think that pthread_self() or pthread_equal()
is any more reliable than the rest of the implementation? (To put it
another way: if you really need to ask, can you trust the answer?)
--
/--------------------[ ***@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/
Ronald Landheer-Cieslak
2003-08-13 09:18:08 UTC
Permalink
Post by David Butenhof
<posted & mailed>
Post by Ronald Landheer-Cieslak
I would be kinda surprised if the answer is "no", but I want to be
sure anyway: is the return value of pthread_self() guaranteed to
persistently point to the thread?
{
pthread_t handle1 = pthread_self();
// do lots and lots and lots of stuff
assert(pthread_equal(handle1, pthread_self()));
}
Yes, that ought to work. However, assert (handle1 == pthread_self()) might
NOT have worked, even on implementations where pthread_t is a scalar type
that C can compare directly. That is, pthread_self() is not required to
return the same value each time its called within a given thread... but if
not then pthread_equal() needs to be able to adjust for that. (I know of no
current implementation where pthread_self() will ever return more than one
value for a given thread, though there are some where pthread_t isn't a
scalar type.)
Of course.. :)
Post by David Butenhof
On the other hand, this is a pretty silly thing to assert except possibly if
you're developing a user-mode context switching "pthread emulation", in
which case you ought to have a better way to do this. If, on the other
hand, you suspect that you're USING someone else's implementation that's so
badly broken that thread identity within a single routine is realistically
in question... what makes you think that pthread_self() or pthread_equal()
is any more reliable than the rest of the implementation? (To put it
another way: if you really need to ask, can you trust the answer?)
The actual code will not have the assertion explicitly, because there
will be no way for me to store the thread ID somewhere in a
thread-safe manner: I'm (still) implementing a thread-local pointer. I
need to be able to count on the handle being returned for my thread to
be valid at all times - at least until the end of the thread's life.

Do I really need to ask? Well.. just to make sure I haven't overlooked
something in the standard on the transient-ness of the return value of
pthread_self, yes.

Can I trust the answer? I'd have been very surprised if the answer
were "no" but that would have made the whole concept of my
thread-local pointer impossible. I now construct an ID unique to the
thread from a handle given to me by pthread_self (i.e. I hash the
count of "ID"s I've already generated) and put the pair in a linked
list. For each ID I'm asked to generate, I go through the list and
compare handles with pthread_equal - if they're the same, I hand back
the same hash, if not, I generate a new one. It's not a very efficient
algorithm, but I think it'll work *if* I can count on the pthread_self
value being valid for the entire lifetime of the thread..

I'll also have to deal with removing old handles (for threads that no
longer exist) from the hash's "memory" in a portable way (using
pthread_cleanup_*) but I'll cross that bridge when I get to it (it's
just over the horizon at the moment).

Thanx,

rlc
Ronald Landheer-Cieslak
2003-08-13 10:56:59 UTC
Permalink
Post by Ronald Landheer-Cieslak
The actual code will not have the assertion explicitly, because there
will be no way for me to store the thread ID somewhere in a
thread-safe manner: I'm (still) implementing a thread-local pointer. I
need to be able to count on the handle being returned for my thread to
be valid at all times - at least until the end of the thread's life.
Since your answer suggests you're using an existing thread library rather
than creating yoru own... why aren't you using the standard thread-specific
data instead of trying to invent your own "thread-local" data mechanism?
That IS what it's there for. Or is this more an academic exercise?
What I need is a portable and transparent way to store data
local to a thread in an instance of a class that is not local to the
thread. The transparency means the code using my class must be able to
use it as any other (raw) pointer. That means that I can't just ask for
a key and hand it back to the user - the class must be able to handle
it's own keys internally. Hence, the easiest way would be to map the
pointer I'm storing against the thread itself - either by getting a
unique ID of the thread and mapping against that (which is what I do
for Windows) or by creating a unique ID from the handle of the thread
(because I can't use the handle itself to map against) and map against
the ID.

The problem is mainly that there isn't a portable way to store the key
one uses for thread-local storage anywhere that I'm sure I can get at
it from my thread: I can't hand it back to the user because he's not
expecting that, I can hardly put it in thread-local storage by itself
(as then I'd have YA key to point to the key). I can't expect to get
the same key each time I ask for a new one, I can't just put it on the
thread's stack because I don't know how long it will stay alive, etc.
The only way I see to do this is to map against something that is
unique and non-transient for the thread itself: either the ID (Windows)
or an ID I create (POSIX).

thx,

rlc
David Butenhof
2003-08-13 13:57:02 UTC
Permalink
Post by Ronald Landheer-Cieslak
Post by Ronald Landheer-Cieslak
The actual code will not have the assertion explicitly, because there
will be no way for me to store the thread ID somewhere in a
thread-safe manner: I'm (still) implementing a thread-local pointer. I
need to be able to count on the handle being returned for my thread to
be valid at all times - at least until the end of the thread's life.
Since your answer suggests you're using an existing thread library rather
than creating yoru own... why aren't you using the standard
thread-specific data instead of trying to invent your own "thread-local"
data mechanism? That IS what it's there for. Or is this more an academic
exercise?
What I need is a portable and transparent way to store data
local to a thread in an instance of a class that is not local to the
thread. The transparency means the code using my class must be able to
use it as any other (raw) pointer. That means that I can't just ask for
a key and hand it back to the user - the class must be able to handle
it's own keys internally. Hence, the easiest way would be to map the
pointer I'm storing against the thread itself - either by getting a
unique ID of the thread and mapping against that (which is what I do
for Windows) or by creating a unique ID from the handle of the thread
(because I can't use the handle itself to map against) and map against
the ID.
The problem is mainly that there isn't a portable way to store the key
one uses for thread-local storage anywhere that I'm sure I can get at
it from my thread: I can't hand it back to the user because he's not
expecting that, I can hardly put it in thread-local storage by itself
(as then I'd have YA key to point to the key). I can't expect to get
the same key each time I ask for a new one, I can't just put it on the
thread's stack because I don't know how long it will stay alive, etc.
The only way I see to do this is to map against something that is
unique and non-transient for the thread itself: either the ID (Windows)
or an ID I create (POSIX).
I'm not sure whether you don't understand thread-specific data, or whether
you haven't thoroughly explained your constraints. To me, this sounds like
thread-specific data would be a good fit for your requirements.

You create a thread-specific data key (pthread_key_create) ONCE; possibly
storing it as a "static" (class) member. Any object of that class can then
pass the key to pthread_setspecific() to set a thread-specific VALUE
(presumably a pointer) for the global KEY; and pthread_getspecific() to
retrieve the thread's value... which it can return to the caller for use as
a normal pointer. (Of course there are restrictions on what the caller
should be allowed to do with that pointer behind your back... but that
doesn't distinguish it from your model.)

I assume you actually want a unique per-thread value for each OBJECT of the
class, and that's just a little more complicated. You probably don't want
to create a new key with each object constructor, because POSIX allows for
a fixed number of keys and you might run out. (Though the assumption of
unlimited keys would just make your code slightly less portable, not
"wrong"... in POSIX terms, "conforming" but not "strictly conforming".) You
could instead use the single static key for the class, but make the value a
list (or hash table) of per-object values for that thread, which at least
would simplify (and speed) your lookup over doing everything yourself.
--
/--------------------[ ***@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/
Ronald Landheer-Cieslak
2003-08-13 15:21:36 UTC
Permalink
[...]
% > I'm not sure whether you don't understand thread-specific data, or whether
% > you haven't thoroughly explained your constraints. To me, this sounds like
% > thread-specific data would be a good fit for your requirements.
% >
% > You create a thread-specific data key (pthread_key_create) ONCE; possibly
% > storing it as a "static" (class) member.
% Can't do that - there's no way of knowing (beforehand) from which thread the
% class will be called, so there's no way I know where to put the key so I can
% find it back again and be sure I'm in the same thread as when I put it there..
I'm sure that you don't understand thread-specific data. I assume you're
going to have functions to set and return this thread-specific value,
since the linked-list look-up you describe requires it. The `get' function
whatever * myclass::mydata()
{
return (whatever *)pthread_getspecific(mykey);
}
int myclass::setmydata(whatever * data)
{
return pthread_setspecific(mykey, (void *)data);
}
mydata() returns the data for the current thread associated with mykey
(presumed to be a member of myclass). setmydata() sets the data for
the current thread associated with mykey. If you call mydata() from
two threads at time, they will get different pointers as return
codes. This sounds to me like it's exactly what you want.
Even if it isn't, use thread-specific data to store the hash value that
will give you want you want. It will almost certainly be faster than
searching a linked list using pthread_equal() et al.
And where would you propose I store the key?
I'll need to store it somewhere where I can
* get at it from the current thread
* know for sure that it's the key for the current thread
when the only thing I have as a parameter is the current thread.

If you can figure that out, please do.

I'll repeat:
* There is only one instance of my class in the process.
* The class must be used as a raw pointer, and can therefore only take the raw
pointer as parameter, and hand it back. Any other information (such as
reference counts, keys and somesuch) must be stored within the class itself -
because there's no where else I can store it.
* The caller must be sure that
a. only its own thread has access to its raw pointer
b. the raw pointer it will get will always belong to him
* The caller can't handle anything but the raw pointer - no keys outside of
the class.

The linked list is there to emulate the Windows call GetCurrentThreadId(),
which returns a system-wide unique ID for the thread - I only need process-wide
but the same mechanism would work for system-wide if the list were stored in
shared memory with accompanying pid.. but I digress..

If you know where to store a key for TLS somewhere within my one instance of
the class, in a way that I can portably get that key back by just handing
the handle to the current thread to the getter, please elucidate. Otherwise,
please take back your statement of me not understanding TLS.

rlc
Patrick TJ McPhee
2003-08-13 20:48:35 UTC
Permalink
In article <***@localhost.localdomain>,
Ronald Landheer-Cieslak <***@landheer.com> wrote:
% In article <bhdjlc$hep$***@news.eusc.inter.net>, Patrick TJ McPhee wrote:

% > whatever * myclass::mydata()
% > {
% > return (whatever *)pthread_getspecific(mykey);
% > }
% >
% > int myclass::setmydata(whatever * data)
% > {
% > return pthread_setspecific(mykey, (void *)data);
% > }
% >
% >
% > mydata() returns the data for the current thread associated with mykey
% > (presumed to be a member of myclass).

[...]

% And where would you propose I store the key?

As I say, in a member of myclass called mykey. This value is the same
in all threads. The thread-specific value associated with it is returned
by pthread_getspecific().
--
Patrick TJ McPhee
East York Canada
***@interlog.com
Ronald Landheer-Cieslak
2003-08-13 15:34:50 UTC
Permalink
Apparently I didn't do a good job at explaining my model: there is only
one instance of my thread-specific pointer class at any time (because the
class that contains that instance is unique in the process, stored by a
singleton) but that instance (or rather: its owner) can be called from any
thread at any time. What I need is a mechanism to see from which thread
it's being called to retrieve the data for that thread - which is always
the current thread, but hardly ever the main thread.
As Patrick already said, this sounds exactly like you're trying to re-invent
thread-specific data. The value returned from pthread_getspecific() on a
key has nothing to do with the thread that allocated the key (or the class
or object containing it). It's strictly related to the identity of the
thread making the current call to pthread_getspecific(). Just like your
class and object, the pthread_key_t value is GLOBAL, but the thread library
will maintain a unique VALUE for the key in each thread that calls
pthread_setspecific().
I think what you want is a 'static pthread_key_t' member in your class (or
on the single global instance -- it doesn't matter a bit in this case).
Each thread uses pthread_setspecific() to set it's own unique data pointer
for that key, and pthread_getspecific() to retrieve a value previously set
in the same thread.
If the pthread_getspecific() returns NULL then you haven't yet set a value
in that thread, and you can new/malloc whatever you want and pass that
value into pthread_setspecific() to establish the value that will be
returned by subsequent calls to pthread_getspecific() in that thread.
OK.. doubt is setting in..

re-reading..

hmm..

"Different threads may bind different values to the same key"
in deed..

OK, seems I was reading a bit too quickly there :(

Sorry - I guess I can de-complicate things a bit.. :|

Thanks y'all - an please forgive my stubbernness..

rlc
David Butenhof
2003-08-14 11:50:35 UTC
Permalink
If the pthread_getspecific() returns NULL then you haven't yet set a
value in that thread, and you can new/malloc whatever you want and pass
that value into pthread_setspecific() to establish the value that will be
returned by subsequent calls to pthread_getspecific() in that thread.
A bit off subject, but related to another discussion I was reading through
awhile ago, and didn't seem to find the "final" answer... If a pthread
implementation allows keys to be reused, must the implementation ensure a
pthread_getspecific() returns NULL (assuming pthread_setspecific() has not
been done on the newly / reused key instance)? If the answer is yes (or
if the answer is undefined, but most implementations do anyway) what would
a typical method be to ensure the value returned was NULL (if not
"previously" set in that key instance). I am guessing a versioning of
some kind.
Yes, "versioning of some kind" is one good answer. Generally better (and
less expensive, though still awkward) than synchronizing TSD access so that
either deletion or reuse can scan through all active threads to safely zap
any current values they may hold.

The standard does call for the value of a newly created key to be NULL in
all threads. A late addition to the standard was the ability to delete a
key, and while the working group had never INTENDED to require either
versioning or the "asynchronous zap", that's the way it came out because
nobody paid enough attention to the consequences of adding delete.

The safest answer is: don't delete keys. Then you never have to worry about
it at all. ;-)
--
/--------------------[ ***@hp.com ]--------------------\
| Hewlett-Packard Company Tru64 UNIX & VMS Thread Architect |
| My book: http://www.awl.com/cseng/titles/0-201-63392-2/ |
\----[ http://homepage.mac.com/dbutenhof/Threads/Threads.html ]---/
Alexander Terekhov
2003-08-14 19:59:51 UTC
Permalink
David Butenhof <***@hp.com> wrote in message news:<***@usenet01.boi.hp.com>...
[...]
Post by David Butenhof
Yes, "versioning of some kind" is one good answer.
I don't think so.
Post by David Butenhof
Generally better (and
less expensive, though still awkward) than synchronizing TSD access so that
either deletion or reuse can scan through all active threads to safely zap
any current values they may hold.
It's probably less expensive than a versioning thing.
Post by David Butenhof
The standard does call for the value of a newly created key to be NULL in
all threads. A late addition to the standard was the ability to delete a
key, and while the working group had never INTENDED to require either
versioning or the "asynchronous zap", that's the way it came out because
nobody paid enough attention to the consequences of adding delete.
A compromise solution has been proposed for TC2. IIRC,
working group sort of 'agreed'. Oder?
Post by David Butenhof
The safest answer is: don't delete keys. Then you never have to worry about
it at all. ;-)
Nah. It's time to fix the standard.

regards,
alexander.

Loading...