To: comp.unix.solaris@cs.columbia.edu, alt.solaris.x86@cs.columbia.edu In-reply-to: dupuy@smarts.com's message of 11 Mar 1996 16:23:28 -0500 Subject: Re: Problems with multithreaded floating point in Solaris/x86 (or PPC) References: <9603112123.AA15659@just.smarts.com> Distribution: world --text follows this line-- In comp.unix.solaris, article <9603112123.AA15659@just.smarts.com> I wrote: >I've run into a few problems with fairly simple multithreaded floating point >code in Solaris/x86 which have convinced me that nobody (including Sun) has >ever tried to do this before. >The first problem is quite simple to reproduce, and quite fatal: >#define _REENTRANT >#include > >int main (void) >{ > float a; > > sscanf ("2.1", "%f", &a); > printf ("%f\n", a); > > return 0; >} > >#ifdef WORKAROUND >int getc (FILE *stream) >{ > return getc_unlocked (stream); >} >#endif >If you compile and link this program without -lthread, it runs fine. Linked >with -lthread, it dumps core. If you add -DWORKAROUND, it doesn't. The >problem seems to be that file_to_decimal(3) on x86 is the same in all versions >of Solaris 2. The difficulty is that between 2.1 and 2.4, this little feature >called threads was added, but file_to_decimal wasn't changed to take this into >account. >The second problem is a bit more difficult to reproduce; I see the problem >when running the Tcl expr test suite in multiple threads. Basically, what >seems to happen is that the "common" (overflow,division,invalid) IEEE-754 >exceptions are mysteriously getting traps enabled, causing SIGFPE when one of >these exceptions occur (normally, an exception flag is set, and the result is >Infinity or NaN. This one is really pretty much of a mystery; I haven't been >able to identify the culprit. For now, I just set up a SIGFPE handler that >tries to fixup the IEEE-754 exception modes and flags, and put a vaguely >reasonable looking exceptional result on the top of the FP stack. This is far >from 100%, but it's good enough for Tcl. I've come up with a better workaround for the scanf problem, which is usable in real multithreaded code (i.e. it doesn't disable locking for all getc() calls). Note that this problem with scanf is not x86-specific; it occurs for all non-SPARC Solaris 2 (including Power PC). The workaround should be fine on any architecture (even SPARC, although it's not necessary). #include #include /* * We intercept the internal _-prefixed name, because that's what sscanf uses * (via doscan_u). To get the original version, we use the unprefixed name. */ static FILE *sscanf_static; static int sscanf_getc (void) { return (*sscanf_static->_ptr == '\0') ? EOF : *sscanf_static->_ptr++; } static int sscanf_ungetc (int c) { return (sscanf_static->_ptr == sscanf_static->_base) ? EOF : (*--sscanf_static->_ptr = (unsigned char) c); } void _file_to_decimal (char **pc, int nmax, int fortran_conventions, decimal_record *pd, enum decimal_string_form *pform, char **pechar, FILE *pf, int *pnread) { static si_mutex_t sscanf_lock; if (pf->_flag & _IOWRT) /* this must be a call from sscanf */ { si_mutex_lock (&sscanf_lock); sscanf_static = pf; func_to_decimal (pc, nmax, fortran_conventions, pd, pform, pechar, sscanf_getc, pnread, sscanf_ungetc); si_mutex_unlock (&sscanf_lock); } else file_to_decimal (pc, nmax, fortran_conventions, pd, pform, pechar, pf, pnread); } @alex