----------------------------------------------------------------------------
                                                        The Florida SunFlash

          SunProgrammer  Newsletter Vol 1 #1    (3 of 3)

SunFLASH Vol 45 #11					      September 1992 
----------------------------------------------------------------------------
**********************************************************************
*								     *
*                          COPYRIGHT NOTICE                          *
*								     *
*  The copyright for this publication is held by Sun Microystems     *
*  Inc.  All rights reserved.  No part of this publication can be    *
*  reprinted, modified, or otherwise reproduced without permission   *
*  from the publisher (David.Reim@Sun.Com).  However, you are free   *
*  to forward unmodified sections via Email as long as you also      *
*  include this copyright notice.                                    *
*                                                                    *
*                                                                    *
**********************************************************************

-----------------------------------------------------------------------

"Performance Tuning"
by Keith Bierman


Working with Compilers

     SunPro compilers are wonderful, but no technology and no programmer
is perfect. Here are a few trouble-spots developers have run into, and 
some suggestions for how to work around them.


Code produces different answers at different optimization levels. 

     Welcome to the wonderful world of computer arithmetic! As optimizers 
move your code around (in perfectly legal ways) you may get results other 
than you expect. Hopefully you restricted yourself to numerically stable 
algorithms, and your data is not ill-conditioned. Try to use some savvy; 
if the results only differ in the last couple of digits, it's probably 
harmless. If it's in the most significant digits, use the optimizer 
selectively. When you locate the module which "breaks" under optimization
take it to your local numerical guru, or start breaking it up into submodules.
Continue until happy, or exhausted. You may wish to try the f77-r8 option 
which promotes all floats to doubles. If this makes your problem go away, 
you probably have a subtle coding flaw in your code or your algorithm.

    If you do take the time to track down your problem, and you find that 
there is some computation which only differs by a mite -- but this error 
grows -- your algorithm and/or your implementation is not numerically 
stable.

    Another common cause of getting different, wrong, or no answers when 
optimization is turned up, is using a variable before it is initialized. 
Since UNIX (like most other operating systems) zeros memory at program 
load time, at low optimization levels one will find uninitialized 
variables are typically preinitialized to zero. As the optimizer 
eliminates unnecessary memory references, it is quite possible that 
instead of zero, one gets the results of some arbitrary previous 
computation instead. Using a variable before it is set is illegal, so 
such transformations are in fact reasonable for the compiler to perform. 
In order to prevent such "accidents" one should always lint C code. One 
should always use functional prototypes in ANSI C, and one should procure
a FORTRAN static analysis tool. 

    I have personally never run across a large code (100,000+ lines) which 
doesn't have at least one bug this sort of tool can uncover. Most codes 
turn out to have many unsuspected latent bugs. 


Running out of register windows.

    Register windows are a good thing, but sometimes you can deplete the 
supply. When that happens, the OS emulates more register windows. This is 
fine, but slow. If you find yourself with large system times (and you've 
taken care of FP problems), and you have complex call graphs (see your 
gprof report) you may find it worth your while to in-line some modules. 
To accomplish this, find the most expensive caller and place the callee 
in the same file. If there are several modules which employ the same 
modules(s), co-locate them all. Compile that file at -O4.


Core dumps with -fast.

    When code runs fine without optimization, but the core dumps with
-fast, it is probably due to -fast doing a -dalign for you. Crack open 
your manual (or call up the man page) and specify the compiler options 
you want manually (or use adb or dbx to track down what in your program 
is poorly aligned!). In FORTRAN this often comes about by having INTEGER 
and DOUBLE PRECISION variables in the same common block (which is, of 
course, legal). On a CDC or a Cray the code probably had REAL and 
INTEGERs together, which the standard specifies should occupy the same 
space. If you converted the code by hand to 32-bit machines you may have 
introduced this alignment problem. Use -r8 instead of changing the code 
(f77v1.3 and beyond). C programmers (since you guys use typedefs more 
than FORTRAN users employ structured variables) have lots more ways to 
introduce non-double word alignment. The speedup dalign offers (on 
current machines) tends to be 10%-30%.


Compiling with optimization takes too long. 

    Use perfmeters to check for thrashing if you don't have enough RAM. 
Most often, the modules which cause optimizers the most grief get little 
benefit from optimization (e.g., FORTRAN main program with tons of 
COMMONs, EQUIVALENCES, and subroutine calls) with little compute actually 
performed. Using the profiler to guide your optimization usage clears 
much of this up. If you do have a module which does suck up time and 
is still a dog to compile, consider breaking it up into smaller chunks.

    If you have exhausted all other avenues and can bear to part with 
your code, please file a bug report. Optimization shortfalls may be 
hard to track down, or require a while to solve; wrong answers will 
be dealt with more swiftly.


Slow system performance. 

    This is often a symptom of poorly-handled floating-point exceptions. 
FORTRAN users get an automatic call to ieee_retrospective so they get 
some warning (ignore inexact, unless you are really hip to the IEEE 
spec), but C programmers should add a call to ieee_retrospective to 
their exit code. What you don't know can hurt you!

    If the underflow exception has been raised (and the only other 
exception raised was inexact), you have a lot of computation taking 
place very close to zero. Non-IEEE machines typically just return zero
when you get "close enough." This isn't the best modern technology 
can offer. (IEEE's gradual underflow can enable you to get better 
[more accurate] answers in many cases, but if it is what you want, 
add a call to nonstandard_arithmetic() to your program. See the 
Numerical Computational Guide for details.)

    To track down other problems, add a call to ieee_handler ("set",
"common",SIG FPE_ABORT) details to be found in the documents 
mentioned above. You will probably want to compile with the -g 
option to get an exact line number. (My approach is to use the optimized 
code, find the module, then recompile just it. This can produce 
misleading results in pathological cases, but for big jobs there may 
not be any other practical choice, except adb.)


*Keith Bierman is a rabble rouser attached to the SunPro Floating Point
 and Performance Group. He can be reached at khb@chiba.eng.sun.com.

-----------------------------------------------------------------------

"QUALITY MATTERS
 Implementing Code Reviews"

by Donald G. Miller, Jr.


    Reviewing your software product with technical peers can be a useful 
-- if not critical -- step to take before significant testing and 
integration occurs. This article offers a few suggested guidelines to 
be used in the implementation of code reviews. It is primarily the 
result of research, although group discussion and individual interview 
guided the content focus. It is not the gospel on the topic nor is it 
the last word. It is an attempt to look at a methodology which some 
software development efforts have been able to use with significant 
success.


What?

    A code review is an analysis of a software product by technical 
peers of the producer. A walkthrough is characterized by the producer 
of the reviewed material guiding the progression of the inspection 
process. A walkthrough is a method of rapidly evaluating material by 
confining attention to a few selected aspects, one at a time. The 
terms walkthrough, inspection, and review are used somewhat synonymously
in the literature although implying varying degrees of formality. The 
relative formality of the process is dictated to a degree by resource 
and schedule constraints, although the effort to perform reviews in some
form is seen to be valuable.


Why?

    Simply stated, code reviews are an attempt to answer the question: 
Will the product do the job it's supposed to do?

    Code reviews find bugs earlier and more economically than other 
defect identification techniques. Studies have shown that code 
walkthrough can remove between 38% and 89% of the errors prior to the
execution of the first test. The cost of rework, or defect correction, 
was found to be 10 to 100 times higher when done in the later phases 
of a project.

   In addition to cost and quality improvements, code reviews have been 
found to facilitate training and technical information exchange, as well 
as assuring that the product will be maintainable.


Who?

   Research literature recommends three to seven technically competent 
reviewers who are not responsible for evaluating the producer of the 
code. Typically, the review team is made up of members of the producer's 
project.


The following roles are identified as important to the review process:


   *     A review leader, or moderator, provides focus on the process 
         (in addition to technical input).

   *     A recorder is responsible for documenting information for an 
         accurate report of the review. This information consists of 
         defect descriptions and issue summaries.

   *     Other reviewers focus on ensuring the product does what it's 
         supposed to do.

Yourdon proposes the additional reviewers' focuses have a specific slant.

   *     A maintenance oracle (Fagan calls this person the tester) who
         reviews the product from the perspective of future
         maintainability.

   *     A standards bearer who ensures that any existing group standards 
         are adhered to.


   The role or roles assumed by each reviewer can be implicitly decided 
upon.  However, making them known ensures that pertinent viewpoints and 
functions are represented.


When?

   Code reviews should be held early enough so that identified issues 
can be addressed without major impact to the product. They should be 
late enough that the product under review is in a reasonably stable 
design state. This indicates that a good time frame for review is 
sometime after the first clean compile of the program and before
significant testing and integration has been done.


How?

Steps in the review process are:


Pre-Review

      1. The producer decides peer review of his or her code would be
         valuable. A reviewable portion of the code is selected.

      2. An appropriate group of reviewers is identified and solicited.

      3. A copy of the source to be reviewed, along with any additional 
         informative data, is made available to the review team with 
         sufficient lead time.

      4. The team reviews the code independently and prepares comments
         for the review meeting.


Review Meeting

      5. The producer provides introduction and clarification as
         necessary.

      6. All criticisms, comments, suggestions, and issues are raised
         and noted, not resolved.

      7. The review team accepts the material "as is," with minor 
         modifications, or decides that the changes necessary are
         sufficient to require another review.


Post Review

      8. The minutes, issues, and recommendation are documented and
         distributed. The recommendation should be directed at
         management while the issues and comments list will interest
         the development team.

      9. The producer addresses the documented issues.


    We at SunPro have found that following these steps in the review 
process is extremely valuable in helping us ensure product quality. 


References

    For additional information on this subject, we highly recommend the
following reference sources, which were also helpful in preparing this
article:


Freedman, Daniel P. and Weinberg, Gerald M. Handbook of Walkthroughs, 
  Inspections, and Technical Reviews -- Evaluating Programs, Projects, 
  and Products. Third edition. Little, Brown and Company, 1982.

Boddie, John. Crunch Mode -- Building Effective Systems on a Tight 
  Schedule.  Prentice-Hall, 1987.

Yourdon, Edward. Structured Walkthroughs. Fourth edition. Prentice-Hall, 1989.

Fagan, Michael. "Design and Code Inspections to Reduce Errors in Program
  Development," IBM Systems Journal, Vol. 15 No. 3 (July 1976).

Dickinson, Brian. Developing Quality Systems. Second edition. McGraw-Hill,
  Inc., 1989.


*Don Miller is a SunPro Software Quality Engineer responsible for testing
 tools and techniques. He can be reached at donm@eng.sun.com.

------------------------------------------------------------------------

"THE FRIDAY HACK
 Qrgfwi-ing Your Workstation"

by Peter van der Linden


     They say that "real programmers don't eat quiche," but I happen to 
know that's a bunch of hooey. I threw a party last week, all the real 
SunPro programmers came, and quiche was one of the most popular dishes. 
And so it goes in life. Sometimes there's a big gap between what "they" 
say and the way things really work out. 

     For example, last time your workstation was taken off the ether to 
be moved, upgraded or reinstalled, how long did it take you to get 
operational again?  Ah ha! And how long would it have taken if you had 
recorded the vital statistics of your workstation in advance?

     Over the years I've been bitten by this too many times; that's why 
I jotted down the Quick Reference Guide for Workstation Interrogation -- 
Qrgfwi, for short. This guide is an easy reminder of the commands to run 
to find out just about anything of interest on your workstation. Next 
time you need to upgrade, just think Qrgfwi!

     One point to watch: you can't run commands like ypwhich or ypmatch 
unless you are using the Network Information System. If you're not using 
NIS, take a look at the ifconfig command. For more information on any 
command, look at its man-page. 

Finally, any time you need a boost, just type in the command:

                              who is smart

and see what your loyal workstation replies!

*Peter van der Linden is a programming gourmet as well as a software
 engineering manager at SunPro. He can be reached at linden@eng.sun.com.

To Find This            Type This Command        Sample Output

 
Name and OS version     /usr/bin/uname -a        SunOS adapt 4.1.1 1 sun4c

Processor type          /usr/bin/mach            sparc

Architecture type       /usr/bin/arch            sun4

Host ID                 /usr/bin/hostid          510061f8

NIS domain name         /usr/bin/domainname      EBB.Eng.Sun.COM

Ethernet address        /etc/dmesg|grep Ether    Ether addr=8:0:30:7:e4:33

Host Name               /usr/bin/hostname        adapt

NIS server              /usr/bin/ypwhich         vampira

Physical memory         /etc/dmesg|grep mem      mem = 8192K (0x800000)

TCP/IP host address     /usr/etc/ifconfig -a     129.144.127.37

Virtual memory          /etc/pstat -s            928k used, 6852k
                                                  available

Floating point HW       fpversion                A SPARC-based CPU is 
                                                  available

Local disk              /usr/bin/df |grep /dev   (output needs 
                                                  interpretation)

GX graphics accelerator /etc/dmesg|grep six      cgsix0 at SBus slot 1 
                                                  0x0 pri 7

Color monitor	        /etc/dmesg|grep cg       cgsix0 at SBus slot 1 
                                                  0x0 pri 7

eeprom information      /usr/etc/eeprom          (output has more eeprom 
                                                  stuff than you ever
                                                  dreamed of)

-------------------------------------------------------------------------

"PUZZLE"
by Vijay Tatkar

     Much of human discovery comes through unexpected, inexplicable 
mental leaps. "Aha, insight," says Martin Gardner of this phenomenon. 
Kekule's benzene ring, Archimede's principle of fluid displacement, 
and many breakthroughs are well-known results of this curious brand 
of human thought that we fondly describe as "lateral thinking." 

     In the puzzles that follow, such lateral thinking is both encouraged 
and sometimes crucial for finding solutions. Here are two puzzles to get 
you started.


(1)

First an easy one to start you off:

Find a number ABCDEFGHIJ such that A is the count of how many O's are 
in the number, B is the number of 1's, and so on.

Here's an example of a 4-digit number: 1210 (1 zeros, 2 ones, 1 two,...).

Think of such numbers when the number of digits is > 4.


(2)

In little Fooeyland, there are only two kind of people: Those with blue 
eyes and those with brown eyes. Fooeyland folks have some rather 
interesting properties and constraints by which they lead their lives.

A. A brown eyed person will commit suicide, when (and if) (s)he finds
   out that (s)he is brown eyed.

B. No one tells anyone what color their eyes are, for fear of these
   consequences.

C. They do not have the means, or desire to look at shiny objects
   to find out what color their own eyes are.

D. The group is pretty social, and everyone gets to meet everyone else
   exactly once every day.

Notice that at time+0, the population is stable: the blue-eyed folks are 
ok, and the brown-eyed folks don't know that they are brown eyed. One day 
an outsider comes to Fooeyland and before leaving tells everyone that 
there is at least one brown-eyed person amongst them. Now the question 
is: Will the population remain stable? If not, on what day will 
instability creep in and how? (No credit for partial answers.) 
[Hint: Use induction.]

*Vijay Tatkar is a member of the C compiler team. He can be reached at
 tatkar@Eng.Sun.Com.

-------------------------------------------------------------------------

SOLUTIONS

(1)

For number of digits = 5

21200

For number of digits >= 7

(n-4), 2, 1, 0, 0,...(n-7 zeroes here)...., 0, 1, 0, 0, 0

(2)

If there is only one brown-eyed person, he or she will know by the end of 
the first day that no one else is brown-eyed and will commit suicide.

If there are n+1 brown-eyed folks, it will take them n+1 days to realize 
that each of them is brown-eyed just as n others are. On the (n+1)st day, 
they will each commit suicide (because until the nth day has passed 
without anyone committing suicide, they would not know that they 
themselves could be brown-eyed. This is the inductive part.).

--------------------------------------------------------------------------

"Dangling Pointers
 Superpipelining, Flightless Bees, and Varying Mileage"

     Greetings denizens of the labs! Welcome to the first Dangling 
Pointers column. Following a long tradition of infamous characters such 
as John Dvorak, Mac the Knife, and Matco, the Dangling Pointers column 
is dedicated to uncovering and reporting on the truth in our industry, 
whether or not we can prove it! 


Superpipelining vs. Superscalar

     A general question these days is, "Is superpipelining better than 
superscalar?" Well kids, engineering is full of trade-offs. Choosing a 
very simple pipeline (and possibly stretching it out) enables one to 
use whatever the newest chipmaking technology is. This is, in and of 
itself, a Good Thing because being first enables one to ship first.

     However, new process technology tends to have ramp-up problems (the 
exact nature of which is often unpredictable). Thus it may not be 
practical to ship systems in quantity for a considerable period after
announcement. 

     Superscalar systems have more complex pipelines; the basic bet here 
is that at any given process technology level, everyone will start from 
about the same point... but the clever ones can eke out more performance. 

    Which is the "right" bet? It all depends on how fast new process technology 
becomes available. There is no simple "right" answer.

    In the long run, machines will run at very high clock rates, and 
do a lot of things in parallel. Both fine grain and very large grain 
parallelism will be exploited. But the long run may be a long time
down the road.


Instruction Parallelism: Taking it to the Limit

    During this summer's Hot Chips conference, at the night panel 
session, participants debated how much instruction parallelism there 
is to exploit. The range was from nearly none to nearly unlimited. 
Joseph Fisher of HP Labs provided one of the best arguments for the 
"we don't know the limit yet."

Fisher: "The various `proofs' of N amount of parallelism are inherently 
         flawed; they are akin to proofs that bees can't fly, that 
         dial-up modems can't beat 4800 baud and that CDs would need
         50KHz sampling rates...in fact they are worse as they don't
         have an underlying theory to appeal to."

An alternative by James Smith, the honorable representative from Cray:

Smith: "There is clearly room for more than five; they've done it for
        years. What's the discussion all about? Also, whose workload is
        important? Mine is to me, yours is to you. Don't get confused!
        Trying to find a single workload ensures doing none of them
        well."

     Which is a very good point. The fallacy of all standard benchmarks
is that there is no reason to believe that it represents anything faintly 
akin to what a particular user uses a computer for. This is a much harder
task than benchmarking a car's MPG...and a large fraction of car buyers 
don't get anywhere near the performance they had been "promised"...which 
leads to the now old saying: "Your Mileage May Vary." There is no good
replacement for running your own workload on prospective machines.

     And so loyal readers, this concludes this edition of Dangling 
Pointers. Remember, any theorem in Analysis can be fitted onto an 
arbitrarily small piece of paper if you are sufficiently obscure.

------------------------------------------------------------------------

(c� 1991 Sun Microsystems, Inc. All rights reserved. No part of this
publication may be reprinted or otherwise reproduced without permission 
from the publisher.

Sun Microsystems, Sun, the Sun logo, SunPro, the SunPro logo, SunSoft, 
SunExpress, Solaris, SPARCworks Professional, SunOS, OpenWindows, DeskSet,
and ToolTalk are trademarks or registered trademarks of Sun Microsystems,
Inc. All SPARC trademarks are trademarks or registered trademarks of SPARC
International, Inc. SPARCompiler and SPARCworks are licensed exclusively 
to Sun Microsystems, Inc. Products bearing the SPARC trademark are based 
on an architecture developed by Sun Microsystems, Inc. UNIX and OPEN LOOK 
are registered trademarks of UNIX System Laboratories, Inc. All other
products or services mentioned herein are trademarks of their respective 
owners.

SunPro's mention or review of third-party products in this newsletter
is for informational purposes only and is not an endorsement or
recommendation of those products. The vendors or manufacturers of the
products have provided all specifications, descriptions and other
claims concerning the products. SunPro assumes no responsibility
regarding the selection, purchase, performance or use of such products.
Prospective purchasers of such products must deal directly with the
vendors or manufacturers to create any understandings, agreements or
warranties concerning those products.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
For information send mail to info-sunflash@sunvice.East.Sun.COM.
Subscription requests should be sent to sunflash-request@sunvice.East.Sun.COM.
Archives are on solar.nova.edu, paris.cs.miami.edu, uunet.uu.net,
src.doc.ic.ac.uk and ftp.adelaide.edu.au

All prices, availability, and other statements relating to Sun or third
party products are valid in the U.S. only. Please contact your local
Sales Representative for details of pricing and product availability in
your region. Descriptions of, or references to products or publications
within SunFlash does not imply an endorsement of that product or
publication by Sun Microsystems.

John McLaughlin, SunFlash editor, flash@sunvice.East.Sun.COM. (305) 776-7770.