---------------------------------------------------------------------------- The Florida SunFlash SunProgrammer Newsletter Vol 1 #1 (3 of 3) SunFLASH Vol 45 #11 September 1992 ---------------------------------------------------------------------------- ********************************************************************** * * * COPYRIGHT NOTICE * * * * The copyright for this publication is held by Sun Microystems * * Inc. All rights reserved. No part of this publication can be * * reprinted, modified, or otherwise reproduced without permission * * from the publisher (David.Reim@Sun.Com). However, you are free * * to forward unmodified sections via Email as long as you also * * include this copyright notice. * * * * * ********************************************************************** ----------------------------------------------------------------------- "Performance Tuning" by Keith Bierman Working with Compilers SunPro compilers are wonderful, but no technology and no programmer is perfect. Here are a few trouble-spots developers have run into, and some suggestions for how to work around them. Code produces different answers at different optimization levels. Welcome to the wonderful world of computer arithmetic! As optimizers move your code around (in perfectly legal ways) you may get results other than you expect. Hopefully you restricted yourself to numerically stable algorithms, and your data is not ill-conditioned. Try to use some savvy; if the results only differ in the last couple of digits, it's probably harmless. If it's in the most significant digits, use the optimizer selectively. When you locate the module which "breaks" under optimization take it to your local numerical guru, or start breaking it up into submodules. Continue until happy, or exhausted. You may wish to try the f77-r8 option which promotes all floats to doubles. If this makes your problem go away, you probably have a subtle coding flaw in your code or your algorithm. If you do take the time to track down your problem, and you find that there is some computation which only differs by a mite -- but this error grows -- your algorithm and/or your implementation is not numerically stable. Another common cause of getting different, wrong, or no answers when optimization is turned up, is using a variable before it is initialized. Since UNIX (like most other operating systems) zeros memory at program load time, at low optimization levels one will find uninitialized variables are typically preinitialized to zero. As the optimizer eliminates unnecessary memory references, it is quite possible that instead of zero, one gets the results of some arbitrary previous computation instead. Using a variable before it is set is illegal, so such transformations are in fact reasonable for the compiler to perform. In order to prevent such "accidents" one should always lint C code. One should always use functional prototypes in ANSI C, and one should procure a FORTRAN static analysis tool. I have personally never run across a large code (100,000+ lines) which doesn't have at least one bug this sort of tool can uncover. Most codes turn out to have many unsuspected latent bugs. Running out of register windows. Register windows are a good thing, but sometimes you can deplete the supply. When that happens, the OS emulates more register windows. This is fine, but slow. If you find yourself with large system times (and you've taken care of FP problems), and you have complex call graphs (see your gprof report) you may find it worth your while to in-line some modules. To accomplish this, find the most expensive caller and place the callee in the same file. If there are several modules which employ the same modules(s), co-locate them all. Compile that file at -O4. Core dumps with -fast. When code runs fine without optimization, but the core dumps with -fast, it is probably due to -fast doing a -dalign for you. Crack open your manual (or call up the man page) and specify the compiler options you want manually (or use adb or dbx to track down what in your program is poorly aligned!). In FORTRAN this often comes about by having INTEGER and DOUBLE PRECISION variables in the same common block (which is, of course, legal). On a CDC or a Cray the code probably had REAL and INTEGERs together, which the standard specifies should occupy the same space. If you converted the code by hand to 32-bit machines you may have introduced this alignment problem. Use -r8 instead of changing the code (f77v1.3 and beyond). C programmers (since you guys use typedefs more than FORTRAN users employ structured variables) have lots more ways to introduce non-double word alignment. The speedup dalign offers (on current machines) tends to be 10%-30%. Compiling with optimization takes too long. Use perfmeters to check for thrashing if you don't have enough RAM. Most often, the modules which cause optimizers the most grief get little benefit from optimization (e.g., FORTRAN main program with tons of COMMONs, EQUIVALENCES, and subroutine calls) with little compute actually performed. Using the profiler to guide your optimization usage clears much of this up. If you do have a module which does suck up time and is still a dog to compile, consider breaking it up into smaller chunks. If you have exhausted all other avenues and can bear to part with your code, please file a bug report. Optimization shortfalls may be hard to track down, or require a while to solve; wrong answers will be dealt with more swiftly. Slow system performance. This is often a symptom of poorly-handled floating-point exceptions. FORTRAN users get an automatic call to ieee_retrospective so they get some warning (ignore inexact, unless you are really hip to the IEEE spec), but C programmers should add a call to ieee_retrospective to their exit code. What you don't know can hurt you! If the underflow exception has been raised (and the only other exception raised was inexact), you have a lot of computation taking place very close to zero. Non-IEEE machines typically just return zero when you get "close enough." This isn't the best modern technology can offer. (IEEE's gradual underflow can enable you to get better [more accurate] answers in many cases, but if it is what you want, add a call to nonstandard_arithmetic() to your program. See the Numerical Computational Guide for details.) To track down other problems, add a call to ieee_handler ("set", "common",SIG FPE_ABORT) details to be found in the documents mentioned above. You will probably want to compile with the -g option to get an exact line number. (My approach is to use the optimized code, find the module, then recompile just it. This can produce misleading results in pathological cases, but for big jobs there may not be any other practical choice, except adb.) *Keith Bierman is a rabble rouser attached to the SunPro Floating Point and Performance Group. He can be reached at khb@chiba.eng.sun.com. ----------------------------------------------------------------------- "QUALITY MATTERS Implementing Code Reviews" by Donald G. Miller, Jr. Reviewing your software product with technical peers can be a useful -- if not critical -- step to take before significant testing and integration occurs. This article offers a few suggested guidelines to be used in the implementation of code reviews. It is primarily the result of research, although group discussion and individual interview guided the content focus. It is not the gospel on the topic nor is it the last word. It is an attempt to look at a methodology which some software development efforts have been able to use with significant success. What? A code review is an analysis of a software product by technical peers of the producer. A walkthrough is characterized by the producer of the reviewed material guiding the progression of the inspection process. A walkthrough is a method of rapidly evaluating material by confining attention to a few selected aspects, one at a time. The terms walkthrough, inspection, and review are used somewhat synonymously in the literature although implying varying degrees of formality. The relative formality of the process is dictated to a degree by resource and schedule constraints, although the effort to perform reviews in some form is seen to be valuable. Why? Simply stated, code reviews are an attempt to answer the question: Will the product do the job it's supposed to do? Code reviews find bugs earlier and more economically than other defect identification techniques. Studies have shown that code walkthrough can remove between 38% and 89% of the errors prior to the execution of the first test. The cost of rework, or defect correction, was found to be 10 to 100 times higher when done in the later phases of a project. In addition to cost and quality improvements, code reviews have been found to facilitate training and technical information exchange, as well as assuring that the product will be maintainable. Who? Research literature recommends three to seven technically competent reviewers who are not responsible for evaluating the producer of the code. Typically, the review team is made up of members of the producer's project. The following roles are identified as important to the review process: * A review leader, or moderator, provides focus on the process (in addition to technical input). * A recorder is responsible for documenting information for an accurate report of the review. This information consists of defect descriptions and issue summaries. * Other reviewers focus on ensuring the product does what it's supposed to do. Yourdon proposes the additional reviewers' focuses have a specific slant. * A maintenance oracle (Fagan calls this person the tester) who reviews the product from the perspective of future maintainability. * A standards bearer who ensures that any existing group standards are adhered to. The role or roles assumed by each reviewer can be implicitly decided upon. However, making them known ensures that pertinent viewpoints and functions are represented. When? Code reviews should be held early enough so that identified issues can be addressed without major impact to the product. They should be late enough that the product under review is in a reasonably stable design state. This indicates that a good time frame for review is sometime after the first clean compile of the program and before significant testing and integration has been done. How? Steps in the review process are: Pre-Review 1. The producer decides peer review of his or her code would be valuable. A reviewable portion of the code is selected. 2. An appropriate group of reviewers is identified and solicited. 3. A copy of the source to be reviewed, along with any additional informative data, is made available to the review team with sufficient lead time. 4. The team reviews the code independently and prepares comments for the review meeting. Review Meeting 5. The producer provides introduction and clarification as necessary. 6. All criticisms, comments, suggestions, and issues are raised and noted, not resolved. 7. The review team accepts the material "as is," with minor modifications, or decides that the changes necessary are sufficient to require another review. Post Review 8. The minutes, issues, and recommendation are documented and distributed. The recommendation should be directed at management while the issues and comments list will interest the development team. 9. The producer addresses the documented issues. We at SunPro have found that following these steps in the review process is extremely valuable in helping us ensure product quality. References For additional information on this subject, we highly recommend the following reference sources, which were also helpful in preparing this article: Freedman, Daniel P. and Weinberg, Gerald M. Handbook of Walkthroughs, Inspections, and Technical Reviews -- Evaluating Programs, Projects, and Products. Third edition. Little, Brown and Company, 1982. Boddie, John. Crunch Mode -- Building Effective Systems on a Tight Schedule. Prentice-Hall, 1987. Yourdon, Edward. Structured Walkthroughs. Fourth edition. Prentice-Hall, 1989. Fagan, Michael. "Design and Code Inspections to Reduce Errors in Program Development," IBM Systems Journal, Vol. 15 No. 3 (July 1976). Dickinson, Brian. Developing Quality Systems. Second edition. McGraw-Hill, Inc., 1989. *Don Miller is a SunPro Software Quality Engineer responsible for testing tools and techniques. He can be reached at donm@eng.sun.com. ------------------------------------------------------------------------ "THE FRIDAY HACK Qrgfwi-ing Your Workstation" by Peter van der Linden They say that "real programmers don't eat quiche," but I happen to know that's a bunch of hooey. I threw a party last week, all the real SunPro programmers came, and quiche was one of the most popular dishes. And so it goes in life. Sometimes there's a big gap between what "they" say and the way things really work out. For example, last time your workstation was taken off the ether to be moved, upgraded or reinstalled, how long did it take you to get operational again? Ah ha! And how long would it have taken if you had recorded the vital statistics of your workstation in advance? Over the years I've been bitten by this too many times; that's why I jotted down the Quick Reference Guide for Workstation Interrogation -- Qrgfwi, for short. This guide is an easy reminder of the commands to run to find out just about anything of interest on your workstation. Next time you need to upgrade, just think Qrgfwi! One point to watch: you can't run commands like ypwhich or ypmatch unless you are using the Network Information System. If you're not using NIS, take a look at the ifconfig command. For more information on any command, look at its man-page. Finally, any time you need a boost, just type in the command: who is smart and see what your loyal workstation replies! *Peter van der Linden is a programming gourmet as well as a software engineering manager at SunPro. He can be reached at linden@eng.sun.com. To Find This Type This Command Sample Output Name and OS version /usr/bin/uname -a SunOS adapt 4.1.1 1 sun4c Processor type /usr/bin/mach sparc Architecture type /usr/bin/arch sun4 Host ID /usr/bin/hostid 510061f8 NIS domain name /usr/bin/domainname EBB.Eng.Sun.COM Ethernet address /etc/dmesg|grep Ether Ether addr=8:0:30:7:e4:33 Host Name /usr/bin/hostname adapt NIS server /usr/bin/ypwhich vampira Physical memory /etc/dmesg|grep mem mem = 8192K (0x800000) TCP/IP host address /usr/etc/ifconfig -a 129.144.127.37 Virtual memory /etc/pstat -s 928k used, 6852k available Floating point HW fpversion A SPARC-based CPU is available Local disk /usr/bin/df |grep /dev (output needs interpretation) GX graphics accelerator /etc/dmesg|grep six cgsix0 at SBus slot 1 0x0 pri 7 Color monitor /etc/dmesg|grep cg cgsix0 at SBus slot 1 0x0 pri 7 eeprom information /usr/etc/eeprom (output has more eeprom stuff than you ever dreamed of) ------------------------------------------------------------------------- "PUZZLE" by Vijay Tatkar Much of human discovery comes through unexpected, inexplicable mental leaps. "Aha, insight," says Martin Gardner of this phenomenon. Kekule's benzene ring, Archimede's principle of fluid displacement, and many breakthroughs are well-known results of this curious brand of human thought that we fondly describe as "lateral thinking." In the puzzles that follow, such lateral thinking is both encouraged and sometimes crucial for finding solutions. Here are two puzzles to get you started. (1) First an easy one to start you off: Find a number ABCDEFGHIJ such that A is the count of how many O's are in the number, B is the number of 1's, and so on. Here's an example of a 4-digit number: 1210 (1 zeros, 2 ones, 1 two,...). Think of such numbers when the number of digits is > 4. (2) In little Fooeyland, there are only two kind of people: Those with blue eyes and those with brown eyes. Fooeyland folks have some rather interesting properties and constraints by which they lead their lives. A. A brown eyed person will commit suicide, when (and if) (s)he finds out that (s)he is brown eyed. B. No one tells anyone what color their eyes are, for fear of these consequences. C. They do not have the means, or desire to look at shiny objects to find out what color their own eyes are. D. The group is pretty social, and everyone gets to meet everyone else exactly once every day. Notice that at time+0, the population is stable: the blue-eyed folks are ok, and the brown-eyed folks don't know that they are brown eyed. One day an outsider comes to Fooeyland and before leaving tells everyone that there is at least one brown-eyed person amongst them. Now the question is: Will the population remain stable? If not, on what day will instability creep in and how? (No credit for partial answers.) [Hint: Use induction.] *Vijay Tatkar is a member of the C compiler team. He can be reached at tatkar@Eng.Sun.Com. ------------------------------------------------------------------------- SOLUTIONS (1) For number of digits = 5 21200 For number of digits >= 7 (n-4), 2, 1, 0, 0,...(n-7 zeroes here)...., 0, 1, 0, 0, 0 (2) If there is only one brown-eyed person, he or she will know by the end of the first day that no one else is brown-eyed and will commit suicide. If there are n+1 brown-eyed folks, it will take them n+1 days to realize that each of them is brown-eyed just as n others are. On the (n+1)st day, they will each commit suicide (because until the nth day has passed without anyone committing suicide, they would not know that they themselves could be brown-eyed. This is the inductive part.). -------------------------------------------------------------------------- "Dangling Pointers Superpipelining, Flightless Bees, and Varying Mileage" Greetings denizens of the labs! Welcome to the first Dangling Pointers column. Following a long tradition of infamous characters such as John Dvorak, Mac the Knife, and Matco, the Dangling Pointers column is dedicated to uncovering and reporting on the truth in our industry, whether or not we can prove it! Superpipelining vs. Superscalar A general question these days is, "Is superpipelining better than superscalar?" Well kids, engineering is full of trade-offs. Choosing a very simple pipeline (and possibly stretching it out) enables one to use whatever the newest chipmaking technology is. This is, in and of itself, a Good Thing because being first enables one to ship first. However, new process technology tends to have ramp-up problems (the exact nature of which is often unpredictable). Thus it may not be practical to ship systems in quantity for a considerable period after announcement. Superscalar systems have more complex pipelines; the basic bet here is that at any given process technology level, everyone will start from about the same point... but the clever ones can eke out more performance. Which is the "right" bet? It all depends on how fast new process technology becomes available. There is no simple "right" answer. In the long run, machines will run at very high clock rates, and do a lot of things in parallel. Both fine grain and very large grain parallelism will be exploited. But the long run may be a long time down the road. Instruction Parallelism: Taking it to the Limit During this summer's Hot Chips conference, at the night panel session, participants debated how much instruction parallelism there is to exploit. The range was from nearly none to nearly unlimited. Joseph Fisher of HP Labs provided one of the best arguments for the "we don't know the limit yet." Fisher: "The various `proofs' of N amount of parallelism are inherently flawed; they are akin to proofs that bees can't fly, that dial-up modems can't beat 4800 baud and that CDs would need 50KHz sampling rates...in fact they are worse as they don't have an underlying theory to appeal to." An alternative by James Smith, the honorable representative from Cray: Smith: "There is clearly room for more than five; they've done it for years. What's the discussion all about? Also, whose workload is important? Mine is to me, yours is to you. Don't get confused! Trying to find a single workload ensures doing none of them well." Which is a very good point. The fallacy of all standard benchmarks is that there is no reason to believe that it represents anything faintly akin to what a particular user uses a computer for. This is a much harder task than benchmarking a car's MPG...and a large fraction of car buyers don't get anywhere near the performance they had been "promised"...which leads to the now old saying: "Your Mileage May Vary." There is no good replacement for running your own workload on prospective machines. And so loyal readers, this concludes this edition of Dangling Pointers. Remember, any theorem in Analysis can be fitted onto an arbitrarily small piece of paper if you are sufficiently obscure. ------------------------------------------------------------------------ (cİ 1991 Sun Microsystems, Inc. All rights reserved. No part of this publication may be reprinted or otherwise reproduced without permission from the publisher. Sun Microsystems, Sun, the Sun logo, SunPro, the SunPro logo, SunSoft, SunExpress, Solaris, SPARCworks Professional, SunOS, OpenWindows, DeskSet, and ToolTalk are trademarks or registered trademarks of Sun Microsystems, Inc. All SPARC trademarks are trademarks or registered trademarks of SPARC International, Inc. SPARCompiler and SPARCworks are licensed exclusively to Sun Microsystems, Inc. Products bearing the SPARC trademark are based on an architecture developed by Sun Microsystems, Inc. UNIX and OPEN LOOK are registered trademarks of UNIX System Laboratories, Inc. All other products or services mentioned herein are trademarks of their respective owners. SunPro's mention or review of third-party products in this newsletter is for informational purposes only and is not an endorsement or recommendation of those products. The vendors or manufacturers of the products have provided all specifications, descriptions and other claims concerning the products. SunPro assumes no responsibility regarding the selection, purchase, performance or use of such products. Prospective purchasers of such products must deal directly with the vendors or manufacturers to create any understandings, agreements or warranties concerning those products. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ For information send mail to info-sunflash@sunvice.East.Sun.COM. Subscription requests should be sent to sunflash-request@sunvice.East.Sun.COM. Archives are on solar.nova.edu, paris.cs.miami.edu, uunet.uu.net, src.doc.ic.ac.uk and ftp.adelaide.edu.au All prices, availability, and other statements relating to Sun or third party products are valid in the U.S. only. Please contact your local Sales Representative for details of pricing and product availability in your region. Descriptions of, or references to products or publications within SunFlash does not imply an endorsement of that product or publication by Sun Microsystems. John McLaughlin, SunFlash editor, flash@sunvice.East.Sun.COM. (305) 776-7770.