sunflash-Distributed to mailing list sun/NC/north-carolina sunflash-Send requests, problems to owner-sunflash@suntri.east.sun.com ---------------------------------------------------------------------------- The Florida SunFlash Sunergy Newsletter #14 (April, 1994) Part 2 of 2 SunFLASH Vol 64 #61 April 1994 ---------------------------------------------------------------------------- 64.61 Sunergy Newsletter #14 (April, 1994) Part 2 of 2 From Vicki.Pedretti@Corp.Sun.COM Sunergy is a program designed to inform and educate computer users worldwide. Sunergy brings the great minds of the world to its audience through satellite television broadcasts, electronic newsletters and a library of whitepapers and other associated online documents. In the coming weeks, Sunergy will be accessible through the Mosaic network browser. Sunergy broadcasts are now being downlinked by over 1000 sites in over 40 countries. The broadcasts raise awareness of new technologies in existing, new and emerging markets. (( The Sunergy group also produce this great newsletter. Note that I have dropped a couple of parts that have already been posted to SunFlash. You can send for back isues of SunFlash by placing the vol.issue number in the subject line of mail to flashadm@sun.com -johnj )) *4* Advanced Systems Magazine Excerpt: "When Enough is Not Enough" *5* Internet Services List (64.37 Internet Services List (4/1/94) *6* Of Interest from Sun -Information Highway Pilot Projects (SunFlash 63.67 Sun Chosen for "Information Highway" Projects) -Fast Ethernet Seminar Series (SunFlash 63.91 Fast Ethernet Alliance Sponsors Seminar Series) *7* SMCC Announcements (SunFlash 63.60 SMCC Introduces SPARCstation Voyager) (SunFlash 63.69 SPARCstorage Array Model 100 Series) (SunFlash 63.101 Sun Introduces Two New Desktop Families SPARCstation 5 and SPARCstation 20) *8* O'Reilly & Associates *9* Sunergy Update *10* Sunergy ftp site login instructions *11* Sunergy enrollment and contact info (564 lines) ---------------------------------------------------------------------------- ********************************************************* *4* Advanced Systems Excerpt: When Enough is Not Enough * ********************************************************* From Michael McCarthy, Editor-in-Chief mac@advanced.com (415) 267-1727 When Enough Is Not Enough Lots of disk space won't make everyone happy unless its configured properly. By Hal Stern SysAdmin, April 1994, adapted from SunWorld Supplement to Advanced Systems Magazine, April 1994, copyright Advanced Systems, used with permission. Advanced Systems can be reached at editors@advanced.com. Work anywhere for a few months and you'll learn the early warning signs of a manager meltdown. When the danger vein on your boss's head starts to stick out enough to earn its own ZIP code, you know he's painfully aware of a problem. Why did you put this small database on a single 2-gigabyte disk? The users hate the performance! he blurts out. Somehow, the connotations of a small database can create a not-so-small configuration nightmare. Explaining that you were being nice by donating the extra room for growth of the 300-megabyte database won't explain away sluggish performance. Byte capacity and I/O-operation capacity are two distinct issues. Using a small portion of a disk will actually help because most seeks are track-to-track and take only a few milliseconds versus the typical 10-plus millisecond cross-platter seek. But put 500 megabytes on a 2-gigabyte disk, and you are seeking at average speed. With 100 users accessing the database, if each user does one transaction every 10 seconds and each transaction is big (like 10 I/O operations), the single disk will not handle the load. Determining the best configuration of disks is never an easy task. Frequently, users ask for just enough disk space to contain a database or set of data files, without considering the ability of the disks to handle the number and size of I/O requests generated by their applications. Estimating demands on the I/O subsystem forms the justification for a disk-farm configuration. You need to be able to show that the disks can store the data as well as provide adequate access to it in any access pattern an application might create. So how do you build your case? Having a sample application up and running is a luxury. You can measure I/O statistics and extrapolate usage of the proposed system, based on users expectations. A new system poses new problems, because your estimates will be only as valid as the analyses made by the design team. To fairly project disk usage, you need to understand the transaction models and usage patterns of the application in question and create a model that maps the business world into disks and I/O buses. This month, we'll review some basics of SCSI subsystems and disk mechanics, which covers the I/O architecture of most current desktop and server systems. A survey of I/O access patterns and acceleration techniques, such as disk striping, presents the problems and the array of tools that will combat these problems. Finally, we'll cover a couple more things they don't teach at Harvard Business School: how to decompose a business transaction into a series of I/O operations, and how to do some basic capacity planning. Seek and ye shall find Most vendors offer SCSI subsystems as their primary disk and tape I/O interconnects. Several years ago, SCSI disks were markedly slower and smaller than their IPI or SMD brethren, but today SCSI disks have similar transfer rates, low seek times and leading capacity, and price/capacity ratios. SCSI disks spin at 3600 to 7200 RPM, with most doing a swift 5400 turns a minute. Seek times for the highest-capacity, small-form-factor disks range from 10 to 12 milliseconds. Seek time and rotational speed are the major factors for disk-performance evaluations because they govern the maximum random and sequential I/O rates, respectively. Most SCSI disks have peak transfer rates of 412 to 5 megabytes per second, clamping sequential access speeds. On the other hand, if you're seeking from block to block on every disk access, you can't do more than 50 to 80 operations per second per disk. (Divide 1 second by the sum of the average seek and rotational latencies.) Although your mileage varies by disk and application, 60 random I/O operations per second per disk is a fair lower bound, and its the speed limit we will use for constructing models. At this rate, the actual sustained I/O transfer rate isn't that large. Assuming an 8-kilobyte transfer per operation, you're moving less than 500 kilobytes per second from the disk, which is about 10 percent of the disks maximum transfer rate. Stripes on the data highway Performance mavens stress the importance of performing I/O operations from memory rather than disk; however, I/O from disk is not necessarily a bad thing. If you're running a 30-gigabyte database, some of it will come in from disk, and the system will execute a steady stream of I/O requests under average load. The trick is to keep the I/O requests well balanced so all of your disk resources are used within the bounds of their I/O rate and byte capacities. You should be able to keep your processes supplied with data, even if it means keeping the disks busy. Under the load of a TPC benchmark, some database servers show as little as 25 percent I/O wait time (the time CPUs wait for disks to transfer data), while the disks are clicking away. This isn't bad, since it means you're moving the disks at full speed but losing only 25 percent of your CPU cycles. Given that CPU cycles are in the nanosecond range and seeks are in the millisecond range, this implies a good interleaving of work. In contrast, reading a large file from only one disk might load the system with 1 percent CPU busy time and 99 percent I/O wait time. Access patterns for the data are the high-order design input. Sequential accesses open a file (or raw device) and read it from beginning to end. Batch file transfers, FTP, imaging applications, GIS or seismic code, and most numerically intense processes do some amount of sequential I/O. Random activities include most database work and NFS traffic, since requests from all clients are streamed together. The typical access size is the next input. SunOS 4.1.x and Solaris 2.x use a version of the Unix filesystem called UFS+, in which the block-placement algorithms have been modified to increase the typical access size. Most BSD-based filesystem accesses are done in 8-kilobyte chunks, while UFS+ works in 56-kilobyte clusters of blocks. The larger effective access size yields better sequential I/O performance, and integration of the filesystem and virtual memory system let the kernel further turbocharge the filesystem by doing read-ahead. Filesystem access speed is gated by single-disk transfer rates, even if the filesystem imposes little or no overhead. (For example, with a 41/2-megabyte-per-second disk, top transfer rates might run at about 33-34 megabytes per second.) To get improved sequential access throughput, consider striping disks together to aggregate the transfer rates from individual disks. Striping can be done explicitly in an application that knows to read from multiple files, or by the underlying filesystem using a product like SMCC's OnLine:DiskSuite. Match the size of the stripe to the filesystem or the raw-device access size, so each access hits all of the disks in the stripe and moves them in parallel. For example, if you want to stripe a filesystem across two disks, use an interleave of 28 kilobytes per disk so each 56-kilobyte cluster access hits both disks. Striping over two disks can achieve almost double the transfer rate of a single disk, although the benefits diminish as the striping overhead increases for more drives Optimizing random I/O involves spreading the load over as many independent disks as possible. Instead of maximizing the transfer rate, you want to minimize average seek time. The worst scenario is to have one I/O stack up behind another random I/O on the same disk, increasing the effective disk latency. You can attempt to spread the I/O load over more than one disk through careful partitioning of the data. On an NFS server, this means separating users into different filesystems, with one or more filesystems per disk. As usage patterns change, and disk capacity utilization ebbs and flows, partitioning may be hard to manage. Disk striping comes to the rescue again. For random I/O, match the interleave to the size of a typical I/O operation so successive operations go to different disks in the stripe. If you're accessing all the data in random fashion, you're likely to hit all of the disks and benefit from the maximum disk I/O operations per second you can deliver. Random I/O operations usually are smaller than sequential transfers, with filesystem accesses in 8-kilobyte blocks and databases in 2-kilobyte blocks. NFS uses an 8-kilobyte block size as a maximum. You should stripe an NFS server disk set so that each 8-kilobyte access is directed to a different disk. A 4-disk stripe will use 32-kilobyte blocks interleaved with 8 kilobytes per disk. Some database systems do the interleaving and striping when you designate multiple disks or filesystems to hold the tables and indexes. If you have multiple database tables on the same machine, you will have to allocate disks to each table based on the size of the table and expected usage patterns. Exemplary planning Capacity planning deserves a detailed example. Lets suppose you're configuring a machine to handle an 8-gigabyte database that will be accessed by about 200 users. Each user will submit one transaction every 10 seconds, given the typical screen composition time. Customer-service representatives fire off transactions in the first 10 seconds of a telephone call, especially when dealing with an irate customer. Servicing those first few transactions in a timely manner will set the stage for the remainder of the call. Keep in mind, the mental health of your support staff depends on a good configuration. Byte capacity and I/O-operation capacity are two distinct issues. A short interview with the database designer turns up more clues. There is little locality of reference in the transactions, and successive transactions are expected to occur at completely random points in the database tables. The workload is 50 percent updates, 20 percent inserts, and 30 percent lookups, using four different database tables. Each table has one index field. In the worst case, that's three disk operations for an update, two for an insert, and two for a lookup, for an average of 31-32 I/Os per table or 14 per transaction. You are likely to do fewer disk operations because indexes and database-table rows are cached by the database. You should figure that most of the index operations are satisfied out of memory, reducing your average demand to about 10 I/O operations per transaction, although peaks of 14 or more are to be expected. Your transaction rate is about 20 per second, multiplying the number of users by the number of transactions per second. Net result: You need to produce an average of 200 I/O operations per second to sustain this transaction rate. Adding in the overhead of transaction logging, 220 I/O operations per second is a safe minimum. Heres the 64-kilobyte question: Will four disks of 2 gigabytes each handle this load, since they provide enough storage capacity for the database? Or do you have to explain to a budget- conscious manager why you need another two disks, which appear to be wasted disk space? Four disks may meet the average I/O requirement, providing a minimum of 240 disk I/O operations per second. But you'll be utilizing the disk subsystem at about 90 percent of its operational capacity. Any variations in usage or periods of contention for one part of the database will introduce contention and latency. A good rule of thumb is to divide the number of I/O operations per second by 60, determining a lower bound on the number of disks. If your proposed configuration is close to this threshold, add more disks. In many cases, several smaller-capacity drives are better than one large drive because the bundle of small drives delivers a higher random I/O rate. Load distribution also offers you flexibility if a database designer decides to add an index field to speed up accesses for certain types of transactions. It's better to be able to react to changes on the development side, rather than having to go back to the budgeting process because of what a designer deems a minor change in the database schema. Measure and score your capacity planning successes by watching system and I/O utilization with tools like sar and iostat. If you notice disk hot spots where you have one disk that's doing more than 70 percent of its I/O-rate capacity, consider moving parts of the data around to load-balance access to the disks. Aggressive performance management sounds like the stuff of MBAs, but modeling your environment isn't all theory. Put together accurate estimates of workload demands, and others will take faith in your budgets and requisition requests. Accurate estimates will eliminate at least a few managerial visits. Author Hal Stern, an area technology manager for Sun, can be reached at hal.stern@east.sun.com. SIDEBAR For those about to block Once you know something about gross access patterns and have chosen an organization for the disks, all that remains is to determine the number of disks needed. Before breaking out the spreadsheets, though, consider these questions and issues. What is the typical transaction? Read, modify, insert, or write? Consider transaction types for NFS and time-sharing servers as well as database engines. If a typical user is going to type make all day long, determine what kind of file accesses are attributable to each make. A complex build might examine 300 header files, read in three libraries of 3 megabytes each, and write out 400 kilobytes of object code. Read-intensive workloads benefit from caching on the server. Database records and indexes are cached by the database engine, and file pages are cached by NFS and time-sharing servers. Write operations have to update disk storage and almost always generate at least one physical I/O operation. The ratio of total data size determines the average number of data accesses that turn into disk accesses. For example, a machine with 128 megabytes of RAM running a 1-gigabyte database may see seven out of eight transactions go to disk, since less than an eighth of the total database can be cached. Conversely, accesses with good locality of reference will generate few physical disk requests. Look at the supporting data structures like indexes and logs in a database, and inodes and indirect block pointers in the filesystem. Inserting a record with an index requires logical operations to write the data block, read the index segment, rewrite the index and write the log record. Most of the index segments may be cached, so an insert may result in only one or two physical I/O operations. Its helpful to have worst-case figures as well as expected or average-load values in order to evaluate a proposal. How many operations are contained in the scope of a single transaction? Industry benchmarks like TPC-A are misleading in their definitions of transaction. A TPC-A transaction is lightweight, doing only a single lookup and update of two databases. Common business transactions might look up or join several records from multiple tables, and insert or update records with multiple indexes. Define the scope of a transaction in terms of the average and worst-case number of I/O operations, remembering to count index operations for databases and inode/indirect-block updates for Unix filesystem work. What is your timetable? Transactions may come in a steady stream or in several peaks. What is the typical think time? Are you looking at a development environment where everyone sits down after lunch and immediately tries to fix a bug? What about a university environment where the class break at 2:20 leads to a mad rush for the workstation lab at 2:25? How much latency can you handle under peak load? Can you design for the average case or must you satisfy the peak demand periods with no response-time discrimination from the typical load periods? How big is the total dataset, and how much will it grow? This value determines the minimum number of disks. ***************************** *5* Internet Services List * ***************************** Excerpts from this extensive listing of services available via the Internet, appear here with the permission of Scott Yanoff who compiled the information. The list in its entirety can be retrieved from the Sunergy ftp site. Available as /pub/sun-info/sunergy/misc/internet.services.list [see login instructions at the end of this newsletter]. (( removed from SunFlash version. This document is posted on a regular basis to SunFlash. e.g. 63.45 Internet Services List (Mar-01-94) and 64.37 Internet Services List (4/1/94) -john )) --------------------------------------------------------------------------- - INDUSTRY OUTLOOK - --------------------------------------------------------------------------- *************************** *6* Of Interest from Sun * *************************** Excerpts from recent Sun press/business releases. SUN CHOSEN AS TECHNOLOGY SUPPLIER IN FOUR INFORMATION HIGHWAY PILOT PROJECTS (( SunFlash 63.67 Sun Chosen for "Information Highway" Projects )) FAST ETHERNET ALLIANCE ANNOUNCES NATIONWIDE FAST ETHERNET SEMINAR SERIES (( SunFlash 63.91 Fast Ethernet Alliance Sponsors Seminar Series )) PORTLAND, Ore., March 21, 1994 -- The Fast Ethernet Alliance --------------------------------------------------------------------------- - PRODUCT UPDATES - --------------------------------------------------------------------------- Excerpts from recent Sun press/business releases. ************************ *7* SMCC Announcements * ************************ BREAKTHROUGH SUN SYSTEM REDEFINES WORKSTATION PACKAGING First Full-featured SPARCstation with Nomadic Computing Capabilities (( SunFlash 63.60 SMCC Introduces SPARCstation Voyager )) SUN UNVEILS NEW HIGHLY INTEGRATED RAID MASS STORAGE SUBSYSTEM (( SunFlash 63.69 SPARCstorage Array Model 100 Series Introduced )) SUN REDEFINES DESKTOP PRICE/FUNCTIONALITY WITH TWO NEW DESKTOP FAMILIES Fully Equipped, Low-Cost Systems Aimed at Networked Enterprises (( SunFlash 63.101 Sun Introduces Two New Desktop Families SPARCstation 5 and SPARCstation 20)) *************************** *8* O'Reilly & Associates * *************************** On the most recent Sunergy, live, satellite broadcast, Gina Blaber (O'Reilly & Associates) introduced an upcoming O'Reilly & Associates product "Internet In A Box". Following is a small excerpt from the faq available from O'Reilly. To receive a copy of the faq, send email to ibox@ora.com and ask to get the Internet In A Box fact sheet. What is Internet In A Box? Internet In A Box is the first shrink-wrapped package to provide a total solution for PC users to get onto the Internet. Internet In a Box provides instant connectivity, a multimedia Windows interface, a full suite of applications, and the first interactive guide to the Internet. There will be two versions: a single-user, dial-up version for use with a modem and a LAN version providing Internet connectivity for corporate networks. Internet In A Box is a joint venture of O'Reilly and Associates, Inc., and SPRY, Inc. --------------------------------------------------------------------------- - SUNERGY INFORMATION - --------------------------------------------------------------------------- ********************* *9* Sunergy Update * ********************* Sunergy is a program designed to inform and educate computer users worldwide. Sunergy brings the great minds of the world to its audience through satellite television broadcasts, electronic newsletters and a library of whitepapers and other associated online documents. In the coming weeks, Sunergy will be accessible through the Mosaic network browser. Sunergy broadcasts are now being downlinked by over 1000 sites in over 40 countries. The broadcasts raise awareness of new technologies in existing, new and emerging markets. Alexei Mednikov from the State Technical University of St. Peters- burg in Russia writes: "Here in Russia we have lack of information especially about modern technologies, trends (like distributed multimedia systems etc.) and what is most important - watching Sunergy broadcasts we may look at experts in "real time"." In many areas, Sunergy programs are being rebroadcast over local cable television networks. As well, various colleges and universities, worldwide are using Sunergy broadcasts as campus-wide educational programming and course material. From Cecile Thornton, Dartmouth College: "Already the requests are coming in for us to make Sunergy #9 available for classroom use! Your programs are very relative for our faculty, administration, and student viewing." User groups are using the broadcasts as an informal means of getting together. Peter Hollands of Sun in Bagshot, Surrey, England writes: "Please can you send me the coordinates for the Half Moon Pub, Bagshot, Surrey, England. (The pub has a sophisticated satellite receiver set up.). Hopefully we can enjoy a quality pint of beer while listening in." Sunergy is being used in many creative ways. And Sunergy services are provided to you free by Sun Microsystems. Thank you to all who have sent in comments, suggestions and ideas. These are always very welcomed. If you have an article you would like to be considered for inclusion in an upcoming issue of the Sunergy newsletter, please email it to sunergy@sun.com. ******************************************* *10* Sunergy ftp Site Login Instructions * ******************************************* White papers and other information from the broadcasts can be found at the Sunergy ftp site. Back issues of the Sunergy newsletter, as well as miscellaneous white papers referenced by Sunergy are also available. To access, type the following: $ ftp (or Iftp) sunsite.unc.edu username: anonymous password: **FOR WHITE PAPER RETRIEVAL AND SPEAKER INFORMATION FROM THE SUNERGY SATELLITE BROADCASTS**: ftp> cd /pub/sun-info/sunergy/broadcast_docs/march_94 ftp> get **FOR BACK ISSUES OF THE SUNERGY NEWSLETTER**: ftp> cd /pub/sun-info/sunergy/newsletters ftp> get **FOR MISCELLANEOUS WHITEPAPERS and INFO REFERENCED BY SUNERGY**: ftp> cd /pub/sun-info/sunergy/misc ftp> get ************************** *11* Sunergy Enrollment * ************************** If you are not already a member of Sunergy and would like to join, simply fill out and return this form. If you are already enrolled in Sunergy, please feel free to pass this along. --------------------------------cut here---------------------------------- *****SUNERGY SIGN-UP FORM***** NAME: TITLE: COMPANY: ADDRESS: CITY: STATE: ZIP/POSTAL CODE: COUNTRY: PHONE: FAX: **E-MAIL: RETURN COMPLETED FORM TO: sunergy@sun.com 415/336-5847 **An e-mail address is mandatory for enrollment in Sunergy, as Sunergy information is distributed on an electronic basis only. ======================================================================== (c) 1994 Sun Microsystems, Inc. Sun, Sun Microsystems, SunWorld, SunSoft, SunPro,SunATM, SunFastEthernet, Sun Microsystems Computer Corporation (SMCC) and Solaris are trademarks or registered trademarks of Sun Microsystems, Inc. Sunergy is a service trademark of Sun Microsystems, Inc. SPARCstation is licensed exclusively to Sun Microsystems, Inc. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX and OPEN LOOK are registered trademarks of UNIX System Laboratories, Inc. All other product or service names mentioned herein are trademarks of their respective owners.