Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!hookup!uwm.edu!math.ohio-state.edu!howland.reston.ans.net!gatech!ncar!kiowa.scd.ucar.edu!ilana From: ilana@kiowa.scd.ucar.edu (Ilana Stern) Newsgroups: sci.data.formats,news.answers,sci.answers Subject: Scientific Data Format Information FAQ Followup-To: sci.data.formats Date: 12 Apr 1995 11:00:06 GMT Organization: NCAR/UCAR Lines: 460 Approved: news-answers-request@MIT.Edu Distribution: world Expires: Wed, 26 Apr 1995 07:00:00 GMT Message-ID: <3mgbrm$7f1@ncar.ucar.edu> Reply-To: ilana@ncar.ucar.edu NNTP-Posting-Host: niwot.scd.ucar.edu Summary: Where to find information on scientific data formats Xref: senator-bedfellow.mit.edu sci.data.formats:906 news.answers:41769 sci.answers:2426 Archive-name: sci-data-formats Last-modified: 6 Apr 1995 Recent changes: ==within last two weeks== Added CIF (Crystallographic Information File) Added JCAMP format for chemical spectra Added information about CDF mailing list Added CXF Chemical eXchange Format Changed subscription procedure for netCDF mailing list ==within last four weeks== This is the FAQ for the sci.data.formats newsgroup. Contents: -2) How to use this document -1) How to get a current copy of this document 0) Resources for format information 1) Resources for visualization software information 2) How to use the data retrieval methods 3) Why isn't my favorite format on this list? Each (major) section has a "Subject:" line, so you can search on the subject title above to find the section quickly. This article is copyright (c) 1993 by Ilana Stern. It may be freely distributed provided that this copyright notice and the information on retrieving a current copy are not removed. Comments, corrections, or additions should be sent to Ilana Stern at ilana@ncar.ucar.edu. --------------- Subject: How to use this document Most FAQ (Frequently Asked Questions) documents list many questions and their answers. This FAQ is (mostly) devoted to answering only one question: "Where can I find documentation and software for [X] data format?" As the amount of information available over the networks has been increasing, so have the methods by which this information can be obtained. No longer is direct usage of FTP the only, or even the most frequent, method of obtaining data; we now have Gopher, Wais, and WWW, as well as many site-specific interfaces. Because the information itself may be accessible in many different ways, this FAQ will identify resources in terms of URLs (Uniform Resource Locators). This will also help us convert this FAQ to a hypertext document, so that it can be used with a WWW browser to go directly to any of the listed sources. Here's a glossary, so you can decode the URLs if necessary to reach the sites: ftp://host.name.domain/directory/[filename] ftp site http://host.name.domain/directory/[filename] www server telnet://host.name.domain telnet site gopher://host.name.domain gopher server wais://host.name.domain wais server news:newsgroup.name newsgroup So, for example, if a document is available at ftp://ncardata.ucar.edu/ it means that you should ftp to ncardata.ucar.edu, and the information is in the top-level directory. If you don't know what these information retrieval methods are, see the section "How to use the data retrieval methods". --------------- Subject: How to get a current copy of this document If you are reading this document after 26 Apr 1995, you are reading an outdated copy. A current copy of this document can be obtained by anonymous FTP to ftp://rtfm.mit.edu/pub/usenet/news.answers/sci-data-formats. If you don't know what FTP is, see the section "How to use the data retrieval methods". If you can't use FTP, send email to mail-server@rtfm.mit.edu with send /pub/usenet/news.answers/sci-data-formats as the only text in the message (leave the subject blank). A current hypertext version of this document can be obtained from http://fits.cv.nrao.edu/traffic/scidataformats/faq.html, or (for European users in particular) from http://info.mcc.ac.uk:80/CGU/Visualisation/sdf.html. If you would like to archive this FAQ in either hypertext or plaintext format, and want to receive a new copy automatically at every update, please send me email. --------------- Subject: Resources for format information 1) CDF 2) FITS 3) GRIB 4) HDF 5) netCDF 6) VICAR 7) PDS 8) Miscellaneous graphics formats 9) SAIF 10) SDTS 11) HDS 12) MedFileS 13) CXF 14) JCAMP 15) CIF 1. CDF CDF (Common Data Format) is a library and toolkit for storing, manipulating, and accessing multi-dimensional data sets. The basic component of CDF is a software programming interface that is a device independent view of the CDF data model. A user's guide and software are on ftp://nssdca.gsfc.nasa.gov/cdf.dir/ for VMS and ftp://ncgl.gsfc.nasa.gov/pub/cdf/ for all others. Some general information on CDF, including a FAQ, is available from http://nssdc.gsfc.nasa.gov/cdf/cdf_home.html. A recent paper for CDF is available from ftp://ncgl.gsfc.nasa.gov/pub/cdf/ doc/papers/CDF-nssdc.ps.Z. A mailing list, cdf-users@nssdc.gsfc.nasa.gov, exists for discussion of CDF. To subscribe, please send email to "Majordomo@nssdc.gsfc.nasa.gov" with the command "SUBSCRIBE cdf-users" in the body of your message. Questions can be directed to cdfsupport@nssdca.gsfc.nasa.gov. A client-server software layer called CSCDF, which can be used with the CDF library to provide applications access to remote CDF datasets, can be obtained from its author, Hillel Steinberg, by email at zeus@cs.umd.edu. 2. FITS FITS (Flexible Image Transport System) is the standard data interchange and archival format of the worldwide astronomy community. The NOST Standard and User's Guide, some software, and test files are available from ftp://nssdc.gsfc.nasa.gov/pub/fits. The site ftp://fits.cv.nrao.edu/fits (accessible via WWW at http://fits.cv.nrao.edu/) has other software and a different set of test files, and electronic copies of FITS proposals that are under development or in the international approval process. Archives of news:sci.data.formats and news:sci.astro.fits (which is devoted to discussion of FITS) that are of interest to astronomers can be found in ftp://fits.cv.nrao.edu/fits/traffic/. A WAIS index that can be searched for FITS information is at http://info.cern.ch:8001/fits.cv.nrao.edu:210/nrao-fits. If you've searched all these resources and still have questions, you can direct them to fits@nssdca.gsfc.nasa.gov. 3. GRIB GRIB (GRid In Binary) is the World Meteorological Organization (WMO) standard for gridded meteorological data. Unfortunately it is still not very "standard", as some organizations use their own versions. A format description for WMO GRIB can be found at ftp://ncardata.ucar.edu/datasets/ds084.5/format_grib.new, and read code is in the file access_grib.f in the same directory. The format description can also be found at ftp://nic.fb4.noaa.gov/pub/nws/nmc/docs/gribguide/guide.txt. If you need GRIB to read ECMWF data, the above format description, along with the ECMWF-specific parameter table, and a list of differences between WMO and ECMWF GRIB, is in ftp://ncardata.ucar.edu/datasets/ds111.2/format. Read code can be found in ftp://ncardata.ucar.edu/datasets/ds111.2/software. If all else fails, contact Ilana Stern at ilana@ncar.ucar.edu. 4. HDF HDF (Hierarchical Data Format) is a self-defining file format for transfer of various types of data between different machines. The HDF library contains interfaces for storing and retrieving compressed or uncompressed raster images with palettes, and an interface for storing and retrieving n-Dimensional scientific datasets together with information about the data, such as labels, units, formats, and scales for all dimensions. Source code and documentation are on ftp://ftp.ncsa.uiuc.edu/HDF. Some general information on HDF, including a FAQ, is available from http://www.ncsa.uiuc.edu/SDG/Software/HDF/HDFIntro.html. The HDF World Wide Web(WWW) information server, with links to the above plus an in-progress HTML reference manual is on http://hdf.ncsa.uiuc.edu:8001/. 5. netCDF NetCDF (Network Common Data Form) is an interface for scientific data access which implements a machine-independent, self-describing, extendible file format. All netCDF information is available via the WWW site http://www.unidata.ucar.edu/packages/netcdf/. Source code and documentation for the netCDF data access library is available from ftp://ftp.unidata.ucar.edu/pub/netcdf/netcdf.tar.Z. A FAQ is available from http://www.unidata.ucar.edu/packages/netcdf/faq.html or in text from ftp://ftp.unidata.ucar.edu/pub/netcdf/FAQ. Past netCDF support inquiries have been archived and can be searched from gopher://groucho.unidata.ucar.edu/7waissrc%3a/systems/netcdf/unidata-support-netcdf.src. The netCDF User's Guide is available as a hypertext (HTML) document from http://www.unidata.ucar.edu/packages/netcdf/guide.txn_toc.html, in compressed PostScript at ftp://ftp.unidata.ucar.edu/pub/netcdf/guide.ps.Z, or in source form with the netCDF source distribution. A recent paper (Jenter and Signell, 1992) which provides a good introduction to netCDF is available as ftp://crusty.er.usgs.gov/pub/netcdf.asce.ps. A visual browser for netCDF format data files is available from ftp://ftp.unidata.ucar.edu/pub/netcdf/contrib/ncview.tar.Z. A mailing list, netcdfgroup@unidata.ucar.edu, exists for discussion of the netCDF interface, and for announcements of netCDF news: to subscribe, send a message to majordomo@unidata.ucar.edu containing the line: "subscribe netcdfgroup". The archives of netcdfgroup are available from ftp://ftp.unidata.ucar.edu/mail-archives/netcdfgroup, and can be searched at wais://wais.unidata.ucar.edu:210/netcdf-group.src. For more information, contact support@unidata.ucar.edu. 6. VICAR VICAR (Video Image Communication and Retrieval) is a collection of image processing programs supported by the Multimission Image Processing Laboratory (MIPL) at the Jet Propulsion Laboratory (JPL), for use in manipulating and analyzing spacecraft images. The image format used by VICAR programs, and for all or most data from JPL-managed missions, is referred to as VICAR format. An independent third-party description of the VICAR image format is available at ftp://lager.geo.brown.edu/pub/doc/vicar_fmt.txt. A much more comprehensive and official description of the VICAR image format was recently spotted at http://www-mipl.jpl.nasa.gov/vic_file_fmt.html. Contact Bob_Deen@iplmail.jpl.nasa.gov for more information. 7. PDS In recent years, the Planetary Data System (PDS) has been responsible for archiving space mission data on CD-ROM media, using its own self- describing data format, variously know as PDS or ODL (Object Description Language). At least some of the current projects (e.g. Magellan, Galileo) are using the PDS format as a "pointer" to detached VICAR-format imagery on the mission CDROM volumes. The PDS Standards Reference Document can be found at http://stardust.jpl.nasa.gov/stdref/stdref.htm. For more information, contact pds_operator@jplpds.jpl.nasa.gov. 8. Miscellaneous graphics formats These formats for storing graphics files -- TIFF, GIF, JPEG, FLI, CGM, and so on -- are more properly discussed in news:comp.graphics. A small amount of documentation on these and other graphics formats is on ftp://zamenhof.cs.rice.edu/pub/graphics.formats; other archive sites are ftp://ftp.ncsa.uiuc.edu/misc/file.formats/graphics.formats, and ftp://telva.ccu.uniovi.es/pub/graphics/Image. The site http://www.crs4.it/HTML/LUIGI/MPEG/mpegfaq.html has information on the MPEG format. The comp.graphics FAQ and resource file have more information on where to find read and conversion programs for these formats. You can find them at ftp://rtfm.mit.edu/pub/usenet/news.answers. A good (hardcopy) reference for graphics formats is _Graphics File Formats_, by David C. Kay and John R. Levine (Windcrest Books, ISBN 0-8306-3060-0, about US$30.00 in paperback). 9. SAIF SAIF (Spatial Archive and Interchange Format) is a Canadian standard for the exchange of geographic data. It uses an object oriented data model, and consists of definitions of the underlying building blocks, including tuples, sets, lists, enumerations, and primitives. A company has formed to provide tools and training for the SAIF data standard. Safe Software may be contacted by email at infosafe@safe.com or by phone at either (604) 241-4424 or (604) 583-2016. They maintain a WWW page for SAIF at http://www.wimsey.com/~infosafe/saif/saifHome.html which will be continually updated. The SAIF specification is also available by FTP at ftp://s2k-ftp.cs.berkeley.edu/pub/sequoia/schema/STANDARDS/SAIF and ftp://moon.cecer.army.mil/ogis/related/SAIF3.1. There is a SAIF Mailing List: send email to "infosafe@safe.com" with the subject "SAIF Request" to be added to the list. 10. SDTS SDTS (Spatial Data Transfer Standard) is a Federal standard (Federal Information Processing Standard (FIPS) 173) for transfer of geologic and other spatial data. Documentation and examples are available from the USGS at ftp://sdts.er.usgs.gov/pub/sdts/www/html/sdts.html (for WWW users; this is an html interface to the ftp site, which can also be accessed directly, although not as nicely, at ftp://sdts.er.usgs.gov/pub/sdts. For more information, contact sdts@usgs.gov. 11. HDS HDS (Hierarchical Data System) is a freely available database system. It is particularly suited to the storage of large multi-dimensional arrays (with their ancillary data) where efficiency of access is a requirement. It is presently used in astronomy, for storing (in particular) images, spectra and time series. Documentation, and information on obtaining the source code, is available at http://star-www.rl.ac.uk/ or in a LaTex document at ftp://starlink-ftp.rl.ac.uk/pub/doc/star-docs/sun92.tex. 12. MedFileS The Medical File Standard (MedFileS) is a global project coordinated via the internet to provide a standard for the recording of clinical medical information. Anyone may participate in the project or obtain the current standard by e-mail to "medfiles@delphi.com". Information is obtained by sending commands in the subject line of e-mail messages. The command "send distrib." will provide a full description of the e-mail distribution system. The command "send overview." will provide a document detailing the MedFileS project. NOTE: an attempt on 19 Dec 1994 to obtain MedFileS failed. 13. CXF CXF provides representation of chemical substances and queries, including atoms, fragments, molecules, and reactions. Also available are various substance types, including organics, inorganics, polymers, salts, hydrates, multi-component mixtures and biosequences. The specification is available at ftp://info.cas.org/pub/cxf For more information, interested users should contact Thomas Steckert (tsteckert@cas.org) or Joseph Mockus (jmockus@cas.org). Questions and comments also are welcome. 14. JCAMP JCAMP is a draft standard for spectra data (IR & NMR) and chemical stuff which is related to netCDF. Some references: JCAMP-DX for NMR, A. N. Davies, P. Lampen, Applied Spectroscopy, 1993, 47, 1093-1099; A proposed European Implementation of the JCAMP-DX Format, D. N. Rutledge, P. Mcintyre, Chemometrics and Intelligent Laboratory Systems, 1992, 16, 95-101 JCAMP-DX, A standard format for exchange of infrared-spectra in computer readable form, J. G. Grasselli, Pure and Applied Chemistry 1991, 63, 1781-1792 JCAMP-CS A standard exchange format for chemical-structure information, J.Gasteiger, B. M. P. Hendriks, P. Hoever, C. Jochum, H. Somberg, Applied spectroscopy, 1991, 45, 4-11 Also, see the DEC 1994 issue of Applied Spectroscopy. A viewer is at http://wwwchem.uwimona.edu.jm:1104/software/jcampdx.html The mass spectrometry standard is available at ftp://ftp.pe-nelson.com/pub/andi-MS/ms_doc.zip (192.52.153.11) 15. CIF CIF (Crystallographic Information File) is becoming standard in the crystallography world and related fields: http://www.icur.ac.uk/cif/home.html --------------- Subject: Resources for visualization software information Many visualization software packages exist which are intended to be used with data in one or more of these standard formats. Here are pointers to some lists of information about this software. (Note that this is somewhat outside the scope of this document, which is really only intended to discuss data formats, but I think this will be useful to many.) Brief descriptions and pointers to software that can be used with netCDF is at http://www.unidata.ucar.edu/packages/netcdf/utilities.html. A page of links to many scientific visualization and graphics software packages is at http://www.msi.umn.edu/SciVis/Packages/packages.html. A page of links to both graphics software and various scientific data format descriptions is at http://sslab.colorado.edu:2222/sw_list.html. An article comparing several scientific visualization techniques and packages is available at http://www.sara.nl/Consumer.Report/Report.html. --------------- Subject: How to use the data retrieval methods This section only describes FTP and telnet in any detail; for other methods, FTP sites are given, so you can get information on them yourself. 1) How to use FTP 2) How to use telnet 3) Gopher information 4) Wais information 5) WWW information 1. How to use FTP FTP (File Transfer Protocol) allows transfer of files between two computers which are on the Internet. To access the FTP areas listed here, at your system prompt type "ftp" followed by the name of the desired system. For example, to access ncardata.ucar.edu you'd type ftp ncardata.ucar.edu Use "anonymous" as your login and your email address as the password (if requested). [Note: quotes ("like this") are used to set off names of directories and files, or commands you'd type, and are not part of these names.] Not all FTP systems accept the same commands, but here's a list of the most useful: ls list files in the current directory. cd change directory, e.g. "cd wx" changes to the wx directory. binary sets binary mode ascii sets ascii mode (the default). Use for retrieving text. get retrieves a file, e.g. "get readme" gets a file called readme. bye exits FTP. If you can't seem to connect to the site, check to see if it is a telnet site. If it is, follow the instructions in the following section instead. If you can't FTP from your site, use one of the following ftp-by-mail servers: ftpmail@decwrl.dec.com ftpmail@src.doc.ic.ac.uk ftpmail@cs.uow.edu.au ftpmail@grasp.insa-lyon.fr ftpmail@ftp.uni-stuttgart.de Send an e-mail message to the closest address, with the lines: reply your_address@some.where <- with your email address connect ncardata.ucar.edu <- for example cd datasets/ds111.2/software get access_sun.f quit For complete instructions, send a one-line message reading "help" to the server. Please don't ask me for help! 2. How to use telnet Type "telnet" followed by the name or IP number of the desired system. These publicly accessible systems generally allow you to log in but put you in a restricted shell, from which only a certain menu of commands is available. The description for the site will include the login to use. If you can't seem to connect to the site, re-check its description in the document; if it's an FTP site, follow the instructions in the previous section instead. 3. Gopher information Available by ftp at ftp://rtfm.mit.edu/pub/usenet/news.answers/gopher-faq. 4. Wais information Available by ftp at ftp://rtfm.mit.edu/pub/usenet/news.answers/wais-faq/getting-started. 5. WWW information Available by ftp at ftp://rtfm.mit.edu/pub/usenet/news.answers/www/faq. WWW is so easy to use that you might as well just hop in and try it, so ask your sysadmin if you have a WWW browser such as NCSA Mosaic. --------------- Subject: Why isn't my favorite format on this list? If you don't see a format you're interested in here, it could be one of three reasons. First of all, there are a lot of formats which are out of the scope of this newsgroup: it ain't named *sci*.data.formats for nuthin', you know. Formats used in commercial spreadsheet and word-processing software aren't scientific data formats, and aren't discussed in this group. Second, it may be that nobody has given the FAQ organizer any information on sources for information on that format. So ask the newsgroup -- and if you do get a response, please let me know what it is! Finally, you may ask on the net, and hear nothing, because the data format description just *isn't* publicly available. For most scientific data formats, this is a Bad Thing, and most archivists and scientists *want* to have their format information available. If you have such information, but don't have resources to make it available, please ask around and see if you can get it into an FTP area or other resource. Please don't publicize private or proprietary formats without the permission of the author, though. -- /\ | The immense vacuum of space is neither canister nor upright, and \_][ | has no upholstery attachments. -- Bob Rhubart \___http://www.ucar.edu/dss/ilana.html ilana@ncar.ucar.edu | Ilana Stern