DRAFT DRAFT DRAFT
Intro to Scientific Programming and Data Processing
Kurt Schwehr
http://schwehr.org
$Id: intro-programming.html,v 1.20 2005/10/16 14:59:41 schwehr Exp $
Oct 2005 UPDATE: - Hi All, I have not had any time to work on
this document in a long time. If you have an improvement and want to
contribute, that is great, but I won't get to make any additions
myself until after I finish my PhD thesis. But I will incorporate
things that people send me (thanks Joost!)
Ah yes... it turns out that trying to write a large document in
straight html is no fun. This thing is getting painful.
-kurt
NOTE: Perhaps I should be writing this in docbook format
instead of raw html circa 1994!
The idea is that this book would be for an introductory text for a
class that I would like to teach and that once students have gone
through such a class, the text would be starting point to get them
started in data processing. It is essential that this book not be set
in stone. Paper editions are fine, but this topic will never end.
For most projects you will want to go out and find other resouces
after you have worked through the introductory material.
This book will not make you a master programmer, but hope will give
you the tools to process some data and spend more time on your
research and maybe even save you from Fortran.
Hopefully you will have tools and some background from here that you
can use to build your skill set upon in your personal direction.
Table of Contents
FIX: this needs a little structure so it is not such a blob.
- Quick Start
- Introduction
- Why Open Source Software
- Cost and Relearning
- The Scientific Method
- What books/references to buy?
- Packages for common tasks
- Choosing which Operating System
- Choosing which Programming Language(s)
- Finding software
- Useful commercial software that I have ignored
- Using the bash shell and Essential Unix commands
- Editing with Emacs
- Revision Control with RCS and CVS
- Document what you do and work on
- Where to get help?
- Basic scripting with bash
- Makefiles and compiling
- Documenting your code with doxygen
- General C++
- C++ Standard Template Library
- C++ Complex Template
- C++ String Template
- C++ Vector Template and File Parsing
- Linking to Fortran 77 code
- Python
- Python unittest
- Python pydoc
- SQL (Structured Query Language) using sqlite
- Basic HTML
- Command line arguments - Using gengetopt
- Making manual pages for programs - help2html
- autoconf - mastering the configure beast
- GNU Scientific Library
- Parellel Processing
- How to write a bug report
- Creating 3D geometry with OpenInventor/Coin
Larger systems/packages - Volume 2?
- gnuplot
- octave - matlab like system
- gdl - similiar to IDL
- r
- open dx
Processing datasets by example
Quick Start
Okay, so let's make a couple quick programs to quick do some stuff.
Anything will do! So open a shell and start typing... We will first
use some shell tricks to avoid using a text editor. Start by
compiling and running hello world in C and then C++
cat << EOF > first_c.c
#include
#include
int main (int argc, char *argv[]) {
printf ("Hello World\n");
return (EXIT_SUCCESS); /* C style comment */
}
EOF
gcc -c first_c.c -o first.c -g -Wall
./first_c
You should then see:
Hello World
Time to do the similiar thing in c++
cat << EOF > first_cplusplus.C
#include
using namespace std;
int main (int argc, char *argv[]) {
cout << "Hello World" << endl;
return (EXIT_SUCCESS); // C++ style comment
}
EOF
g++ -c first_cplusplus.C -o first_cplusplus -g -Wall
./first_cplusplus
This will again print out:
Hello World
Now a quick jump to a more complicated example in c++ where we load
some data, sort it, find the smallest and largest, and the sum. It
uses the standard template library (STL) vector data type.
cat << EOF > 2_complicated.C
#include
#include
using namespace std;
int main (int argc, char *argv[]) {
vector data;
int new_value;
while (cin >> new_value) {
data.push_back(new_value);
}
cout << "This is what you entered:" << endl;
for (size_t i=0; i::iterator i=data.begin(); i!=data.end();i++) {
cout << *i << endl;
}
cout << "Minimum value: " << *(data.begin()) << endl
<< "Maximum value: " << *(data.end()-1) << endl;
cout << "Minimum value: " << data[0] << endl
<< "Maximum value: " << data[data.size()-1] << endl;
return (EXIT_SUCCESS);
}
EOF
cat << EOF > data.int
3
6
1
4
8
EOF
make 2_complicated CXXFLAGS="-g -Wall"
./2_complicated < data.int
Here is what you should get back:
This is what you entered:
3
6
1
4
8
This is the data sorted
1
3
4
6
8
Minimum value: 1
Maximum value: 8
Minimum value: 1
Maximum value: 8
That will give you a couple programs to look at and see run just to
get you started for those who like to jump right in. The last example
is much more advanced, so do not worry if it looks kind of crazy.
Introduction
Check out the C tutorial by Peter Shearer: http://mahi.ucsd.edu/shearer/COMPCLASS/c.txt
C is a subset of C++ about 99% of the time. So you can use all of
that document for more help.
This text will hopefully become a beginning tutorial to programming
for geology and geophysics students. There are an infinite number of
ways to approach this topic, so this will reflect my take on how new
students should approach learning how to write programs that will help
with data reduction, analysis, and presentation.
For this document, I will focus on unix style software on Mac OSX.
This will be applicable to working with Linux, NetBSD, FreeBSD, SGI's
IRIX, Sun's SunOS/Solaris, and cygwin. There may be differences if
you are using a system other than Darwin/Mac OSX and it will be up to
you to adapt the material here to those systems.
As a scientist or engineer (I presume that's who you are if you are
reading this), I presume that your goal is to make descoveries and be
able to support and prove your results. A big part of the scientific
method is create reproduceable results. "It works for me" is
definitely not good enough. Just ask the cold fusion folks from the
90's (FIX: fact check the year). With data analysis and interigation,
you can strive to make the process repeatable. It may not always be
possible, but that should be the target. If you can give a tar
archive of raw data and scripts to someone, they should be able to
completely follow what you did and end up with the same results.
Other may not be able get the same raw data. For example, if your
study is on the measurements of a particular supernova and no one else
used the same type of instrument, it will be impossible to go back in
time. However, what you do with the measurements needs to be
repeatable.
A part of this is to try to avoid GUI type systems whenever possible
or to take note of all the parameters and methods used throughout. An
example of non-repeatable processing right now that is currently
essential is swath sonar ping editing. We can all apply the same
exact mbclean to the data, but when it comes to deleting bad pings by
hand, two people may delete 90% of the same pings, but that 10% is a
judgement call that people do differently.
Why Open Source Software
There is often a tug of ware between commercial software and free
software. You are almost always have a restricted budget (wow is it
crazy when you don't!). So choices must be made as to where to put
the money. You will have to balance between these factors some of
which are listed here:
- overhead - keeping the lights on and the network up
- staff - people cost money. what is the right number of people
for a project?
- commercial software
- computer hardware
- research consumables (ship time, chemicals, instruments, etc.)
- time - if I throw a large amount of money at some company, can
they make my problem go away?
Do you buy the commercial software? Hire a consultant to adapt
open source software to your needs? Pay the commercial vender to add
a needed feature? Write something in house from scratch? Always
tough choices. Buy a better computer or optimize the software?
Cost and Relearning
As a practical measure, I will try to avoid commercial software as
much as possible. Each scientist will have a different budget
situation which may preclude purchasing and maintain expensive
software packages and this situation can vary dramatically during your
career. By choosing free and open software, I hope to maximize the
number and quality of tools available to you while minimizing the
number times that you will have to learn a new tool to re-solve an old
problem. Early in my career, I was able to use a number of expensive
software tools and libraries. I then tried to convince a number of
univerities to use my software that I built on top of them. Their
response was that software licensing to use the libraries and tools
needed by my free code would cost more than their graduate students
cost them. As a result that software was long ago shelved never to
see the light of day again.
Then there are the days that you find out some critical piece of
commercial software you now rely on is gone. The reasons for this
happening are numerous. For example:
- The vendor will not support your OS
- The vendor went out of business or was swallowed by a not so
friendly business
- You vendor went insane (e.g. SCO)
- They redid the software in a way that is not compatible and you
can't upgrade.
When you are working with open source software, you have the option to
do with the code as you need. If you need to hire a software engineer
to maintain some old piece of critical code, you have the ability to
do so. If the vendor is gone, you, a colleague, or the community can
take responsibility for a body of code.
The Scientific Method
Parallel to the arguments of cost and troubles with commercial vendors
is the scientific method. Other scientists need to be able to
reproduce your results and know exactly how the data were processed.
There is nothing like a binary only software package or library to
hide what is really going on. In what version of the software was a
critical bug really fixed? With open source, you know you have the
option to see how the algorithm works under the hood. You probably
really don't want to see in there, but when the day comes that you
must, you have the option. When I think there is something wrong
inside of Matlab or IDL, I can only bug the vendor and cross my
fingers that they will give me a decent answer.
I think this section deserves an entire essay, but that is all I will
say for now.
What books/references to buy?
It would be nice to say that everything you need is online in
electronic form, but that just isn't so. Sometimes you just have to
go with dead trees. Computer screens still don't have the utility of
a good book. Buying books is also a great way to support authors who
put huge amount of energy in to writing software and documenting it.
- Programming with
GNU Software - This is a good summary book that covers emacs,
C/C++, make, etc. This book is what I would have liked to have
written here. So I guess, I'll have to to above and beyond that book
and focus more on scientific programming.
- GNU Emacs yellowbook
- Bash book
- C++ books
- Python books
- Python: Visual QuickStart Guide, by Chris Fehily, 2001. And
this series of books is inexpensive.
- Unix books
- Advanced Programming in the Unix Environment, Stevens, 1992. He
passed away in 2003, but his book will be a classic for a long time.
- The Art of UNIX Programming, Eric Raymond, 2003
- General System Administration: In order to survive
working with unix like systems, it is good to have a reference to get
you through system tasks. Many days man pages are just not enough.
I learned system admin on my own using the 2nd addition of this book.
I now hav ethe 3rd edition too.
- UNIX System Administration Handbook (3rd Edition) by
Nemeth, Snyder, Seebass, Hein, 2000. (The Purple Book. Last
edition was the Red Book.)
More advanced books:
- Effective C++, 2nd Ed., Scott Meyers, 1998.
- Effective STL, Scott Meyers, 2001.
- Expert C Programming, Deep C Secrets, by Peter van der Linden,
1994. Who has my copy of this book?!?!?
Packages for common tasks
This section will talk about what programs you can use for which
tasks.
- su - seismic unix for processing seismics on land and at sea
- xsonar - sidescan sonar
- gmt/mbsystem - multibeam sonar, towed magnetics, and gravity
- gimp - image editing
Choosing which Operating System
This text focuses on Mac OSX 10.3 and newer.
Often this will not be your choice. You are stuck with what you have
for any number of reasons or you don't want to learn anything new. If
you have the opportunity to switch, here is my take on the options.
Caution! Opinionated sections! Not that the rest of this text isn't
heavily laced with oppinions.
- Microsoft - Only use this if you are forced into it.
- MS-DOS - Yup, this is still in lots of devices like sonars (ugh)
- Windows 3.x - Yes, people still use these
- Windows 95 - Yes, people still use this
- Windows 98 - For those who like crashing
- Windows NT - Clunky but stable
- Windows Me - Ouch
- Windows 2000 - Lots of people swear by this
- Windows XP - Not my choice, but many use 2003 and XP Pro
- Windows CE (Wince) - For little devices.
- Linux - The swiss army knife. Embedded to the top super
computers of the world. Pretty cool. I like Knoppix and Mandrake. 400+ Linux Distributions
- Mac OSX 10.3 and newer - My currently personal preference.
FreeBSD, Fink/DebianTools, plus Photoshop and Illustrator. the NeXT,
but way better. If you need to run Windows software, there is always
VirtualPC (reboot the PC without loosing everything else you were doing!)
- SunOS - Shoot me now. Please do not make me administer another
solaris box ever again.
- SGI/IRIX - Used to be my favorite till 2000. They sold
OpenInventor and Explorer. Their compilers are frustrating. But it
works well for the 21x8 foot screen in the viz center. Hardware
techsupprt folks are top notch.
- HPUX, OpenVMS, Ultix, Tru64 - Not a fan of Carli. She
kills good technology.
- AIX - Only for those who already know they need it.
- BeOS - Almost the coolest os. What is the new name now that Be collapsed?
- NetBSD - Very portable
- OpenBSD - Very secure
- FreeBSD - Very stable
- PalmOS - Useful for in the field
- Are there any other major choices. Not including VxWorks, QNX, and?
- Don't forget embedded microcontrollers with no OS or uClinux
Finding software
So you have a shinny new computer or inhereted some old beast from the
dark ages of the last century. How do you find the software to make
it do your thesis for you???
- Fink - Only for Mac
OSX. FinkCommander is the bomb. I would not be right to leave out
DarwinPorts
- Fresh Meat - summaries and
searches for software.
- FSF Free Software
Directory - The home of GNU and Richard Stallman
- VersionTracker - More
commercially oriented, but some free software
- Yes, there is Google too.
Choosing which Programming Language
There a hundreds of programming languages out there. Here are my
oppinions on programming languages. There are many reasons for
choosing so talk my list and descriptions with a salt dome.
First, if you want to see the simplest of comparisons between
languages, look at the hello world
page. This page has about 200 programming languages. You'll see
pretty quickly that is is missing many a language. For example, there
is no Arc Macro Language (AML) that is used by ESRI's Arc/Info.
If you want or must choose a different set of programming languages
that described here, you will need to go get some different docs.
This is not necessarily a bad thing, just you won't get much help
beyond the introduction section.
Common programming languages and eventually my take on each one. They
have many strengths and weakness. My main philosophy is to not lock
yourself into one platform. Learn general skills that apply no matter
what you end up doing. If you just learn the Microsoft world of
VC++, C#, and VB, you are missing out.
- python - My personal favorite. Getting to be really nice for
data processing when you have fink to help install all the nice
libraries that exist.
- c++ - This is the language you use for heavy hitting monster
projects. Just don't abuse the language or you will end up with a
mess. Worth while to know.
- bash/sh/zsh/ksh - Bash is my shell of choice. Good for quick
tasks and system scripts. MUCH better than tcsh. I am not sure why
you would want zsh or ksh over bash.
- SQL - You MUST learn some basic SQL.
- lisp - Good to learn a bit if you a big emacs user or if you are
an AI nerd. cdr, cdr, cdr.
- perl - Often ends up with crazy code. I get tired of $variable
names. But it does get the job done.
- ruby - fancy smancy. Supposed to be interesting.
- php - The swiss army knife of web programming. Easy to learn
and easy to talk to databases.
- c - Unless you are doing super low level OS stuff or embedded
systems, use C++ instead.
- objc - If you want to be a hard core Mac OSX / Cocoa
programming, you should learn ObjectiveC. Otherwise, just ignore it.
- java - Not a bad language. Lots of helpful libraries. This is
not the end all, be all language that many claim. But it has a place.
- f77 - Stop writing Fortran 77 code. Wrap your fortran and call
it from python. Then start writing python.
- f90/f95 - Until recently, you paid a lot of money for really
horrible compilers and ended up with code that will not build on
other compilers. Guess what, with proper data structures, C and
C++, beat fortran for speed any day. GNU f90 brings the possibility
of decent fortran in the future, but go learn something else.
- ada - Don't use ADA unless the DOD makes you.
- csh/tcsh - STOP. Do not bother learning csh/tcsh. Do yourself
a favor and switch to bash RIGHT NOW.
- AppleScript - it is worth while to learn a little if you are a
mac user, but go learn python if you want to do a lot. AppleScript
is not portable, so do not depend on it heavily.
- Basic - Put this in the trash right now.
That is enough languages for now. You will get different opinions on
the above list depending on who you talk to. There are also a billion
languages specific to different programs like Matlab, Mathematica,
IDL, SAS, Arc Macro Language, etc.
Useful commercial software that I have ignored
These are programs and libraries that can be extremely powerful and
useful for certain problems, but that I have not covered. You will
need to look elsewhere for information on them. They are included
just for completeness. I may have given some aruments above for why
not to use commercial software, but I still do use a ton of it. I
keep trying to estimate how much has been spent on me personally by my
employers. My current estimate is on the order of greater than half a
million dollars. It all adds up! There is no particular order to the
madness...
- Comercial compilers for C, C++, Fortran, Java, VisualJunk, C#,
etc. - Just because you pay lots of money, do not expect consistancy
on quality or consistance of compilers.
- GIS software
- Arc/Info, Arc/View, etc
- MapInfo
- Math software
- Matlab
- IDL
- Mathematica
- Maple
- S, S/Plus stats stuff
- 3D Graphics: TGS Inventor, NAG Explorer (formerly SGI Explorer)
- Visualization: Fledermaus
- Rendering/Design: Maya, Lightwave, Renderman, etc.
- CAD: ProEngineer, AutoCAD, etc.
- Geology/geophysics specific
- Seismic: Focus, &, &, &
- Rockware tools
- etc.
Using the bash shell and Essential Unix commands
In this section, we need to cover how to get around in the shell, run
some programs, and look at files. You really need to go get and read
a basic unix book. FIX: I need to find one that is affordable, short,
and "easy". Does anything like that exist? However, to keep this
text self contained, we will cover the basics of unix shells. If you
remember using PC-DOS or MS-DOS, delete that knowledge from your brain
right now. Better to start from scratch!
Dealing with basic files. How to copy, move/rename, and remove
files.
There are some dangerous things about naming files. Here should be
some guidelines on how not to get in trouble. Stick to [a-zA-Z_-] in
your filenames. Do NOT use spaces in filenames. Do not name
files the same but with different capitalizations. This works on many
systems, but it will kill you on Mac OS X and Windows.
How to view files with less. "Less is more"
Grep grep grep. It would be a scary world without grep.
Dealing with columns and rows of data. tail, head, awk'ing of
columns. If your awk is longer than one line, stop now. Put down awk
and go use python or perl.
Editing files with emacs
Emacs and vi are both tricky editors when you are just starting out.
However, after a few days in emacs, things will get easier and the
power of having emacs as you text editor is amazing. I know many
people really favor integrated developement environments, but I have
been able to use emacs for many languages over the last 14 years,
while I have learned lots of GUI's for different developement
environments that were only good for certain platforms or languages.
Yes, emacs is way more powerful than VI (even vim).
In addition to using the drop down menus from the top of the window,
you will want to know some basic emacs key commands. When you see a
C-, that means hold down the CTRL key and press the key
that follows. The M- is the META key. Don't see a meta
key on the keyboard? Then you can press and RELEASE the ESC
key. Then type the letter that follows the "-".
- C-x C-s -- Save File
- C-x C-c -- Exit Emacs
- C-x C-f -- Open a new file
- C-x b -- Jump to a different buffer
- M-x compile -- Start compiling programs
- C-x ` -- Jump to the next compiler error
- C-x v v -- Check in and out revision control (RCS or CVS)
- C-c C-c -- Finish a revision control comment
- M-x gdb -- Start the debugger
- C-s -- Search for a string
- M-$ -- Spell check the current word
- M-x tetris -- Waste some time
- M-x doctor -- Talk to a shrink
- M-x shell -- Start a shell in an emacs window
- C-x 2 -- Split the window
- C-x 1 -- Go back to just one window
- C-x 0 -- Close the current window
- C-x 5 2 -- Open a new frame
- C-x 5 0 -- Close current frame
Need to talk about creating a .emacs file.
Revision Control with RCS and CVS
There is a CVS book that is available in print or here.
Here is the
CVS
manual.
See my document here for now: Revision Control
using CVS (Convurrent Version System)
Of course, Aurelio will tell me that I really need to get on the
Subversion (svn) band wagon.
Where to get help?
RTFM == Read The F'ing Manual. This is usually what you do not want
to hear from someone that you are asking help from. But if you are
stuck with figuring it out for yourself, here are some things you can do.
- man and man -k
- info - works but clunky
- google
- Usenet groups (aka google groups to many)
- FAQ's
- Mailing lists
- Bribery
- AIM - Aol Instant Messanger
- IRC - Internet Relay Chat
Document what you do and work on
Create a text file in which you log what you do. May an entry for
each time or day that you work in order. Watch out for proprietary
programs. If you do use a proprietary program, make sure that you can
export all your logs to flat ascii. Programs go away and file formats
change. Your research can easily still be important 40 years after
you did it.
Keep a journal/notebook too. I highly recommend the bound art books
that you can get from your campus bookstore. What you do is valuable.
Treat your work and yourself with respect. This is a place to draw,
doodle, write ideas/frustrations/successes.
Basic scripting with bash
- csh - Do NOT use csh. You have tcsh. See
tcsh why you shouldn't even use tcsh. csh is for masochists.
- tcsh - Why would you use tcsh when you have bash? If you
are going to learn a shell, learn one that will really work well for
you. At first glance, tcsh and bash are the same, but deep down, bash
is so much better than tcsh. I switched from tcsh to bash in 2000.
You should switch too!
- sh - Use sh when you have to write scripts that must be
portable no matter what. Do not make it your shell unless you like
pain.
- korn - (pdkshtoo) Okay, so I haven't used ksh really.
Might be okay, but it's not used by many that I've run into.
- python/perl - If you're adventurous, these could be
pretty productive to use.
Here is the official
manual for bash.
So now that we have all agreed to use bash, we can go on with
life. How about some simple examples. Here is one that is handy one
that I use a lot. I often need to batch convert digital images from
one format to another. ImageMagick has a nice program called
convert that makes it pretty easy. Now we just need to call
that program for each image in a directory. Here is the code:
for image in *.tif; do
echo "Converting $image to ${image%%.tif}.png"
convert $image ${image%%.tif}.png
done
Here is what you might see if you type these commands into the bash
shell with 3 tif files in the current directory called 1.tif, 2.tif,
and 3.tif:
Converting 1.tif to 1.png
Converting 2.tif to 2.png
Converting 3.tif to 3.png
Makefiles and compiling
Here is the official
manual for GNU make.
Write lots about how cool make is.
Documenting your code with doxygen
Here is the official
manual for doxygen.
Doxygen makes great documentation of your code much easier. No
fortran support yet, but it is great for lots of other languages!
General C++
The text is not going to teach you in depth about programming in C++.
It will just get you going with a few hints and ideas of what to look
for. A good online resource for C++ programming is C++
Annotations Version 6.1.1b
C++ Standard Template Library
C++ Complex Template
C++ String Template
C++ Vector Template and File Parsing
Linking to Fortran 77 code
Python
Basic HTML
You really should know at least just the basics of HTML enough to make
a quick web page. Graphical programs can make pretty pages, but for
pure content, consider writing just a little html yourself.
First a couple ways to make html without having to know about html
tags. The one that most people know is MS Word and Powerpoint.
Command line arguments - Using gengetopt
Here is the official
manual for gengetopt.
Seen programs like GNU grep and tar that have --help? Want that for
your programs? Then gengetopt is the easiest way to go.
Making manual pages for programs - help2html
Here is the official
manual for help2man.
Writing groff based man pages is no fun, so let's use help2man to make
life easier! help2man uses the program's --help to get the basic man
page and then you can insert small pieces.
autoconf - mastering the configure beast
I found out you link to http://mdcc.cx/autobook from
http://schwehr.org/papers/intro-programming.html . That's cool :)
Perhaps you could add links to copies of recent autoconf, automake and
libtool info files: http://www.gnu.org/software/autoconf/manual/
http://sources.redhat.com/automake/automake.html
and
http://www.gnu.org/software/libtool/manual.html
. These are more
actively maintained. (The Unofficial Autobook text is getting obsolete,
just as fast as the official text... :( )
Your document looks very promissing, thanks for your work!
Bye,
Joost
The Autobook
describes the whole automake/autoconf/libtool process. However this
book is getting "old" and has not been updated since 2001. That is a
long time for tools like this that have to adapt to every
operating system and tool release from every major vendor.
You might prefer to use the Unofficial AutoBook that is being
more actively maintained or the autotut (which seems to be a
a cranky link so consider the google cache option).
Getting started with autoconf can be overwhelming, but if you are
writing a lot of programs, this will help you get them running on lots
of platforms with ease. However, the initial setup is a big task.
I have yet to get really good at this, so a section on it may be a
long time coming.
GNU Scientific Library (GSL)
Here is the GSL
manual.
This should be the staple starting point for data analysis
algorithms.
Basic debugging with GDB
FIX: Write up a little sample program that hits and assert and debug
the assert.
gdb foo
run
# you get some assert triggered
backtrace
up # do this until you are in a stack frame that has something
# useful... the bottom couple will be the assert mechanisms
print myTroubleSomeVariable
Parallel Processing
This is a monster can of worms. A multitude of solutions. I
recommend using one pthreads and/or lam-mpi. Try to avoid locking
yourself into a solution. Some programs will run faster on
a single machine than multiple and you may not be able to have more
than one CPU at a particular time. The overhead can really hurt some
applications.
Do not let vendors lock you into proprietary APIs.
How to write a bug report
In the process of writing and using your software, you will run into
times when there is probably a bug in a library or program by
someone else. Here are some guide lines that will help you to resolve
the problem faster.
- Be polite The problem may turn out to be in your code and
other people are giving you their time to look at this bug report.
With open source software, they very likely are not paid for their
work.
- Write clearly Correct English is very important. Make
sure to spell check you report. There is nothing like reading through
a gramatically incorrect and mispelled email to make you not want to
help someone. (Hey, maybe I should spell check this doc???)
- Describe the problem up front Describe the problem in a
short few scenetences right up front in the report. Don't burry it.
- Identify your environment Many bugs are specific to the
environment you are running. Make sure to specify the operating
system and version, what type of CPU and any major changes or unusual
circumstances that there are with the system. How much physical RAM
is installed. Note that your hard disk space is NOT RAM!!!
Common untilities to tell about your system are: top, uname -a, hinv
(SGI's only), sysinfo (Solaris sometimes), cat /etc/redhat (redhat,
fedora, mandrake).
- How is the program linked If you are reporting a problem
against a linked library, give them the context that your program is
in with all the libraries the program is linked against.
- Mac OSX otool -L programname
- Other Unix's ldd programname
- Stack trace If the system is crashing down in the
library in question, include a stack trace. Start up the GNU debugger
like this: gdb programname. Run the program and when it
fails, type: backtrace. This lists all the function/method
calls and there arguments down to where the system failed.
- Example Case Try to create the smallest test case that
causes the system to fail. You want a fellow developer to be able to
recreate the same problem on their system if at all possible. You are
more likely to get a quick solution then.
- There has got to be some more good ideas, right?
That is for now folks. -kurt