What is the best way to analyze crashes on Linux?
We expect to build the software and deliver a release version to testers. The testers may not be able remember how to reproduce the crash or the crash may be totally intermittent. They also will not have a development environment on their machines. The software is written in C/C++ and compiled into native machine code for distribution.
I believe what you're looking for is this: http://stackoverflow.com/questions/77005/how-to-generate-a-stacktrace-when-my-gcc-c-app-crashes
If you have space on the disk, let the application create its coredump when it crashes.
ulimit -c unlimited
Later you can debug it with GDB.
Aside from coredumping and stacktracing as already noted - make sure you can easily identify which versions of your executable people are running, and be able to answer what version of each source file goes into what binary version (i.e. spend some time with your source code control system and your build scripts). Otherwise neither a core file nor a stack trace is going to help.
Core dumps are helpful, but they don't always tell you everything you want to know about how you ended up in the error condition.
Logging actions, inputs, and events can be very helpful. If you are able to log each run of your program in such a way that in the event of a crash a developer could get access to the log and regenerate the error could be very helpful.
If possible you should build your programs with maximum debug symbols generated and then strip them if you don't want or can't let your release versions to have them, but keep a copy of each released version with the debug symbols which you can couple with the core file if you need to debug a crash.
In addition to generating a stacktrace in a
SIGSEGV handler and/or generating a core dump, it may also be useful to find where an uncaught C++ exception is thrown.