If you think you found a problem in OPS or Nanos first thing is to verify where the bug lies. Generally, if you turn on the -d flag and study the bottom you can tell if it is in user or kernel:
1 general protection fault in user mode, rip 0x13783afe7
Clearly in this example we see a GPF in user.
At this moment, we do not have interactive debugging support with ops, but we plan on integrating it soon. For now if you wish to have symbol access within your user program the following conditions must be met:
Ensure your program is statically linked.
Ensure you have debugging symbols to begin with:
You can do this with c by doing the following:
Example
For this example will examine a segfault (that we purposely injected):
#include <stdio.h>
#include <stdlib.h>
void mybad() {
int x = 1;
char *stuff = "asdf";
printf("about to die\n");
*(int*)0 = 0;
}
int main(void) {
mybad();
printf("should not get here\n");
return 0;
}
We compile with debugging symbols and link statically:
cc main.c -static -g -o main
First (since we are missing interactive debug support in ops) you need to modify ops to manually add the noaslr flag in lepton/image.go:
m.AddDebugFlag("noaslr", 't')
This is important because otherwise we randomize the location of the .text and other parts of your program.
Next, we'll run without accel support:
ops run --accel=false main
Then we let it crash.
Now let's manually start qemu with gdb support: (Not all of this is necessary but definitely ensure your 'drive file' line matches where your disk image is)
Does this work on the nightly build? Running '-n' will run ops with whatever was in the master branch last night.
Reproducible steps
GUI Debugger in VSCode
This part of the debugging guide shows how to use the GNU Debugger (GDB) in combination with VSCode to get better visualization of the debugging process. It requires that the Native Debug extension is installed.
Prerequisite
Launch the application in debug mode with ops:
$ ops run -d main
booting ~/.ops/images/main.img ...
You have disabled hardware acceleration
Waiting for gdb connection. Connect to qemu through "(gdb) target remote localhost:1234"
See further instructions in https://nanovms.gitbook.io/ops/debugging
Attach the Debugger
Click the Run icon on the left sidebar (alternatively use Ctrl+Shift+D) and then create a launch.json file.
Select GDB as the environment. This will create an autogenerated launch.json file.
Replace the contents of the launch.json file with following:
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Debug",
"type": "gdb",
"request": "attach",
"executable": "${workspaceFolder}/main",
"target": "localhost:1234",
"remote": true,
"cwd": "${workspaceRoot}",
"valuesFormatting": "parseText"
}
]
}
Set a Breakpoint in the source file (main.c) and start the debugging session from the Run on the left sidebar (alternatively use Ctrl+Shift+D) and click on the > Debug icon.
It is now possible to use the debugging palette to debug the application code.
Dump
Ops provides a tool that allow you to inspect image crash logs and image manifests. The dump tool binary is inside the ops version directory (~/.ops/<ops_version>/dump). Make sure you use the dump tool of the same ops version you used to build the image you want to analyze.
If the application crashes the unikernel writes the error stack to a log file before exiting. You are able to see the log content if you run the command dump -l <image_path>.
Nanos has a FUSE driver which allows us to mount the TFS image used on the host filesystem. This makes it easy for rapid development, ease of debugging and other interesting tools.
If you'd like to read more check out https://nanovms.com/dev/tutorials/nanos-unikernel-has-fuse-driver-for-tfs .
To try it out on Linux:
sudo apt-get install libfuse-dev
To try it out on Mac:
brew install macfuse
Then you simply create a mount point and mount your desired image:
Nanos supports core dumps. By default they are turned off and enabled if specifying a > 0 'coredumplimit' config variable. Ensure that the volume size is large enough to contain the core dump as well however.
➜ gdb -ex bt -ex quit main core
GNU gdb (Ubuntu 11.1-0ubuntu2) 11.1
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from main...
[New LWP 2]
Core was generated by `main'.
Program terminated with signal SIGABRT, Aborted.
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=761004893632) at pthread_kill.c:44
44 pthread_kill.c: No such file or directory.
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=761004893632) at pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=761004893632) at pthread_kill.c:80
#2 __GI___pthread_kill (threadid=761004893632, signo=signo@entry=6) at pthread_kill.c:91
#3 0x0000007a1af59476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x0000007a1af3f7b7 in __GI_abort () at abort.c:79
#5 0x000000000056d185 in main () at main.c:6
Tracing
If you are using '--trace' you can set an optional 'notrace' variable to exclude certain output such as: