Ops
Search
K
Comment on page

Klibs

Klibs are plugins which provide extra functionality to unikernels.
As of Nanos b66be2b there are 13 klibs in the kernel source tree:
  • cloud_init - used for Azure && env config init
    • cloud_azure - use to check in to the Azure meta-data service - not available in config (auto-included by cloud_init)
  • cloudwatch - use to implement cloudwatch agent for AWS
    • aws - not available in config (auto-included by cloudwatch)
  • firewall - use to implement network firewall
  • gcp - logging and memory metrics for GCP
  • ntp - used for clock syncing
  • radar - telemetry/crash data report using the external Radar APM service
  • sandbox - provides OpenBSD-style pledge and unveil syscalls
  • special_files provides nanos-specific pseudo-files
  • syslog - used to ship stdout/stderr to an external syslog - useful if you can't/won't modify code
  • tls - used for radar/ntp and other klibs that require it
  • tun - supports tun devices (eg: vpn gateways)
  • test - a simple test/template klib
Not all of these are available to be included in your config (cloud_azure, aws). Only the ones found in your ~/.ops/NANOS-VERSION/klibs folder can be specified, where NANOS-VERSION is the version of nanos you are using, ie:
  • 0.1.48 - ~/.ops/0.1.48/klibs
  • nightly - ~/.ops/nightly/klibs
Some of these are auto-included as they provide support that is required by certain clouds/hypervisors. You should only include a klib if you think you need it. The ones that are required for certain functionality will be auto-included by ops. For instance if you are deploying to Azure we'll auto-include the cloud_init and tls klibs.

Cloudwatch

The cloudwatch klib implements two functionalities that can be used on AWS cloud:
  1. 1.
    logging - console driver that sends console output to AWS CloudWatch
  2. 2.
    metrics - emulates some functions of AWS CloudWatch agent to send memory utilization metrics to AWS CloudWatch

Cloudwatch logging

The cloudwatch klib implements a console driver that sends log messages to AWS CloudWatch when Nanos runs on an AWS instance. This feature is enabled by loading the cloudwatch and tls klibs and adding a "logging" tuple to the "cloudwatch" tuple in the root tuple. The "logging" tuple may contain the following attributes:
  • "log_group": specifies the CloudWatch log group to which log messages should be sent; if not present, the log group is derived from the image name (taken from the environment variables), or from the name of the user program if no IMAGE_NAME environment variable is present
  • "log_stream": specifies the CloudWatch log stream to which log messages should be sent; if not present, the log stream is derived from an instance identifier (e.g. 'ip-172-31-23-224.us-west-1.compute.internal')
The log group and the log stream are automatically created if not existing.
In order for the cloudwatch klib to retrieve the appropriate credentials needed to communicate with the CloudWatch Logs server, the AWS instance on which it runs must be associated to an IAM role with the CloudWatchAgentServerPolicy, which must grant permissions for the following actions, as described in https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/iam-access-control-overview-cwl.html :
  • logs:PutLogEvents
  • logs:CreateLogGroup
  • logs:CreateLogStream
Example contents of Ops configuration file:
{
"Klibs": ["cloudwatch", "tls"],
"ManifestPassthrough": {
"cloudwatch": {
"logging": {
"log_group": "my_log_group",
"log_stream": "my_log_stream"
}
}
}
}
Note: If log_group is not set, the "IMAGE_NAME" environment variable, if present, will be used to set the log_group, while the program name is used as a fallback setting for log_group.

Cloudwatch metrics

The cloudwatch klib implements sending memory utilization metrics to AWS CloudWatch. Analogously to the implementation in the Linux CloudWatch agent, the metrics being sent are under the "CWAgent" namespace, and have an associated dimension whose name is "host" and whose value is an instance identifier formatted as in the following example: "ip-111-222-111-222.us-west-1.compute.internal".
The list of supported metrics is:
  • mem_used
  • mem_used_percent
  • mem_available
  • mem_available_percent
  • mem_total
  • mem_free
  • mem_cached
A description for each of these metrics can be found at https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/metrics-collected-by-CloudWatch-agent.html, in the section for the Linux CloudWatch agent. In order for the cloudwatch klib to retrieve the appropriate credentials needed to communicate with the CloudWatch server, the AWS instance on which it runs must be associated to an IAM role with the CloudWatchAgentServerPolicy, as described in https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/create-iam-roles-for-cloudwatch-agent.html. Memory metrics are defined as standard-resolution metrics, i.e. they are stored with a resolution of 60 seconds.
The cloudwatch klib is configured via a cloudwatch tuple in the image manifest. Sending of memory metrics is enabled by specifying a sending interval (expressed in seconds) via a mem_metrics_interval attribute inside the cloudwatch tuple. The cloudwatch klib depends on the tls klib for cryptographic operations (which are needed in order to sign the requests being sent to the CloudWatch server).
Example Ops configuration to send memory metrics with a 60-second interval:
{
"Klibs": ["cloudwatch", "tls"],
"ManifestPassthrough": {
"cloudwatch": {
"mem_metrics_interval": "60"
}
}
}

GCP

The gcp klib implements two functionalities that can be used on GCP cloud:
  1. 1.
    logging - console driver that sends console output to GCP logs
  2. 2.
    metrics - emulates some functions of GCP ops agent to send memory usage metrics to the GCP monitoring service
Note: When executing the ops instance create command to create a GCP instance, if the CloudConfig.InstanceProfile configuration parameter is a non-empty string, the instance being created is associated to the service account identified by this string, with cloud-platform access scope. GCP service accounts are identified by an email address; the special string "default" indicates the default service account for a given project. The service account specified in the configuration must exist in the GCP project when an instance is being created, otherwise an error is returned. For more information about GCP service accounts, see https://cloud.google.com/compute/docs/access/service-accounts.

GCP logging

The gcp klib implements a console driver that sends console output to GCP logs. Instance-specific information that needs to be known in order to interface with the GCP logging API is retrieved from the instance metadata server, which is reachable at address 169.254.169.254. GCP logging is enabled by loading the gcp and tls klibs and adding to the root tuple a "gcp" tuple that contains a "logging" tuple. The "logging" tuple may optionally contain a "log_id" attribute that specifies the [LOG_ID] string that is sent in the "logName" parameter (which is formatted as "projects/[PROJECT_ID]/logs/[LOG_ID]") associated to GCP log entries; if not present, the log ID is derived from the instance hostname as retrieved from the metadata server.
In order for the GCP klib to retrieve the appropriate credentials needed to communicate with the GCP logging server, the instance on which it runs must be associated to a service account (see https://cloud.google.com/compute/docs/access/service-accounts).
Instances created by Ops are associated to a service account via the CloudConfig.InstanceProfile configuration parameter.
Example contents of Ops configuration file:
{
"CloudConfig" :{
"Platform" :"gcp",
"ProjectID" :"prod-1000",
"Zone": "us-west1-a",
"BucketName":"my-s3-bucket",
"InstanceProfile":"default"
},
"Klibs": ["gcp", "tls"],
"ManifestPassthrough": {
"gcp": {
"logging": {
"log_id": "my_log"
}
}
}
}

GCP metrics

The gcp klib can be configured for sending memory and disk usage metrics to the GCP monitoring service, thus emulating the GCP ops agent. The ID of the running instance and the zone where it is running (which are necessary to be able to send API requests to the monitoring server) are retrieved from the instance metadata server.
  • memory usage metrics being sent (see https://cloud.google.com/monitoring/api/metrics_opsagent#agent-memory), are the bytes_used and bytes_percent metric types; for each type, a value is sent for each of the "cached", "free" and "used" states.
  • disk usage metrics being sent (see https://cloud.google.com/monitoring/api/metrics_opsagent#agent-disk), are the bytes_used and bytes_percent metric types; for each type, a value is sent for each of the "free" and "used" states.
In order for the gcp klib to retrieve the appropriate credentials needed to communicate with the GCP monitoring server, the instance on which it runs must be associated to a service account (see https://cloud.google.com/compute/docs/access/service-accounts).
Instances created by Ops are associated to a service account via the CloudConfig.InstanceProfile configuration parameter.
The allowed configuration properties are:
  • metrics - enable metrics. By default, only memory metrics are sent, disk metrics are disabled.
    • interval - value expressed in seconds to modify the time interval at which metrics are sent. The default (and minimum allowed) value is 60 seconds.
    • disk - enable disk metrics. By default, only read-write mounted disk(s) metrics are sent, read-only disk metrics are disabled.
      • include_readonly - enable also read-only disk metrics.
Example Ops configuration to enable sending memory only metrics every 2 minutes:
{
"CloudConfig": {
"Platform": "gcp",
"ProjectID": "prod-1000",
"Zone": "us-west1-a",
"BucketName": "my-s3-bucket",
"InstanceProfile": "default"
},
"Klibs": ["gcp", "tls"],
"ManifestPassthrough": {
"gcp": {
"metrics": {
"interval":"120"
}
}
}
}
Example Ops configuration to enable sending memory metrics and disk metrics (read-write only) every 2 minutes:
{
"CloudConfig": {
"Platform": "gcp",
"ProjectID": "prod-1000",
"Zone": "us-west1-a",
"BucketName": "my-s3-bucket",
"InstanceProfile": "default"
},
"Klibs": ["gcp", "tls"],
"ManifestPassthrough": {
"gcp": {
"metrics": {
"interval": "120",
"disk": {}
}
}
}
}
Example Ops configuration to enable sending memory metrics and disk metrics (read-write and read-only) every 2 minutes:
{
"CloudConfig": {
"Platform": "gcp",
"ProjectID": "prod-1000",
"Zone": "us-west1-a",
"BucketName": "my-s3-bucket",
"InstanceProfile": "default"
},
"Klibs": ["gcp", "tls"],
"ManifestPassthrough": {
"gcp": {
"metrics": {
"interval": "120",
"disk": {
"include_readonly": "true"
}
}
}
}
}

NTP

The ntp klib allows to set the configuration properties to synchronize the unikernel clock with a ntp server.
The allowed configuration properties are:
  • ntp_servers - array of ntp servers, with each server specified using the format <address>[:<port]. The <address> string can contain an IP address or a fully qualified domain name; if it contains a numeric IPv6 address, it must be enclosed in square brackets, as per RFC 3986 (example: "ntp_servers": ["[2610:20:6f97:97::4]", "[2610:20:6f97:97::5]:1234"]). The default value is pool.ntp.org:123.
  • ntp_poll_min - the minimum poll time is expressed as a power of two. The default value is 4, corresponding to 16 seconds (2^4 = 16). The minimum value is 4, corresponding to 16 seconds (2^4 = 16).
  • ntp_poll_max - the maximum poll time is expressed as a power of two. The default value is 10, corresponding to 1024 seconds (2^10 = 1024). The maximum value is 17, corresponding to 131072 seconds (2^17 = 131072 = ~36.4 hours).
  • ntp_reset_threshold - This is a difference threshold expressed in seconds to use step/jump versus smearing on ntp - the default is set to 0 meaning it will never step/jump. If the difference is over this threshold then step/jump will be used allowing correction over much longer periods.
  • ntp_max_slew_ppm - maximum slewing rate for clock offset error correction, expressed in PPM; default value: 83333
  • ntp_max_freq_ppm - maximum clock frequency error rate, expressed in PPM; default value: 25000
The ntp klib needs to collect some data samples from ntp server(s) before deciding the actions needed to update the clock (if any).
  • It needs a minimum of 4 samples to analyze the needed changes - MIN_SAMPLES 4
  • It keeps a maximum of 30 samples to analyze the needed changes - MAX_SAMPLES 30
It is important to understand that until ntp has collected at least 4 samples, no actions will be taken to update the clock. During that time the system will operate under the clock value provided by the hypervisor, which may or may not be correct. This means that, with default config setup and no ntp request failures, it will take about 53 seconds after boot for the klib to update the system clock if needed.
This can be observed from a debug build of the klib:
en1: assigned 10.0.2.15
[0.319779, 0, ntp] adding server 0.us.pool.ntp.org (port 123)
en1: assigned FE80::447A:E3FF:FECF:B36F
[5.697626, 0, ntp] selecting 0.us.pool.ntp.org as current server
[5.700948, 0, ntp] insert 0: 1618876805.261140, off=67692786.324026320, rtd=0.317320199, jit=67692786.324026320
[21.595184, 0, ntp] insert 1: 1618876821.208357, off=67692786.382279612, rtd=0.211364274, jit=0.058253291
[37.591244, 0, ntp] insert 2: 1618876837.206483, off=67692786.380658678, rtd=0.207233731, jit=-0.001620934
[53.585795, 0, ntp] insert 3: 1618876853.203833, off=67692786.377978558, rtd=0.201635557, jit=-0.002680119
[53.591257, 0, ntp] packet offset=67692786.377978558 est_offset(total)=67692786.376225217(67692786.376225217) est_freq(total)=0.000087361(0.000087361) offset_sd=0.008933535 skew=0.118788699
[69.587293, 0, ntp] insert 4: 1686569655.580903, off=0.000968804, rtd=0.202942327, jit=-0.000793821
[69.592481, 0, ntp] packet offset=0.000968804 est_offset(total)=0.003349950(0.003349950) est_freq(total)=-0.000005103(0.000082258) offset_sd=0.009849557 skew=0.040548864
[85.590427, 0, ntp] insert 5: 1686569671.587333, off=-0.001286386, rtd=0.205845955, jit=0.001095304
[85.595407, 0, ntp] packet offset=-0.001286386 est_offset(total)=-0.000979362(-0.000979362) est_freq(total)=-0.000023029(0.000059228) offset_sd=0.007714219 skew=0.012092993
...
[758.312460, 0, ntp] insert 28: 1686570344.311070, off=-0.011912567, rtd=0.269852664, jit=-0.002127423
[758.318662, 0, ntp] packet offset=-0.011912567 est_offset(total)=-0.000798835(-0.000798835) est_freq(total)=-0.000001312(0.000040926) offset_sd=0.003480576 skew=0.000028015
[774.309514, 0, ntp] insert 29: 1686570360.309574, off=-0.010015399, rtd=0.266696978, jit=0.001098517
[774.312045, 0, ntp] packet offset=-0.010015399 est_offset(total)=-0.000703166(-0.000703166) est_freq(total)=-0.000001130(0.000039796) offset_sd=0.003435532 skew=0.000027003
[790.313905, 0, ntp] insert 0: 1686570376.311817, off=-0.006775507, rtd=0.270922496, jit=0.002536878
[790.315987, 0, ntp] packet offset=-0.006775507 est_offset(total)=-0.001349425(-0.001349425) est_freq(total)=-0.000003542(0.000036254) offset_sd=0.002574216 skew=0.000019820
[806.313070, 0, ntp] insert 1: 1686570392.310801, off=-0.005923203, rtd=0.269834293, jit=-0.000496634
[806.316524, 0, ntp] packet offset=-0.005923203 est_offset(total)=-0.000628994(-0.000628994) est_freq(total)=-0.000001222(0.000035031) offset_sd=0.002575573 skew=0.000019651
...
Use the configuration file to enable the ntp klib and setup the settings.
{
"Klibs": ["ntp"],
"ManifestPassthrough": {
"ntp_servers": ["127.0.0.1:1234"],
"ntp_poll_min": "5",
"ntp_poll_max": "10",
"ntp_reset_threshold": "0",
"ntp_max_slew_ppm": "83333",
"ntp_max_freq_ppm": "25000"
}
}

Verifying your NTP is working:

To verify that NTP is working you can run this sample go program and then manually adjust the clock on qemu:
➜ g cat config.json
{
"Klibs": ["ntp"],
"ManifestPassthrough": {
"ntp_servers": ["0.us.pool.ntp.org:123"],
"ntp_poll_min": "4",
"ntp_poll_max": "6"
}
}
➜ g cat main.go
package main
import (
"fmt"
"time"
)
func main() {
for i := 0; i < 10; i++ {
fmt.Println("Current Time in String: ", time.Now().String())
time.Sleep(8 * time.Second)
}
}
Run it once to build the image:
GOOS=linux go build
ops run -c config.json g
Then you can grep the ps output of the 'ops run' command which should look something like this:
and tack on the clock modifier to the end of the command:
-rtc base="2021-04-20",clock=vm
➜ g qemu-system-x86_64 -machine q35 -device
pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3
-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x3.0x1
-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x3.0x2
-device virtio-scsi-pci,bus=pci.2,addr=0x0,id=scsi0 -device
scsi-hd,bus=scsi0.0,drive=hd0 -vga none -smp 1 -device isa-debug-exit -m
2G -accel hvf -cpu host,-rdtscp -no-reboot -cpu max,-rdtscp -drive
file=/Users/eyberg/.ops/images/g.img,format=raw,if=none,id=hd0 -device
virtio-net,bus=pci.3,addr=0x0,netdev=n0,mac=e6:44:9c:ca:fd:ca -netdev
user,id=n0,hostfwd=tcp::8080-:8080 -display none -serial stdio -rtc base="2021-04-20",clock=vm
en1: assigned 10.0.2.15
Current Time in String: 2021-04-20 00:00:00.093924867 +0000 UTC
m=+0.001076816
en1: assigned FE80::E444:9CFF:FECA:FDCA
Current Time in String: 2021-04-22 12:06:53.91767521 +0000 UTC
m=+8.008671695

Radar

The radar klib allows to send telemetry/crash data to an external Radar APM service.
The actual radar klib is pre-configured to send data to https://radar.relayered.net:443 and requires an api key Radar-Key that can be provided as an environment variable. The available configuration items that can be passes as envs are:
  • RADAR_KEY - mandatory, used to set Radar-Key and enable radar klib fuctionality
  • RADAR_IMAGE_NAME - optional, used to set an "imageName" that will be sent to the APM
The radar klib depends on the tls klib for cryptographic operations.
Sample config to activate the klib functionality:
{
"Env": {
"RADAR_KEY": "RADAR_KEY_xyz",
"RADAR_IMAGE_NAME": "RADAR_IMAGE_NAME_xyz"
},
"Klibs": ["radar", "tls"]
}

Radar boot report

Upon boot, the radar klib will send some initial data to the Radar server and expects to get back a numeric "id" that will be assigned to this boot event as "bootID".
The information sent looks like this:
POST /api/v1/boots HTTP/1.1
Host: radar.relayered.net
Content-Length: 116
Content-Type: application/json
Radar-Key: RADAR_KEY_xyz
{
"privateIP": "10.0.2.15",
"nanosVersion": "0.1.45",
"opsVersion": "master-dc83aee",
"imageName": "RADAR_IMAGE_NAME_xyz"
}
The radar server should return back an unique id:
{"id":1234}
Note: If there is a crash dump to be sent, it will be sent before sending the boot report.

Radar metrics report

The radar klib retrieves from the kernel ifromation about:
  • memory usage - retrieved from the kernel total memory usage in bytes every minute, and sends retrieved data to the server every 5 minutes. The 5 samples are sent under "memUsed" attribute.
  • disk usage - retrieved ans sent every 5 minutes. Date is sent under "diskUsage" attribute whose value is an array of JSON objects (one for each disk mounted by the instance). Each array element contains 3 attributes:
    • "volume": a string that identifies the volume. For the root volume, the string value is "root", for any additional volumes, the string value is the volume label if present, or the volume UUID if the label is not present
    • "used": the number of bytes used by the filesystem (file contents and meta-data) in the volume
    • "total": the total number of bytes in the storage space of the volume, i.e. the upper limit for the "used" attribute value
The information sent looks like this:
POST /api/v1/machine-stats HTTP/1.1
Host: radar.relayered.net
Content-Length: 138
Content-Type: application/json
Radar-Key: RADAR_KEY_xyz
{
"bootID": 1234,
"memUsed": [
68329472,
68329472,
68329472,
68329472,
68329472
],
"diskUsage": [
{
"volume": "root",
"used": 5516288,
"total": 39779328
},
{
"volume": "2375a5a1-2d36-15cf-ea6b-2fa01df2ccde",
"used": 12345,
"total": 39779328
}
]
}

Radar crash report

A "crash" is defined as either a user application fatal error (e.g. anything that causes the application to terminate with a non-zero exit code), or a kernel fatal error. When any of these happens, before shutting down the VM the kernel dumps on disk a trace of log messages (printed by both the user application and the kernel). At the next boot, the radar klib detects the log dump and sends it to the radar server as a crash report. The value of the "id" field of a crash report is the same as the "boot id" (sent right after each boot), so that a crash can be unequivocally associated to a boot.
Note: The log dump is saved in a section of the disk that is being carved between stage2 and the boot filesystem.
The information sent looks like this:
POST /api/v1/crashes HTTP/1.1
Host: radar.relayered.net
Content-Length: 213
Content-Type: application/json
Radar-Key: RADAR_KEY_xyz
{
"bootID":1234,
"nanosVersion":"0.1.45",
"opsVersion":"master-dc83aee",
"imageName":"RADAR_IMAGE_NAME_xyz",
"dump":"en1: assigned 10.0.2.15\nen1: assigned FE80::2885:1DFF:FE7E:9D71\nProcess crash message in here\n"
}
Note: In order for the submission to be considered successful, radar server needs to respond with a "specific" confirmation message, otherwise the crash dump submit will be attempted over and over.

SpecialFiles

The 'special_files' klib provides a set of pseudo-files that are Nanos specific.
In particular the config below will populate the path of /sys/devices/disks with the name of each disk attached with the volume name and UUID of the disk:
{
"RunConfig": {
"QMP": true
},
"Klibs": ["special_files"],
"ManifestPassthrough": {
"special_files": {
"disks": {}
}
},
"Mounts": {
"bob": "/bob"
}
}
You can test this behavior with the below program:
package main
import (
"fmt"
"os"
"time"
)
func main() {
for i := 0; i < 30; i++ {
body, err := os.ReadFile("/proc/mounts")
if err != nil {
fmt.Println(err)
}
fmt.Print(string(body))
body, err = os.ReadFile("/sys/devices/disks")
if err != nil {
fmt.Println(err)
}
fmt.Print(string(body))
time.Sleep(2 * time.Second)
}
}
This functionality provides similar lsblk like functionality you might find on linux, however, different cloud providers do not put the same uniquely identifiable information into the serial/id of the device so we implemented this instead.

Syslog

If you can point your app at a syslog we encourage you to do that:
https://nanovms.com/dev/tutorials/logging-go-unikernels-to-papertrail
However, if you have no control over the application than you can direct Nanos to use the syslog klib and it will ship everything over.
Just pass in the desired syslog server along with the syslog klib in your config.
If the "IMAGE_NAME" environment variable is present, it is used to populate the APP_NAME field in syslog messages, while the program name is used as a fallback.
For example, if running locally via user-mode you can use 10.0.2.2:
{
"Env": {
"IMAGE_NAME": "app-name-in-syslog-msg"
},
"ManifestPassthrough": {
"syslog": {
"server": "10.0.2.2",
"server_port": "514"
}
},
"Klibs": ["syslog"]
}
If you need the logs to be also stored in a file:
{
"Env": {
"IMAGE_NAME": "app-name-in-syslog-msg"
},
"ManifestPassthrough": {
"syslog": {
"server": "10.0.2.2",
"server_port": "514",
"file": "/tmp/sys.log",
"file_max_size": "8M",
"file_rotate": "9"
}
},
"Klibs": ["syslog"]
}
If you are running on Linux you can use rsyslogd for this. By default rsyslogd will not listen on UDP 514 so you can un-comment the lines in /etc/rsyslog.conf:
module(load="imudp")
input(type="imudp" port="514")
and restart:
sudo service rsyslog restart
also you can disable the dupe filter or add a timetstamp to skirt around that.

TUN

The tun klib is used to create TUN interfaces, namely network TUNnel, that simulate a network layer device operating in layer 3 of OSI Model carrying IP packets.
Can be used when running vpn gateways (i.e userspace wireguard).
The allowed configuration properties for the tun interface(s) are:
  • interface base name - i.e wg, tun
    • ipaddress - ip address of the interface
    • netmask - netmask of the interface
    • mtu - configures the mtu value of the interface, default value is 32768 (32KiB)
    • up - true/false manages the interface initial state
Sample config:
{
"Klibs": ["tun"],
"ManifestPassthrough": {
"tun": {
"wg": {
"ipaddress": "172.16.0.1",
"netmask": "255.255.255.0",
"up": "true",
"mtu": "1420"
}
}
}
}

Cloud Init - HTTP(s) to File/Env KLIB

Cloud Init has 3 functions.
  1. 1.
    For Azure machines it is auto-included to check in to the meta server to tell Azure that the machine has completed booting. This is necessary, otherwise Azure will think it failed.
  2. 2.
    One can include this on any platform to download one or more extra files to the instance for post-deploy config options. This is useful when security or ops teams are separate from dev or build teams and they might handle deploying tls certificates or secrets post-deploy. All files are downloaded before execution of the main program.
  3. 3.
    One can include this on any platform to populate the environment of the user process with variables whose name and value are retrieved from an HTTP(S) server during startup.
The cloud_init klib supports a configuration option to overwrite previous files: it's called overwrite. If you specify this option for a given file, by inserting an overwrite JSON attribute with any string value, cloud_init will re-download the file at every boot.
The cloud_init klib also supports simple authentication mechanisms by using auth config option. It uses HTTP Authorization request header Authorization: <auth-scheme> <authorization-parameters> where <auth-scheme> and <authorization-parameters> are configurable.
Certain caveats to be aware of:
  • Only direct download links are supported today. (no redirects)
  • HTTP chunked transfer encoding is not supported, (don't try to download a movie). If the source server uses this encoding, a file download may never complete.
  • When cloud_init cannot download one or more files, the kernel does not start the user program. The rationale for this is that we want all files to be ready and accessible when the program starts.
  • When used to populate the user environment, only string-valued attributes are converted to environment variables (non-string-valued attributes are ignored).
  • The cloud_init klib (download_env functionality) is not meant to be used to set environment variables whose value is used in other klibs or in the kernel code (e.g. the "IMAGE_NAME" environment variable used by the syslog klib), because the code that uses an environment variable can be executed before cloud_init sets a value for the variable.
Also, be aware that you set an appropriate minimum image base size to accomodate your files.
Example Go program:
package main
import (
"fmt"
"os"
)
func main() {
body, err := os.ReadFile("/nanos.md")
if err != nil {
fmt.Println(err)
}
fmt.Print(string(body))
}
Example config - no overwrite - existing destination file won't be changed/overwritten:
{
"BaseVolumeSz": "20m",
"Klibs": ["cloud_init", "tls"],
"ManifestPassthrough": {
"cloud_init": {
"download": [
{
"src": "https://raw.githubusercontent.com/nanovms/ops-documentation/master/README.md",
"dest": "/nanos.md"
}
]
}
}
}
Example config - overwrite - existing destination file will be replaced/overwritten at every boot:
{
"BaseVolumeSz": "20m",
"Klibs": ["cloud_init", "tls"],
"ManifestPassthrough": {
"cloud_init": {
"download": [
{
"src": "https://raw.githubusercontent.com/nanovms/ops-documentation/master/README.md",
"dest": "/nanos.md",
"overwrite": "t"
}
]
}
}
}
Example config, basic access authentication - auth - Authorization header will be added to the request:
{
"BaseVolumeSz": "20m",
"Klibs": ["cloud_init", "tls"],
"ManifestPassthrough": {
"cloud_init": {
"download": [
{
"src": "https://httpbin.org/hidden-basic-auth/user/passwd",
"dest": "/basic_auth_test.txt",
"auth": "Basic dXNlcjpwYXNzd2Q="
}
]
}
}
}
Example config, with environment variables management - download_env config:
The configuration syntax in the manifest to use this functionality is as in the following example:
"ManifestPassthrough": {
"cloud_init": {
"download_env": [
{
"src": "http://10.0.2.2:8200/v1/secret/data/hello",
"auth": "Bearer hvs.6v6yY5yZf32uZzaa7HhiX6AZ",
"path": "attr1/attr2"
}
]
}
},
The download_env attribute is an array where each element specifies
  • src - download source URL
  • auth - optional authentication header
  • path - optional attribute path
For each element in the download_env attribute, the klib executes an HTTP request to the specified URL, and if the peer responds successfully, the response body is parsed as a JSON object, and from this object an "environment object" is retrieved.
If the attribute path (i.e. the path element in the manifest) is not present (or is empty), the environment object corresponds to the root JSON object of the response body, otherwise, for each element in the attribute path (where the element separator is the / character), a nested JSON object is retrieved from the response body by looking up an attribute named after the element (the corresponding attribute value must be a JSON object): the environment object is the nested object corresponding to the last element of the attribute path.
Once the environment object is identified, all string-valued attributes in this object are converted to environment variables (non-string-valued attributes are ignored).
Examples:
an empty attribute path can be used to retrieve the environment variables from the following response body:
{
"VAR1": "value1",
"VAR2": "value2"
}
an attribute path set to "obj1/obj2" can be used to retrieve the environment variables from the following response body:
{
"obj1": {
"obj2": {
"VAR1": "value1",
"VAR2": "value2"
}
}
}
"VAR1" and VAR2, being non-string-valued attributes, will be ignored from the following response body:
{
"VAR1": 1,
"VAR2": true,
"VAR3": "value3"
}
Full example:
nanos config:
{
"BaseVolumeSz": "20m",
"Klibs": [
"cloud_init",
"tls"
],
"ManifestPassthrough": {
"cloud_init": {
"download": [
{
"src": "https://httpbin.org/hidden-basic-auth/user/passwd",
"dest": "/cloud_init_test.json",
"overwrite": "t",
"auth": "Basic dXNlcjpwYXNzd2Q="
}
],
"download_env": [
{
"src": "https://httpbin.org/hidden-basic-auth/user/passwd",
"auth": "Basic dXNlcjpwYXNzd2Q=",
"path": ""
}
]
}
}
}
go program sample:
package main
import (
"fmt"
"os"
)
const (
AUTH_RESULT_FILE = "cloud_init_test.json" // {"authenticated": true, "user": "user"}
ENV_AUTHENTICATED = "authenticated"
ENV_USER = "user"
)
func main() {
// ManifestPassthrough.cloud_init.download
authResult, err := os.ReadFile(AUTH_RESULT_FILE)
if err != nil {
fmt.Printf("error reading file - %s (%s)\n", AUTH_RESULT_FILE, err.Error())
} else {
fmt.Printf("%s - content:\n", AUTH_RESULT_FILE)
fmt.Printf("%s\n", string(authResult))
}
// ManifestPassthrough.cloud_init.download_env
authenticated, ok := os.LookupEnv(ENV_AUTHENTICATED)
if !ok {
fmt.Printf("ENV: %q - not found\n", ENV_AUTHENTICATED)
} else {
fmt.Printf("ENV: %q = %q\n", ENV_AUTHENTICATED, authenticated)
}
// ManifestPassthrough.cloud_init.download_env
user, ok := os.LookupEnv(ENV_USER)
if !ok {
fmt.Printf("ENV: %q - not found\n", ENV_USER)
} else {
fmt.Printf("ENV: %q = %q\n", ENV_USER, user)
}
}
from the result, as expected, "authenticated" - being non-string-valued attribute, is not available in the environment:
booting /home/ops/.ops/images/cloudinit_test ...
...
cloud_init_test.json - content:
{
"authenticated": true,
"user": "user"
}
ENV: "authenticated" - not found
ENV: "user" = "user"
...

Firewall

This klib implements a network firewall. The firewall can drop IP packets received from any network interface, based on a set of rules defined in the manifest. Each rule is specified as a tuple located in the rules array of the firewall tuple in the manifest. Valid attributes for a firewall rule are the following:
  • ip: matches IPv4 packets, and is a tuple that can have the following attributes:
    • src: matches packets based on the source IPv4 address, which is specified with the standard dotted notation aaa.bbb.ccc.ddd, can be prefixed by an optional ! character which makes the rule match packets with an address different from the provided value, and can be suffixed with an optional netmask (with format /, where can have a value from 1 to 32) which matches packets based on the first part of the provided address
  • ip6: matches IPv6 packets, and is a tuple that can have the following attributes:
    • src: matches packets based on the source IPv6 address, which is specified with the standard notation for IPv6 addresses, and similarly to its IPv4 counterpart can be prefixed with a ! character and suffixed with a netmask (with allowed values from 1 to 128)
  • tcp: matches TCP packets, and is a tuple that can have the following attributes:
    • dest: matches packets based on the TCP destination port (whose value can be prefixed with a ! character to negate the logical comparison with the port value in a packet)
  • udp: matches UDP packets, and is a tuple that can have the following attributes:
    • dest: matches packets based on the UDP destination port (whose value can be prefixed with a ! character to negate the logical comparison with the port value in a packet)
  • action: indicates which action should be performed by the firewall when a matching packet is received: allowed values are "accept" and "drop"; if this attribute is not present, the default action for the rule is to drop matching packets
Firewall rules are evaluated for each received packet in the order they are defined, until a matching rule (i.e. a rule where all the attributes match the packet contents) is found and the corresponding action is executed. If a packet does not match any rule, it is accepted.
Example contents of Ops configuration file:
  • accept all TCP packets to port 8080, drop all other packets:
{
"Klibs": ["firewall"],
"ManifestPassthrough": {
"firewall": {
"rules": [
{"tcp": {"dest": "8080"}, "action": "accept"},
{"action": "drop"}
]
}
}
}
  • accept all packets coming from IP address 10.0.2.2, drop packets from other addresses unless they are to TCP port 8080:
{
"Klibs": ["firewall"],
"ManifestPassthrough":