-
Notifications
You must be signed in to change notification settings - Fork 31
Active Network Testing
DOCUMENT STATUS: DRAFT
The idea of a netprobe
handler follows the learnings of the September 2022 Hackathon, where Cloudprober was successfully implemented as a new back-end to the Orb agent. Though functional, the cloudprober approach had a few shortcomings, most notably in the metrics it provided. An alternate approach using pktvisor is the intent behing the netprobe
handler.
The netprobe
handler should support multiple test types and should be extensible to add additional test types in the future. For an initial release, supporting the PING (ICMP Echo) test type should be sufficient.
The following test types should be considered for future development:
- PING with Jitter measurements
- TCP Connect
- HTTP/HTTPS
- UDP Echo
The following set of metrics should be supported across all test types:
Metric | Type | Unit of Measure | Metric Group | Description |
---|---|---|---|---|
attempts | counter | integer | Counters | Number of times a test was run in the reporting interval |
successes | counter | integer | Counters | Number of successful tests in the reporting interval |
response_max | gauge | time (µsecs) | Counters | Maximum response time measured in the reporting interval |
response_min | gauge | time (µsecs) | Counters | Minimum response time measured in the reporting interval |
response_quantiles | quantiles | time (µsecs) | Quantiles | Quantiles of test response times (P50/90/95/99) |
Here are a few concepts to keep in mind in reviewing this:
- the
netprobe
tap provides the basic facilities to run network tests, but should only include host specific configuration settings and default configuration overrides - the
netprobe
input is where the network tests are defined and configured, as well as any default configuration overrides - the
netprobe
handler is where the metrics that are to be measured and collected are configured
Here is a sample configuration for the netprobe
tap:
version: "1.0"
visor:
taps:
default_netprobe:
input_type: netprobe
config:
maximum_concurrent_tests: 10
ip_source_binding: 127.0.0.1
tags:
virtual: true
vhost: 1
Here is a sample policy for the netprobe
input and handler:
version: "1.0"
visor:
policies:
basic_ping_policy:
kind: collection
description: "basic PING netprobe policy"
input:
tap: default_netprobe
input_type: netprobe
config:
test_type: ping
interval_msec: 2000
timeout_msec: 1000
packets_per_test: 10
packets_interval_msec: 25
packet_payload_size: 56
disable_scout_packet: false
disable_integrity_check: false
targets:
test_1_name:
target: foo.bar
test_2_name:
target: 10.0.0.1
tos: EF
handlers:
config:
num_periods: 2 #default is 5
modules:
default_ping:
type: netprobe
metric_groups:
enable:
- quantiles
- dns_resolution
disable:
- jitter