Skip to content

Mellanox SDK and PRM Sniffer Utility CLI Design for SONiC

Kebo Liu edited this page Apr 18, 2018 · 2 revisions

Mellanox SDK and PRM Sniffer Utility CLI Design for SONiC

Rev 0.2

Revision

Rev Date Author Change Description
0.2 Liu Kebo Initial version

About This Manual

This document is intended to provide information about the Mellanox SDK/FW sniffer utilities and how to implement a CLI to use these utilities in SONiC system.

SDK sniffer will record the RPC calls from the Mellanox SDK user API library to the sx_sdk task into a .pcap file. This .pcap file can be replayed afterward to get the exact same state on SDK and FW to reproduce and investigate issues.

In some case if we want to detect the interaction between the SDK and FW, we can enable the PRM sniffer to record the communication to human readable format log file, then MLNX support team can analyze this log file to identify where the problem is.

These two sniffers are independent and can work simultaneously.

To enable these two sniffers, need to set some specific environment variable and restart the SDK again.

1. Functional Requirements

The new CLI shall provide a user interface to expose the above SDK and PRM sniffer debug utilities in SONiC system.

2. Function Design

The enable/disable of SDK and PRM sniffer are controlled by some environment variable which will be passed to SDK task during the startup.

In SONiC case, SDK task resides in syncd container, thus to enable/disable the sniffers need to set/unset some environment variables inside syncd container and pass them to sx_sdk task. One possible way is to manipulate the supervisord configuration of syncd container.

For the convenience of debugging, the sniffer file shall be stored in the host file system instead of in the container, to achieve this volume will be used to bind a directory of the host file system to a directory of the container. This can be done with adding volume bind options to docker run command.

A new folder will be created to store the sniffer files: "/var/log/mellanox/sniffer/"

For the SDK sniffer, result file will be stored in a .pcap file, which includes a time stamp of the starting time in the file name, for example, "sx_sdk_sniffer_20180224081306.pcap".

PRM sniffer result file name will also contain a starting timestamp, like "prm_recording_20180225111422.log".

So the major work of this CLI will be composed of a set of actions which manipulate the supervisord configuration of syncd docker container and restart the related services:

  1. Add/Delete related ENV variable configuration to the syncd supervisord configuration
  2. Restart the swss service to reload all the related modules/services include SDK to make sniffer start to work

2.1 Add new volume mapping for sniffer files storage

Sniffer file will be stored in the directory "/var/log/mellanox/sniffer/" of the host, to set the volume, one option need to be added to the original syncd container create command which is in file "/usr/bin/syncd.sh":

	-v /var/log/mellanox/sniffer:/var/log/mellanox/sniffer:rw   

2.2 Manipulate related ENV variable for sniffer

To enable the SDK/PRM sniffer need to pass 2 sets of environment variables to the syncd container to have the SDK started with sniffer enabled:

	SX_SNIFFER_ENABLE
	SX_SNIFFER_TARGET 
   
	PRM_SNIFFER
	PRM_SNIFFER_FILE_PATH

Each set of them control the enable/disable and the file path of corresponding sniffer.

In the case to enable the sniffer, save the related configuration to a supervisord configuration file and upload it to the folder "/etc/supervisor/conf.d" of syncd container:

	[program:syncd]
	environment=SX_SNIFFER_ENABLE="1",SX_SNIFFER_TARGET="/var/log/mellanox/sniffer/sx_sdk_sniffer_20180224081306.pcap"

2.3 Reload SWSS service

To restart the SDK and have the whole system work properly after SDK restarted, some related modules and service also need to be restarted. SWSS service restart can guarantee all the impacted modules and services be restarted in the proper sequence. The command is :

service swss restart 

2.4 New "config platform mlnx sniffer enable/disable" CLI command design

Sniffer CLI will be implemented to run the commands mentioned above to enable or disable the SDK/PRM sniffer, or both of them.

SONiC:# config platform mlnx sniffer ?
Usage: sniffer [OPTIONS] COMMAND [ARGS]...

  SONiC command line - 'Sniffer' command

Options:
  -?, -h, --help  Show this message and exit.

Commands:
  sdk     sdk sniffer
  prm     prm sniffer
  all     all sniffers
  status  sniffer running status


SONiC# config platform mlnx sniffer sdk ?   
Usage: sniffer sdk [OPTIONS] COMMAND [ARGS]...

  SDK Sniffers

Options:
  -?, -h, --help  Show this message and exit.

Commands:
  enable   Enable SDK sniffer
  disable  Disable SDK sniffer


SONiC# config platform mlnx sniffer prm ?   
Usage: sniffer disable [OPTIONS] COMMAND [ARGS]...

  PRM Sniffers

Options:
  -?, -h, --help  Show this message and exit.

Commands:
  enable   Enable PRM sniffer
  disable  Disable PRM sniffer

SONiC# config platform mlnx sniffer all ?   
Usage: sniffer disable [OPTIONS] COMMAND [ARGS]...

  SDK and PRM Sniffers

Options:
  -?, -h, --help  Show this message and exit.

Commands:
  enable   Enable SDK and PRM sniffer
  disable  Disable SDK and PRM sniffer

When sniffer enable/disable command are issued, a prompt for SWSS service restart will be showed and user needs to agree to proceed, or the command will be canceled.

Sniffer files names will also be shown after issuing the command.

3. Open Questions

Will log rotate be required?

For the PRM sniffer, it will generate a log file by default. Multi-files should not impact the analysis, maybe can consider doing log rotate for PRM sniffer file.

Clone this wiki locally