Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cisco NCS Polling #493

Open
pmawsonau opened this issue May 4, 2021 · 4 comments
Open

Cisco NCS Polling #493

pmawsonau opened this issue May 4, 2021 · 4 comments

Comments

@pmawsonau
Copy link

We are running snmpcollector 0.8.0 (yes we need to upgrade) to poll a number of devices. Our network is made up of Juniper and Cisco.

We are having issues in particular with Cisco NCS. Every 6 hours (which lines up with the full walk) it fails to poll for up to 20minutes. The logs show this

time="2021-05-04 01:25:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:25:00" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:25:00" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:25:00" level=info msg="STATS SNMP GET: snmp polling took [0.911181 seconds] SNMP: Gets [650] , Processed [340], Errors [0]"
time="2021-05-04 01:25:00" level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
time="2021-05-04 01:25:00" level=info msg="STATS INFLUX: influx send took [0.000008 seconds]"
time="2021-05-04 01:30:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:30:00" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:30:00" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:30:00" level=info msg="STATS SNMP GET: snmp polling took [0.914380 seconds] SNMP: Gets [650] , Processed [340], Errors [0]"
time="2021-05-04 01:30:00" level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
time="2021-05-04 01:30:00" level=info msg="STATS INFLUX: influx send took [0.000003 seconds]"
time="2021-05-04 01:35:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:35:01" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:35:01" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:35:01" level=info msg="STATS SNMP GET: snmp polling took [1.010677 seconds] SNMP: Gets [650] , Processed [340], Errors [0]"
time="2021-05-04 01:35:01" level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
time="2021-05-04 01:35:01" level=info msg="STATS INFLUX: influx send took [0.000002 seconds]"
time="2021-05-04 01:40:00" level=info msg="SNMPDEVICE [SNIP] Init gather cycle mode Concurrent [ true ]"
time="2021-05-04 01:40:30" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.6) get error: Request timeout (after 1 retries)
time="2021-05-04 01:41:00" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.10) get error: Request timeout (after 1 retries)
time="2021-05-04 01:41:30" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.18) get error: Request timeout (after 1 retries)
time="2021-05-04 01:42:00" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.31.1.1.1.15) get error: Request timeout (after 1 retries)
time="2021-05-04 01:42:30" level=error msg="MEASUREMENT [interface_counters] SNMP WALK (103.1.X.X) for OID (.1.3.6.1.2.1.2.2.1.8) get error: Request timeout (after 1 retries)
time="2021-05-04 01:42:30" level=info msg="MEASUREMENT [interface_counters] Not Oid CONDITIONEVAL metrics exist on this measurement"
time="2021-05-04 01:42:30" level=info msg="MEASUREMENT [interface_counters] Not EVAL metrics exist on this measurement"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCOutOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:HundredGigE0/0/0/19]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252090 Valid:false CookedValue:238289135 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.4169110
17 LastTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.10.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifAlias] from MEASUREMENT[ interface_counters ] with TAGS [map[ifName:HundredGigE0/0/0/19 hostname:snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002523f0 Valid:false CookedValue: LINK | SNIP | et-0/0/0 | 0 CurValue: LastValue: CurTime:2021-05-
04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x9477f0 RealOID:.1.3.6.1.2.1.31.1.1.1.18.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHighSpeed] from MEASUREMENT[ interface_counters ] with TAGS [map[DC:SNIP STATE:NT ifName:HundredGigE0/0/0/19 snip AS:9669]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252240 Valid:false CookedValue:100000 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 La
stTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.15.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifOperStatus] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT ifName:HundredGigE0/0/0/19 snip ]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002522d0 Valid:false CookedValue:1 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
me:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.2.2.1.8.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT ifName:HundredGigE0/0/0/19 snip ]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:54151875 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017
LastTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.6.82 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="MEASUREMENT [interface_counters] error in influx point creation :point without fields is unsupported"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime
:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.6.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCOutOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252090 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTim
e:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.10.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifAlias] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002523f0 Valid:false CookedValue: CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime:0001-
01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x9477f0 RealOID:.1.3.6.1.2.1.31.1.1.1.18.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHighSpeed] from MEASUREMENT[ interface_counters ] with TAGS [map[ STATE:NT ifName:TenGigE0/0/0/41/3 snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252240 Valid:false CookedValue:10000 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastT
ime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.15.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifOperStatus] from MEASUREMENT[ interface_counters ] with TAGS [map[hostname:snip ifName:TenGigE0/0/0/41/3]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002522d0 Valid:false CookedValue:2 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime
:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.2.2.1.8.128 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="MEASUREMENT [interface_counters] error in influx point creation :point without fields is unsupported"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[DC:SNIP ifName:HundredGigE0/0/0/40 STATE:NT snip AS:9669]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
me:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.6.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCOutOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT snip ifName:HundredGigE0/0/0/40]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252090 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastT
ime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.10.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifAlias] from MEASUREMENT[ interface_counters ] with TAGS [map[ ifName:HundredGigE0/0/0/40 STATE:NT snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002523f0 Valid:false CookedValue: CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTime:000
1-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x9477f0 RealOID:.1.3.6.1.2.1.31.1.1.1.18.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHighSpeed] from MEASUREMENT[ interface_counters ] with TAGS [map[ ifName:HundredGigE0/0/0/40 STATE:NT snip]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252240 Valid:false CookedValue:100000 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 La
stTime:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.31.1.1.1.15.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifOperStatus] from MEASUREMENT[ interface_counters ] with TAGS [map[snip ifName:HundredGigE0/0/0/40 STATE:NT]] has obsolete data => See Metric Runtime [ &{cfg:0xc0002522d0 Valid:false CookedValue:2 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi
me:0001-01-01 00:00:00 +0000 UTC ElapsedTime:0 Compute: Scale:0x9452f0 Convert:0x948ab0 SetRawData:0x945570 RealOID:.1.3.6.1.2.1.2.2.1.8.61 Report:1 re: mm:[] expr: condflt: log:0xc0003a43c0} ]"
time="2021-05-04 01:42:30" level=warning msg="MEASUREMENT [interface_counters] error in influx point creation :point without fields is unsupported"
time="2021-05-04 01:42:30" level=warning msg="Warning METRIC ID [ifHCInOctets] from MEASUREMENT[ interface_counters ] with TAGS [map[STATE:NT snip ifName:HundredGigE0/0/0/45]] has obsolete data => See Metric Runtime [ &{cfg:0xc000252000 Valid:false CookedValue:0 CurValue: LastValue: CurTime:2021-05-04 01:35:00.001019441 +0000 UTC m=+167157.416911017 LastTi

We have tweaked get bulk and fiddled with Max repetitions between 30 and 100, and even disabled it with no changes.

Is this a poller or network device issue

Thanks

@sbengo
Copy link
Collaborator

sbengo commented May 7, 2021

Hi @pmawsonau , apologies for the late answer

As a first sight, it doesn't seem to be related with the SNMPCollector as itself, but we need the following information:

  • Total number of devices vs Cisco NCS devices
  • Gather frequency and Filter Frequency cycles (SNMP Device configuration)
  • Timeout/Retries configured on Cisco NCS devices
  • Number of measurements (indexed ones) and filters attached to the Cisco NCS devices
  • Avg. Gather time when the polls are succeded

@toni-moreno
Copy link
Owner

Hello @pmawsonau

If there are not any update in next hours we will close this issue.
Thank you.

@pmawsonau
Copy link
Author

Hi,
Apologies for delay.
This issue is across all our collectors (we have a number), but for this example I will look at syd01 only
Total Devices: 137
NCSs included in total: 15
Device Setting Timeout: 60
Device Setting Retries: 2
Polling Settings Freq: 300
Polling Settings UpdateFitFreq: 60

For one NCS that has the issues, logfile

level=info msg="STATS SNMP GET: snmp polling took [0.351648 seconds] SNMP: Gets [662] , Processed [357], Errors [0]"
level=info msg="STATS SNMP FILTER: filter polling took [0.000000 seconds] "
level=info msg="STATS INFLUX: influx send took [0.000004 seconds]"

@sbengo
Copy link
Collaborator

sbengo commented Oct 22, 2021

Hi @pmawsonau

According to your configuration, the measurement index (mesurement walks + filters) will be each 5 hours, not 6

Anyway, we do not discard any SNMPCollector issue, so, please:

  • Update to 0.11 version (we have updated the core SNMP lib and other improvements)
  • Try to create a new instance with only one Cisco NCS and set the UpdateFiltFreq to lower value (i.e -6) and check if the timeout period fits into the "full walk" (i.e - every 30 minutes)

Thanks,
Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants