Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
abigserve
Sep 13, 2009

this is a better avatar than what I had before

Methanar posted:

Does anybody have any sweet Grafana dashboards they like for keep tracking of network utilization, or other metrics? I spent a good chunk of time yesterday yesterday getting Telegraf to poll some basic information like bytes_recv'd per interface to dump into InfluxDB, but my graphs suck!

There were so many ways of doing what I want, sFlow, SNMP, Telegraf, fancy EOS APIs and I really didn't know which to choose. Right now I'm going blind reading SNMP documentation to try that.

What does everybody else use?

We use statseeker currently but you could consider AKIPS which is by the guy that made statseeker:
https://www.akips.com/

The idea with these is no customization (basically) but you set em and forget em. Gives you all the basic metrics and the database is super fast so you get a lot of historical data without having to babysit it at all.

For more complicated stuff like monitoring BGP sessions or custom mibs or whatever I would probably still recommend Cacti.

Adbot
ADBOT LOVES YOU

Methanar
Sep 26, 2013

by the sex ghost


I made something useful! I'm so proud of myself.

What else should I graph!?

Sepist
Dec 26, 2005

FUCK BITCHES, ROUTE PACKETS

Gravy Boat 2k
How do you have negative bandwidth? Is it like solar power where you're generating grid free bandwidth and becoming a supplier instead of a consumer?

falz
Jan 29, 2005

01100110 01100001 01101100 01111010
If you graph egress and ingress it typically looks like that unless you want them to overlap.

Sepist
Dec 26, 2005

FUCK BITCHES, ROUTE PACKETS

Gravy Boat 2k
Oh, that makes sense. I haven't used an ops dashboard in a long time. I don't like that it shows a negative though, I would be worried people in the NOC would infer a traffic spike/dip backwards.

Thanks Ants
May 21, 2004

#essereFerrari


Yeah I've seen graphs like that before, but usually it doesn't use negative numbers for outbound traffic.

wolrah
May 8, 2006
what?
I'm pretty sure most RRDTool style bandwidth graphs I've seen have used negative numbers for outbound. I can confirm that pfSense does in 2.2 and prior (RRDTool is not used in 2.3 and beyond).

ate shit on live tv
Feb 15, 2004

by Azathoth
Honestly the "negative" bandwidth helps a lot with the directionality of the flows. Observium does negative bandwidth for outbound example.

If you have any firewalls that are handling NAT/Encryption/etc graphing their CPU/Memory usage is useful. Same if you have any software routers, Cisco ISRs/G2's/6500's if you still have those for some reason. Our Juniper SRXs have their flow tables, both connections per second and total flows, as well the Packet Forward Engine CPU/Memory. The Routing Engine doesn't get hit that hard, but the PFE can get brick walled pretty easily from our crawler cluster.

jwh
Jun 12, 2002

Methanar posted:



I made something useful! I'm so proud of myself.

What else should I graph!?

Forgive me, I've been out of the game for a while, but what tool is this?

Thanks Ants
May 21, 2004

#essereFerrari


https://grafana.com/

Partycat
Oct 25, 2004

This comment rolled across the cisco voip list the other day which could be of interest if you're working with a partner or are a service provider working with legacy UCM customers:

Charles Goldsmith posted:

Thanks for that Anthony, but something else that was in that presentation that I didn't know about, was the 8.6 or older installs. If the license isn't migrated prior to Dec 1, the customer will have to purchase the licenses again when they upgrade to 9.x or higher. Basically, license migrations for 8.6 and older will no longer be available after November, unless they have SWSS on 8.x and the slide says very few people will have that.

doomisland
Oct 5, 2004

Interface errors are helpful if you have a lot of interfaces. There is a way to just graph the top X based on deltas if I remember correctly. Observium has something similar where it just shows you interfaces with recent errors. If it works well enough you can turn on grafana alerts to email you or something.

Methanar
Sep 26, 2013

by the sex ghost

quote:

core-switch-4(config-if-Et1-48)#switchport trunk allowed vlan except 4090, 4093, 104







core-switch-4(config-if-Et1-48)#

This command took 3 times as long as I expected it to take and I'm pretty sure my heart stopped near the end.

I don't have any important to say, networks are just scary.

edit:




Jesus christ, the instant I pasted in my command to the second place it had to be, my VPN dropped, then my home network connection dropped. Entirely unrelated to what I had just done.

Now my heart stopped for real.

edit:

quote:

core-switch-3(config-if-Et1-48)#switchport trunk allowed vlan except 4090, 4093, 104
core-switch-3(config-if-Et1-48)#

Finally got my nerve back to finish what I was doing. Worked fine without any existential terror this time

Methanar fucked around with this message at 17:51 on Aug 31, 2017

Judge Schnoopy
Nov 2, 2005

dont even TRY it, pal
VAR did the Call Manager and Unity upgrade (migration, not in-place) last night. Besides the new router not being set for the correct call codec, everything was pretty peachy. A couple snags here and there because the CUCM environment is riddled with circular logic and lovely transform masks, but it got ironed out quick.

The biggest issue was the guy didn't set the Unity 11.5 pin requirements to the same as 8.6, so everybody's pin got wiped. 95% of people use the email function anyway so it wasn't the worst.

Overall I'm pretty satisfied with this VAR for doing their due diligence and getting the cutover done in one night with about 1 hour of phone downtime.

tortilla_chip
Jun 13, 2007

k-partite
Non-atomic commits. Womp.

GreenNight
Feb 19, 2006
Turning the light on the darkest places, you and I know we got to face this now. We got to face this now.

Judge Schnoopy posted:

VAR did the Call Manager and Unity upgrade (migration, not in-place) last night. Besides the new router not being set for the correct call codec, everything was pretty peachy. A couple snags here and there because the CUCM environment is riddled with circular logic and lovely transform masks, but it got ironed out quick.

The biggest issue was the guy didn't set the Unity 11.5 pin requirements to the same as 8.6, so everybody's pin got wiped. 95% of people use the email function anyway so it wasn't the worst.

Overall I'm pretty satisfied with this VAR for doing their due diligence and getting the cutover done in one night with about 1 hour of phone downtime.

Did you go to 11.6(1) or 11.5? Not sure if you use Finesse but 11.6 fixes some issues with Chrome.

Partycat
Oct 25, 2004

11.6 is only a thing for CCX, and yeah "finesse" is not the label I would give it.

Not sure how your PINs were lost unless those didn't migrate.

AXL integrate your users with the UCM and enable 'PIN Sync'. They can set the PIN in the UCM Sef Care portal and it replicates to UCXN. Much nicer than the PCA.

If you came from Unity, which I thought died at 7, PINs had a MD5 hash so you were advised to reset them so theyd be set as SHA1 for Connection, otherwise they would not be usable at some point. Maybe that's what happened?

GreenNight
Feb 19, 2006
Turning the light on the darkest places, you and I know we got to face this now. We got to face this now.

All our call center users love Finesse way more than the old CAD client. gently caress that and Tapilink.

Judge Schnoopy
Nov 2, 2005

dont even TRY it, pal
Went from 8.6 call manager and unity to 11.5 call manager, unity, and imp.

The tech doesn't really know what happened to the pins either but I had to reset 15 today, and everybody else uses the email function exclusively.

1000101
May 14, 2003

BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Methanar posted:

This command took 3 times as long as I expected it to take and I'm pretty sure my heart stopped near the end.

I don't have any important to say, networks are just scary.

edit:




Jesus christ, the instant I pasted in my command to the second place it had to be, my VPN dropped, then my home network connection dropped. Entirely unrelated to what I had just done.

Now my heart stopped for real.

edit:


Finally got my nerve back to finish what I was doing. Worked fine without any existential terror this time

Working with Juniper it's less scary. Things aren't in production as soon as you hit enter and configs only apply when you commit them.

Which brings me to 'commit confirmed 5' which basically will commit the config for 5 minutes and roll it back automatically (and instantly) if you don't commit one more time.

If you're on Cisco then you'll need to get into the habit of saving your config; doing a 'reload in 5'; apply your change and if things go south it will reboot in 5 minutes. Otherwise if things still work 'reload cancel' and move on with life.

falz
Jan 29, 2005

01100110 01100001 01101100 01111010
To be fair, IOS did wedge in "conf term revert timer 5" years ago. It requires archive config setup in advance and I've had mixed results with it so still prefer reload in. Junos commit comfirmed is still wayyyy better.

As far as vlan pruning all but 3 on IOS, couldn't you "sw trunk allowed vlan all" l then "sw trunk allowed vlan remove xxx" one at a time? Seems safer than using a single command with commas in it.

FatCow
Apr 22, 2002
I MAP THE FUCK OUT OF PEOPLE

1000101 posted:

Working with Juniper it's less scary.

Hah, we just had an outage because our SRXs pegged their CPUs for 5 minutes after a commit. It caused instability on our 911 system so we rolled the change back...causing another 5 minute CPU peg. We're still figuring out how to get out of this one.

Also -XR has Junos style commits.

Thanks Ants
May 21, 2004

#essereFerrari


One of our providers has to keep announcing maintenance windows because of memory leaks on their Juniper switches. I don't know if this is because they are just throwing new software at things without really testing them properly or if there's a particular issue in certain Juniper products.

Methanar
Sep 26, 2013

by the sex ghost

falz posted:

As far as vlan pruning all but 3 on IOS, couldn't you "sw trunk allowed vlan all" l then "sw trunk allowed vlan remove xxx" one at a time? Seems safer than using a single command with commas in it.

This was supposed to have been a very safe thing to do. Those vlans had absolutely no business running on the switch ports I pruned from. But you're right, definitely not doing multiple vlan prunes at once again.

What I was really doing was making a sandbox because there was a fun thing where when CARP was being set up, it somehow completely killed prod in a bad way and I had no idea why. Once CARP was up in a way that wasn't immediately trashcanning prod a few tcpdumps showed that CARP was fighting with the VRRP groups I had set up on our WAN edges for our public address spaces.

https://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol#Incompatibility_with_IANA_standards

1000101 posted:

Working with Juniper it's less scary. Things aren't in production as soon as you hit enter and configs only apply when you commit them.

Which brings me to 'commit confirmed 5' which basically will commit the config for 5 minutes and roll it back automatically (and instantly) if you don't commit one more time.

If you're on Cisco then you'll need to get into the habit of saving your config; doing a 'reload in 5'; apply your change and if things go south it will reboot in 5 minutes. Otherwise if things still work 'reload cancel' and move on with life.

A reboot on a wan edge is still a pretty horrible outcome. I suppose 5 minutes + reboot time to recover is better than broken until I get someone into the DC with a serial cable.

Thanks Ants
May 21, 2004

#essereFerrari


If it's critical then out-of-band management with console servers is a lifesaver

1000101
May 14, 2003

BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Methanar posted:

This was supposed to have been a very safe thing to do. Those vlans had absolutely no business running on the switch ports I pruned from. But you're right, definitely not doing multiple vlan prunes at once again.

What I was really doing was making a sandbox because there was a fun thing where when CARP was being set up, it somehow completely killed prod in a bad way and I had no idea why. Once CARP was up in a way that wasn't immediately trashcanning prod a few tcpdumps showed that CARP was fighting with the VRRP groups I had set up on our WAN edges for our public address spaces.

https://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol#Incompatibility_with_IANA_standards


A reboot on a wan edge is still a pretty horrible outcome. I suppose 5 minutes + reboot time to recover is better than broken until I get someone into the DC with a serial cable.

A worse outcome is having to travel to a remote data center or having to deal with remote hands.

quote:

One of our providers has to keep announcing maintenance windows because of memory leaks on their Juniper switches. I don't know if this is because they are just throwing new software at things without really testing them properly or if there's a particular issue in certain Juniper products.

Probably throwing new software at things. Juniper's been pretty solid for a lot of our customers.

abigserve
Sep 13, 2009

this is a better avatar than what I had before
Anyone experimenting with Openflow? I've had a number of people tell me it's "dead", which confuses me because the other side of SDN (ACI/NSX) is hot garbage but is being pushed extremely hard by the vendors. You literally can't talk to a Vmware person without them mentioning YOU SHOULD CONSIDER OUR SDN SOLUTION NSX.

Sheep
Jul 24, 2003
Can't be that poo poo if Google is deploying it in large scales, no?

Slickdrac
Oct 5, 2007

Not allowed to have nice things
We were looking at Openflow before, from networking perspective it was better than ACI but integration of it with vendor devices was (at least as of a year ago) pretty much garbage. ACI isn't too bad once you have a decent collection of scripts for it that can do about 90-99% of typical actions depending on your network usage. I love it on the corporate side where things are stable, but there's always something going on over on the projects network that is some wonky one off.

tortilla_chip
Jun 13, 2007

k-partite
Most folks are/were using OpenFlow to solve bin packing problems... and it turns out PCEP is better and lets you leverage that nice sunk capex cost in your MPLS infrastructure.

abigserve
Sep 13, 2009

this is a better avatar than what I had before

Slickdrac posted:

We were looking at Openflow before, from networking perspective it was better than ACI but integration of it with vendor devices was (at least as of a year ago) pretty much garbage. ACI isn't too bad once you have a decent collection of scripts for it that can do about 90-99% of typical actions depending on your network usage. I love it on the corporate side where things are stable, but there's always something going on over on the projects network that is some wonky one off.

Apparently takeup on ACI is pretty low - my account manager was saying they have very few customers buying it (in AU) and even fewer rolling it out.

And I definitely agree with the integration point - hopefully at some point Openflow becomes a ratified standard because at the moment support is all over the place which is probably why it's stuck mostly in academia (although we have a bigswitch install in prod that works a treat).

Docjowles
Apr 9, 2009

Is anyone aware of an emulator (a la GNS3) that supports Brocade ICX series ethernet switches, or some sort of VM image for that OS? My Google searches are telling me there's nothing out there, which is a bummer. I have to troubleshoot some issues with a site running this gear. I'm not super familiar with it and there are no spare devices available to lab on. And they don't have active support contracts, naturally :rip:

falz
Jan 29, 2005

01100110 01100001 01101100 01111010
Doubt it exists. There are a lot of ICX models, which do you have?

Docjowles
Apr 9, 2009

ICX 6650, would happily settle for 6450 too.

Thanks Ants
May 21, 2004

#essereFerrari


Depending on the issue you're looking to troubleshoot you might get away with any switch that runs FastIron. Nobody seems to want Brocade ethernet switches so they cost close to gently caress all on eBay, which might be an option.

Docjowles
Apr 9, 2009

Heh that is exactly my Plan B for the future. I need to work this particular issue out before an ebayed switch is likely to show up, but it would be good to have a lab box on hand going forward.

I was holding off since these are getting decommed in favor of some shiny new Arista poo poo in the near future. But hell there's an auction right now for a 6650 for less than $1k, seems worth it.

Bigass Moth
Mar 6, 2004

I joined the #RXT REVOLUTION.
:boom:
he knows...
To the other VOIP guys out there, are you seeing much work with Spark or newer tech? Anything with immersive telepresence? We do some with my company but most seem to stick with CUCM/Unity/IM&P bread and butter.

ate shit on live tv
Feb 15, 2004

by Azathoth

FatCow posted:

Hah, we just had an outage because our SRXs pegged their CPUs for 5 minutes after a commit. It caused instability on our 911 system so we rolled the change back...causing another 5 minute CPU peg. We're still figuring out how to get out of this one.

Also -XR has Junos style commits.

What kind of SRXs?

We just upgraded to 4200's and putting them in a cluster breaks either new sessions, or OSPF adjacencies. So we had to turn one off :/

falz
Jan 29, 2005

01100110 01100001 01101100 01111010

Docjowles posted:

ICX 6650, would happily settle for 6450 too.
We have a handful of these in production but fortunately most have been pulled. Previous eng opted to try to move org from J to brocade when he started and yeah, not a good idea.

Anyway we have those, ssh login is terrible (10 second delay to log in), DOM is the dumbest, I hate them. Anyway I hope you're not running 8.30, 8.10 is much more stable.

Adbot
ADBOT LOVES YOU

Partycat
Oct 25, 2004

Bigass Moth posted:

To the other VOIP guys out there, are you seeing much work with Spark or newer tech? Anything with immersive telepresence? We do some with my company but most seem to stick with CUCM/Unity/IM&P bread and butter.

Cisco's sales are pushing spark and spark flex plan very heavily . It is cheap compared to Webex but cannot be used as a conf bridge at the moment - you can sip call into spark rooms I think though.

Their presentations are flashy but the effort to stand up the infra and the costs for on prem to are still insane. The CMS is super cool but not for $10k for a license to have a call on it.

We were going to use spark for education to get students in there and have online classes but the 25 person cap in the meetings and no features kills it.

They have some work to get done yet on services interoperability with on prem and external resources . Preferably without it costing $99999999999 to kit it out.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply