Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
I've got something I'm hoping you all can help me with here. I'm fairly new to the Mikrotik world with most of my experience being with Cisco. Like many of you I work for a WISP.

In this situation I have a specific client that's seeing bandwidth overages and I had someone offer for us to take a look at where their data is going over the course of several days. We do not have any external appliances which would be able to track this- the tracking will have to be done in Mikrotik. My boss in under the impression that I can use torch but I'm not really seeing how this is possible given that it seem built to monitor traffic in realtime, given the max one hour timeout. The other options I see are using the sniffer tool to do a PCAP or traffic flow. The former creates files that are too large to maintain the pcap for more than an hour or so and I'm not personally familiar with how I would track bandwidth usage from a PCAP. I'm understanding traffic flow to basically be Netflow so with an external sever of some sort it seems like it may do the trick but would be a headache for a few reasons due to how we're set up.

How would you all deal with a similar request?

Adbot
ADBOT LOVES YOU

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
I work for a WISP and we just did a big upgrade in our core a few months ago- 4 CCR-1072 running in full mesh with iBGP. The company has been using Mikrotik in the core with minimal issue for years as far as I'm aware. I personally don't have a ton of experience with the gear though outside of the past year that I've been with the company.

The new core runs ok for a couple weeks at a time but I get these intermittent reboots on the core routers. To clarify, I have two "edge" routers that peer with our upsreams and are entirely focused on doing so, and the core routers peer with them and our internal OSPF network where our customer reside. For whatever unholy reason the latter will, as I said, reboot entirely at random.

This has immediately led to me discovering that there's a lot of information hidden in the autosuppout.rif that can only be read by Mikrotik Support, which frankly makes me never want to use their gear again. They have been taking forever to responds and have been entirely unhelpful after doing so, generally just wanting me to update the firmware without any clear reason as to why this may help.

There's really no point to this post aside from me venting to people that might understand my pain, unless somebody in here is either really knowledgeable about this stuff or has access to view the full autosuppout.rif file. I'm just getting really frustrated and regretting our decision to not spring for Cisco or Juniper gear.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

redeyes posted:

The thing about Mikrotik gear is you save (a poo poo LOAD) money upfront but don't even pretend you will get enterprise class support. Is it a tradeoff that is worth it? Up to your company.

I think the Cloudcore routers are different (ARM) SOC based. Of course this means they have a lot more horsepower I think this is a new arch for Mikrotik and there are bugs. This is probably why they want you to upgrade firmwares without reason. Bugs and more bugs. If I am completely full of poo poo, feel free to chastise this statement.

This actually reminds of another little factoid I've discovered that is really turning me off to using these things in our core. Given that the 1072 is the most badass router that they produce with 72 cores, one would expect you would mostly see them in places where BGP might be a major consideration. I found out to my utter horror that these things will only run the BGP process on a single CPU core which is now pegged at all times with only a small handful of upstream peers. We tried tweaking our timers a bit and it absolutely wrecked everything across the core, with constantly flapping sessions. I'm looking at trying to use BFD to improve recovery times in a failover situation but I'm afraid this will just break things even worse.

This was never a real issue in the past because we only had a pair of upstream providers and no access to peering exchanges or anything. We put a lot of effort into getting access to some of the local exchanges for cheaper transit etc and are finding that our hardware just doesn't work that well for what we want.

I ended up having a long conversation with one of their official certified trainers about the issues we were having and he told me (to paraphrase since it's been a few months) "Yeah, don't tell anybody I told you this since I'm supposed to be one of their advocates but I would never trust these things with anything important. Tower sites? They work great. Anything bigger than that is just asking for trouble though."

Pendent fucked around with this message at 00:30 on Sep 13, 2017

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

redeyes posted:

HAHAHA :( Condolences

jesus, I really only use them in SOHO poo poo, minor non-critical situations

For SOHO I feel like they'd be really solid, in all fairness.

I'm just bitter right now because I killed myself for literally months designing, building and migrating to this new core and now we're having issues not due to any flaws with my design or configuration but due to lovely software. On the plus side I guess this will might convince my boss to make the leap to Cisco sooner rather than later.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
Yeah, now that I know to look for it there's a bunch out there about how poo poo these things are for BGP. I had an opportunity to push harder to Cisco and I really regret not doing so. It just seemed at the time like our CCR-1036 was doing a pretty solid job and there is obviously a considerable price difference. It turns out that having two really stable peers on a single router is not nearly as hard as having several peers over some suprisingly unstable DWDM transport circuits.

We're slowly coming to the conclusion that we're just going to buy an ASR. Live and learn, I guess.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

CuddleChunks posted:

Or just upgrade to their latest half-tested firmware and roll the dice! :v:

I actually just did this the other night, funnily enough. :v:

It definitely feels like we're going to be upgrading sooner rather than later. We have some more money available than we did previously so it really feels like the big name gear is in the cards.

Pendent fucked around with this message at 03:06 on Sep 25, 2017

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
Oh hey, you guys might be interested to hear this.


I'm that stupid rear end in a top hat that put a bunch of CCR-1072's in my core as an engineer for a small ISP and was dealing with random reboots. The issue appears to have been caused by connection tracking, which was enabled due to one router doing a really specific NAT and the other doing a bit of firewalling.

It's frankly a really weird fix to me since each router would generally only be handling about 500mb/s of traffic at peak (for a bit over 1gb/s aggregate). These are fairly badass routers so I'm weirded out to see what looks like a performance issue at such low throughput. I do have the thought that they may have been caused by DDOS attacks and am working on coming up with better ways to monitor or prevent such issues in the future.

In the next six months we're still going to move to an ASR since as an organization we just don't feel like we can trust Mikrotik for anything really important anymore.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

unknown posted:

I'm frankly surprised that the router didn't die sooner/more often doing 500+mbps of connection tracking.

It's one of those situations where it's just the way things had always worked and I went along with it during configuration because I didn't fully understand how Mikrotik deals with connection tracking. There's other changes I've been wanting to make to our firewalling that feel a bit more pressing these days.

SamDabbers posted:

If you're doing less than 1Gbps, why not use some generic x86 servers with VyOS or some other Linux on them? You don't even need hardware offload for that amount of traffic.

Because our bandwidth usage has grown by like 30% in the past 8 months or so and the increase is only likely to accelerate- this is only looking at IP transit as well and ignores other services we're offering like transport for AWS Direct Connect. Given some of the clients we're onboarding I wouldn't be surprised if our transit usage alone is at 2-3 gbps by this time next year. We've been able to leverage a sort of unique city fiber buildout to start picking up some really big clients.

There's also some more complicated business stuff where we sold ourselves to a larger company in the area but are still basically independent. They'll be functionally using as as an upstream for stuff destined for the bay area and I believe they're up near the 5gb/s range.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

PUBLIC TOILET posted:

Alright, one more thing. I cocked up my MikroTik and had to reset/manually reconfigure. Sadly my last backup was from May of 2017. Now that I have it back in working order, what are folks doing for maintaining MikroTik backups? Specifically compact exports (gently caress actual backups as they're clearly useless.) I'm looking around on Google at people who are using elaborate scripts that e-mail themselves scheduled backups. I'm not sure I want something that elaborate, maybe just something that uses the scheduler to do a compact export to internal storage?

Rancid has a Mikrotik device type and has been completely amazing since I got it set up a few months ago. The initial setup is sort of a pain but after that it's incredibly easy to manage.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
That's low enough speed that something like speedtest.net should still be pretty accurate tbh

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

Atreus posted:

:parrot: ROS 7 anytime now. :parrot:

:negative:

Back when I was looking into our BGP issues I remember finding a post from something like 2012 - 2013 talking about how the they'd be resolving the issue with the BGP process being single threaded in 7.0. That was pretty eye opening for someone who runs/ran a lot of their production network on this hardware, let me tell you.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

Atreus posted:

That reminds me, how do you guys handle BGP if you're using these for full tables? ASR1k/9k?

I've currently got a pair of CCR-1072s handling my BGP which is fine when you've only got one or two transit providers but when you start getting access to the big internet exchanges you start seeing some severe scalability issues. Within the next 2-3 months that will be a single ASR-9006 with dual supervisors. My fancy design for my core OSPF routers with multiple 1072s and stacking switches will be replaced by a single Nexus 7k, also with redundant supervisors.

Finally having the income to justify buying big-boy gear makes life so much simpler and so, so much more reliable.

Pendent fucked around with this message at 21:06 on Mar 19, 2018

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
Is there a reason you don’t just use the ACLs that are tied to the services? Those seem to work pretty well to me

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

unknown posted:

New security vulnerability found in all recent versions (v6.29+) over winbox allowing remote download of the users file.

https://forum.mikrotik.com/viewtopic.php?f=21&t=133533


DON'T ROLL YOUR OWN CRYPTO

My ASR 9006 was ordered last Friday :)

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

thebigcow posted:

The people that made IPv6 didn't think DNS was necessary for a working network setup, so everything that distributes DNS information with IPv6 is a hacked up poo poo show.

Imagine I went and found my Hurricane Electric shirt before typing this for maximum effect.

I spent some time looking into migrating my network to IPv6 before eventually throwing up my hands and saying "Screw it, I can get a /22 for like $12k."

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

redeyes posted:

I have one of these new fiber ISPs which has 'carrier grade NAT'. It's basically double NAT. BUT they are fully IPv6 enabled so I set up my network with it. It's pretty cool to have so many publically available addresses.

Right now I'm just manually entering googles DNS on my Windows 10 box. I just cannot get it to pull DNS except over v4 and my ISP has bad v4 routing for whatever reason. My Linux box pulls DNS fine as does Android.

I may or may not be one of those ISPs doing CGNAT, actually. If I can make it work at least.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

Binary Badger posted:

I keep hearing RouterOS doesn't take advantage of multicore CPUs well or at all, it's still single threaded ITYOOL 2018, is this true or something made up by an ER-X fanboy?

My CCR-1072s reboot themselves with more than like 2-3 BGP peers doing full routes because the BGP process is single threaded and gets overwhelmed.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
I've seen people talking about how certain features will be introduced in RouterOS 7 in forums going back to like 2014 or earlier. Definitely vaporware.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

redeyes posted:

My ISP does carrier grade NAT and has their network configured like poop. I can run Winbox on a computer connected directly to the ISP and it finds their CCR1016 Cloud Core router. *sigh*

And the firmware is older, from last year. Oh boy.

Name and shame imo

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
Did you guys see this: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-14847

Feels like a constant stream of obnoxious exploits these days. This one fortunately doesn't appear to apply if you're using the winbox service to restrict access at least.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
We have explicitly permitted management IPs, which is really the standard for most networking gear in my experience. If that isn't good enough here, well I'll be probably 90% Mikrotik free by early next year.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

Methylethylaldehyde posted:

Basically everyone who makes internet facing networking gear is getting the poo poo hammered out of them now. Mikrotik is just the latest round of casualties. Cisco had some amazing as gently caress vulnerabilities a while back, and new ones keep getting discovered.

I am not aware of any recent Cisco vulnerabilities that allowed an attacker complete access to their devices with no authentication necessary.

Having worked with both types of gear for years I trust Cisco significantly more when it comes to this stuff.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

im depressed lol posted:

I just listen to the Security Now! podcast as background noise, but I remembered a tidbit regarding this. Sorry for the whole wall of text, but I'm not a genius (to put it lightly) when it comes to this stuff and wouldn't want to miss anything relevant.

https://www.grc.com/sn/sn-667.htm
June 12th, 2018

I have a lot less to say about their software offerings like WAAS aside from to say that I agree it's a lot less good that their core products. I'm referring to their mainline routing and switching gear though- stuff like Nexus series switches or ASRs. There are vulnerabilities to be sure but I haven't seen anything as egregious as this Winbox vuln from them in a long time.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

EssOEss posted:

I guess having a hardcoded root password or two does not count has no authentication, in a way.

These were not specifically routers but there is no reason to believe their router department is any different.

There is actually a ton of reason to expect that the routing and switching gear is better. People have had literally decades to attack IOS/NXOS etc and it's not like they aren't huge targets. When I look around at other ISP's racks in our various colos I see a hell of a lot of Cisco gear and there's a reason for that. I would be absolutely shocked if some sort of amateur hour exploit like this Mikrotik bug was discovered with the software in an ASR.

These other products are generally things they've bought and are not used nearly as widely. Something like the denial of service bug that was recently announced is annoying. Allowing unathenticated access to routing equipment is quite another and is completely unacceptable for anyone that gives the slightest poo poo about what these devices are doing.

Thanks Ants posted:

If there's an exploit that can bypass basic ACLs then we're hosed

I feel like it's just a matter of time and I am trying to pull my migration timeline forward as a result.

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
I decommissioned the CCR-1072s from my edge over the weekend. :rip: in piss

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer

jeeves posted:

I weep for anyone using Mikrotiks on their edge/BGP.

They're great for internal networks though. I mean, stub networks. I mean networks you don't care as much about.

Two days before they were set to go away a flapping peer caused one of my edge routers to poo poo itself so badly it actually physically bounced a bonded interface.

Please learn from my mistakes everyone. Do not trust mikrotik with anything you care about.

Adbot
ADBOT LOVES YOU

Pendent
Nov 16, 2011

The bonds of blood transcend all others.
But no blood runs stronger than that of Sanguinius
Grimey Drawer
I've had some buggy OSPF behavior even, mostly around route advertisements for directly connected networks. Then there's the random stability issues where they'll reboot more or less at random with a message about kernel failure in the log. I've still got 30-40 of various models in the field but their days are numbered.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply