Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
evol262
Nov 30, 2010
#!/usr/bin/perl

The whole point of systemd in a lot of ways is that it is not a single-threaded, synchronous pile of garbage.

systemctl enable debug-shell.service (if necessary, it'll be on VT12)
reboot
systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M

Run through the logs to find out what it's starting and in what order, and when it finishes.

Yes, this is annoying. But you can use this in conjunction with confirm_spawn on multi-user.target (or whatever) to poke into the system as much as you want while you start services.

Adbot
ADBOT LOVES YOU

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum
What does "stealing 8GB of ram" mean?

LochNessMonster
Feb 3, 2005

I need about three fitty


anthonypants posted:

What does "stealing 8GB of ram" mean?

Probably that it's using 8gb and he can't figure out what for.

ToxicFrog
Apr 26, 2008


anthonypants posted:

What does "stealing 8GB of ram" mean?

In this case, 4000 "huge pages" were getting reserved at boot. I don't really know anything about huge pages except the obvious, but from this I can infer that (a) they're 2MB each on Harik's system and (b) they show up as "allocated memory" even if no process is actually using them, presumably because not all memory allocations are eligible to be fulfilled by a huge page.

Odette
Mar 19, 2011

I just setup my own mail server with Dovecot/Postfix using virtual hosts/domains/etc. Have installed ClamAV, SpamAssassin, OpenDKIM & setup a few extras (SPF/DKIM/DMARC)

I've got Roundcube up with some plugins too. It's been an interesting experience just learning about everything involved.

Does anyone know any other cool things I can do in regards to mail? I've got iptables & fail2ban up, so that part is covered.

mike12345
Jul 14, 2008

"Whether the Earth was created in 7 days, or 7 actual eras, I'm not sure we'll ever be able to answer that. It's one of the great mysteries."





Just curious, how does Ubuntu (or Debian, for that matter) deal with a manually upgraded kernel when doing dist-upgrade? Say my own kernel is more recent than the one in the release I'm updating to, will it just skip dealing with the kernel, or install its kernel alongside?

ExcessBLarg!
Sep 1, 2001

mike12345 posted:

Say my own kernel is more recent than the one in the release I'm updating to, will it just skip dealing with the kernel, or install its kernel alongside?
If you have the kernel meta-package installed (linux-image-$arch or linux-image-generic) then dist-upgrade will install the latest kernel "ABI" package for your distribution (e.g., xenial will install linux-image-4.4.0-59-generic). Usually this means the new kernel will be installed alongside your existing kernel and they'll both have menu entries in GRUB with priority given to the kernel with the highest version.

But the exact behavior depends on how your manual kernel was installed. If you built it from upstream git or a tarball and manually copied it to /vmlinuz or something I'd hope the kernel image install script would recognize that and bail out, but I haven't done that in a long time. If you built a kernel .deb then it should still be installed on your machine after a dist-upgrade so long as the ABI number (and package name) aren't the same the disto version you're upgrading to, which they're most likely not.

Honestly though if you're running a custom kernel on a machine you may want to remove the kernel meta-package so you don't have to worry about booting a "wrong" kernel version. If you still want to keep the stock kernel around then just keep the .deb of your manual build around and reinstall it after dist-upgrade if needed. Unfortunately you may have to monkey with the /etc/default/grub GRUB_DEFAULT line or the grub menu directly (/boot/grub/grub.cfg) if your kernel version is older than the disto version. If it's newer you should be fine.

xzzy
Mar 5, 2009

Odette posted:

I just setup my own mail server with Dovecot/Postfix using virtual hosts/domains/etc. Have installed ClamAV, SpamAssassin, OpenDKIM & setup a few extras (SPF/DKIM/DMARC)

I've got Roundcube up with some plugins too. It's been an interesting experience just learning about everything involved.

Does anyone know any other cool things I can do in regards to mail? I've got iptables & fail2ban up, so that part is covered.

postgrey is really good too.

xtal
Jan 9, 2011

by Fluffdaddy
Can I take this opportunity to ask an email thing too? My email provider provides POP and IMAP but instead of backing it all up, I want to use POP to automatically move all the emails to my home server and then connect my devices to that (don't worry, they are not classified.) Does having a POP downloader and an IMAP server at home make the most sense? Would I use Dovecot and what else?

I'm still going to use my email provider's SMTP servers and only fetch mail from them after they've received it, so I don't think I need to be concerned about SPF and all that crap.

telcoM
Mar 21, 2009
Fallen Rib

xtal posted:

Can I take this opportunity to ask an email thing too? My email provider provides POP and IMAP but instead of backing it all up, I want to use POP to automatically move all the emails to my home server and then connect my devices to that (don't worry, they are not classified.) Does having a POP downloader and an IMAP server at home make the most sense? Would I use Dovecot and what else?

I'm still going to use my email provider's SMTP servers and only fetch mail from them after they've received it, so I don't think I need to be concerned about SPF and all that crap.

Yes, Fetchmail to pull mail from the ISP, then Dovecot for your own IMAP service.

You'll probably also need a local SMTP server (I found Postfix rather easy to set up for this kind of purpose) as "infrastructure" between Fetchmail and Dovecot, but you don't need to expose it to Internet nor even use it for sending emails, if you don't want to. However, if you have devices that might want to send you hardware status reports or similar as email, you might want to use that in your local network.

Just make sure you block TCP ports 25 and 587 from the Internet-side.
Block those ports for IPv6 too: if your Internet service provider has not enabled IPv6 yet, they might do it someday. My ISP did it in 2015.

Optional extra: set up a web server that can provide some email autoconfiguration information for your network. It can be tedious, but might be worth your while if you have a lot of devices that can use the autoconfiguration or your home mail server is also used by a non-technical person (spouse, kids...).
http://www.ullright.org/ullWiki/show/providing-email-client-autoconfiguration-information-from-moens-ch

Optional extra challenge for later, if you have a public IP address and not too onerous inbound port restrictions by your ISP: make your own SSL/TLS CA, make a certificate for the home server (or get one from some cheap/free certificate service), then configure for IMAPS and authenticated TLS-protected mail submission in port 587, set good passwords, test carefully within your home network, and then you might allow access to port 587 and the IMAPS port to the internet... and now your smartphone can access your home email server from anywhere.

xtal
Jan 9, 2011

by Fluffdaddy
Thanks a lot for the super helpful and informative reply!

xzzy
Mar 5, 2009

For the ultimate challenge drop $10/month on a digitalocean droplet and run email under your own domain. :v:

xtal
Jan 9, 2011

by Fluffdaddy

xzzy posted:

For the ultimate challenge drop $10/month on a digitalocean droplet and run email under your own domain. :v:

I actually did that for a while after Lavabit shut down my email account, but now my paranoia doesn't extend far enough for me to deal with SPF / DKIM / etc

xzzy
Mar 5, 2009

SPF and DKIM are not all that hard.. just take notes as you set it up so you can remember the details later.

It helps that there's a billion tutorials floating around out there these days.

jre
Sep 2, 2011

To the cloud ?



xtal posted:

I actually did that for a while after Lavabit shut down my email account, but now my paranoia doesn't extend far enough for me to deal with SPF / DKIM / etc

SPF and DKIM have nothing to do with paranoia and everything to do with avoiding all your email being blackholed

xtal
Jan 9, 2011

by Fluffdaddy

jre posted:

SPF and DKIM have nothing to do with paranoia and everything to do with avoiding all your email being blackholed

Right, I mean being paranoid enough to self-host entails that or your email isn't going anywhere.

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

Alright, I bought a refurbished Dell laptop. Installed Ubuntu 16.04, spent and hour configuring it how I wanted, copying some music over, installed a couple programs... Then I packed up and went to a coffee shop and turned the laptop back on. After about 5 minutes Chrome crashed, I couldn't start any programs, and running terminal commands to try find out what the gently caress happened just dumped garbage to my terminal. Restarted, got a GRUB prompt, then I said gently caress it and went home.

Got home, booted it back up, said the drive wasn't found. So I booted up Fedora 25, installed it, put my music and apps back on, hooked up my external monitors (HDMI and MiniDP outputs both worked for triple screens!) and played around some more. After dinner I fired the laptop back up, took it out to the living room and it crashed while I was sitting on the couch. My programs just quit running, I couldn't open new ones, couldn't log out...Rebooted and I got an error while booting saying my filesystems weren't checked, booted off the LiveUSB, fsck'ed /dev/fedora/root, then I restarted and boom...no disk found.

I took the factory SSD out since I didn't want to gently caress up the windows install, and replaced it with a Samsung 256GB SSD which I was using in a couple other machines without issues. Is my SSD crapping out? Are the later kernels just being stupid?

I put the factory SSD in and just re-installed Ubuntu - if it fucks up again I'm going to put my first guess as the SATA cable or motherboard itself, being a refurb and all.

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

Been running all day on the last Ubuntu install, must have been the drive.

Is there a reason why using the volume control in the menubar doesn't have any effect on individual applications like Audacious or RhythmBox?

apropos man
Sep 5, 2016

You get a hundred and forty one thousand years and you're out in eight!

Bob Morales posted:

Been running all day on the last Ubuntu install, must have been the drive.

Is there a reason why using the volume control in the menubar doesn't have any effect on individual applications like Audacious or RhythmBox?

The one in the menubar might be just affecting the wrong sound card. See what you can achieve by running alsamixer in the terminal. You can specify which sound card to tweak by using "alsamixer -c0", "alsamixer -c1" etc.

I find it handy if I've got a radio stream playing on my machine across the room and I want to turn the volume up or down. I just ssh from my laptop and use alsamixer :hurr:

Bob Morales
Aug 18, 2006


Just wear the fucking mask, Bob

I don't care how many people I probably infected with COVID-19 while refusing to wear a mask, my comfort is far more important than the health and safety of everyone around me!

apropos man posted:

The one in the menubar might be just affecting the wrong sound card. See what you can achieve by running alsamixer in the terminal. You can specify which sound card to tweak by using "alsamixer -c0", "alsamixer -c1" etc.

I find it handy if I've got a radio stream playing on my machine across the room and I want to turn the volume up or down. I just ssh from my laptop and use alsamixer :hurr:

I installed KDE and the volume mixer there works. I see now in GNOME/unity whatever you call it how you can control the sound for different sources/outputs, thanks.

Docjowles
Apr 9, 2009

Bob Morales posted:

Been running all day on the last Ubuntu install, must have been the drive.

Is there a reason why using the volume control in the menubar doesn't have any effect on individual applications like Audacious or RhythmBox?

Yeah, bizarro behavior like that is almost always a hardware issue in my experience. RAM or disk. If a popular distro shipped a kernel that was causing extreme instability like that, you'd have heard about it from tech news sites / Twitter / other broader audiences than this goony rear end thread.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
Vim's quickfix window keeps stealing my focus when I open it. Can I make it not do that?

e: ended up using tpope/vim-dispatch and firing off builds in a tmux split instead

Vulture Culture fucked around with this message at 06:20 on Jan 17, 2017

Droo
Jun 25, 2003

I have a Synology NAS with a raid 6 mdadm volume. I recently noticed that the mismatch_cnt is 528. I am running data scrubbing (sync_action repair) and it appears to be working like it should - mismatch_cnt was reset to 0 and is incrementing slowly as the resync runs, and I think I expect it to finish back at 528 when the whole array is finished. I plan to run it again immediately afterwards, and hope it reset to 0 and stay there during the second scrub.

I have run extended smart tests on the drives, poked around logs, and the Synology itself doesn't report any problems so most people wouldn't ever have known about the issue I think. I scrolled through the dmesg output and didn't see anything of note. It's not a very comprehensive version of linux that they use so my troubleshooting options are pretty limited, and I'm not sure what else to do.

How worried should I be about the mismatch_cnt having the 528 value in it? Is there possible data corruption, or bad memory, or a bad drive I'm not detecting?

RFC2324
Jun 7, 2012

http 418

Ok, this is making me crazy. I am fixing a script that is causing RHEL 7.1 boxes to not work with memory reporting in cacti. I changed the script, got it working with one box, and everything looked good. I then changed several other boxes to use the edited script, and it will work for one or 2 checks, then go back to giving values of -1.

I run the script from the CLI, and get -1 again. SSH into the box in question, run free -wb, and it is working(the script in question uses php to ssh in and execute the command). Once I do then, I can go back to the cacti box and it will work correctly for another cycle of checking, then go back to reporting -1.

I have no idea what could be doing this, any suggestions on where to look?

E: I just noticed it works consistently if called by the fwdn, so there is a start. Shame I can't redo the whole thing to do by fqdns

E2: it appears to be a memcache issue. loving php

RFC2324 fucked around with this message at 22:57 on Jan 18, 2017

other people
Jun 27, 2004
Associate Christ
Put some debug statements in your script so you can see why it returns the value it does???????

I suppose if you get desperate you could strace it but hopefully if you understand what it is trying to do it will be relatively clear where the bad return code is coming from.

other people
Jun 27, 2004
Associate Christ
oh so the script fails to run then? neat :/.

RFC2324
Jun 7, 2012

http 418

other people posted:

Put some debug statements in your script so you can see why it returns the value it does???????

I suppose if you get desperate you could strace it but hopefully if you understand what it is trying to do it will be relatively clear where the bad return code is coming from.

Its in php, which doesn't appear to have real good debugging for this sort of thing, since everything technically runs, it just doesn't return the values I expect :v:

xzzy
Mar 5, 2009

You need to find out what part of the php script is failing. Is the ssh successful? Is it running with the correct permissions (check selinux too)? Is the data returned in a sensible format?

Basically sprinkle your php script with echo statements and figure out where it's crapping.

RFC2324
Jun 7, 2012

http 418

xzzy posted:

You need to find out what part of the php script is failing. Is the ssh successful? Is it running with the correct permissions (check selinux too)? Is the data returned in a sensible format?

Basically sprinkle your php script with echo statements and figure out where it's crapping.

Here is the thing... It works. For one server it works every time, for others it works sometimes. For one it only appears to work when manually invoked(even invoked as non-root), but never when invoked by cacti itself. It TECHNICALLY works. It just isn't returning the expected values(results from free -wb on various servers).

And it wasn't memcached.

:suicide:

theperminator
Sep 16, 2009

by Smythe
Fun Shoe
So it's graphing -1's not NaN right? because NaN can mean a whole bunch of poo poo ranging from the columns not being output to the script exiting nonzero etc etc
When run manually, how long does it take to return a result?
Does the cacti user trust the host key of the monitored system(s)?

Things I would try:
1. Turn up poller logging in Settings > General > Poller logging level, wait for the next polling interval and observe the logs
2. Run the php script as the cacti user and observe any error output

Would it be possible to share the script in question?

other people
Jun 27, 2004
Associate Christ
I don't know what cacti is but can it just ssh in and run some bash or python script directly to get what it needs?

I work in support so i have tons more mostly unhelpful suggestions if you want...


but srsly you need to keep drilling down til you find the lowest/most specific piece which is failing. if you just have a generic "hey why does this custom script not work" kind of question then no one can do anything but give (probably useless) random guesses. i suppose if you want to share the script we may be able to say more?

im not bitter

xzzy
Mar 5, 2009

RFC2324 posted:

Here is the thing... It works. For one server it works every time, for others it works sometimes. For one it only appears to work when manually invoked(even invoked as non-root), but never when invoked by cacti itself. It TECHNICALLY works. It just isn't returning the expected values(results from free -wb on various servers).

And it wasn't memcached.

:suicide:

My first guess is the credentials are wrong. Like an ssh-agent exits and it can't log in anymore, a kerberos ticket is being purged, known_hosts is getting wiped.. something like that. Without knowing your environment it's just guessing.

RFC2324
Jun 7, 2012

http 418

theperminator posted:

So it's graphing -1's not NaN right? because NaN can mean a whole bunch of poo poo ranging from the columns not being output to the script exiting nonzero etc etc
When run manually, how long does it take to return a result?
Does the cacti user trust the host key of the monitored system(s)?

Things I would try:
1. Turn up poller logging in Settings > General > Poller logging level, wait for the next polling interval and observe the logs
2. Run the php script as the cacti user and observe any error output

Would it be possible to share the script in question?

Keys work, I've been trying as both root and the cacti user. Rrd is getting NaN, the script is returning -1 for everything.

The script can be easily found by googling ss_get_by_ssh.php (it's the percona script) and all i changed is the free command in the memory section and the way it parses the results(rhel 7.1 has a new free command with slightly different output). My changes work since it gives correct output when it gets any values at all

E: Diff showing my changes below, original can be found at https://github.com/percona/percona-monitoring-plugins/blob/master/cacti/scripts/ss_get_by_ssh.php

code:
<    return "free -ob";
---
>    return "free -wb";
890,891c892,894
<             $result['STAT_memused']   = sprintf('%.0f',
<                $words[2] - $words[4] - $words[5] - $words[6]);
---
>           $result['STAT_memused']   = $words[2];
> #            $result['STAT_memused']   = sprintf('%.0f',
> #               $words[2] - $words[4] - $words[5] - $words[6]);

RFC2324 fucked around with this message at 23:36 on Jan 18, 2017

anthonypants
May 6, 2007

by Nyc_Tattoo
Dinosaur Gum

RFC2324 posted:

Keys work, I've been trying as both root and the cacti user. Rrd is getting NaN, the script is returning -1 for everything.

The script can be easily found by googling ss_get_by_ssh.php (it's the percona script) and all i changed is the free command in the memory section and the way it parses the results(rhel 7.1 has a new free command with slightly different output). My changes work since it gives correct output when it gets any values at all

E: Diff showing my changes below, original can be found at https://github.com/percona/percona-monitoring-plugins/blob/master/cacti/scripts/ss_get_by_ssh.php

code:
<    return "free -ob";
---
>    return "free -wb";
890,891c892,894
<             $result['STAT_memused']   = sprintf('%.0f',
<                $words[2] - $words[4] - $words[5] - $words[6]);
---
>           $result['STAT_memused']   = $words[2];
> #            $result['STAT_memused']   = sprintf('%.0f',
> #               $words[2] - $words[4] - $words[5] - $words[6]);
Is there a reason you got rid of that sprintf()?

jre
Sep 2, 2011

To the cloud ?



RFC2324 posted:

Keys work, I've been trying as both root and the cacti user. Rrd is getting NaN, the script is returning -1 for everything.

The script can be easily found by googling ss_get_by_ssh.php (it's the percona script) and all i changed is the free command in the memory section and the way it parses the results(rhel 7.1 has a new free command with slightly different output). My changes work since it gives correct output when it gets any values at all

E: Diff showing my changes below, original can be found at https://github.com/percona/percona-monitoring-plugins/blob/master/cacti/scripts/ss_get_by_ssh.php

code:
<    return "free -ob";
---
>    return "free -wb";
890,891c892,894
<             $result['STAT_memused']   = sprintf('%.0f',
<                $words[2] - $words[4] - $words[5] - $words[6]);
---
>           $result['STAT_memused']   = $words[2];
> #            $result['STAT_memused']   = sprintf('%.0f',
> #               $words[2] - $words[4] - $words[5] - $words[6]);

You've taken out the code which strips the decimal places off, does free -wb always return an int ?

theperminator
Sep 16, 2009

by Smythe
Fun Shoe
The current version of the script doesn't use "free", it uses /proc/meminfo (probably exactly to cover the issue you had)
https://github.com/percona/percona-monitoring-plugins/pull/9
https://github.com/percona/percona-monitoring-plugins/commit/9db2767a18573a5ee3fb99d71e1eaececea792b2#diff-8e4aa3316b9e3f198f1754ff4a124bc2

theperminator fucked around with this message at 00:17 on Jan 19, 2017

RFC2324
Jun 7, 2012

http 418

anthonypants posted:

Is there a reason you got rid of that sprintf()?

Mostly because i want 100% sure what it did, other than some math that seemed off. The numbers i get out of it match reality tho so it doesn't seem to be the break point.

And when it returns numbers cacti handles them ok.


Oh hell. I'll see if i can just upgrade the percona plug-in tomorrow if my last round of things doesn't work overnight.

Thanks!

jre
Sep 2, 2011

To the cloud ?



RFC2324 posted:

Mostly because i want 100% sure what it did, other than some math that seemed off. The numbers i get out of it match reality tho so it doesn't seem to be the break point.

And when it returns numbers cacti handles them ok.

quote:

Cacti currently supports four types of data that RRDTool can represent for any given data source:

COUNTER: is for continuous incrementing counters like the ifInOctets counter in a router. The COUNTER data source assumes that the counter never decreases, except when a counter overflows. It is always a whole INTEGER, floating point numbers are invalid. The update function takes the overflow into account. The counter is stored as a per-second rate. When the counter overflows, RRDTool checks if the overflow happened at the 32bit or 64bit border and acts accordingly by adding an appropriate value to the result.

GAUGE: numbers that are not continuously incrementing, e.g. a temperature reading. Floating point numbers are accepted.

ABSOLUTE: counters that are reset upon reading

DERIVE: like COUNTER but without overflow checks

Is the thing you're returning definitely a gauge ? Otherwise your script will work when the result is 10 but not when its 9.5

theperminator
Sep 16, 2009

by Smythe
Fun Shoe
It's definitely a GUAGE as the other ones are used for counters that only increment & are derived to a 5 minute average etc.

If the script itself is returning -1's then the issue has nothing to do with cacti.

"free -wb" turns on wide output and counters in bytes so there should be no floats anyway, the only thing I can guess is that this:
code:
if ( preg_match_all('/\S+/', $line, $words) ) {
which splits the line up into an array of words, is sometimes getting confused or wonky by non-whitespace characters depending on the length of the output fields or something?

Adbot
ADBOT LOVES YOU

RFC2324
Jun 7, 2012

http 418

jre posted:

Is the thing you're returning definitely a gauge ? Otherwise your script will work when the result is 10 but not when its 9.5

Its a gauge, and what data it gets is graphed just fine... the script is just not returning data consistently. This is happening before it even gets to cacti, even.

E: gonna check for approval to just update the scripts, I prefer the approach of gathering the info from /proc anyway, and it might fix the other issues we are having before I even need to dig into them(crap with DBs, I really don't want to deal with scripts that log into a db to collect performance metrics)

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply