|
Thanks, I should have posted the full code instead of a snippet but I was accounting for that. It ended up being an issue with another part of my code. Much appreciated, hopefully my questions will improve as I do :P
|
# ? Apr 23, 2023 15:58 |
|
|
# ? May 26, 2024 22:22 |
|
Apologies if this is more of a Linux question. I've heard that using fcntl() to set stdin to non-blocking can affect other processes in weird ways. Is that true? I'd test it myself but I don't know what I'd be looking for in terms of weird behavior. If so, if you dup() stdin and set the new file descriptor to non-blocking, does that affect stdin in the same way? The reason I ask is that while the fds are said to be interchangeable, they don't share flags and I don't know if that covers nonblocking. edit: it seems like the recommended way is to not even try and just use select, poll, epoll, etc. but for something extremely simple I was wondering if I could get away with just a nonblocking fd. BattleMaster fucked around with this message at 22:46 on Apr 25, 2023 |
# ? Apr 25, 2023 22:41 |
|
BattleMaster posted:Apologies if this is more of a Linux question. I've heard that using fcntl() to set stdin to non-blocking can affect other processes in weird ways. Is that true? I'd test it myself but I don't know what I'd be looking for in terms of weird behavior. If you make stdin nonblocking and a child process inherits it from you, they're going to be very confused if they're expecting to block when they read from it. quote:If so, if you dup() stdin and set the new file descriptor to non-blocking, does that affect stdin in the same way? The reason I ask is that while the fds are said to be interchangeable, they don't share flags and I don't know if that covers nonblocking. File descriptors don't share fd flags (F_GETFD/F_SETFD), but the underlying file they point to share file flags (F_GETFL, F_SETFL). The only fd flag that exists on linux is FD_CLOEXEC, everything else (O_NONBLOCK, O_APPEND, etc.) is a file flag. quote:edit: it seems like the recommended way is to not even try and just use select, poll, epoll, etc. but for something extremely simple I was wondering if I could get away with just a nonblocking fd. If nothing's inheriting stdin from you, do whatever you want. (But how exactly are you expecting to use a nonblocking fd without using select/poll/epoll? Just eating up an entire core and spinning when you don't have any input available to read?)
|
# ? Apr 28, 2023 01:21 |
|
Sleep between retries
|
# ? Apr 28, 2023 01:42 |
|
I was thinking about something where I need to check stdin periodically before immediately going back to other stuff, but in that case select() with a NULL timeout is probably superior anyway.
|
# ? Apr 28, 2023 04:58 |
|
Planning to work with some std::atomic<> here, especially after C++20 introduced std::atomic<>::wait() and std::atomic<>::notify(). Has anybody worked with that and has some idea or experience w.r.t thread cancelation? It's noexcept, so it's not like it will throw upon the destruction of its std::atomic...
|
# ? May 22, 2023 20:04 |
|
Thread cancellation is an extremely bad idea that mostly doesn't work.
|
# ? May 23, 2023 16:08 |
|
And condition variables are a great example of a situation where thread cancellation is just kindof unfixable.
|
# ? May 23, 2023 19:26 |
|
Wipfmetz posted:Planning to work with some std::atomic<> here, especially after C++20 introduced std::atomic<>::wait() and std::atomic<>::notify(). c++20 introduced "cooperative thread cancellation" which is supposed to somehow work better than that crazy pthread thread cancelling bullshit but i haven't actually tested it yet because i know i'll spend a week playing with it and will probably decide i hate it in the end. it might work and be good though. thread cancellation in general though is a bad idea, i agree.
|
# ? May 23, 2023 20:36 |
|
The stop_token stuff is basically just some helpers for manually implementing task cancellation when you need something more complicated than a bool you check occasionally. I haven't had a chance to actually use it yet but it looks sane.
|
# ? May 23, 2023 20:49 |
|
So i guess i'll better use a common condition variable to handover work items to worker threads and use a good oldfashioned bool to indicate "no more work incoming, just return from your thread's mainloop, thank you".
|
# ? May 24, 2023 07:23 |
|
The simpler, the better. Prefer coordinating with running threads through shared memory datastructures. A shared stop variable for telling workers to exit works well, it's monotonic. Just make sure with an atomic counter that all threads finished before resetting it. A barrier, basically.
|
# ? May 24, 2023 09:20 |
|
Wipfmetz posted:So i guess i'll better use a common condition variable to handover work items to worker threads and use a good oldfashioned bool to indicate "no more work incoming, just return from your thread's mainloop, thank you". (And you shouldn't need any extra objects to wait for the threads to all finish, because there's join for that, assuming when you reach that point you're waiting for all the worker threads or specific worker threads to complete.)
|
# ? May 24, 2023 13:39 |
|
Anyone have a profiler for C that runs on Windows that they like? The software we write at work has been running slower and slower all year. I'd like to find out where the problem code is and go streamline it. My backup plan is to setup a linux vm and use gprof. I'd really prefer to find a windows solution, though. If I have to use a linux vm I'll be the only one able to run it.
|
# ? May 26, 2023 21:28 |
|
LLSix posted:Anyone have a profiler for C that runs on Windows that they like? The software we write at work has been running slower and slower all year. I'd like to find out where the problem code is and go streamline it It’s been a few years, but last I did this on Windows it was with Intel VTune and it worked OK. I’ve also heard good things about https://github.com/VerySleepy/verysleepy but haven’t used it myself.
|
# ? May 26, 2023 21:34 |
|
If you do set up linux, don't use gprof, it's pretty much useless for stuff that's not completely blatant. linux-perf is a far more sophisticated tool. Also I guess oprofile, though I think that's semi-deprecated in favor of linux-perf? I haven't used it myself[1], but I think MS provides ETF for Windows for free, which is supposedly super-sophisticated, though a lot of its functionality is whole-system, there are some instructions on https://randomascii.wordpress.com/2015/09/01/xperf-basics-recording-a-trace-the-ultimate-easy-way/ (blog of Chrome person who mostly seems to use it to find Windows bugs). [1]... though I may be suppressing an unsuccessful attempt.
|
# ? May 26, 2023 22:06 |
|
Subjunctive posted:It’s been a few years, but last I did this on Windows it was with Intel VTune and it worked OK. VTune improved a ton in the last few years and is now completely free as part of the oneAPI basekit.
|
# ? May 26, 2023 23:15 |
|
Beef posted:VTune improved a ton in the last few years and is now completely free as part of the oneAPI basekit. Yeah I was just noticing that I didn’t really recognize any of the screenshots on the site. Good that they’re still investing in it, hope they aren’t doing anything shady with AMD processors like that library they had that checked CPUID and chose to suck for non-Intel.
|
# ? May 27, 2023 00:02 |
|
I found something that is making me tear my hair out using gcc and glibc 2.31 - if that even matters, maybe I'm doing it wrong. For a program that deals with TCP/IP sockets and deals with a lot of socket file descriptors, I am using search.h's tree functions to map file descriptors to associated data. So I malloc the structures and put them on the tree (non-essentials and error checking, which I do, omitted): code:
code:
However, the problem is that I malloced the event and I should free it at some point otherwise this thing would leak memory for every client. If I uncomment the free statement the whole thing goes haywire. The tree gets jacked up and things in it are no longer reliably found, and sometimes the whole thing just blows up entirely with free complaining about me double-freeing something. Here's it running into a double free: code:
code:
I just don't understand why freeing stuff I malloced myself goes back and mangles the tree!
|
# ? May 28, 2023 03:16 |
|
Try with valgrind (or asan/msan)?
|
# ? May 28, 2023 03:25 |
|
tfind returns a pointer to the internal tree node (the first element of which is the pointer that you passed in). After you call tdelete, that internal node is no longer valid, so you can't then go accessing its first element and expect to get your original data pointer out.
|
# ? May 28, 2023 03:29 |
|
What's tdelete actually do? If I was getting those kinds of errors with a linked list I'd think the delete function was missing a found->prev->next = found->next. i.e. it sounds kind of like however the items are being stored still has a pointer to the memory being "deleted" and is accessing it after the free call. That you're sometimes having it find something already deleted like in your double free example is what is making me the most suspicious delete isn't working right. Subjunctive posted:Removing fd 5
|
# ? May 28, 2023 03:34 |
|
Jabor posted:tfind returns a pointer to the internal tree node (the first element of which is the pointer that you passed in). After you call tdelete, that internal node is no longer valid, so you can't then go accessing its first element and expect to get your original data pointer out. gently caress, this was it. Thank you so much! (and thanks to everyone else, I even had valgrind installed but didn't think to use it before, and it was helping me edge slowly closer to this) I need to dereference the pointer to my pointer before it deletes it. This works: code:
LLSix posted:That you're sometimes having it find something already deleted like in your double free example is what is making me the most suspicious delete isn't working right. This quoted part is actually working properly even if the output looks weird - it reports the FDs before it checks them: code:
BattleMaster fucked around with this message at 03:52 on May 28, 2023 |
# ? May 28, 2023 03:48 |
|
You are closing all these fds too at some point, right?
|
# ? May 28, 2023 05:18 |
|
Presto posted:You are closing all these fds too at some point, right? Oh yeah, that was the easy part. I wrote an echo server that has backends that use select, poll, epoll, and io_uring (with and without liburing) and loving tsearch ended up being the hardest thing for me to figure out apparently. Here's a strace of one connection using the epoll backend: code:
I'm sure that people who do this for a living know all this but it's interesting to me as an amateur who bangs their head against it until it works
|
# ? May 28, 2023 05:31 |
|
BattleMaster posted:Oh yeah, that was the easy part. I wrote an echo server that has backends that use select, poll, epoll, and io_uring (with and without liburing) and loving tsearch ended up being the hardest thing for me to figure out apparently. If this isn’t just to mess with sockets, you don’t need to relate these things yourself. Epoll allows you to add a epoll_data to your fds that it will return back to you from epoll_wait: https://man7.org/linux/man-pages/man2/epoll_ctl.2.html Pretty sure io_ruing also supports this, don’t think poll does Sweeper fucked around with this message at 12:14 on May 28, 2023 |
# ? May 28, 2023 12:12 |
|
Yeah in my first epoll version, which is also the multiplexing API I learned first, I used its userdata of it to store information on what the event was and what user id it belonged to. The user id was also the array index of where I stored the sockets and other information for each user. So I had an enum like code:
code:
Essentially: code:
In the end, I made a poll list big enough for the listening socket and the max number of users. The first entry would always be the listening socket. Whenever users disconnected I repacked it using memmove to make sure it never had any holes (not necessary except for performance, maybe - the fds could just be set to negative numbers to get poll to skip them). I would also keep a mapping between poll_list index and what user it belonged to: code:
Now the current reason I'm using tsearch along with epoll though, was that I'm writing a more generic event loop that lets you add and remove fds and have them be polled without having to hardcode them in, as with that enum in my previous epoll setup. So I have functions that let you add and remove fds, so it needs a way to map those fds to the rest of its internal data, which I'm using tsearch for. The tree is where it keeps all the data associated with the fd, instead of a fixed-size structure like before. The actual event loop uses the userdata of epoll to contain a pointer directly to the data structure for that fd, which contains the callback function and other information. So my event handling looks like code:
Like all the criticism of epoll seems to be from people doing stupid boneheaded bullshit with it like sharing it between processes or closing FDs without removing them from epoll's watch list and expecting epoll to know they're no longer relevant. But in my experimentation it seems pretty slick and the inclusion of the userdata that lets you identify each event however you want (by the FD, by a pointer, or by an arbitrary integer) was a really good choice. tl;dr I/O multiplexing is a land of contrasts
|
# ? May 28, 2023 18:23 |
|
BattleMaster posted:Yeah in my first epoll version, which is also the multiplexing API I learned first, I used its userdata of it to store information on what the event was and what user id it belonged to. The user id was also the array index of where I stored the sockets and other information for each user. epoll has some non-trivial edge cases around the handling of level triggered, edge triggered, etc. it’s basically only safe to use from multiple threads in one configuration (ET, one_shot?) or something. I forget the details exactly, but it definitely got warts Generally I’ve found all of these apis pretty slow so we end up bypassing the kernel anyway and I don’t have to call pill, I just spin on an ef_vi handle
|
# ? May 28, 2023 20:50 |
|
ef_vi is definitely a bit more than I think I can handle. I'm okay with janitoring system calls but I don't think I'm ready for doing the lower-layer protocols myself They added the EPOLLEXCLUSIVE flag for having multiple epoll FDs monitor the same FD, like a listening socket. If you open the socket and then make threads or fork off new processes and then they all make separate epoll FDs that monitor it with the EPOLLEXCLUSIVE flag ored in, "one or more" of them will wake up. If you don't use this flag, more or all of them will likely wake up. However, the worst thing that happens is a thundering herd, you don't run into any weird bugs or edge cases with epoll. Some people have run into problems making one epoll FD and then sharing it across threads or processes which seems like an outright bad idea. There's so little cost to making an epoll FD that I don't see why you wouldn't just make one per thread or process. Just utter madness. You try sharing one select fd_set or one poll event array between multiple threads and see how they like it. Same with closing FDs but not informing epoll that you no longer want to monitor them. You try that with poll or select and they won't be happy either. epoll is better than those (at least for many fds) but it's not magic. In my experimentation, the actual best way to do this is to do all your forking first, then have each process open its own listening socket, setting the SO_REUSEADDR and SO_REUSEPORT sockopts set before binding the address. Each process can monitor that socket however they want, using whatever multiplexing scheme or even just doing blocking I/O on it (hundreds of processes with their own socket doing blocking I/O isn't even that bad in my experimentation but doesn't scale as well as any of the multiplexing schemes, even select), but the key thing is that if each process or thread has its own socket, the kernel only sends a given incoming connection to one of them. No thundering herd problem, no weird fighting between epoll instances or anything like that. And the kernel is pretty good about load balancing. The disadvantage is that if you do this on a port above 1023, theoretically a rogue process could listen on the same port and steal a portion of your connections. But maybe you have worse problems if that's happening. BattleMaster fucked around with this message at 21:24 on May 28, 2023 |
# ? May 28, 2023 21:15 |
|
BattleMaster posted:I rather like epoll compared to the other options, even the more experimental stuff I've tried with io_uring. io_uring is great for queuing up batches of I/O events but I don't really like it so far for actually just polling FDs. (io_uring has support for interacting with epoll, but it's now deprecated, no joke, because in a listserv message between io_uring's creator and Linus Torvalds, Linus said offhandedly that he hated epoll. So my fantasy of coming up with some way to mix epoll and io_uring in a way that is faster than either was dead before it started.) https://lore.kernel.org/io-uring/20230501185240.352642-1-info@bnoordhuis.nl/T/#u If you mean this, I think you're safe. I don't know what the dislike for it was based on.
|
# ? May 30, 2023 04:00 |
|
That's good news. Getting rid of it because of Torvalds' whim was pretty lame. I honestly don't know why he has such a bug up his rear end about it, but the original conversation I referred to is here. Torvalds busts into a thread about io_uring for no real reason other than to say that "epoll is the worst possible implementation of a horribly bad idea, and one of the things I would really want people to kill off" and that he hopes io_uring helps kill it off. So Axboe says (paraphrased) "well we can get rid of epoll support in io_uring to help kill off epoll." Like maybe Torvalds doesn't like the way it's implemented in the kernel or something, but I don't really see how the system calls could be any different and I don't see how polling FDs for readiness is a bad idea anyway. Maybe he doesn't like having the kernel manage the watchlist? io_uring is kind of worse at polling FDs than epoll, both in terms of API (io_uring polls are one-shot only so they need to be resubmitted every time they fire, and have fewer options like no ability to do edge triggering, and the poll results get mixed into the completion queue along with every other operation completion so you never get a list of just what FDs are ready) and benchmarks for things that replaced epoll with io_uring are often a little slower. From my outsider's view the epoll API is pretty solid and it does and works the way I expected it to. It's simpler and easier to effectively use than select and poll, although with the disadvantage that you need to make system calls to add/modify/remove FDs in the polling list. io_uring's support for epoll seems like it actually has the possibility to work well alongside epoll. The fact that you can queue up a large number of epoll_ctls and submit them with just one syscall (not to mention all the accepts, reads, writes, and closes you will be doing) helps mitigate that disadvantage. You can actually do a server with io_uring without doing polling at all, by queuing up blocking operations (accept when you don't know there's an incoming connection, read when you don't know there's waiting data, etc.) which io_uring will handle asynchronously, and you can handle them when they eventually show up in the completion queue. But that doesn't seem necessarily very good and it may be faster or less resource intensive to just poll them and queue up the operations when you know they'll be handled quickly instead of blocking in the io_uring shadow realm. BattleMaster fucked around with this message at 21:17 on May 30, 2023 |
# ? May 30, 2023 10:54 |
|
Subjunctive posted:It’s been a few years, but last I did this on Windows it was with Intel VTune and it worked OK. Thank you. VerySleepy is exactly what I was looking for. Best of all, it's already on the list of pre-approved software. Surprisingly, it seems like almost all the slowness we've been seeing is due to I/O, mostly due to a deeply stupid core architecture decision and partly because we issue a couple hundred alarms on startup with our development data. I guess maybe we should fix the developer data so it stops issuing some of those alarms.
|
# ? May 31, 2023 00:54 |
|
Screaming into the void again, I'm doing some more stuff with io_uring. It has an opcode (IORING_OP_LINK_TIMEOUT added with io_uring_prep_link_timeout in liburing) has the ability to attach a timer to an entry being submitted that will cancel it if the timer ticks over before it's completed. It works fine but the return value is useless and also not properly documented. It says it returns 0 on a success but there's no scenario where it returns 0. If no actual error happens, it returns -ETIME if the timeout occurred or -ECANCELED if the timeout was canceled because the attached entry completed first. Also, -ETIME means a cancellation is ATTEMPTED, not that the cancellation succeeded. So the only way to find out if the cancellation went through is to find the entry for the thing you wanted to cancel in the completion queue and check if the return value was -ECANCELED. So the completion queue is crapped up with a useless entry with a useless return value. Also it would have been really nice if the completion queue entries contained the associated opcode to provide context for what the userdata means, especially if I use a pointer for the userdata that could point to different things depending on what the opcode was. (I could have the userdata point to a structure that has an enum and a void pointer in it but ehhh) edit: Also would have been cool if they implemented an opcode for recvfrom so I could use it for UDP without having to use recvmsg which has all kinds of features I don't want BattleMaster fucked around with this message at 02:31 on Jun 2, 2023 |
# ? Jun 2, 2023 02:24 |
|
In code likeC++ code:
If Foo wasn't a template, you'd do C++ code:
I'm bad at SFINAE and can't figure out the right way to write it
|
# ? Jun 2, 2023 03:28 |
|
C++ code:
|
# ? Jun 2, 2023 03:41 |
|
Foo<A,B>::ThingToFormat is a nondeduced context, the straightforward version of that pattern isn't valid See https://godbolt.org/z/E6EGd5MYG
|
# ? Jun 2, 2023 04:31 |
|
Ah right, because the template parameters are before the ::. I think you just have to unnest the types and it can't be done with nested types? Something like:C++ code:
|
# ? Jun 2, 2023 05:01 |
|
BattleMaster posted:Screaming into the void again, I'm doing some more stuff with io_uring. It has an opcode (IORING_OP_LINK_TIMEOUT added with io_uring_prep_link_timeout in liburing) has the ability to attach a timer to an entry being submitted that will cancel it if the timer ticks over before it's completed. On the one hand I could joke about Meta and unfinished/unspecified behaviour, but on the other that looks really nice and I would have been extremely happy to have it some years back when I was last doing kernel stuff.
|
# ? Jun 2, 2023 09:44 |
|
Private Speech posted:On the one hand I could joke about Meta and unfinished/unspecified behaviour, but on the other that looks really nice and I would have been extremely happy to have it some years back when I was last doing kernel stuff. I have gripes with io_uring but it's really pretty awesome. It's an actually useful way to do asynchronous I/O and not only that but you can submit hundreds of I/O calls all at once with one syscall or even set it up to read the queues automatically and do a whole server with no system calls.
|
# ? Jun 3, 2023 03:26 |
|
|
# ? May 26, 2024 22:22 |
|
I have this problem that when I run my program on different computers (x64), some results won't be same for same inputs when using "sin" (and probably other trigonometric functions and maybe more). By not same I mean I need them to be identical, not just close enough. Is there something easy that can be done about this? I've noticed that in visual studio in code generation, I have runtime library set to "Multi-threaded DLL (/MD)". This apparently means that it will use whatever is available on user's computer and it can give slightly different results? 1. will this help? 2. if this helps and I one of the libraries I use is closed source, does that mean I'm hosed if they only have their library built with /MD ? I wanted to ask before I spend my day rebuilding all the stuff. I have some recollection that I've changed between /MD and /MT before, but no idea why was that.
|
# ? Jun 26, 2023 07:43 |