Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
ultrafilter
Aug 23, 2007

It's okay if you have any questions.


camoseven posted:

I usually code on OSX, but I've started doing a little on the Windows Subsystem for Linux (Ubuntu) for convenience sake. Is there a command like 'open' that I can use? On OSX that would open the Finder (File Explorer) GUI in the folder you are currently in, but it's not built-in to WSL and I know there's some weirdness about even manual navigating to those files in the GUI.

From the terminal explorer.exe $dirname will do what you want. I assume there's nothing weird about invoking it programmatically but Windows so who knows?

Adbot
ADBOT LOVES YOU

camoseven
Dec 30, 2005

RODOLPHONE RINGIN'

ultrafilter posted:

From the terminal explorer.exe $dirname will do what you want. I assume there's nothing weird about invoking it programmatically but Windows so who knows?

This is perfect, thanks so much!

e: just aliased open='explorer.exe' in my bash profile and life is GOOD!

camoseven fucked around with this message at 19:10 on May 16, 2022

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Zephirus posted:

Good azure info...

Thanks!

Computer viking
May 30, 2011
Now with less breakage.

In windows shell, "start" does the same thing - and in a typical linux desktop you have "xdg-open". The latter takes an argument, but it can be . to open the current directory - so if you have a linux desktop installed in WSL, "xdg-open ." should pop up a linux file manager window through the magic of WSLg.

Generally speaking, you can start windows programs from WSL by giving a program name with the .exe - so for your specific use you can try "explorer.exe ." to open the current directory in windows explorer.

E: f;b

qsvui
Aug 23, 2003
some crazy thing

ultrafilter posted:

From the terminal explorer.exe $dirname will do what you want. I assume there's nothing weird about invoking it programmatically but Windows so who knows?

There shouldn't be, this exact usage is mentioned in the WSL docs: https://docs.microsoft.com/en-us/windows/wsl/setup/environment#file-storage

PIZZA.BAT
Nov 12, 2016


:cheers:


This is actually gonna be several questions in one post:

Does anyone have a suggestion for a simple program that can be configured to watch a specified window and take a screenshot at some interval, say- once per second, and dumping it into a folder somewhere?

Next- I’ve been reading that when feeding images into a neural network it’s best to ‘denoise’ it as much as possible to just make things easier for training / running when you’re live as there will be less chance for weird errors. Does anyone know of a guide that provides some tips on what filters or whatever to run the images through to do that?

Finally- what’s the best or easiest way to get a java program to interact with another window? I’m assuming finding the target window and sending keystrokes is easiest but I’m seeing all sorts of stuff when I look this up and as is typical with google most of the results are from 2008

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

PIZZA.BAT posted:

This is actually gonna be several questions in one post:

Does anyone have a suggestion for a simple program that can be configured to watch a specified window and take a screenshot at some interval, say- once per second, and dumping it into a folder somewhere?

Next- I’ve been reading that when feeding images into a neural network it’s best to ‘denoise’ it as much as possible to just make things easier for training / running when you’re live as there will be less chance for weird errors. Does anyone know of a guide that provides some tips on what filters or whatever to run the images through to do that?

Finally- what’s the best or easiest way to get a java program to interact with another window? I’m assuming finding the target window and sending keystrokes is easiest but I’m seeing all sorts of stuff when I look this up and as is typical with google most of the results are from 2008

1. AutoIt I think can do this

2. Simplest thing is probably to reduce the resolution of your input image.

3. https://docs.microsoft.com/en-us/windows/win32/apiindex/windows-api-list https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-sendmessage
But maybe it's easier to just use AutoIt

Gothmog1065
May 14, 2009
Hey guys, want to make sure I'm not missing some obvious faster solution. Currently using BASH, but shell doesn't matter, it's simplistic enough.

I have a directory with files in it, and each file has a bunch of lines (HL7 messages for those who know). The 'header' or the beginning of the line basically looks like this:

code:
MSH|^~\&|<string>|<string>|||<date>|<string>|<string>|<ID>|<more strings>|<loads of other data separated by a pipe>
What I'm doing is looping over each file, then each line within the file, awking out the 10th item separated by |, and using that a comparator (If it's before a specific number throw it away, if it is that number or after, write to a new file). Code:

code:
path="/a/path/here"
for file in ${path}/*; do
  while read -r msg; do
    id=${echo ${msg} | awk -F '|' '{print $10}')
    if (( ${id} < 12345 )); then
      continue
    else
      echo $msg >> ${file}.new
    fi
  done < ${file}
done
Is there a more efficient way to do this? The biggest thing is having to pull that specific value from each line, where each line can be a few bytes to a few kb (or more) in size.

nielsm
Jun 1, 2009



AWK should already be working on the input line by line so you don't need to loop over the file contents in the shell. It can also do the conditional.

code:
path="/a/path/here"
for file in ${path}/*; do
  awk -F '|' '{ if ($10 < 12345) { print $10 } }' ${file} > ${file}.new
done
(Note: Totally untested.)

ArcticZombie
Sep 15, 2010
AWK can itself do everything you're doing in bash:

code:
awk -F '|' '$10 > 12345 { newfile = FILENAME ".new"; print $0 > newfile }' /path/to/files/*
You should get a significant speed up by not having to fork/exec over and over again and letting AWK do all the work. This could probably be faster if it didn't redefine the variable for every matching record every time and instead only updated it when a new file was encountered, but I don't know AWK well enough to do that off the top of my head.

ArcticZombie fucked around with this message at 19:00 on May 17, 2022

Gothmog1065
May 14, 2009

ArcticZombie posted:

AWK can itself do everything you're doing in bash:

code:
awk -F '|' '$10 > 12345 { newfile = FILENAME ".new"; print $0 > newfile }' /path/to/files/*
You should get a significant speed up by not having to fork/exec over and over again and letting AWK do all the work. This could probably be faster if it didn't redefine the variable for every matching record every time and instead only updated it when a new file was encountered, but I don't know AWK well enough to do that off the top of my head.

Yeah, I've begun to realize how much more useful awk is, just haven't had the opportunity to dig into it

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Gothmog1065 posted:

Yeah, I've begun to realize how much more useful awk is, just haven't had the opportunity to dig into it

Just curious, how big of a file are you looking at? Are you wanting to optimize in an effort to better learn the tools, or is your script taking a non-negligible time to run on big files?

I really enjoy having problems that give me a chance to aim for meaningful optimizations.

ExcessBLarg!
Sep 1, 2001
Eh, skip awk and go straight to:
code:
ruby -e 'ARGV.each {|f| File.open("#{f}.new","w") {|n| File.foreach(f) {|l| l.split("|")[9].to_i >= 12345 and n.puts(l)}}}' "$path"/*
Note that it will create (and truncate) a .new file even if no records are inserted into it. But it does keep you from constantly reopening the file.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Someone give us some perl magic for that.

Gothmog1065
May 14, 2009

Hughmoris posted:

Just curious, how big of a file are you looking at? Are you wanting to optimize in an effort to better learn the tools, or is your script taking a non-negligible time to run on big files?

I really enjoy having problems that give me a chance to aim for meaningful optimizations.

It should have been a one-off thing in all actuality. I'll post back tomorrow, but some of the files are probably 3-400k lines long, I don't remember the file sizes off the top of my head. I just know the original code I posted took an hour or more to do. What I ended up with was a 'multi-thread' deal where I had a script call a second script with each of the filenames (a basic script2 <file> &) which proceeded to process all of the files at once (there were 15 or so). It took a considerably smaller amount of time to process.

I'd pass you some of the files, but can't due to HIPPA. Plus Netsec would probably break my fingers for pushing that much data off the network.

I'm going to play with the awk statement if I get some time tomorrow and people quit breaking poo poo long enough for me to do something other than put out fires.

ArcticZombie
Sep 15, 2010
For quick and dirty parallelism, you can use find and xargs, something like:

code:
find /path/to/files -type f -print0 | xargs -0 -n 1 -P 0 {awk script here}
Will take all of the files under that path and run the the script on each file in parallel. You can fiddle with the xargs arguments if you’ve got hundreds of files and don’t want hundreds of processes running at once, limit the number of processes simultaneously with -P and have each process more that one file with -n.

lifg
Dec 4, 2000
<this tag left blank>
Muldoon

Hughmoris posted:

Someone give us some perl magic for that.

Phone posting, so this isn’t tested, but something like this gets close, but will rename the old file with a .old extension rather than give the new file a .new extension.

code:
perl -F\| -i.old -ane’print if $F[9] < 123’ *

Hughmoris
Apr 21, 2007
Let's go to the abyss!

lifg posted:

Phone posting, so this isn’t tested, but something like this gets close, but will rename the old file with a .old extension rather than give the new file a .new extension.

code:
perl -F\| -i.old -ane’print if $F[9] < 123’ *

:hfive:

*Disregard, need to learn a little javascript.

Hughmoris fucked around with this message at 13:20 on May 19, 2022

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender
I don't see a shell scripting thread, so hopefully this is the right place to ask this.

I have this zsh script I wrote to do processing on a bunch of files it searches for recursively. However, it fails if it runs into a file with a dollar sign in the filename, because it interprets part of the filename as a variable name. This happens, for example, on line 39.

I think switching to single quotes is out, since I do need to use the variables the file names are stored in, but I don't want the shell to try to parse the contents of the variables. Is there a way to do this that I'm missing?

Kreeblah fucked around with this message at 22:22 on May 22, 2022

Kuule hain nussivan
Nov 27, 2008

Kreeblah posted:

I don't see a shell scripting thread, so hopefully this is the right place to ask this.

I have this zsh script I wrote to do processing on a bunch of files it searches for recursively. However, it fails if it runs into a file with a dollar sign in the filename, because it interprets part of the filename as a variable name. This happens, for example, on line 39.

I think switching to single quotes is out, since I do need to use the variables the file names are stored in, but I don't want the shell to try to parse the contents of the variables. Is there a way to do this that I'm missing?

I think you're going to have to add some escaping to the filenames when reading them in. If you use \$ instead of $, it should no longer be interpreted as a variable.

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender

Kuule hain nussivan posted:

I think you're going to have to add some escaping to the filenames when reading them in. If you use \$ instead of $, it should no longer be interpreted as a variable.

Oh, right, I should have mentioned that. I tried that right after line 36 with this:

Zsh code:
tmpcue="${tmpcue//$/\\$}"
Problem is, the shell then can't find those files with dollar signs in their names, because it looks for one with a backslash in front of the dollar sign.

I expect you're probably right about there being some escaping I need to do, but I'm having a hard time figuring out what the right way to escape these strings is without the shell interpreting the backslash as part of the filename.

Kuule hain nussivan
Nov 27, 2008

Kreeblah posted:

Oh, right, I should have mentioned that. I tried that right after line 36 with this:

Zsh code:
tmpcue="${tmpcue//$/\\$}"
Problem is, the shell then can't find those files with dollar signs in their names, because it looks for one with a backslash in front of the dollar sign.

I expect you're probably right about there being some escaping I need to do, but I'm having a hard time figuring out what the right way to escape these strings is without the shell interpreting the backslash as part of the filename.

I am rubbish with terminal stuff, but could you assign with single quote marks? Shouldn't that handle it as a literal rather than interpret the $ as the start of a variable?

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender

Kuule hain nussivan posted:

I am rubbish with terminal stuff, but could you assign with single quote marks? Shouldn't that handle it as a literal rather than interpret the $ as the start of a variable?

It would, but I've got the contents of the filename in a variable, so using strong quoting (single quotes) wouldn't expand that to get me the filename I need. Basically, I need to expand the filename variable without the shell trying to recursively expand the contents of it if they happen to contain a dollar sign.

cheetah7071
Oct 20, 2010

honk honk
College Slice
this is an idle curiosity more than something truly important but: what is the actual algorithm for dynamic memory allocation? When I call new or malloc, what does the assembly that generates actually do?

I assume it uses a heap data structure but when I try to google it I get lots of articles about the advantages and disadvantages of dynamic memory, and about the heap data structure (which I assume is the data structure that the heap uses because of the name, though it's not clear to me how that data structure would be helpful for the problem), which are far more useful but don't satisfy my idle curiosity on a topic whose answer is essentially meaningless to me

e: and of course immediately after hitting post I stumble on the right term to google to get useful answers, though I'm still interested in hearing more easy-to-understand answers than technical papers

cheetah7071 fucked around with this message at 07:14 on May 24, 2022

Kuule hain nussivan
Nov 27, 2008

Kreeblah posted:

It would, but I've got the contents of the filename in a variable, so using strong quoting (single quotes) wouldn't expand that to get me the filename I need. Basically, I need to expand the filename variable without the shell trying to recursively expand the contents of it if they happen to contain a dollar sign.
Hmm, could it be that the replace operation you're doing is too late in the flow? When the script meets a filename with a dollar sign in it, what is the value of i? Does it still contain the dollar sign char?

My line of thought is, maybe the dollar sign is being removed when you do tmpcue="${i:r}" and is not present at tmpcue="${tmpcue//$/\\$}". So maybe doing...
code:
i="${i//$/\\$}" //Maybe i="${i/$/\\$}"
tmpcue="${i:r}"
...would work?

RPATDO_LAMD
Mar 22, 2013

🐘🪠🍆

cheetah7071 posted:

this is an idle curiosity more than something truly important but: what is the actual algorithm for dynamic memory allocation? When I call new or malloc, what does the assembly that generates actually do?

I assume it uses a heap data structure but when I try to google it I get lots of articles about the advantages and disadvantages of dynamic memory, and about the heap data structure (which I assume is the data structure that the heap uses because of the name, though it's not clear to me how that data structure would be helpful for the problem), which are far more useful but don't satisfy my idle curiosity on a topic whose answer is essentially meaningless to me

e: and of course immediately after hitting post I stumble on the right term to google to get useful answers, though I'm still interested in hearing more easy-to-understand answers than technical papers

It's typically an intrusive linked list, or more often a collection of several linked lists containing free chunks of different sizes.
You might have one linked list each for memory chunks of 16, 32, 64, etc etc bytes and just hand out the smallest size that fits what the caller needs. If a caller allocates close to 512 bytes, you just pull an arbitrary node outta the 512-byte free list. Different allocators will have different strategies for how they choose this arbitrary node to achieve goals like keeping cache locality for back-to-back allocations or minimizing fragmentation or etc etc.

You also need logic for handling splitting up chunks of memory (e.g. if you only have 512-byte chunks left and a user tries to allocate 32 bytes, you don't want to waste tons of memory) and for coalescing adjacent small chunks back together into big chunks again (e.g. after the user allocates and frees a few thousand 32-byte objects, they might try to allocate a 512-byte object again, which means you need some efficient logic that can figure out when a bunch of smaller chunks of free memory are all adjacent and can be combined into one larger chunk). And if you're totally out of memory you grab another memory page from the OS/kernel.

I had to write a little toy memory allocator for my data structures class, although it only had to deal with one size of allocation so I didn't get into all the crunchy memory coalescing algorithms.

For example here is the struct glibc's malloc.c uses:
code:
/*
  This struct declaration is misleading (but accurate and necessary).
  It declares a "view" into memory allowing access to necessary
  fields at known offsets from a given base. See explanation below.
*/

struct malloc_chunk {

  INTERNAL_SIZE_T      mchunk_prev_size;  /* Size of previous chunk (if free).  */
  INTERNAL_SIZE_T      mchunk_size;       /* Size in bytes, including overhead. */

  struct malloc_chunk* fd;         /* double links -- used only if free. */
  struct malloc_chunk* bk;

  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
  struct malloc_chunk* bk_nextsize;
};
It keeps free memory in a doubly-linked list. Note that the next and previous pointers are in the same memory area that the user would normally write to. That block of allocated memory just gets reinterpreted as a malloc_chunk when malloc actually touches it. So if you write to a pointer after freeing it, you will actually be clobbering the internal linked-list pointers and corrupting the heap, which is why use-after-free can be so catastrophic.

Glibc uses different "bins" (linked lists) for chunks of different sizes, according to this scheme:
code:
/*
   Indexing

    Bins for sizes < 512 bytes contain chunks of all the same size, spaced
    8 bytes apart. Larger bins are approximately logarithmically spaced:

    64 bins of size       8
    32 bins of size      64
    16 bins of size     512
     8 bins of size    4096
     4 bins of size   32768
     2 bins of size  262144
     1 bin  of size what's left

    There is actually a little bit of slop in the numbers in bin_index
    for the sake of speed. This makes no difference elsewhere.

    The bins top out around 1MB because we expect to service large
    requests via mmap.

    Bin 0 does not exist.  Bin 1 is the unordered list; if that would be
    a valid chunk size the small bins are bumped up one.
 */

RPATDO_LAMD fucked around with this message at 07:50 on May 24, 2022

yippee cahier
Mar 28, 2005

cheetah7071 posted:

this is an idle curiosity more than something truly important but: what is the actual algorithm for dynamic memory allocation? When I call new or malloc, what does the assembly that generates actually do?

I assume it uses a heap data structure but when I try to google it I get lots of articles about the advantages and disadvantages of dynamic memory, and about the heap data structure (which I assume is the data structure that the heap uses because of the name, though it's not clear to me how that data structure would be helpful for the problem), which are far more useful but don't satisfy my idle curiosity on a topic whose answer is essentially meaningless to me

e: and of course immediately after hitting post I stumble on the right term to google to get useful answers, though I'm still interested in hearing more easy-to-understand answers than technical papers

You can browse glibc’s malloc (based on dlmalloc I think) if you want, but here’s a simple, well commented heap implementation from an embedded RTOS: https://github.com/sifive/FreeRTOS-metal/blob/master/FreeRTOS-Kernel/portable/MemMang/heap_4.c

Key to this implementation is that there’s a tracking structure preceding every pointer that’s handed out. Free backs up the pointer to recover this structure. The actual implementation is a linked list of memory blocks.

A standard implementation will be thread safe, though there are allocators that are meant to be thread local that can offer performance benefits by guaranteeing they will only be accessed from a single thread.

The simplest heap allocator I’ve seen is simply a pointer that is moved through a block of memory. Need 10 bytes? Save the global pointer address locally and increment it 10 bytes. The local copy is your allocated memory. Naturally there is no free with a system like this.

feb

cheetah7071
Oct 20, 2010

honk honk
College Slice
thanks to both of you

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

cheetah7071 posted:

this is an idle curiosity more than something truly important but: what is the actual algorithm for dynamic memory allocation? When I call new or malloc, what does the assembly that generates actually do?

I assume it uses a heap data structure but when I try to google it I get lots of articles about the advantages and disadvantages of dynamic memory, and about the heap data structure (which I assume is the data structure that the heap uses because of the name, though it's not clear to me how that data structure would be helpful for the problem), which are far more useful but don't satisfy my idle curiosity on a topic whose answer is essentially meaningless to me

e: and of course immediately after hitting post I stumble on the right term to google to get useful answers, though I'm still interested in hearing more easy-to-understand answers than technical papers

Since you're talking about malloc and free, I assume you're thinking of application-level memory management instead of the system dividing up ram between applications.

There are many possible algorithms that could be used, but a common principle is to divide memory up into "chunks", where each chunk can be handed out to satisfy one allocation. Chunks are just slightly bigger than the allocation they contain, so that they have some extra bookkeeping data to identify how to properly free that allocation when the app is done with it.

One rather naive mechanism is to maintain a linked list of free chunks. The pointers for this don't take up any extra space - you can just use the space inside the chunk where all the data would live if it were actually allocated memory. When you need to allocate something, you can scan your linked list for a big enough chunk, split off a chunk (if necessary) to fulfill the allocation, and put the rest of the chunk back on the linked list. When a chunk is freed, simply return it to the linked list.

Real allocators are often more complex than this, because the naive solution has some pretty big shortcomings. But the fundamentals are pretty similar.

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender

Kuule hain nussivan posted:

Hmm, could it be that the replace operation you're doing is too late in the flow? When the script meets a filename with a dollar sign in it, what is the value of i? Does it still contain the dollar sign char?

My line of thought is, maybe the dollar sign is being removed when you do tmpcue="${i:r}" and is not present at tmpcue="${tmpcue//$/\\$}". So maybe doing...
code:
i="${i//$/\\$}" //Maybe i="${i/$/\\$}"
tmpcue="${i:r}"
...would work?

It does still contain it. Having looked at this more, I think it's an issue with how a command (chdman) I'm calling is dealing with escaped dollar signs.

For example, if I change those lines to this:

Zsh code:
		tmpcue="${i:r}"
		if [[ ! -f "${tmpcue}.chd" ]]; then
			echo "Processing: ${tmpcue}.cue"
			chdman createcd -np ${num_cpus} -i "${tmpcue//$/\\$}.cue" -o "${tmpcue//$/\\$}.chd"
Then with a directory setup that looks like this:

code:
ls Test\$Game\ Disc\ \(USA\)/
Test$Game Disc (USA).bin Test$Game Disc (USA).cue
I get these results:

code:
chdman is /opt/homebrew/bin/chdman
Processing: /Users/kreeblah/disctest/Test$Game Disc (USA)/Test$Game Disc (USA).cue
chdman - MAME Compressed Hunks of Data (CHD) manager 0.243 (unknown)
Error parsing input file (/Users/kreeblah/disctest/Test\$Game Disc (USA)/Test\$Game Disc (USA).cue: No such file or directory)

Fatal error occurred: 1
Error processing: /Users/kreeblah/disctest/Test$Game Disc (USA)/Test$Game Disc (USA).cue
If I don't do the dollar sign substitution there, I get these errors:

code:
Warning: osd_subst_env variable Millionaire not found.
Warning: osd_subst_env variable Millionaire not found.
I thought those were coming from the shell, but it turns out they're probably coming from chdman. It looks like it might try to do some sort of environment variable substitution on its own, which is kind of horrific, so I guess I get to talk to the MAME folks about it.

Computer viking
May 30, 2011
Now with less breakage.

For more malloc reading, you may also find jenalloc interesting - it's the one FreeBSD uses by default, but I think it shows up elsewhere, too.

The BSDCan paper they link to looks fairly readable.

ExcessBLarg!
Sep 1, 2001

Kreeblah posted:

I expect you're probably right about there being some escaping I need to do, but I'm having a hard time figuring out what the right way to escape these strings is without the shell interpreting the backslash as part of the filename.
I'm not familiar with zsh so I don't know if I can offer specific help, but I would make the point that trying to manually escape strings in shell is generally a futile exercise--every time you think you got it right some other character/not-8-bit-clean/unicode-garbage throws it off again. For example, just reviewing your script it would seem like the eval on "find_command_str" might blow up if any of the directories you add to it themselves contain shell-interpretable characters.

With bash you can often get around this by using arrays to build commands, and then use array interpolation that preserves quoting and white-space ("${arr[@]}").

ExcessBLarg!
Sep 1, 2001

yippee cahier posted:

The simplest heap allocator I’ve seen is simply a pointer that is moved through a block of memory. Need 10 bytes? Save the global pointer address locally and increment it 10 bytes. The local copy is your allocated memory. Naturally there is no free with a system like this.
This scheme actually works pretty well if you can get away with freeing the entire heap at once.

For example, this is the allocator that Varnish Cache uses within a request workspace. When a request comes in, a fixed-size workspace (say, 8 kB) is allocated to satisfy "malloc" operations for any externally-supplied data (headers, etc.) for the request. Once the request is finished processing, the entire workspace is freed at once. It's a good tradeoff to allow for flexible memory usage within a request while also bounding the amount of memory a single request can use--to prevent malicious actors from memory-DOSing you.

nielsm
Jun 1, 2009



cheetah7071 posted:

this is an idle curiosity more than something truly important but: what is the actual algorithm for dynamic memory allocation? When I call new or malloc, what does the assembly that generates actually do?

You already got several answers about the application level. I just want to add that at the OS kernel level there's generally some functions to request new (virtual) memory from the kernel. Usually you can only request whole pages (page size depends on the CPU architecture) at a time. When you do that the OS will set up some structures in your process to support the additional pages, and when your program then actually begins using the new memory (reading/writing it) the OS gets interrupts from the CPU about you using memory addresses that aren't valid, so it'll find some physical memory and map into your process at the requested address.
Virtual memory paging is a big topic in itself, also worth having a basic understanding of. The main thing is that the act of allocating memory (getting some address space) and having that memory be in RAM (paging in) are different, and just because you allocated some memory doesn't mean it "physically exists" yet.

rjmccall
Sep 7, 2007

no worries friend
Fun Shoe
On top of what’s already been said, let me just confirm that the heap data structure is totally unrelated to anything about memory management. Heaps are an efficient data structure (because they can be implemented on top of an array) for extracting the minimum/maximum element from a set without having to fully sort it, which is nice if you’re continually adding new elements as you go.

ArcticZombie
Sep 15, 2010

Kreeblah posted:

It does still contain it. Having looked at this more, I think it's an issue with how a command (chdman) I'm calling is dealing with escaped dollar signs.

For example, if I change those lines to this:

Zsh code:
		tmpcue="${i:r}"
		if [[ ! -f "${tmpcue}.chd" ]]; then
			echo "Processing: ${tmpcue}.cue"
			chdman createcd -np ${num_cpus} -i "${tmpcue//$/\\$}.cue" -o "${tmpcue//$/\\$}.chd"
Then with a directory setup that looks like this:

code:
ls Test\$Game\ Disc\ \(USA\)/
Test$Game Disc (USA).bin Test$Game Disc (USA).cue
I get these results:

code:
chdman is /opt/homebrew/bin/chdman
Processing: /Users/kreeblah/disctest/Test$Game Disc (USA)/Test$Game Disc (USA).cue
chdman - MAME Compressed Hunks of Data (CHD) manager 0.243 (unknown)
Error parsing input file (/Users/kreeblah/disctest/Test\$Game Disc (USA)/Test\$Game Disc (USA).cue: No such file or directory)

Fatal error occurred: 1
Error processing: /Users/kreeblah/disctest/Test$Game Disc (USA)/Test$Game Disc (USA).cue
If I don't do the dollar sign substitution there, I get these errors:

code:
Warning: osd_subst_env variable Millionaire not found.
Warning: osd_subst_env variable Millionaire not found.
I thought those were coming from the shell, but it turns out they're probably coming from chdman. It looks like it might try to do some sort of environment variable substitution on its own, which is kind of horrific, so I guess I get to talk to the MAME folks about it.

A quick experiment:

code:
$ ls *
test

files:
baz         foo$BAR.cue foobar.cue

$ cat test 
#!/usr/bin/env zsh

BAR="bar"

for f in **/*.cue;
do
	file=${f:r}
	echo "$file"
done

$ ./test 
files/foo$BAR
files/foobar
Zsh isn't the cause of the expansion.

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender

ArcticZombie posted:

A quick experiment:

code:
$ ls *
test

files:
baz         foo$BAR.cue foobar.cue

$ cat test 
#!/usr/bin/env zsh

BAR="bar"

for f in **/*.cue;
do
	file=${f:r}
	echo "$file"
done

$ ./test 
files/foo$BAR
files/foobar
Zsh isn't the cause of the expansion.

Yeah, that's what I figured out eventually. It didn't occur to me that it might be the application I was calling, because who loving does their own variable expansion?

ExcessBLarg!
Sep 1, 2001

Kreeblah posted:

Yeah, that's what I figured out eventually. It didn't occur to me that it might be the application I was calling, because who loving does their own variable expansion?
A shell script that packs the filename into a larger string and calls eval on it.

Kuule hain nussivan
Nov 27, 2008

Kreeblah posted:

It does still contain it. Having looked at this more, I think it's an issue with how a command (chdman) I'm calling is dealing with escaped dollar signs.

For example, if I change those lines to this:

Zsh code:
		tmpcue="${i:r}"
		if [[ ! -f "${tmpcue}.chd" ]]; then
			echo "Processing: ${tmpcue}.cue"
			chdman createcd -np ${num_cpus} -i "${tmpcue//$/\\$}.cue" -o "${tmpcue//$/\\$}.chd"
Then with a directory setup that looks like this:

code:
ls Test\$Game\ Disc\ \(USA\)/
Test$Game Disc (USA).bin Test$Game Disc (USA).cue
I get these results:

code:
chdman is /opt/homebrew/bin/chdman
Processing: /Users/kreeblah/disctest/Test$Game Disc (USA)/Test$Game Disc (USA).cue
chdman - MAME Compressed Hunks of Data (CHD) manager 0.243 (unknown)
Error parsing input file (/Users/kreeblah/disctest/Test\$Game Disc (USA)/Test\$Game Disc (USA).cue: No such file or directory)

Fatal error occurred: 1
Error processing: /Users/kreeblah/disctest/Test$Game Disc (USA)/Test$Game Disc (USA).cue
If I don't do the dollar sign substitution there, I get these errors:

code:
Warning: osd_subst_env variable Millionaire not found.
Warning: osd_subst_env variable Millionaire not found.
I thought those were coming from the shell, but it turns out they're probably coming from chdman. It looks like it might try to do some sort of environment variable substitution on its own, which is kind of horrific, so I guess I get to talk to the MAME folks about it.

Good thing you figured it out! Sorry I couldn't be of more help.

Adbot
ADBOT LOVES YOU

lifg
Dec 4, 2000
<this tag left blank>
Muldoon

cheetah7071 posted:

this is an idle curiosity more than something truly important but: what is the actual algorithm for dynamic memory allocation? When I call new or malloc, what does the assembly that generates actually do?

I assume it uses a heap data structure but when I try to google it I get lots of articles about the advantages and disadvantages of dynamic memory, and about the heap data structure (which I assume is the data structure that the heap uses because of the name, though it's not clear to me how that data structure would be helpful for the problem), which are far more useful but don't satisfy my idle curiosity on a topic whose answer is essentially meaningless to me

e: and of course immediately after hitting post I stumble on the right term to google to get useful answers, though I'm still interested in hearing more easy-to-understand answers than technical papers

I once spent a couple weeks slowly reading papers on generational garbage collection. It’s a weirdly fun topic.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply