Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!
I need to install a scientific Python environment on our labs Linux machines for students to be able to use as an alternative to MATLAB. So one of my requirements is basically I need to be able to have an icon that says "Python" that users can double click to open an IDE with a place to run code, place to edit code, analyze data, make plots, etc.

I use PyCharm but I'm not sure if their license allows me to do this for free and I'm actually worried that PyCharm has too many options/things going on that it would be too complicated. Is Spyder any good?

The other thing I was specifically told is that I need to manage the packages and dependencies at the system level. User level venvs and containers are out. They want everybody using the same versions, of everything. They don't want a situation where someone will use package_x v2.0, and someone will use package_x v2.5, and so on. Yes I know a container, or a predefined venv with specific version numbers would solve this problem since it wouldn't matter… but my boss was quite clear that one of us should micromanage packages for all users.

What's the best way to do that in a way that it would work with the IDE, and also allow people to just type "python" or "python script.py" at the command line to enter/run the same interpreter wth the same available packages?

Adbot
ADBOT LOVES YOU

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Anaconda would do all of that

Eela6
May 25, 2007
Shredded Hen

Munkeymon posted:

Anaconda would do all of that

This sounds like the perfect use case for Anaconda.

Spyder isn't great, but it will work just fine for students. It's as good of an 'IDE' for beginners as any, in my opinion.

(Spyder has the advantage of being being configured by default so that you can slam F5 and see the results of your script in a live IPython session in the bottom right corner. This is very helpful for beginners, especially in a math/scientific computing context. As a developer IDE, it's not great, but I think its just right for this.)

Dominoes
Sep 20, 2007

Thirding Anaconda; a stock installation should do.

QuarkJets
Sep 8, 2008

IIRC the people who made Spyder were actively trying to mimic the Matlab IDE, to act as an entry point to former Matlab users who aren't already fully consumed by Stockholm's Syndrome

Eela6
May 25, 2007
Shredded Hen

QuarkJets posted:

IIRC the people who made Spyder were actively trying to mimic the Matlab IDE, to act as an entry point to former Matlab users who aren't already fully consumed by Stockholm's Syndrome

Huh. That explains why I took to Spyder immediately as a new Python programmer (I learned to program in MATLAB)

numpy-in-spyder vs. MATLAB are remarkably similar, with the exception that numpy is good

Eela6 fucked around with this message at 00:22 on Sep 15, 2017

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

As a followup to that thing I posted from stackoverflow about how Python is the fastest growing language, they delve into why its the fastest growing language.




its data science

Ghost of Reagan Past
Oct 7, 2003

rock and roll fun

Thermopyle posted:

As a followup to that thing I posted from stackoverflow about how Python is the fastest growing language, they delve into why its the fastest growing language.




its data science
Good, good, maybe soon the ecosystem will rival R :getin:

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!
Ok. Im trying anaconda now with spyder. I used anaconda before so I have some experience with that at least.

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!
Is there a conda equivalent to "pip download -r requirements.txt"?

I have an environment set up and I'd like to export it along with all the required downloaded things to install on non networked computers.

I found this link

https://stackoverflow.com/questions/32956583/how-do-i-download-anaconda-packages-without-installing-them

Which links to the official FAQ but that question/answer has been removed.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Boris Galerkin posted:

Is there a conda equivalent to "pip download -r requirements.txt"?

I have an environment set up and I'd like to export it along with all the required downloaded things to install on non networked computers.

I found this link

https://stackoverflow.com/questions/32956583/how-do-i-download-anaconda-packages-without-installing-them

Which links to the official FAQ but that question/answer has been removed.

Just use pip inside your conda environment.

pubic void nullo
May 17, 2002


Boris Galerkin posted:

Is there a conda equivalent to "pip download -r requirements.txt"?

I have an environment set up and I'd like to export it along with all the required downloaded things to install on non networked computers.

I found this link

https://stackoverflow.com/questions/32956583/how-do-i-download-anaconda-packages-without-installing-them

Which links to the official FAQ but that question/answer has been removed.

Can you use the instructions for building identical conda environments and target the 'root' environment on the target machines? Root is supposed to be the name of the default environment and I know you said no virtual envs.

Then you can copy the cache directory that conda already keeps from the first machine and they should install from cache, unless there is some URL that it has to hit before it will install local packages.

pubic void nullo fucked around with this message at 13:47 on Sep 15, 2017

Dominoes
Sep 20, 2007

Thermo and public each gave you a valid answer; this is a pain point in Anaconda: You have two separate package managers, and packages may need to be installed in a mix of the two.

SurgicalOntologist
Jun 17, 2004

conda has a --file argument that is equivalent to the -r in pip. I tend to handle it by having one file conda-requirements.txt and another pip-requirements.txt. I've never had an issue with that setup.

I would not attempt to copy over the actual environment, rather the instructions to create it, e.g. the requirements files and maybe a bash script (it can even start from scratch and include curling the miniconda install script). It will be much easier to maintain, easier to transfer, and less prone to weird errors. Just remember to keep it up to date.

Regarding the MultiIndex question, the solution would be to just have the same value for all rows of that type. "both"?

Spans aren't really a thing in pandas, because spans are more about displaying data than organizing/manipulating it. If you do want to display the data using spreadsheet concepts, you may be able to convert to excel and then continue to manipulate it from Python using one of the excel libraries. pyxl might be one, I don't recall.

SurgicalOntologist fucked around with this message at 15:37 on Sep 15, 2017

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

SurgicalOntologist posted:

conda has a --file argument that is equivalent to the -r in pip. I tend to handle it by having one file conda-requirements.txt and another pip-requirements.txt. I've never had an issue with that setup.

I would not attempt to copy over the actual environment, rather the instructions to create it, e.g. the requirements files and maybe a bash script (it can even start from scratch and include curling the miniconda install script). It will be much easier to maintain, easier to transfer, and less prone to weird errors. Just remember to keep it up to date.

I don't have internet access on some of these machines.

Thermopyle posted:

Just use pip inside your conda environment.

I noticed anaconda can manage things that I would normally compile myself, like petsc, so I would like to use conda to manage things like that too. pip doesn't work here.

pubic void nullo posted:

Can you use the instructions for building identical conda environments and target the 'root' environment on the target machines? Root is supposed to be the name of the default environment and I know you said no virtual envs.

Then you can copy the cache directory that conda already keeps from the first machine and they should install from cache, unless there is some URL that it has to hit before it will install local packages.

That seems like it should work. These machines are all identical so there should be no machine dependent differences. It seems rather clunky that there is no equivalent to pip download though.

accipter
Sep 12, 2003

Boris Galerkin posted:

I don't have internet access on some of these machines.

You could create a virtual environment on your host system, and then distribute that out.

Daviclond
May 20, 2006

Bad post sighted! Firing.
Why does this one-liner:

code:
return my_list.reverse()
return None, but over two lines this:
code:
my_list.reverse()
return my_list
returns the reversed list as intended?

I thought returning code which had to be evaluated (e.g. "return x == y") was fine, so I'm clearly missing something here. Is there a better way to do it?

SurgicalOntologist
Jun 17, 2004

Boris Galerkin posted:

I don't have internet access on some of these machines.

My bad, I missed that.

In that case I would say just copy over the entire miniconda folder. Everything you need is in there.

SurgicalOntologist
Jun 17, 2004

Daviclond posted:

Why does this one-liner:

code:
return my_list.reverse()
return None, but over two lines this:
code:
my_list.reverse()
return my_list
returns the reversed list as intended?

I thought returning code which had to be evaluated (e.g. "return x == y") was fine, so I'm clearly missing something here. Is there a better way to do it?

It's just because list.reverse is implemented as an in-place operation, changing an existing list (and returning nothing) rather than returning a new one. I would also prefer if it was the latter, but for many of the manipulation methods on built-in data structures, they are in-place operations.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

You'll find that pure functions lead to better programs.

Generally, do
Python code:
return reversed(my_list)
instead of
Python code:
my_list.reverse()
return my_list

Eela6
May 25, 2007
Shredded Hen

Daviclond posted:

Why does this one-liner:

code:
return my_list.reverse()
return None, but over two lines this:
code:
my_list.reverse()
return my_list
returns the reversed list as intended?

I thought returning code which had to be evaluated (e.g. "return x == y") was fine, so I'm clearly missing something here. Is there a better way to do it?

Let's look at this in a little more detail.

list.reverse() is a method which reverses the list in place. It returns None.


IN:
Python code:
a = [1, 2, 3]
b = a.reverse()
print(a)
print(b)
OUT:
pre:
[3, 2, 1]
None
On the other hand, reversed() called on a list creates a new iterator, which iterates through each element of the list, starting from the back. This is cool, because you don't have to create a new list to iterate through it backwards. On the other hand, to get a list from it, you'll have to call list() to evaluate the generator.

Python code:
a = [1, 2, 3]
b = reversed(a)
print(b)
print(a)
print(list(b))
pre:
<list_reverseiterator at 0x2f46d9ef0f0>
[1, 2, 3] 
[3, 2, 1]
The best most terse way to simply create a reversed list or tuple is with a slice:
Python code:
a = [1, 2, 3]
b = a[::-1]
print(b)
print(a)
pre:
[3, 2, 1]
[1, 2, 3]

Eela6 fucked around with this message at 01:08 on Sep 16, 2017

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

I disagree that that is the best way if only because it's a little esoteric, whereas reversed is very explicit. I mean, I'm not going to puke when I see the slicing syntax, but it's going to be my third choice.

Daviclond
May 20, 2006

Bad post sighted! Firing.
Aha, thanks guys. I forgot that the in-place methods return None after doing their thing.

huhu
Feb 24, 2006
I'm looking to start a pretty large Python project with the hopes of actually releasing it. The last time I tried this I failed pretty badly and have learned a bit but would love any suggestions before jumping in again. My thoughts thus far are to try and diagram the entire project, use a virtualenv, and document everything as best I can as I go. Any other suggestions?

QuarkJets
Sep 8, 2008

Use github

cliffy
Apr 12, 2002

huhu posted:

I'm looking to start a pretty large Python project with the hopes of actually releasing it. The last time I tried this I failed pretty badly and have learned a bit but would love any suggestions before jumping in again. My thoughts thus far are to try and diagram the entire project, use a virtualenv, and document everything as best I can as I go. Any other suggestions?

Unit test everything. Don't merge anything that lacks testing or breaks existing tests.

Look into pytest and tox.

Learning how valuable automated tests are was a big breakthrough for me as a developer. They won't catch all bugs, but they will catch a ton if done properly.

pubic void nullo
May 17, 2002


huhu posted:

I'm looking to start a pretty large Python project with the hopes of actually releasing it. The last time I tried this I failed pretty badly and have learned a bit but would love any suggestions before jumping in again. My thoughts thus far are to try and diagram the entire project, use a virtualenv, and document everything as best I can as I go. Any other suggestions?

This whole podcast episode is worth a listen, but something that you might be interested in is discussed from around 48:00 on. Basically, Glyph is talking about his Pycon talk and saying that people should start the deployment/continuous integration process earlier in the development timeline instead of starting it late and finding that their app needs major work in order to be deployable, esp. across multiple platforms.

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!
So update on the whole offline anaconda thing:

It's a giant pain in the rear end.

First of all, there is an "--offline" flag available when you do "conda install x" except that it doesn't actually do anything other than raise errors saying that something is trying to use the internet.

I tried copying my entire miniconda folder instead except this doesn't work either because the path miniconda is installed to is hardcoded or something, so it can't be changed.

I saw that all of the bz2 archives for all the packages are kept in minocondaroot/pkgs, so I thought I'd just copy that entire pkgs folder to the new computer and drop them in there, thinking that the conda installer would find the cached files there. Except it doesn't. No amount of "--offline" or "--use-index-cache" or "--use-local" worked and the offline flag kept raising errors like I said above.

I found some random forums post about running "conda index /path/to/local/files" so I tried to do that, but the problem is that I would have needed to sort all of my bz2 files into their respective sub folders (noarch, linux64, etc) as according to those packages metadata. I wasn't going to do this for 100+ packages so I nixed that idea.

In the end what I did was export my configured root environment to know which packages i needed to install, and then write a bash script to iterate through them with "conda install file.bz2", and then upload all the bz2 files from my internet connected computer. This of course wasn't as easy as it sounds because some of the packages need to be installed in some certain order due to them depending on things from conda-forge.

In the end the handful of packages that had to recompile locally with different option flags didn't work when copied over to the other computer and installed this way. So I'll need to either figure out why or just compile those things locally for every computer, which would be the fastest thing.

Anyway that's my complaint right now.

SurgicalOntologist
Jun 17, 2004

Boris Galerkin posted:

I tried copying my entire miniconda folder instead except this doesn't work either because the path miniconda is installed to is hardcoded or something, so it can't be changed.

I don't recall this being the case; the only exception I know of is the change to the PATH environment variable that the installer makes.

Even if I'm wrong on that, can't you just install it to where you want it to end up on the first machine? Or symlink? Of the roadblocks you encountered, this seems like the easiest to solve.

QuarkJets
Sep 8, 2008

Boris Galerkin posted:

So update on the whole offline anaconda thing:

It's a giant pain in the rear end.

First of all, there is an "--offline" flag available when you do "conda install x" except that it doesn't actually do anything other than raise errors saying that something is trying to use the internet.

I tried copying my entire miniconda folder instead except this doesn't work either because the path miniconda is installed to is hardcoded or something, so it can't be changed.

I saw that all of the bz2 archives for all the packages are kept in minocondaroot/pkgs, so I thought I'd just copy that entire pkgs folder to the new computer and drop them in there, thinking that the conda installer would find the cached files there. Except it doesn't. No amount of "--offline" or "--use-index-cache" or "--use-local" worked and the offline flag kept raising errors like I said above.

I found some random forums post about running "conda index /path/to/local/files" so I tried to do that, but the problem is that I would have needed to sort all of my bz2 files into their respective sub folders (noarch, linux64, etc) as according to those packages metadata. I wasn't going to do this for 100+ packages so I nixed that idea.

In the end what I did was export my configured root environment to know which packages i needed to install, and then write a bash script to iterate through them with "conda install file.bz2", and then upload all the bz2 files from my internet connected computer. This of course wasn't as easy as it sounds because some of the packages need to be installed in some certain order due to them depending on things from conda-forge.

In the end the handful of packages that had to recompile locally with different option flags didn't work when copied over to the other computer and installed this way. So I'll need to either figure out why or just compile those things locally for every computer, which would be the fastest thing.

Anyway that's my complaint right now.

The shebangs at the top of all of the miniconda (or anaconda) files are pointing to whatever directory that you installed to. So for instance if you move the folder and then try to run Spyder then it won't launch because it can't find Python (because Spyder thinks your Python lives in a specific place, but then you moved it)

You have two solutions to this dilemma

1) Write a script that goes through all of the files and modifies all of the shebangs to whatever you want
2) Install miniconda or anaconda to whatever directory path that you want to use on your target system (so the shebangs will be set correctly once moved)

Do either of those and then move miniconda to your target system, and it will simply work.

Slimchandi
May 13, 2005
That finger on your temple is the barrel of my raygun
I'm learning more about how to structure a program having written mostly scripts up until now. Please can someone check my understanding and point me towards useful resources if I'm fundamentally wrong.

I understand that if I write a separate.py file, and import it in my main program by using 'import separate' then (almost) all the code in that .py file is executed, and the functions are available in the main program as separate.func1(), separate.func2() etc. Objects that are created when separate.py runs (such as pulling in some data from a file) are accessible to separate's functions, but not in the main file. Any code under 'if __main__' is not executed, as separate was imported, not ran.

What happens if I do 'from separate import func1’? How much of the code in separate.py is executed? I want to do some fairly expensive curve fitting and dataframe building, but if I have to put this code in my func1 definition this seems like a complete waste fitting the same data every time the function runs.

Any general 'beyond python scripting' resources gladly received.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Slimchandi posted:

I'm learning more about how to structure a program having written mostly scripts up until now. Please can someone check my understanding and point me towards useful resources if I'm fundamentally wrong.

I understand that if I write a separate.py file, and import it in my main program by using 'import separate' then (almost) all the code in that .py file is executed, and the functions are available in the main program as separate.func1(), separate.func2() etc. Objects that are created when separate.py runs (such as pulling in some data from a file) are accessible to separate's functions, but not in the main file. Any code under 'if __main__' is not executed, as separate was imported, not ran.

What happens if I do 'from separate import func1’? How much of the code in separate.py is executed? I want to do some fairly expensive curve fitting and dataframe building, but if I have to put this code in my func1 definition this seems like a complete waste fitting the same data every time the function runs.

Any general 'beyond python scripting' resources gladly received.

Anything that is accessible in your "if main" code is also accessible to whatever imports your "separate" module.

If you import func1 from separate then your func1 still has access to everything in your separate module, but code in your module that imported separate does not have access to anything but func1.

Seventh Arrow
Jan 26, 2005

So I've got this script: https://pastebin.com/GLkh6z0T

If I use it to run "python3 scriptname.py > output.csv" it will pull a list of all the "tech" groups in Toronto from meetup.com's API. So far, so good. However, I want to narrow the list down to all data-related groups: data science, data analytics, data engineering, etc. I figured the best way to do this would be to filter the results with a regex, maybe something like this:

meetup_f = main.filter(lambda line: re.search(r'([Dd]ata\s\w+)', line)

However, from what I'm told, this won't quite work since "main" is just printing the results instead of returning any values. I'm not good enough at python to figure this out on my own. Is there any way (in dumb-person language) to get these specific results before it shoots everything out to csv?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

The easiest way to modify it to do what you want is to add this between line 30 and 31:

Python code:
if 'data' not in category.lower(): 
    continue

Seventh Arrow
Jan 26, 2005

Thanks for the suggestion. If I do "python meetup_data.py > meetup_data.csv", I just get a blank file. I think what you might be trying to do is pull "data" groups from the API, but the API only has a category for "tech." It's necessary - at least, as far as I can tell - to further filter the results from there.

If I try doing "python3 meetup_data.py > meetup_data.csv", then I get the error:

File "meetup_data.py", line 32
continue
^
TabError: inconsistent use of tabs and spaces in indentation

Which is strange, since the formatting of the indentation seems to be consistent: https://pastebin.com/aHvT5GfM

accipter
Sep 12, 2003

Seventh Arrow posted:

Thanks for the suggestion. If I do "python meetup_data.py > meetup_data.csv", I just get a blank file. I think what you might be trying to do is pull "data" groups from the API, but the API only has a category for "tech." It's necessary - at least, as far as I can tell - to further filter the results from there.

If I try doing "python3 meetup_data.py > meetup_data.csv", then I get the error:

File "meetup_data.py", line 32
continue
^
TabError: inconsistent use of tabs and spaces in indentation

Which is strange, since the formatting of the indentation seems to be consistent: https://pastebin.com/aHvT5GfM

You have 8 spaces indent on your function definitions, and 4 spaces in other parts of the file. Use the same indent throughout. Did you search the file for tabs?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Seventh Arrow posted:

Thanks for the suggestion. If I do "python meetup_data.py > meetup_data.csv", I just get a blank file. I think what you might be trying to do is pull "data" groups from the API, but the API only has a category for "tech." It's necessary - at least, as far as I can tell - to further filter the results from there.

Well you just need to find the part of the data structure the api returns that you want to filter on. I dont' know what the API data structure looks like, but you want whatever part that says "data science", "data analytics", "data engineering", etc.

Seventh Arrow
Jan 26, 2005

Thermopyle posted:

Well you just need to find the part of the data structure the api returns that you want to filter on. I dont' know what the API data structure looks like, but you want whatever part that says "data science", "data analytics", "data engineering", etc.

Unfortunately the API isn't very robust, so there's no way within it to narrow down the results. That's why I want to see if there's a way to do it directly through python. Of course I could just do a filter on the results within excel, but this is for a data science project and they want a more direct way to get the desired results (since I might need to someday do the same thing on a gargantuan csv). I appreciate the assistance, though!

What's interesting is that the API has a buttload of options if you're searching within a specific group, like "Toronto Python Hackers" or something. But for other stuff (like categories), not so much.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Seventh Arrow posted:

Unfortunately the API isn't very robust, so there's no way within it to narrow down the results. That's why I want to see if there's a way to do it directly through python. Of course I could just do a filter on the results within excel, but this is for a data science project and they want a more direct way to get the desired results (since I might need to someday do the same thing on a gargantuan csv). I appreciate the assistance, though!

What's interesting is that the API has a buttload of options if you're searching within a specific group, like "Toronto Python Hackers" or something. But for other stuff (like categories), not so much.

Ok, I looked a bit closer and this line is where it extracts the data for printing out:

Python code:
print "," .join(map(unicode, [city, group['name'].replace...blah blah blah
Assuming it's currently printing out the info you want (like if its "data engineer" or whatever)...

So, I'd build this whole thing completely differently, but this should work...replace that line with:

Python code:
output = ",".join(map(unicode, [city, group['name'].replace(",", " "), group['created'], group['city'],
                                    group.get('state', ""), category, group['members'],
                                    group.get('who', "").replace(",", " ")]))

if 'data' in output.lower():
    print output
Just go through and make sure indentation is consistent...

Thermopyle fucked around with this message at 20:17 on Sep 20, 2017

Adbot
ADBOT LOVES YOU

Seventh Arrow
Jan 26, 2005

Excellent, thanks! I'll give that a try when I get home.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply