Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
accipter
Sep 12, 2003

Vivian Darkbloom posted:

I'm finding how hard it is to make a cross-platform console interface that isn't horrible for a little strategy game project, so it's time to get a basic GUI framework set up. I just read the Python Wiki page on GUI programming and I am looking at pyjs as a possibility, since targeting web browsers might be easier than rolling my own GUI entirely. Not sure, though -- ideally I'd like something that's programmer-friendly and doesn't require a lot of setup for a user running my program. Any recommendations on what might be the best GUI choice?

What are you envisioning in terms of the GUI?

Adbot
ADBOT LOVES YOU

accipter
Sep 12, 2003
What's the preferred program to create an Windows executable from a Python script these days?

accipter
Sep 12, 2003

Hughmoris posted:

For those that work with Excel, what Python library do you use?

I typically use openpyxl, but if there are charts as sheets then I might have to use xlwings. Note that xlwings opens up an Excel instance so it can be slow. If you are using tabular data, there is a function within pandas to also read excel. If you are mixing xls and xlsx, you might want to use pyexcel that has support for both.

accipter
Sep 12, 2003

Boris Galerkin posted:

My first instinct was that conda isn't in your $PATH or whatever the Windows equivalent is, but I just popped open a PowerShell (I don't have conda installed on this computer) and typed 'conda list' and it complained to me about conda not being a recognized command. So unless you trimmed that out of your post then I dunno.

By the way I think PyCharm is the best thing ever as well, coming from vim for approximately 10 years now (gently caress I'm old). I just started using it though and yeah I could definitely see where it might be confusing and intimidating.

10 years of vim isn't that impressive. I learned how to exit vim over 17 years ago and I still don't consider myself old. I too use PyCharm for most of my Python development, but also use Vim sometimes for standalone scripts.

accipter
Sep 12, 2003

baka kaba posted:

One day you'll reach a point where all the Google results are for 3.x docs no you won't

You can with py3redirect!

accipter
Sep 12, 2003
While we are on the subject of documentation, I also use http://devdocs.io from time to time. You can specify what documentation and version you want to search, and it even downloads the documentation for offline use.

accipter
Sep 12, 2003

Cingulate posted:

I think I need async/await ..? But I have never used either, nor understood the descriptions.

I am training a neural network on synthetic data, and I'm alternating data generation and training. So it looks like this:

code:
for epochs in range(number_of_epochs):
    X, y = generate_data(10000)  # generate training data
    model.train(X, y)  # train the model for one epoch
Now the data generation actually takes a significant fraction of the time it takes to train the model on the generated data (about 30%). But both run only on 1 CPU core (the model runs on the GPU). So it would save me some time if I could generate the next batch of data while the model is training. I.e., if I could do something like:

code:
X, y = generate_data(10000)  # generate training data
for epochs in range(number_of_epochs):
    model_finished = without_GIL(model.train(X, y))  # start training and release GIL
    if epochs < number_of_epochs:  # don't need for last epoch
        X, y = generate_data(10000)  # use CPU while model is running on GPU
        wait_for(model_finished)  # only proceed to next iteration once both training and data generation have finished
Is that something I could use await for?

Edit: both functions fully occupy 1 core. I think that means await is not enough? Maybe I should just abuse joblib heavily, that I could do.
Double Edit: joblib uses pickling and the model is compiled and sent to the GPU, so I think joblib wouldn't work.

What libraries are you using for this? Have you looked into using dask? Take a look at this and see if it will help:

http://matthewrocklin.com/blog/work/2016/07/12/dask-learn-part-1

accipter
Sep 12, 2003

VikingofRock posted:

So I've been doing more python lately, and while I feel like I have a decent grasp on the language itself, I struggle with a lot of the idioms for structuring a program. For example, how should I layout the directory structure of my program, what is __init__.py, where do I list my dependencies, stuff like that. Does anyone have a good resource for learning that sort of stuff? Like the things that are necessary to use python effectively, but which are outside of the scope of the language itself.

edit: I often don't really know what various tools are, either, which falls into the same category. Like what is virtualenv / venv / pyvenv, etc? Is there a standard python code formatter? What other tools should I know about?

Here is the official documentation: https://docs.python.org/3/tutorial/modules.html#packages , and this is pretty good guide on how to package a Python module: https://python-packaging.readthedocs.io/en/latest/index.html . I would also recommend the setuptools documentation: https://setuptools.readthedocs.io/en/latest/index.html . However, it is a more focused on setuptools, rather than the organization of a package. How to organize a package depends on the scale of the package. If it is small enough, you can put everything in __init__.py.

PEP8 is the recommended standard format for Python code. I use it because it is a standard, but there are people that complain about it (then again people always find something to complain about). I have also been used YAPF. When I am developing a package, I use py.test with flake8. This checks the format for all of my code, and if it violates PEP8 then I fix the code by hand.

Depending on what you are working on and the OS, Python dependencies can be in conflict. A virtual environment is a way to create a project specific collection of Python packages. I used miniconda to develop packages that simultaneously support Python 2.7 and 3.6. I do this by having conda environmental for 2.7 and 3.6 installed simultaneously, and test against each one.

accipter
Sep 12, 2003

Hughmoris posted:

Are there any recommended articles/tutorials/blogs on working with sqlite in Python? I've just started learning a little bit about SQL and I'm trying to find best practices when incorporating it into a script.

I'd like to use it in a small script that parses an RSS feed and, if it's a new entry, inserts it into the DB.

Do you want to work with sqlite directly? Or indirectly? If you want to work with it indirectly, look at Object Relational Mappers such as peewee or SQLAlchemy. Peewee is simpler, while SQLAlchemy is the standard (?) ORM for Python.

accipter
Sep 12, 2003

Boris Galerkin posted:

Well I just had a frustrating time figuring out what was wrong with a part of my code when it turns out it was numpy being weird.

So I have a numpy array where some values are inf and some values are nan. I create a Boolean list of where these values in the array are inf/nan, and then I use these indices to do something to another array.

Like so:

code:
import numpy as np

a = np.array([1, np.sqrt(-1), np.nan, np.inf])  # [1, nan, nan, inf]

print(a == np.inf)
# F, F, F, T, as expected

print(a == np.nan)
# F, F, F, F, which is wrong

print(np.isnan(a))
# F, T, T, F
Is there a reason it does this? Does np.nan has an actual numerical value or something? I would have thought it would be treated in the same way as None where it just "is" or "isn't."

code:
In [6]: type(np.nan)
Out[6]: float
See this for more information: https://docs.scipy.org/doc/numpy-1.13.0/user/misc.html

accipter
Sep 12, 2003

Boris Galerkin posted:

Is it ok to do this?

code:
class Foo:
    def __init__(self, parameters):
        self.parameters = parameters

    def __getattr__(self, key):
        try:
            return self.__dict__['_' + key]

        except KeyError:
            value = bar(self.parameters[key])
            setattr(self, '_' + key, value)
            return getattr(self, key)
parameters would be a dict, and bar() would do some calculations that only need to be done once and can be reused, but I might not always need said value. So my intention was to just evaluate and cache it iff I needed to.

I'm just not really sure if it's ok to use the __dict__ attribute like I am, and if I'm using __getattr__ properly.

What about this?

Python code:
class Foo:
    def __init__(self):
        self._bar = None

    @property
    def bar(self):
        if self._bar is None:
            self._bar = self._calculate_bar()
    
        return self._bar
Depending on what calculations you are trying to cache, you might consider https://pythonhosted.org/joblib/memory.html or http://code.activestate.com/recipes/52201/

accipter
Sep 12, 2003

Loezi posted:

Is there some reason @lru_cache doesn't work?

Thanks, I never knew that existed.

accipter
Sep 12, 2003

Eela6 posted:

I completely forgot about singledispatch. I don't think I've ever used it, but maybe I should have.

I feel like I need to review functools to re-learn all of it's cool features.

accipter
Sep 12, 2003

duck monster posted:

Ok, so I have a bit of a puzzle.

I've been loving about with Unreal Engine and I found this very fun bit of library: https://github.com/20tab/UnrealEnginePython I've been coding in C++ on this thing but honestly I'm a bit poo poo at C++, something about that loving language just gives me the hebeejebees. If it was C, no probs, but something about C++ just pisses me off. Anyway , this library is awesome. it works, you get the GPU crushing sexiness of UE4, but you can script the *poo poo* out of it with Python. Theres bit of a focus on editor automation, which is fine (Makes pulling 3d-coat textures together into proper materials much saner). But it also seems to work pretty well with in game scripting because it has some magic introspection auto api generating poo poo. I strongly recomend playing with this, if you fancy making games, because it turns UE4 into the first AAA quality python engine.

Anyway, heres the problem. UE4 uses an object/component model. Each item in the game (and light, and camera and sound and whatever) is an Actor object, and has a list of components attached, with the basic idea that you build up your objects by composition of components. Its kind of the gold-standard pattern in game dev. So the UE4 python plugin lets you use a python class as a component.

The life cycle of the thing is that when the Actor is instantiated., the PythonComponent instantiates the class ,then at some point , very soon after, all the components (C++, Python, etc) have a begin_play (or BeginPlay in the C++ classes) method which acts as the setup function when the scene starts winding into action. Of course like any python class the __init__ is called at true instantiation, but this is not really when one should construct as the UE4 python plugin appears to wait till AFTER __init__ to add the bits and pieces to the class to let it behave like a real UE4 object.

Anyway, I want to have a component called "Transmitter" that acts as a relay to a python server via rabbitMQ. The other objects in the game need to register with it using a register method. All good and easy, and I need to grab a reference to that transmitter so the registration can take place. Normally I'd just use a Singleton but my usual method of making a singleton in python involves the rest of the program agreeing not to call __init__() and instead using a static method that grabs an instance and everything else agrees to behave itself.

The problem is __init__ is called by the UE4 framework, and I have no control over the order of things being instantiated and I am not entirely sure I can be sure the BeginPlay on the objects wanting to register before the Transmitter has done its __init__.

So the question is, is it possible to code a Singleton that can if necessary transmogrify its own instance into an existing instance if snarfed via __init__()

Have you seen this: https://forums.unrealengine.com/showthread.php?54343-Communication-Between-UE4-and-a-Python-UDP-Server ?

accipter
Sep 12, 2003

Seventh Arrow posted:

I'm looking for a python tutor and not really sure how to go about it. I'm taking a data science course and my lack of proficiency with python is my biggest weakness. So obviously I want to focus on data science/data analysis concepts, but also not-quite-so-directly related things like web scraping and working with APIs.

I've read "Python Crash Course" and "Automate the Boring Stuff with Python" so I'm familiar with the basics but I tend to struggle with coming up with code on my own, or analyzing existing scripts.

I live in Toronto, but this seems like the kind of thing that could be done via skype or discord or whatever. I guess(?)

The catch is that I'm unemployed and receiving employment insurance, so I don't have a lot of cash to throw around. I'll try to work out something reasonable, regardless.

Have you tried the IRC #python channel on Freenode?

accipter
Sep 12, 2003

Boris Galerkin posted:

I don't have internet access on some of these machines.

You could create a virtual environment on your host system, and then distribute that out.

accipter
Sep 12, 2003

Seventh Arrow posted:

Thanks for the suggestion. If I do "python meetup_data.py > meetup_data.csv", I just get a blank file. I think what you might be trying to do is pull "data" groups from the API, but the API only has a category for "tech." It's necessary - at least, as far as I can tell - to further filter the results from there.

If I try doing "python3 meetup_data.py > meetup_data.csv", then I get the error:

File "meetup_data.py", line 32
continue
^
TabError: inconsistent use of tabs and spaces in indentation

Which is strange, since the formatting of the indentation seems to be consistent: https://pastebin.com/aHvT5GfM

You have 8 spaces indent on your function definitions, and 4 spaces in other parts of the file. Use the same indent throughout. Did you search the file for tabs?

accipter
Sep 12, 2003

Linear Zoetrope posted:

What's the best tool for documenting Python 3 for an open source project (there are multiple languages in this project, but let's assume I'm gluing together the docs from different languages manually)? It looks like there are several options, and I'm not familiar enough with the Python open source world to know what's standard. I feel like I rarely see the builtin PyDoc, and see sphinx referenced more often, but I don't really know the tradeoffs and pros/cons of all the various tools.

I would suggest Sphinx and ReadTheDocs. I am not sure how well that works for not automatic documentation.

accipter
Sep 12, 2003
I am the author of two Python packages. The first package (pyrvt) provides a set of tools for working with type of motion, and provides a number of classes. The second package (pysra), uses pyrvt, but extends the capabilities of the pysra. I extend those cabilities, but inheriting from a class in pysra, however it feels like I am doing this in a clunky way. Is there a better way do this?

Python code:
# Within motion.py of pysra

class RvtMotion(pyrvt.motions.RvtMotion, Motion):
    def __init__(self,
                 osc_freqs,
                 osc_accels_target,
                 duration=None,
                 peak_calculator=None,
                 calc_kwds=None):
        # Is there a better way of injecting this?
        Motion.__init__(self)
        pyrvt.motions.RvtMotion.__init__(
            self,
            osc_freqs,
            osc_accels_target,
            duration=duration,
            peak_calculator=peak_calculator,
            calc_kwds=calc_kwds)


class CompatibleRvtMotion(pyrvt.motions.CompatibleRvtMotion, Motion):
    def __init__(self,
                 osc_freqs,
                 osc_accels_target,
                 duration=None,
                 osc_damping=0.05,
                 event_kwds=None,
                 window_len=None,
                 peak_calculator=None,
                 calc_kwds=None):
        # Is there a better way of injecting this?
        Motion.__init__(self)
        pyrvt.motions.CompatibleRvtMotion.__init__(
            self,
            osc_freqs,
            osc_accels_target,
            duration=duration,
            osc_damping=osc_damping,
            event_kwds=event_kwds,
            window_len=window_len,
            peak_calculator=peak_calculator,
            calc_kwds=calc_kwds)

accipter
Sep 12, 2003

creatine posted:

Wondering if something like this exists in Python.

I have some data that gets analyzed and then put on a scatter plot. I was using matplotlib and bokeh but I've got a lot of data and multiple datasets on the same graph and was wondering if there was a module or way to save scatter plots into PSD or some other layered image type that I can use to edit.

You can create SVG files with matplotlib, which should work for you.

accipter
Sep 12, 2003

outlier posted:

So if I was looking to make a "fast" version of some Python code (in the sense of writing the details in some lower-level language but keeping the API the same like cStringIO versus StringIO), what's the best way to go around it. Is it still using C? Use a helper library like Pyrex or Cython or Boost? Is all that Rust and Julia stuff viable?

What is your fast code going to do? I haven't tried Cython yet, but I have had very good luck using Numba. The nice thing about Numba is that syntax is Python, but just compiled ahead of time.

accipter
Sep 12, 2003

Cingulate posted:

"The Python default for dealing with tabular data."

Or, for people who know R: "Dataframes for Python."

Exactly. I like DataFrames for quickly working with data in properly formatted CSV files -- especially when I need to work on groups of the data.

accipter
Sep 12, 2003

Seventh Arrow posted:

I have a comma-separated spreadsheet with a bunch of information about condos in my city, most importantly it has the latitude and longitude of these places. I want to be able to output these coordinates onto google maps but I'm not sure how to go about doing this. I looked at this link but none of the API's seem to quite provide what I'm looking for (at least, not with python). Any suggestions?

Are you trying to create a map? Or load custom points on a Google Map? If you need to use Google, then I would create your own map (https://www.google.com/maps/d/), and then upload the CSV.

accipter
Sep 12, 2003

Seventh Arrow posted:

I need to put markers on a map, but since this is for a data science python course I need to find a ~*pythonic*~ way of doing it. The module that was linked to earlier was good, but I need to find around some of the details.

Okay. For nearly all of my mapping needs I use basemap or cartopy, but these create static maps. If you want a slippy interactive map you could also consider Bokeh (https://bokeh.pydata.org/en/latest/docs/user_guide/geo.html).

accipter
Sep 12, 2003

Do you know about getattr?

Python code:
class container:
    def __init__(self):
        self.children = [Inner(), Inner(), Inner()]

    def get_card(self, card):
        return [getattr(inner, card)() for inner in self.children]

accipter fucked around with this message at 06:44 on Jan 13, 2018

accipter
Sep 12, 2003

Portland Sucks posted:

Got it, I guess this is pretty much how I was attempting to write my tests. I figured that Excel likely had precision loss in it as well and I know the guy who originally wrote the Excel workbook we're using put no thought into what he was doing regarding precision. I just want to make sure that what I'm producing isn't going to be wildly off the mark.

Also take a look at numpy.testing.assert_allclose for testing all values in an array.

accipter
Sep 12, 2003

Mrenda posted:

Is there a particular setup I should have for developing with Python on windows?

I would recommend a miniconda installation.

https://conda.io/miniconda.html

accipter
Sep 12, 2003

SurgicalOntologist posted:

I'm doing a bunch of computations and saving them to a database. The following function is supposed to be committing to the database every 100 results (based on the variable commit_interval), and stopping after 10,000 (the value of total). But I'm not getting 10,000 results on each backtest_set, instead I'm getting 9,900. Can anyone spot the error? Am I making a basic off-by-one -ish mistake or do I have some weird race condition? I can't figure it out. Here's the code.
Python code:
    results = []
    for i, result in enumerate(tqdm(
            backtest_set.run(),
            unit='entry', desc='Backtesting', total=total, unit_scale=True,
    ), 1):
        results.append(dict(backtest_id=backtest_set.id, fantasy_points_hundredths=result))
        if not i % commit_interval or i == total:
            db.execute(models.BacktestResult.__table__.insert(), results)
            db.commit()
            results = []
        if i == total:
            break
Any ideas?

The actual work is here is done inside the generator function backtest_set.run(). And if you haven't seen it before, tqmd is just a progress bar library (a pretty great one), as I'm using it just wraps an iterable. The only other things there are my models and sqlalchemy session db.

You are testing that i == total, but on the last iteration i == (total - 1).

accipter
Sep 12, 2003

unpacked robinhood posted:

I'd like to "live-update" a geographical map with rectangular overlays, as soon as background processes bring up fresh data.
So far I have an ugly blocking thing in matplotlib/cartopy that requires manually closing the window to resume processing, until updated content pops up in a new window.
Can I make a non blocking display in matplotlib, or is there something more suitable out there ? I've seen a few SO posts on the issue but the solutions looked super clunky and confusing.

You might want to consider Folium. I think it should work for you.

accipter
Sep 12, 2003

Protocol7 posted:

So what's the deal with multithreading in Qt?

I basically have a long-running background task, that logs through a logger, and I want to capture the logging output and have it stream to a QTextEdit.

It sort of works, except that the main UI thread still gets destroyed and freezes up. In my research I came to the conclusion that the correct way to do it is with a QThread and a QTimer to periodically stream data from the StringIO that I have the log redirecting to, but I'm clearly doing something wrong.

I know that the logging redirection works since it no longer shows up in the standard output window of PyCharm, but my UI thread just freezes so I don't see the QTextEdit ever updating (and I have to force-quit python...).

Nvm, I guess I was using StringIO wrong. I ended up just redirecting the output to an actual file (which will come in handy when we need to debug) and that works for whatever reason.

I believe that you send status updates back from the QThread using a signal().

accipter
Sep 12, 2003
Would a relative import solve the issue? Instead of "import io", you would do "from . import io".

accipter
Sep 12, 2003
I have used xlwings for work with Excel and it is quite nice when tools like pyexcel cannot handle it.

accipter
Sep 12, 2003

QuarkJets posted:

The general rule is that Python will not create a copy unless asked. Since you expect the output list to have all of the elements of the input list, except modified, you can copy it at the start of the function and then modify the elements in place.


Python code:

completed = list(input_list) 
for entry in completed:
    # do something to entry
print(completed) 

This will just create a copy of the list instance, you might need to also copy the objects held by the list.

accipter
Sep 12, 2003
I have the structure below. Basically, p and m are both used by an instance of calc to perform a calculation. All of those components (calc, and p and m, which are stored within calc) are then used by outputs to compute a set of metrics of the calculation. This works great for a single threaded process, but I would like to move to use dask and convert the classes into immutable calculators. Or maybe just copy the calculator are return a new instance with the new parameters (p and m).

Python code:
count = 20
outputs.reset()
for i, p in enumerate(pysra.variation.iter_varied_profiles(
    profile,
    count,
    var_velocity=var_velocity,
)):
    # Here we auto-descretize the profile for wave propagation purposes
    p = p.auto_discretize()
    for j, m in enumerate(motions):
        name = (f'p{i}', f'm{j}')
        calc(m, p, p.location('outcrop', index=-1))
        outputs(calc, name=name)
A few questions. Is there a design pattern for this? Any suggestions on how to break everything into steps and then re-assemble it?

As I write this, I feel like the answer is to return a result from calc that is passed to outputs to be processed and turned into more results, which are eventually collected.

accipter
Sep 12, 2003

Hughmoris posted:

Anyone ever futz with extending Python using C or Rust to gain performance?

I've been reading some articles on it and the idea seems intriguing but I'm sure the reality can be a headache.

I have used C and numba. For my problem, using numba was just a fast and much easier to implement.

accipter
Sep 12, 2003
I recently discovered pyenv with virtual environments and it seems pretty amazing. It is pretty nice because you can configure directory specific environments. It also can mix conda and pip based environments. I found these two webpages helpful:
- https://akrabat.com/creating-virtual-environments-with-pyenv/
- https://realpython.com/intro-to-pyenv/#virtual-environments-and-pyenv
I should note that I am using linux and success on windows might be different.

Adbot
ADBOT LOVES YOU

accipter
Sep 12, 2003

SirPablo posted:

Not sure that is quite what I'm aiming at. He's an example of one polygon that is rasterized at 0.01°x0.01°. I'd like to do this for hundreds of similar polygons, but the step I'm scratching my head on is counting them up grid by grid on a much larger domain, thus giving me a polygon density.



Make a raster of the entire area with a value of zero. Loop over each polygon, and increment all points in it by one. You should be able to do this with rasterio.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply