|
Your line: code:
What you are doing is like this: code:
code:
|
# ¿ Jan 24, 2018 02:19 |
|
|
# ¿ May 6, 2024 22:56 |
|
Here's the documentation for the re module itself. And a how to that's a bit more digestible.
|
# ¿ Jan 25, 2018 18:30 |
|
Pretty sure this would also work:Python code:
Dr Subterfuge fucked around with this message at 19:31 on Jan 25, 2018 |
# ¿ Jan 25, 2018 19:22 |
|
Baronash posted:Is there a good way to have the extra series added as columns, rather than additional rows? I've never actually used pandas, but I can tell you that you should be working with a DataFrame as your result. Series are one-dimensional arrays, which is why appending one to another only extends the length full_list.
|
# ¿ Jan 25, 2018 20:50 |
|
Thermopyle posted:Here's a neat thing if you're one of the cool people like me who can mostly move most of their projects to keep up with the latest python versions. Coming in Python 3.7 are data classes! Say I have a program that uses a dictionary to pass data between functions. But at this point it's getting unwieldy, and attribute access sounds more appealing. Is there something in PyCharm that makes that refactoring less painful?
|
# ¿ Jan 25, 2018 22:41 |
|
Thermopyle posted:I can't think of anything specific to PyCharm that would make that easier other than maybe some fancy regex search/replace. Well you've given me the idea of just running a regex script on the files themselves that subs out a bracketed key in the code for the dotted equivalent. Shouldn't be too hard to make a pattern for, at least for the majority of cases. Hopefully. At least it gives me an excuse to mess with functional replacement in re.sub. Thanks!
|
# ¿ Jan 26, 2018 04:50 |
|
wasey posted:but I can't seem to properly assign the info. Thanks for helping out How have you been trying to do it so far?
|
# ¿ Feb 3, 2018 19:14 |
|
Wallet posted:If I have a list of numbers, like so: Unless that first term is always supposed to be ignored, it would seem that your output list should start with a [0]. e: No 62 either. Are you just ignoring all singleton lists? Dr Subterfuge fucked around with this message at 23:57 on Feb 8, 2018 |
# ¿ Feb 8, 2018 23:53 |
|
map sounds like what you're talking about?
|
# ¿ Feb 10, 2018 21:22 |
|
Cingulate posted:baka is, in fact, literally talking about map: Trying to python from my phone was a bad idea.
|
# ¿ Feb 12, 2018 03:56 |
|
json.loads is turning your pandas json file into a nested dict. You're getting yelled at because "collector_key" is in fact a key whose value is a dict, and you can't turn a dict into an int. E: It looks like your code is assuming that you have a collector key for each number, but what you actually have is one collector key with a bunch of numbers. E2: But that still wouldn't work with your code now, because you're accessing "collector_key" from the same outer dict that you are trying to iterate over. Dr Subterfuge fucked around with this message at 02:44 on Mar 3, 2018 |
# ¿ Mar 3, 2018 02:34 |
|
It puts everything from the imported package into your global namespace, so you can do things like call shuffle() directly instead of calling random.shuffle(). Practically its advantage is it cuts down on typing. Maybe there are other reason to do it that I am not aware of. It's generally not a good idea though because it imports everything implicitly, which makes it harder to understand where something like shuffle is defined, and it could cause hidden conflicts if you have something else with the same name in your global namespace. You can get the same behavior more explicitly by doing "from random import shuffle as shuffle" and you only get what you want.
|
# ¿ Mar 3, 2018 23:06 |
|
There are also ways to automate Excel file creation from python if you haven't already gone that route.
|
# ¿ Mar 4, 2018 23:01 |
|
You'll want to put an __init__.py everywhere so can import your folders as packages which will allow you to do this Python code:
|
# ¿ Mar 14, 2018 20:38 |
|
It's not clear to me that code:
This will work with an __init__.py in /scripts though: code:
Dr Subterfuge fucked around with this message at 22:26 on Mar 14, 2018 |
# ¿ Mar 14, 2018 22:09 |
|
__init__.py is basically a flag that tells python the folder that contains it is a package that can be imported. It can optionally contain some code to do some setup work (but it's going to be called every time you import anything in that package, so things should only go there if they really are general for the whole package). Python files can also be imported directly and are known as modules. The lookup behavior for python is determined by an environment variable called PYTHONPATH, which is basically a list of places on your system to look through when python reaches an import statement, plus the current directory (where you launched your script). The search precedence starts with your current directory and works down the list in PYTHONPATH. The search stops as soon as it finds something, so if you have a local module called math, python will import that instead of the default one because your current directory takes precedence over everything else. Any module that is in a folder pointed to in your list of search paths can be imported directly, which means, for example, you if you have main.py in the same folder as foo.py, you can just import foo from main like this: code:
code:
Dr Subterfuge fucked around with this message at 19:32 on Mar 15, 2018 |
# ¿ Mar 15, 2018 19:26 |
|
Yeah I mostly just trying to present a more complete picture of what was going on. The simplest way is making /scripts into a package and... not doing whatever it is you think is making editing sys.path necessary.
|
# ¿ Mar 16, 2018 04:29 |
|
Say I have two dataframes: df1 code:
code:
code:
Python code:
Dr Subterfuge fucked around with this message at 10:31 on Mar 18, 2018 |
# ¿ Mar 18, 2018 09:38 |
|
Cingulate posted:You mean like this? That feels much better. Somehow I was fixated on directly building the df I wanted instead of relying on col assignment. On the other hand vikingstrike posted:That's a one liner with pd.merge(). Oh hell. So it is. Python code:
|
# ¿ Mar 18, 2018 17:05 |
|
baka kaba posted:While we're on this, this is the best one of those visualisation videos I've seen Holy poo poo Radix is amazing.
|
# ¿ Mar 18, 2018 19:39 |
|
That's what json.loads has done. Printing it just displays message (which is now a python dict) as string so it can be displayed.
Dr Subterfuge fucked around with this message at 00:53 on Mar 30, 2018 |
# ¿ Mar 30, 2018 00:51 |
|
SnatchRabbit posted:Right, but what I need is message to be this portion, in JSON, so I can use individual values (strings) like resourceID, and put that in a statement later. Not sure that makes sense, like I said I'm terrible with python. Like Data Graham said getting resourceID from message would just be message['resourceID']. If you need to turn it back into a JSON string you would do json.dumps(message)
|
# ¿ Mar 30, 2018 04:09 |
|
You have a typo in image 2.
|
# ¿ Mar 30, 2018 21:05 |
|
Instead of explicitly searching intersections it would probably be easier (at least from a coding perspective) to use pandas and merge dataframes.
|
# ¿ Mar 31, 2018 22:49 |
|
Cingulate posted:Thread title I had this feeling I had read some joke about how often pandas was recommended in this thread. Somehow I actually hadn't known about it until I read about it here. Intersections between tabular data sounds like a pretty good use case at least!
|
# ¿ Apr 1, 2018 19:09 |
|
You can't open a list of file paths. You have to do it one at a time.
|
# ¿ Apr 3, 2018 03:55 |
|
Seventh Arrow posted:Yikes, really? So if it generated a hundred files, I'd be pretty hosed. I guess for now I'll just be thankful that the exercise only has five files. You can iterate over your urls and locations at the same time with zip. Python code:
|
# ¿ Apr 3, 2018 04:18 |
|
You don't have to have any more infrastructure than what I posted. In your code there isn't actually any reason to want to open a bunch of files simultaneously anyway, since each url is accessed sequentially, and each url corresponds to one location.
|
# ¿ Apr 3, 2018 05:12 |
|
I've been trying to figure out decorators, but one thing that I've always had trouble with is composition of functions. Basically, I realized that I was creating a bunch of methods that were each supposed to modify a different class attribute. I was getting the attribute from self, doing whatever I needed to change the value, and then setting it back. This seems like an ideal use case for a decorator, but I'm struggling to write one that does this. It seems like I should be able to change this: code:
code:
code:
1. how to get arbitrary arguments out of a decorator call and 2. On the other hand, if there is a better pattern than this class structure to go about updating a bunch of fields (in different ways) from a given input, that would be good to know, too. I'm well past the point where my Python abilities have outstripped my design abilities. E: Well, getting the class is proving to be difficult. The object the decorator says it's wrapping it is the function A.update_a, which breaks the linked class sniffer because the module __main__ doesn't have an attribute A E2: I think this works. Classes to the rescue! Modified code from here. Python code:
E3: using functools.wraps(func) as a decorator on newf is probably superior to just making the __doc__ attributes equal? E4: If someone could explain how the args in the __init__ and __call__ methods are magically different (and why args in __init__ doesn't consume func) I'd be really interested to know. Because this still feels like witchcraft to me. Dr Subterfuge fucked around with this message at 23:57 on Apr 7, 2018 |
# ¿ Apr 7, 2018 19:50 |
|
breaks posted:I don't comprehend at all what you're trying to accomplish. What you've got there at the moment seem to be one hell of a way to write self.some_list.append(some_stuff). On the other hand I've probably never manged to successfully understand a post in this thread when reading it at 2AM (or maybe ever) so I'll just answer these two specific questions: Basically, I have some huge scraper functions that I'm trying to refactor into smaller components (with maybe the eventual goal of messing around with asyncio), and the structure I came up with was to turn each scraper into a class (or rather a subclass of a base scraper) and use attribute access in class methods. So I'm not really just appending things. It was mostly to illustrate that I want to be able to modify an existing value. I'm aware there are whole scraping packages like scrapy that have already solved most of these problems. I'm mostly just messing around with rolling something myself to see if I can learn anything in the process. (But one of those is getting better at designing things, so if there's a better way to go about updating a bunch of different fields in a data structure that would be cool to know.) Looking at my code again and seeing your examples, I can see now how the two sets of args are different. The first args come from the arguments of the class, and the second args come from the fact that newf replaces the decorated function and gets called on the decorated function's arguments. The big thing I was missing was how the call that modifies the decorated function works.
|
# ¿ Apr 8, 2018 16:26 |
|
It looks like Sharepoint doesn't like your authentication method. I don't know anything about Sharepoint, and there are conflicting pieces of information online about how to properly authenticate your session, but I can say that it doesn't look like basic authentication is the way to go here.
|
# ¿ May 14, 2018 17:58 |
|
Generally speaking it would seem like using query instead of loc doesn't give you a whole lot since you lose pretty much all the niceties of your editor by moving everything into a string that needs to be parsed eventually anyway. Am I missing something? F-strings maybe? Which might not be a good idea of you care about security?
|
# ¿ Jun 28, 2018 17:15 |
|
vikingstrike posted:Sure. Glad it’s working! I'm thinking mostly about basic autocomplete and syntax checking. I make stupid typos way too often and sometimes get screwed over when I'm using strings because PyCharm doesn't know any better.
|
# ¿ Jun 28, 2018 18:22 |
|
Zero Gravitas posted:I'm trying to get an example of exactly how long it takes for the youtube algorithm to start pitching far right wing videos at a new user who views soft right or entryist political material. This is an interesting problem. Something to keep in mind is YouTube keeps track of how much of a video their users watch, so an agent that jumps quickly between videos isn't going to register the same way that an attentive listener (someone who watches the whole thing) would. Same thing with liking videos, of course. Probably comments as well. Probably neither of those are things that you want to be doing with the videos of the ever more deranged parts of the political spectrum, though.
|
# ¿ Jun 28, 2018 20:09 |
|
baka kaba posted:Another thing you could try is having a list containing the last 10 seconds' of IPs (filtered to only include the request types you're looking for). So when you get a new IP, add it to the list, then pop off the entries from the front of the list that have a timestamp > 10s earlier than your new entry. Then you can scan your list for matches on that new IP and see if you get 3+ This is basically the ideal use case for a deque and more efficient than the general purpose list.
|
# ¿ Jul 16, 2018 19:52 |
|
I learned about deques from Fluent Python, which I picked up because of reading this thread. So cheers all around.
|
# ¿ Jul 16, 2018 23:21 |
|
Should be this:code:
|
# ¿ Aug 30, 2018 23:23 |
|
Whoops. Yeah. Guess I was too busy looking at the indents to notice the guess.
|
# ¿ Aug 31, 2018 22:56 |
|
pretty sure this would workPython code:
|
# ¿ Sep 1, 2018 01:45 |
|
|
# ¿ May 6, 2024 22:56 |
|
And I just realized the print statement isn't a function, so this is python 2. I know some import behavior changed between 2 and 3, but I don't know if that's relevant here.
|
# ¿ Sep 1, 2018 02:49 |