|
Hey y'all I've got some hosed up data I'm working with and I fear if I keep at this on my own much longer the repeated stress of cranial impact is going to eventually shatter my desk to pieces. I (currently) quote co-delivery service contracts for a living. This involves me going into our vendor's website and putting together a quote using their pricing tool. Once I've got the vendor quote in-hand, I then have to take the guts of that quote and put it into a quote template that my company uses, and then present that to the customer. I can get the vendor quote as an xml file (some kind of weird-rear end xml file that needs some sort of remote auth key for me to even try and do anything in python with it, so that's out), as a live html page, a csv file, or an xlsx document. Since it's all just tables anyway, I usually pull down either the HTML version or the XLSX version. Easy-fuckin-peasy, copy paste from one workbook to the other, right? loving wrong. Look at this poo poo. Look at this absolute horseshit we get from our vendor: 20 columns of data, spread out across sixty-loving-six columns - with most of them loving hidden. What's in them, you ask? Absofuckinglutely nothing. They're blank. Not a loving thing in them, Just resized down to 0.0001 pixels wide, and merged and centered across the range. Of course, when I try and paste this data into a clean sheet, it's an abhorrent mess of empty cells that don't even line up properly (see row 1 as compared to the remaining rows) : I can, fairly efficiently, put in the manual work to unfuck it - punch row 1 over into column b, multi-select empty rows and columns, remove, blah blah blah. This is making me insane. It's a bug-gently caress-nutty time sink and it drags our entire team down. I don't want to do it manually anymore. So far: I've tried macros, which have helped to automate some of the process - the number of blank columns is always consistent so those are easy to account for in a macro. Unfortunately, the variable number of line items (and tables full of line items - there can be multiple sites in a quote, each with their own table of line items) make removing the rows an imprecise science at best. I've tried a little bit of python but I'm still very new to it, I thought about using a for loop but the inconsistent number of blank rows/columns that need to be removed is a bit beyond my level of understanding right now. Ultimately, I want the most simple, plain table of data you could ask for. What do you think would be the best way to approach something like this, save finding whomever designed the format for these quotes and holding their head underwater until that one last bubble "bloop"s its way to the surface?
|
# ¿ Dec 16, 2020 18:38 |
|
|
# ¿ Apr 28, 2024 01:20 |
|
Harminoff posted:Have you tried anything with Power Query yet? I adore Power Query. I use it when I don't feel like spinning up wamp and doing things with SQL - though I'll admit the way the raw data comes out had me thinking that Power Query would just...poo poo itself trying to do anything with it. I may give it a go DRINK ME posted:I get to deal with this sort of stuff for work every now and then and congrats - that’s some of the worst data I’ve seen. This is all MCode, yeah? Or is this VBA? I've really only used Power Query to emulate SQL (that's my real fuckin' jam) for purposes of like, customer inventories and the like so I'm not familiar really with the code - of course with data this lovely, I don't even want to begin trying to load it into a db until it makes something resembling sense. I do have a personal macro that removes the columns I don't need (I really am only concerned with 4 of them - Service Description, Service Code, Svc Quantity, and Total List Price) as well as the empties so, I have the start of something. I actually brought the CSV version of the export over to the python thread a while back and I got essentially the same reaction to that: 100% so, yeah. Fun. It looks a bit like this: code:
|
# ¿ Dec 16, 2020 20:37 |
|
Fun! I've never done VBA before. I was thinking of grabbing a Udemy class on it, I think I will next time they have a sale. You talked me into it.
|
# ¿ Dec 16, 2020 21:47 |
|
Wiggly Wayne DDS posted:do you have a rough example file? if i had to deal with that using only excel first thought would be to make a parser sheet pointing at the downloaded file as a named range then try match(1,index(column_name)) for each uniquely named column. i gather the data is structurally under the column but messes up when you're copy/pasting. if you threw together a naive parser for the first 100 rows it'd give you a more set table to copy out of and filter out the empty rows I definitely wanted to over-engineer a solution. If you open the csv in Excel like the vendor intends, it actually displays almost exactly the way I wanted to Rube Goldberg it to. Ain't that something. I was dead set on using it like this: Still probably going to take a VBA course, another handy skill to have I'm sure Lib and let die fucked around with this message at 22:51 on Dec 16, 2020 |
# ¿ Dec 16, 2020 22:10 |
|
Is there a way to get Excel to stop doing..."intelligent" things like trying to determine the Specific example: I'm pasting 4 values from a webpage into an excel sheet. Product name, version, quantity, and a 4th field that's useless that I end up deleting but have to copy over initially so the formatting stays right. The version field can contain a 7, an 8, or an "8/7" because the license works with both versions of the software. Excel thinks I'm trying to paste in 7-Aug. Ok, so format cells, general, no, that doesn't work, format cells, text, and...Excel converts it to a loving unicode date so I've got to go back in and manually enter "8/7" Yes I can find/replace after the fact, but, I don't want to. E: specifically a "now and for ever" solution, not something I have to pre-configure every time I make a new work book Lib and let die fucked around with this message at 20:30 on Dec 30, 2020 |
# ¿ Dec 30, 2020 20:26 |
|
Big Bad Beetleborg posted:A bunch of genes were renamed earlier this year because Excel kept reading them as dates. It was agreed to be easier to change a what a whole branch of science referred to things as than get Excel to stop doing that. Well it's nice to know this isn't a problem relegated to morons like me. Really highlights the extremes of the love/hate relationship I have with Excel.
|
# ¿ Dec 31, 2020 02:21 |
|
me your dad posted:Thank you both - I haven't yet had a chance to try these solutions. I grossly underestimated the amount of data per file. It's more like 45k to 55k records across six files. I considered today that our email platform provider (Salesforce) may be able to assist us with this from their back end of things. Before I try going forward with the unwieldy process of managing so much data in Excel, I am going to see what they come back with. I opened a case with their support team earlier. I know this is the Excel thread, but have you considered/are you familiar with using SQL? Having both SQL and Excel familiarity this is something I might try and figure out with PowerQuery, even. The Excel gurus here probably have a better idea how PowerQuery could pull it off, but if you wanted to approach it from SQL side, I'd probably create a table from each file and join each table on mailer1.emailaddy=mailer2.emailaddy and then count either all of the null values or all of the non-null values for each returned row.
|
# ¿ Jan 21, 2021 23:06 |
|
Whybird posted:Wouldn't it be better to create a single table with emailaddy and datereceived columns, and then just do Probably, but there's nothing I enjoy more than over engineering a solution.
|
# ¿ Jan 22, 2021 12:53 |
|
It's been a while, but I had a workbook with a power query in it that would update based on the value of a specific cell in the workbook (I'd enter a contract number, refresh data, and the query would refresh to the external data source of [contract number from cell].xlsx and display it in a format that was easier for me to copy and paste into the renewal quote). Perhaps something like that would work, where your variable would be the field where your value picker is?
|
# ¿ Feb 23, 2021 17:42 |
|
Kibayasu posted:Is there some way to remove trailing spaces from cell values without using the TRIM function? I'm going to be sent a monthly table and one of the columns I'm going to be using as a look up reference seems like it has to have 10 characters in it and it fills those characters with spaces if there wasn't 10 text characters entered. The problem is I'm generating the same references in a summary table on my end but without the trailing spaces. If all the references were the same length I could find/replace the number of spaces but they won't be. It wouldn't exactly be hard either to just insert my own column into the table, use TRIM from top to bottom, and reference the new column instead so I'm mostly just wondering if there's some easy method I don't know about. Are there spaces IN the data or just at the end? If there are only trailing spaces you can use text to cells under the data ribbon and set the delimiter character to space.
|
# ¿ Jan 13, 2022 00:52 |
|
I don't know that this is the best thread for it, but since Excel is so often used for working with csv files...is there a better tool? Or a specific set of options I can turn on to have Excel not do any of its allegedly helpful things with say, date and time stamps? I'm talking 500,000+ rows of data, so a notepad++/general IDE is out because I need the cell-based display, but good lord Excel really needs a very basic no-frills raw csv mode or something.
|
# ¿ Mar 3, 2022 17:04 |
|
Wandering Orange posted:Before pasting CSV data or otherwise importing, select the entire sheet and change the number formatting to Text? And then when I re-open the file to work with it, I have to convert it all to text again else it saves dates in whatever hosed up format it's decided is best. I suppose I could macrofy it, but then I have to have a workbook with my macros in it open when I'm working with the files too...It's absolutely infuriating. I'm close to giving up on Excel entirely and seeing if Google Sheets handles them any better - I've tried about everything I can think of with Excel (hence turning to my fellow goons for advice)
|
# ¿ Mar 3, 2022 17:29 |
|
It's just raw UTF-8 CSV files, opening them from explorer. My little machine with 16gb or ram would choke to death copying.... Lemme check real quick....225x24000 out of a csv file and pasting values...back into a csv file. If I had my way, we'd just be getting full SQL dumps to load and work with but.... Yes, that's 225 columns of data. Exceed Beyond is....silly.
|
# ¿ Mar 3, 2022 17:39 |
|
Wandering Orange posted:Oh god you're dealing with that sort of stuff. Yeah I would be using SSIS/SQL or even Access if I had to as Excel is not going to be friendly. It's both shocking and completely unsurprising that the companies that do business with Nonprofits will do anything and everything the can to prevent customers from leaving their jank-rear end systems, up to and including making the export process as difficult as possible.
|
# ¿ Mar 3, 2022 17:46 |
|
I don't understand lambdas in python, and I don't understand them in excel. why create a function that's just getting thrown away???????
|
# ¿ Jul 21, 2022 13:33 |
|
Harminoff posted:Is anyone watching Excel Esports on ESPN? there's no way this is a thing there's no way i'm going to watch this why am i watching this
|
# ¿ Aug 10, 2022 18:01 |
|
Regex and Xlookup? https://www.youtube.com/watch?v=kCzSiOHj_lw
|
# ¿ Aug 25, 2022 21:48 |
|
This is super weird - I can't get XLOOKUP to work at all ('Unknown function', but if I just do XLOOKUP() it tells me it's expecting between 3 and 6 parameters) but LOOKUP seems to function like an XLOOKUP does? Am I missing something? e: Lib and let die fucked around with this message at 22:43 on Aug 25, 2022 |
# ¿ Aug 25, 2022 22:36 |
|
Strong Sauce posted:https://workspaceupdates.googleblog.com/2022/08/named-functions-google-sheets.html
|
# ¿ Aug 25, 2022 23:08 |
|
Jack the Lad posted:Is there a way to write the name of a named range as text in a cell, then pass the contents of that cell to a formula that parses it as the named range? INDIRECT is probably the function you want, it can pass the value of a cell along as part of a formula
|
# ¿ Nov 8, 2022 01:25 |
|
Jack the Lad posted:Is there a way to write the name of a named range as text in a cell, then pass the contents of that cell to a formula that parses it as the named range? a little more info on the INDIRECT function: I set up a basic sheet and A1 to BobsFaveFoods and set the named range to B2:B10, the original formula functions as intended, but it also works as such: code:
You can even get really fancy and create formulas from multiple indirect cell references like this: e: in your case you might want to use something like code:
Lib and let die fucked around with this message at 03:02 on Nov 8, 2022 |
# ¿ Nov 8, 2022 02:43 |
|
kumba posted:word to the wise: indirect is a volatile function which means it has to recalculate every time any cell in the entire workbook is changed Yeah INDIRECT is one of those functions where if you find yourself using it a lot, you may want to consider using like...an actual database. I work with a data conversion workbook that uses INDIRECT in a lot of places and it's really slick in the sense that holy poo poo a google sheets can do that but I also have to force some pages to update manually by deleting a row, hitting undo, and then sheets will recalculate everything (I don't know if F9 would behave the same way in Excel, in Sheets there doesn't seem to be a manual "refresh all my formulas NOW" option)
|
# ¿ Nov 8, 2022 15:09 |
|
Here's an incredibly dumb question: Can I resize all my columns by just the length of the column header? I seem to recall doing that somehow in the past but I can't figure it out now. grumble grumble loving salesforce export files
|
# ¿ Nov 8, 2022 20:13 |
|
esquilax posted:Highlight the cells you want to fit the columns to. oh thank gently caress, i knew it was something incredibly simple but i couldn't even mangle a decent google search to find it lmao
|
# ¿ Nov 8, 2022 21:26 |
|
Small Excel Questions Not Worth Being Answered by a Human
|
# ¿ Jan 24, 2023 22:52 |
|
|
# ¿ Apr 28, 2024 01:20 |
|
Hughmoris posted:XLOOKUP broke the meta and cost my boy MAKRO a chance at the gold. Does native python support in Excel going to be considered some sort of PED for this type of stuff?
|
# ¿ Sep 9, 2023 19:36 |