|
couchbase should be able to handle that just fine
|
# ? Feb 12, 2020 19:29 |
|
|
# ? Mar 29, 2024 14:36 |
|
MongoDB Atlas has a pretty big free tier
|
# ? Feb 12, 2020 21:49 |
|
write a tiny bit of code to import the repetitive, implicit-schema JSON bullshit into a real database, even just SQLite, and do your queries against that it’ll take you barely any time at all to pull the data in, then you can create some indexes and go to town
|
# ? Feb 14, 2020 09:04 |
|
LOL the Yelp dataset isn’t even implicit schema I should write something to slurp this into SQLite just to show how terrible tossing data like this around as JSON is
|
# ? Feb 14, 2020 09:10 |
|
oh my godcode:
gently caress there are legit things that are difficult to structure in a relational database but only one thing in this schema gets anywhere close to that
|
# ? Feb 14, 2020 09:12 |
|
eschaton posted:write a tiny bit of code to import the repetitive, implicit-schema JSON bullshit into a real database, even just SQLite, and do your queries against that Would I be able to do those queries online (not locally)? How hard would this be compared to just hosting the JSON somewhere, if I am completely unfamiliar with the types of software used for SQL, and don't remember anything at all from my databases class? Even with those limitations I've otherwise managed to host small websites using mlab (free mongodb host) and storing nothing but json strings in it. But a 5gig file I'm not so sure that works for. At least mlab won't go that big. Happy Thread fucked around with this message at 09:24 on Feb 14, 2020 |
# ? Feb 14, 2020 09:21 |
|
eschaton posted:write a tiny bit of code to import the repetitive, implicit-schema JSON bullshit into a real database, even just SQLite, and do your queries against that Please don’t be that guy itt, thanks
|
# ? Feb 14, 2020 15:17 |
|
Dumb Lowtax posted:Would I be able to do those queries online (not locally)? How hard would this be compared to just hosting the JSON somewhere, if I am completely unfamiliar with the types of software used for SQL, and don't remember anything at all from my databases class? Yeah 5 GB is pushing it if you’re looking for free hosting. The only thing I can think of is DynamoDB which has 25 GB of free hosting but I also have zero experience with that. Honestly with the fact you’re *just* above the free tiers in demand you may just want to bite the bullet and pay for something. I doubt it will cost you more than a few bucks.
|
# ? Feb 14, 2020 15:58 |
|
Oh, that's good too! As long as I can cancel when needed and don't get rounded up to some plan tier that assumes and charges for significant traffic. I'll check the pricing pages
|
# ? Feb 14, 2020 20:30 |
|
I know this shouldn’t be anything new to anyone itt but here’s a pretty thorough takedown of Mongo if anyone needs to talk their management out of sticking their dick in that mousetrap: http://jepsen.io/analyses/mongodb-4.2.6
|
# ? May 15, 2020 21:48 |
|
So this is the closest thread I can find that has to do with elastic search, I have a couple of questions that I can't find a definitive answer to regarding arrays and searchability. I'm looking to offload some invoice data to elastic for searching purposes, and I have a couple of fields that I want to represent as a simple array. One would be a payment confirmation #, and a customer could potentially pay an invoice off in more than one payment. So the my plan for the field would be for it to look like this "confirmation_numbers": [12345, etc] Is that a searchable field like that or do I need to mark it nested? Related to that, I then want an array field with all the payment dates, so something like "payment_dates": ["date1", "date2", etc] A. is that field able to be defined as a date type, and B) is that field searchable using "from" and "to" with that layout.
|
# ? Nov 17, 2020 18:04 |
|
Just-In-Timeberlake posted:So this is the closest thread I can find that has to do with elastic search, I have a couple of questions that I can't find a definitive answer to regarding arrays and searchability. Elasticsearch doesn't make a distinction between single-value and array fields, so "conf_num": 1234 and "conf_num": [1234, 5678] both have the same mapping/schema and you can have some docs with a single value and some docs with an array in the same index. The main thing to be aware of is with array-valued fields, order is not considered when searching, so you could query for "conf_num:5678 AND conf_num:1234", but you can't specify in the search that 1234 has to come before 5678 in the array. You'll want to use a range query for from/to queries, but keeping in mind the above, a range query will return all docs that where any of the values in the "payment_dates" array fall into that range.
|
# ? Nov 17, 2020 18:21 |
|
Arcsech posted:Elasticsearch doesn't make a distinction between single-value and array fields, so "conf_num": 1234 and "conf_num": [1234, 5678] both have the same mapping/schema and you can have some docs with a single value and some docs with an array in the same index. thanks, just what I was looking for.
|
# ? Nov 19, 2020 21:50 |
|
Happy Thread posted:What's an easy free way to host a 5 gig JSON file for one person at a time to query? It's just the Yelp Academic Dataset for a personal demo for a school project. Normalize the gently caress out of that thing. A 5 gig searchable data blob is going to play havok on any architechture. Redesign that thing, ruthlessly.
|
# ? Nov 21, 2020 12:58 |
|
Hey all hoping you can help me. I'm working through setting up my first project with mongodb using mongoose and node. My project is a very simple website that lists triple a baseball players. I'm having trouble figuring out what I need to do to get the following to work. I realize I could just be dumb and did not set up the schemas right. Anyways Player has a schema of: Name Team (objectid ref to TripleATeam or DoubleATe) DpubleATeam has a schema of: Team name MajorLeagueClub (objectref to MajorLeagueTeam) TripleATeam has a schema of: Team name MajorLeagueClub (objectref to MajorLeagueTeam) MajorLeagueTeam has a schema of: Team Name My problem is I can't for the life of me figure out how to query MajorLeagueTeam and get every player for all three teams listed. Is this possible? Did I mess up by not allowing each player to also select the major league team? I feel like there should be a way to programmatically reference the top level team Thanks!
|
# ? Dec 30, 2020 03:53 |
|
Empress Brosephine posted:I can't for the life of me figure out how to query MajorLeagueTeam and get every player for all three teams listed. Is this possible? I was halfway towards writing out a SELECT statement when I realized what thread I was reading.
|
# ? Dec 30, 2020 04:01 |
|
I mean if you have that I'll take it also, I have some knowledge of postgres
|
# ? Dec 30, 2020 04:12 |
|
Empress Brosephine posted:I mean if you have that I'll take it also, I have some knowledge of postgres Cloudflare said "gently caress you, that looks like a SQL injection attempt".
|
# ? Dec 30, 2020 04:26 |
|
Rip
|
# ? Dec 30, 2020 04:26 |
|
Assuming I could just query on Team and ignoring the object ref part of the NOSQL setup, how I’d do it in SQL is here: https://pastebin.com/xrnGBhw1 ...which is straightforward, two sub-selects leading to Team IN (all matching AA) or Team IN (all matching AAA).
|
# ? Dec 30, 2020 16:50 |
|
That seems Soo easy. I asked this question on reddit and even they are like "not really possible with nosq". Guess I should conver to postgres. Thanks.
|
# ? Dec 30, 2020 17:29 |
|
This would be pretty straightforward with nosql DBs that have triples. You should be able to get what you're looking for with the $lookup feature though. Don't know how it works under the hood / it's performance implications but it'll let you write the query as a left-outer join
PIZZA.BAT fucked around with this message at 17:42 on Dec 30, 2020 |
# ? Dec 30, 2020 17:39 |
|
I had no idea about $lookup, i'll have to look into it. Here's the code I typed so far also:code:
|
# ? Dec 30, 2020 18:49 |
|
I kinda figured it out but now i'm running onto scope issues! Which I guess is a step forward. Here's how I "solved" it:code:
Empress Brosephine fucked around with this message at 19:51 on Dec 30, 2020 |
# ? Dec 30, 2020 19:46 |
|
I FIGURED IT OUT YAY!!! Thanks for the help all.
|
# ? Dec 30, 2020 20:14 |
|
|
# ? Mar 29, 2024 14:36 |
|
I hope I'm not reviving this thread for nothing... I'm building something where I have some devices at different sites running node, which need to keep in sync with each other at the same site as well as a cloud backend. The solution we've come up with is to use PouchDB on each device, syncing with each other as well as a CouchDB instance in the cloud. We've got the devices syncing with each other, and using a selector filter on the CouchDB sync, the devices only replicate documents with a matching site id. I will then have a service on the backend keeping a continuous _changes feed open on the CouchDB, so that records from the devices can get processed into the actual backend. The reason for the devices syncing via a selector filter instead of each site just having its own db is because we expect to have tens of thousands of sites, and I'm expecting that keeping open that many connections open to a _changes feed on each db is going to be not great. So: - am I right in choosing to have a single db instead of a db per site, given that I need to be watching for changes in real time? - is there a better way to do this instead of using the _changes feed? can I configure some sort of callback function in a design document that can post the document data onto a queue or something, so I don't need to watch _changes? - if the single db is the way to go, then is it possible to configure CouchDB to block attempts to replicate without a selector filter specified? I'm assigning unique credentials to each device, but I want to prevent a rogue device from syncing the entire DB and seeing data from other sites that it should not be able to see Ideally I'd rather have a unique database per site so credentials prevent a rogue device from being able to pull down data from another site's db, and have Couch drop the incoming documents onto a queue that my service can watch instead of the _changes feed, but a) I'm not sure if that's possible or how to do it, and b) I'm not sure if that would be better than just monitoring a single _changes feed, and c) I'm not sure if monitoring a single _changes feed instead of tens of thousands of _changes feeds will have a performance impact in the first place. Please guide me, dear goons
|
# ? Jun 14, 2021 16:28 |