Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
8-bit Miniboss
May 24, 2005

CORPO COPS CAME FOR MY :filez:

Strike Anywhere posted:

If you tracert to news.supernews.com, are any hops on Comcast's network? I wonder if the traffic shaping is on their side and they're just doing a blanket nntp throttle during their peak hours. I'd be surprised if that was the case, but it makes me wonder.

Couple hops on their network.

code:
|------------------------------------------------------------------------------------------|
|                                      WinMTR statistics                                   |
|                       Host              -   %  | Sent | Recv | Best | Avrg | Wrst | Last |
|------------------------------------------------|------|------|------|------|------|------|
|                             192.168.2.1 -    0 |  110 |  110 |    0 |    0 |    1 |    0 |
|        cpe-xx-xx-xx-xx.socal.res.rr.com -   80 |   25 |    5 |    0 |   26 |   31 |   29 |
|      tge7-2.grgvca65-cer01.socal.rr.com -    0 |  110 |  110 |    9 |   14 |   52 |   11 |
| tge0-8-0-12.grgvcabt-ccr01.socal.rr.com -   80 |   25 |    5 |    0 |   17 |   18 |   16 |
|       agg23.lsancarc-ccr01.socal.rr.com -   80 |   25 |    5 |    0 |   23 |   37 |   16 |
|           ae-6-0.cr0.lax00.tbone.rr.com -   80 |   25 |    5 |    0 |   21 |   38 |   38 |
|           ae-0-0.pr0.lax10.tbone.rr.com -   19 |   61 |   50 |    0 |   18 |   44 |   15 |
|be-10-303-pe01.600wseventh.ca.ibone.comcast.net -   41 |   42 |   25 |    0 |   19 |   47 |   15 |
|pos-2-3-0-0-cr01.losangeles.ca.ibone.comcast.net -   66 |   29 |   10 |    0 |   19 |   34 |   17 |
|                   No response from host -  100 |   21 |    0 |    0 |    0 |    0 |    0 |
|                   No response from host -  100 |   21 |    0 |    0 |    0 |    0 |    0 |
|                   No response from host -  100 |   21 |    0 |    0 |    0 |    0 |    0 |
|                   No response from host -  100 |   21 |    0 |    0 |    0 |    0 |    0 |
|                   No response from host -  100 |   21 |    0 |    0 |    0 |    0 |    0 |
|                      news.supernews.com -    1 |  106 |  105 |  117 |  123 |  173 |  118 |
|________________________________________________|______|______|______|______|______|______|
   WinMTR v0.92 GPL V2 by Appnor MSP - Fully Managed Hosting & Cloud Provider

Adbot
ADBOT LOVES YOU

Softcox
Jul 13, 2004

But I will not hesitate.
Not for a second.

No it isn't

neurotech
Apr 22, 2004

Deep in my dreams and I still hear her callin'
If you're alone, I'll come home.

Evil Trout posted:

I was curious about writing my own NZB indexer after a few of my friends got boned by the NZBMatrix closing. I spent a few hours writing a multithreaded Ruby app to crawl newsgroups and threw it on a VPS overnight.

I was quite surprised to see about 7GB of storage was used for 24 hours of headers. Now that could definitely be lower as I'd group together the NZBs from the headers as they were complete, but it still takes quite a bit of disk space.

I was kind of hoping to run my own indexer on a VPS, but since most of them come with like 25GB of space total, I'm not sure I could swing a good retention without going dedicated. So that kind of sucks!

I tried to do something like this with Ruby a few days ago and had serious issues with the few nntp gems available. Care to share how you pulled it off?

longview
Dec 25, 2006

heh.

copperblue posted:

Install was seriously 5 minutes on Win7, and the back loading was done in a couple hours. It'll automagically check for new spots every hour.

I'm assuming this'll be the preferred route in the future. You're already paying for usenet, why not get your nzbs from there too. Get more people submitting and commenting entries and it'll be better than having an indexer.

Not to mention you'll subconsciously learn dutch.

No joke, even setting it up on a bunch of VMs with reverse proxying it only took me an hour.

I already had to skim a bunch of Dutch documentation and it's surprisingly readable. These are just teething problems, like SpotWebs English translation has weird grammar and translation issues (Image category example). I expect usage to increase a lot, since it's a pretty good distributed system that doesn't involve gigabytes per day and hours of typical CPU-time.

SpotWeb has the worst search engine though, it seems like it ORs together every word by default, so making a search more detailed usually brings back even more useless results that match one odd word.

isaboo
Nov 11, 2002

Muay Buok
ขอให้โชคดี
Spotweb was easy to setup and looks pretty cool, but Giganews locked my account for several hours because of "probable account sharing". My Spotweb installation is on my VPS and I was also trying to download an NZB via Sabnzbd on my home pc; Giganews only allows connections from one IP at a time for personal accounts.

I also got an alert, not really a warning, from my VPS provider for high disk IO while Spotweb was doing its thing.

loose-fish
Apr 1, 2005

Thermopyle posted:

You can do some pretty complex category filtering, but it's all done via manipulating the url...someone would have to edit the web interface I guess to actually have a UI for it.

Actually that's not necessary, you can use the search window to construct filters (using the categories checkbox thing, so you don't have to look up cryptic poo poo like cat0_z0_c10) and save them. You can also download and upload filters as xml files, though that's also pretty crappy. Filters can be rearranged/deleted/etc. in the settings (I think user settings, I can't actually check right now).

crm
Oct 24, 2004

Can somebody give me a rundown on how Spotweb works? The docs don't really explain what it is.

I've got it installed and had it pull everything, but I'm not exactly sure how it's different than newznab.

Evil Trout
Nov 16, 2004

The evilest trout of them all

neurotech posted:

I tried to do something like this with Ruby a few days ago and had serious issues with the few nntp gems available. Care to share how you pulled it off?

You're right, those gems are awful. And they don't support the extension commands that make an indexer easier to write. So I rolled my own NNTP class. I'm not fully ready to release my indexer's source code, but here's the NNTP Connection class I used. It's pretty simple but works!

https://gist.github.com/4337577

longview
Dec 25, 2006

heh.

crm posted:

Can somebody give me a rundown on how Spotweb works? The docs don't really explain what it is.

I've got it installed and had it pull everything, but I'm not exactly sure how it's different than newznab.

Executive Summary: Instead of putting user submitted data in a central database (like Newzbin, Matrix etc.) they post the data back to usenet in a standard format.

SpotWeb checks for new posts in a specific group and adds them to the database. Comments are also implemented by posting them back to usenet.

The protocol is known as SpotNet, for more information. The real advantage is it can be very high quality, and the hardware and bandwidth requirements are much much lower. Disadvantage is new posts have to be added by users instead of automatically popping up as soon as they're posted.

Lemons
Jul 18, 2003

So is the whole Dutch thing an issue?

Like are there enough reports for English stuff that it won't be an issue?

longview
Dec 25, 2006

heh.
It looks like usage has increased a lot in the last few weeks, a lot of the content is still Dutch but I have been able to find English content easily enough. Often the content is English but the description is Dutch.

Telex
Feb 11, 2003

Lemons posted:

So is the whole Dutch thing an issue?

Like are there enough reports for English stuff that it won't be an issue?

I personally dislike the dutch thing, a lot. It makes the comments portion a bit useless, at least right now. The way the results are filed with mixed content makes it tough to browse too. Like "HD" should not be a main category, it should be a subcategory and that's just unfortunate. Plus I want to filter out any non-English results by default, so maybe newznab is the way to go anyway.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

How big of a problem the Dutch thing is is contingent on what you're going to use it for.

If you're going to use it heavily for browsing around and finding stuff, it's going to be more irritating. If you're going to use it mainly as a search provider for Sickbeard/Headphones/Couchpotato then it's much less of an issue.

Moist von Lipwig
Oct 28, 2006

by FactsAreUseless
Tortured By Flan
SO assuming I have unlimited bandwidth and at least a couple hundred gigs of diskspace but I want to run something locally, is newznab or spotweb the better deal?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Moist von Lipwig posted:

SO assuming I have unlimited bandwidth and at least a couple hundred gigs of diskspace but I want to run something locally, is newznab or spotweb the better deal?

Depends on what you're using it for. Newznab will give you something just like nzbs.org or nzb.su...meaning totally automated nzb generation. SpotWeb will give you human-curated content.

Personally, I'd run both.

TraderStav
May 19, 2006

It feels like I was standing my entire life and I just sat down

Thermopyle posted:

Depends on what you're using it for. Newznab will give you something just like nzbs.org or nzb.su...meaning totally automated nzb generation. SpotWeb will give you human-curated content.

Personally, I'd run both.

Who are the humans that are curating the content if I'm running it locally?

Telex
Feb 11, 2003

TraderStav posted:

Who are the humans that are curating the content if I'm running it locally?

Imagine that you were uploading .nzb's back to usenet and attaching comments.

That's what spotweb is basically doing.

Crackbone
May 23, 2003

Vlaada is my co-pilot.

Is there any way MPAA/etc could shut down the spotweb method? It seems pretty ingenious and I have to image it's the logical progession of usenet indexing rather than trying to scurry into these little private indexers.

longview
Dec 25, 2006

heh.
They *might* be able DMCA the report posts, but that's a little questionable since I don't think the reports themselves are their content, even if it were to be information on the location of their content.

More likely is that since automated services like Newznab can match posts to content, they could automatically send DMCA requests based on that. Otherwise watching spotnet wouldn't be a problem and sending takedowns based on the nzbs in the reports would probably be doable.

TraderStav
May 19, 2006

It feels like I was standing my entire life and I just sat down

longview posted:

They *might* be able DMCA the report posts, but that's a little questionable since I don't think the reports themselves are their content, even if it were to be information on the location of their content.

More likely is that since automated services like Newznab can match posts to content, they could automatically send DMCA requests based on that. Otherwise watching spotnet wouldn't be a problem and sending takedowns based on the nzbs in the reports would probably be doable.

So if I'm reading this correctly, I can hosts these on my HTPC (C2D 2.4ghz ish) running Win7 with several hundred GB of free space (or even my low-powered NAS with even more space??) relatively easily and use it as a replacement for these indexers?

How much technical/server knowledge do I need? If I managed getting sickbeard/sab working just fine in the early days, will I be fine or do I need to set up environments and such?

longview
Dec 25, 2006

heh.
All you need to get either working is some kind of webserver+a database+php, though I recommend using SW as a supplement to a site like nzbs.org since that will pretty much always have more content, being automatic and all.

NewzNab has very detailed guides, SpotWeb has detailed Dutch guides so I'll try to summarize what you need to do here.

I don't have my Newznab setup running but it was pretty simple to get working in Ubuntu Server at least, literally cut and paste commands. Filling a backlog takes forever, it requires lots of disk space and is fairly CPU intensive regexping all the headers to form releases. Since it's pretty much purely header based you can run it for free off an Astraweb block account, headers don't count as usage.

SpotWeb was a bit harder but I set it up in CentOS with nginx as a front-end, php-fpm for CGI and PostgreSQL for a database, it will run about as well with a LAMP or WAMP stack.
All you need is to make a database called spotweb beforehand, preferably with a separate user for it, this is pretty easy to do on the SQL command line.
Download the latest source from the github site, extract it to your www-root.
Open install.php in a browser and it asks for database credentials, walks you through connecting your usenet server, then you make a user and it's done.

Importantly, you should probably have an unlimited account for this, none of the unlimited accounts will let you account share but in my case I use the same account for indexing and downloading, since they're behind NAT they still count as one external IP. I think SW actually downloads release info and nzbs from usenet on the fly when you open a release, there is an option to pre-fetch but it literally makes the process about 50x slower, so I set it to fetch full reports, spots and comments, but not to fetch nzbs, images or full comments.

Only other thing needed is to set up a scheduled task/cronjob to run "php (or path to php.exe etc.) (path to)retrieve.php --force" something like once per hour, I add --force in case a job is killed or exits with warning so it won't stop working until a .pid file is deleted. Filling takes at most a few hours for SW the first time, a few minutes if you do it often, and my database still fits on an 8 GB partition.

SW runs very well on my VM with 4 cores (Q9300) and 1 GB ram allocated, it's fairly CPU intensive to load the front page or search, but it doesn't use a lot of memory at all.

Telex
Feb 11, 2003

For anyone trying to get Newznab running on a nas4free box, here's the URL rewrite rules you need to put in the webserver additional config options:

http://pastebin.com/2gKUrMby

Since it doesn't use apache, the .htaccess it comes with is useless, but you can put those rewrite rules in the base config for the webserver and it all works out. (at least so far).

Telex fucked around with this message at 00:14 on Dec 20, 2012

Potassium Problems
Sep 28, 2001

Evil Trout posted:

You're right, those gems are awful. And they don't support the extension commands that make an indexer easier to write. So I rolled my own NNTP class. I'm not fully ready to release my indexer's source code, but here's the NNTP Connection class I used. It's pretty simple but works!

https://gist.github.com/4337577
I did near the same thing you did, and I incorporated rolling my own yenc decoding so I could use XZVER/XZHDR commands to pull compressed overview & header info. I'll post the code if you're interested

Ashex
Jun 25, 2007

These pipes are cleeeean!!!

Evil Trout posted:


Nope because I was just storing strings in rows in postgres. I'm sure I could do a lot better though. First of all I'd avoid any rows that don't have xxx/yyy in them (ie, are multipart binaries.) Then, instead of storing the entire subject every time I'd store the subject once and a row for just the part with message id.


I'm not very familiar with postgres (I primarily dealt with DB compression on hadoop/big data stuff) but you may able to shrink the db size by compressing the tables with snappy/LZO, it's fast enough that you won't see any performance impact when retrieving articles.

Telex
Feb 11, 2003

anyone know if it's possible to set up an RSS feed type thing with newznab like you could do back in the day with nzbs.org before they disabled it and nzbmatrix before it died?

I had a few custom searches set up and I really miss them. Totally want to re-do them if I've got full reign to ruin my own CPU.

B-Nasty
May 25, 2005

Telex posted:

anyone know if it's possible to set up an RSS feed type thing with newznab like you could do back in the day with nzbs.org before they disabled it and nzbmatrix before it died?

I had a few custom searches set up and I really miss them. Totally want to re-do them if I've got full reign to ruin my own CPU.

You can still do RSS feeds with nzbs.org and .su through the API interface. My SAB autodownloads from about 30 custom searches in RSS.

Telex
Feb 11, 2003

B-Nasty posted:

You can still do RSS feeds with nzbs.org and .su through the API interface. My SAB autodownloads from about 30 custom searches in RSS.

Cool. Now I need to figure out how in the world to get PHP to show me error messages because i get nothing but a white page when I try any searches.

I'm starting to think that doing this on nas4free is a stupid idea, so if anyone else has successfully done this any advice would be super. Everything seems to work but the search I guess, but I sorta need that part to work for sabnzbd don't I?

Police Academy 6
Jul 12, 2006

clever

eames
May 9, 2009

Then wouldn’t it make more sense to make a goon-run Spotweb server instead of the project that may or may not exist?
Is it even possible to run Spotweb on a server for, say, 20 concurrent users?

ClassH
Mar 18, 2008
For the people thinking of setting up spot web, google spotweb easy. It is an automated installer thing that takes about 1 minutes to get going (windows). I rolled up my own newznab server that is working great so far but I think I will hold off on spotweb till the english community gets rolling behind it.

kiresays
Aug 14, 2012

What's the difference between spotweb/newznab and an indexer site?

ihatepants
Nov 5, 2011

Let the burning of pants commence. These things drive me nuts.



So my 2 month old account was deleted from nzb.su today. I didn't download much on it but I'd been logging on daily the past few weeks just to check. I also didn't have VIP, but it never gave me the option to upgrade.

How difficult is it for someone without much computing knowledge to set up and run spotweb?

ihatepants fucked around with this message at 16:45 on Dec 20, 2012

astr0man
Feb 21, 2007

hollyeo deuroga

kiresays posted:

What's the difference between spotweb/newznab and an indexer site?

Spotweb and newznab are ways to run your own indexer site. A lot of active indexer sites (like nzb.su) are newznab based.

Spotweb is a little bit different because it indexes releases that are manually curated by actual people. My understanding is that basically someone re-uploads a release nzb to a spotweb specific newsgroup, and then all of those releases will show up in spotweb.

GhostSeven
Apr 23, 2005

Yesterday Was A Million Years Ago

longview posted:

SpotWeb has the worst search engine though, it seems like it ORs together every word by default, so making a search more detailed usually brings back even more useless results that match one odd word.

I have just fixed that as it was winding me up!

Assuming you are using mysql this is the hacky fix to essentially default to an AND search if you have not specified any other search modifiers such as + - etc.

It also works for newznab api so improves Sickbeard et al.

It is hacky so be warned!

Edit /spotweb-install-dir/lib/dbeng/dbfts_mysql.php

BACK UP THE FILE!

Replace / Comment Out :
code:
if (($searchMode == 'match-natural') || ($searchMode == 'both-match-natural')) {
  /* Natural language mode altijd default in MySQL 5.0 en 5.1, but cannot be explicitly defined in MySQL 5.0 */
  $queryPart = " MATCH(" . $field . ") AGAINST ('" . $this->_db->safe($searchValue) . "')";
  $filterValueSql[] = $queryPart;
} # if

With :
code:
$queryAdj = str_replace(array(' '), ' +', $searchValue);
if (($searchMode == 'match-natural') || ($searchMode == 'both-match-natural')) {
  $queryPart = " MATCH(" . $field . ") AGAINST ('" . $this->_db->safe($queryAdj) . "' IN BOOLEAN MODE)";
  $filterValueSql[] = $queryPart;
} # if

Dude With Pants
May 29, 2010
i think best alternatives at the moment that have similar layout to the matrix , are nzbs.co.uk and newztown.co.za

nzbs.org is also good but they don't have top 10 movies , or games , or tv .

GhostSeven
Apr 23, 2005

Yesterday Was A Million Years Ago

Dude With Pants posted:

nzbs.org is also good but they don't have top 10 movies , or games , or tv .

?? nzbs.org has games both console and pc / mac, TV and movies...

GhostSeven fucked around with this message at 18:01 on Dec 20, 2012

Evil Trout
Nov 16, 2004

The evilest trout of them all

Lone_Strider posted:

I did near the same thing you did, and I incorporated rolling my own yenc decoding so I could use XZVER/XZHDR commands to pull compressed overview & header info. I'll post the code if you're interested

Would absolutely love this!

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

eames posted:

Then wouldn’t it make more sense to make a goon-run Spotweb server instead of the project that may or may not exist?

More sense in what way? Do you mean just for people who can't be arsed to install it themselves?

eames posted:

Is it even possible to run Spotweb on a server for, say, 20 concurrent users?

Sure. I've got several close friends using mine.

GhostSeven posted:

I have just fixed that as it was winding me up!

Oh, nice. Thanks. I was dreading looking through the code to fix this myself. You might want to send them a pull request on github.

Thermopyle fucked around with this message at 18:41 on Dec 20, 2012

Shy
Mar 20, 2010

What's a good software to browse text? Specifically big groups, Thunderbird seemed too clumsy.

Adbot
ADBOT LOVES YOU

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Shy posted:

What's a good software to browse text? Specifically big groups, Thunderbird seemed too clumsy.

Google Groups

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply