Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Cryolite
Oct 2, 2006
sodium aluminum fluoride
I'm building some Twitter bots in Scala and thinking of trying to use akka for pub/sub so I only have one app sampling the Twitter stream publishing it to multiple subscribers of that stream. Is this a bad idea or should I use a real message queue?

I have a few ideas for Twitter bots and they mostly all involve consuming a stream of tweets. I’m halfway towards implementing one of them, but right now I have a single Scala app that both samples the Twitter stream and then processes/saves tweets. If I eventually get to the point of multiple bots I’d like to separate things out so I have only a single app that samples tweets and publishes them onto a queue of some kind, and then each app subscribes to that queue and acts on each tweet in its own way.

I’ve looked at zeromq and a few other message queues but it looks like akka might be really simple to use for this kind of thing. I don’t care if everything has to be written in Scala/Java, so I don't care if using something like zeromq means I can write the publisher in Java and subscribers in Python or something like that. It looks like I could publish tweets using an akka Event Bus and each bot would consume the event stream. This is all going to be on a single machine for now – some cheap Digital Ocean instance maybe – so I don’t need any of the distributed-ness that I think akka could provide. At least not for a while.

Has anyone used akka in this way? Is akka a bad fit for something like this? Should I stick to an actual queue like zeromq? Or if it’s OK that everything is in Scala would it actually be really simple to use akka for this?

Adbot
ADBOT LOVES YOU

Steve French
Sep 8, 2003

I can't really speak to much regarding Akka, as I have limited experience with it.

However, consuming Twitter streams is probably the most common example project that I've seen for Storm, so maybe take a look at that.

Sedro
Dec 31, 2008
Do you need persistence? Reliable delivery? Throttling/backpressure in case of slow consumers? If you don't need the features of a message queue product, don't use one. You can always add it later.

There's an example project using akka's distributed pub/sub and sources on github.

Hughlander
May 11, 2005

Not sure about the Scala component but that also sounds like a prime Kafka use case.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Hughlander posted:

Not sure about the Scala component but that also sounds like a prime Kafka use case.

Kafka's the queue, not a language in which to write the consumer. If the goal is to be scalable at some point, I'd vote Spark Streaming as the consumer framework.

Cryolite
Oct 2, 2006
sodium aluminum fluoride

Sedro posted:

Do you need persistence? Reliable delivery? Throttling/backpressure in case of slow consumers? If you don't need the features of a message queue product, don't use one. You can always add it later.

There's an example project using akka's distributed pub/sub and sources on github.

Thanks. I don't need those things, and that looks like a good starting point for what I need. I think I had looked at this specific example before but after looking a little more closely after your suggestion I think it'll work out.

I spent a few hours trying to get it working just now but can't seem to get my actors talking to each other across JVMs. Very frustrating. I oscillate back and forth between thinking Scala is great and thinking it's not for me because I'm a loving idiot who can't get anything working. I'll have to put this aside and try more later once I have a better understanding of akka. I'm mostly copying and pasting incantations without knowing what's going on.

sink
Sep 10, 2005

gerby gerb gerb in my mouf
how embarrassing for them: https://www.lightbend.com/blog/typesafe-changes-name-to-lightbend

i understand this is about surviving as a company and thus catering to enterprise clients who are Java first, but stepping back from Scala is a little disheartening. the copy on that lovely website has also become completely impenetrable

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

sink posted:

how embarrassing for them: https://www.lightbend.com/blog/typesafe-changes-name-to-lightbend

i understand this is about surviving as a company and thus catering to enterprise clients who are Java first, but stepping back from Scala is a little disheartening. the copy on that lovely website has also become completely impenetrable

This was always going to be a problem for them. The companies that hire consulting firms to build bespoke business software for them are not the ones pushing the adoption of new technologies.

canned from the band
Sep 13, 2007

I'm a man of intensity. Of cool, and youth, and passionately
I've just recently inherited a project in scala, what are the best resources to get going nowadays? I did a bit of the twitter scala school.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Scala School from Twitter is great, but getting quite dated now. There's a scala for java programmers tutorial on the scala-lang web site that's pretty good even if you aren't a java programmer.

WINNINGHARD
Oct 4, 2014

If you're interested in messing around with apache spark and scala, databricks has a free community edition.

https://databricks.com/try-databricks

Unfortunately, databricks configures the spark cluster to their own specs, so optimizing lower level stuff is out of the question :(

I use it at work and I wish we'd bit the bullet and configured it ourselves. The shell access is a little borked and exploring the filesystem is a pain too. That being said, getting a handle on spark pays dividends - it's really fascinating all the things you can do.

Does anyone here have ideas for scala projects for learning the language better? I do webdev and ETLs at work and I really did not like the Play framework.

Steve French
Sep 8, 2003

WINNINGHARD posted:

If you're interested in messing around with apache spark and scala, databricks has a free community edition.

https://databricks.com/try-databricks

Unfortunately, databricks configures the spark cluster to their own specs, so optimizing lower level stuff is out of the question :(

I use it at work and I wish we'd bit the bullet and configured it ourselves. The shell access is a little borked and exploring the filesystem is a pain too. That being said, getting a handle on spark pays dividends - it's really fascinating all the things you can do.

Does anyone here have ideas for scala projects for learning the language better? I do webdev and ETLs at work and I really did not like the Play framework.

If you didn't like Play but would like to do app/service development in Scala, I'd take a look at the Twitter stack stuff. I've been working with it for just about 3 years now and have been pretty happy with it. Check out finagle/finatra.

sink
Sep 10, 2005

gerby gerb gerb in my mouf
Akka-HTTP is superior to Play. It's not great, but at least it's a true library that makes reasonable design decisions so that you can build anything you need on top of it.

There's some stuff in it that is insanely obtuse and I blame the influence of akka-streams, which I have mixed feelings about.

I like the idea of typed actors with backpressure and a bunch of prebuilt shapes. The Akka streams design just seems obtuse.

the talent deficit
Dec 20, 2003

self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture





akka http is technically solid but has a terrible api. getting a req body as a stream sucks and is made way worse by the inability to return a stream or future. if i have to Await on it anyways just give me the body as a chunk of bytes

InAndOutBrennan
Dec 11, 2008
We're using Spark at work and though I usually write Sparkjobs in Java (where I'm decently competent) I thought I'd play around a bit with Scala, because it seems really nice. I'm very new and have dived right in without reading too much so I'm happily shooting myself in the foot every day. I do have some grounding in functional programming though.

Say I have an Iterator[Tuple2[String, String]] and I want to get to a Map[String, TreeSet[String]] where every ._1 (first item in the tuple) is the key and the set for every key contains every ._2 for that key.

Best I came up with is, pseudocodish:
code:
Iterator[Tuple2[String, String]]
	.toList // Iterator has no group by
	.groupBy(t => t._1) // Now we have a Map[String, List[Tuple2[String, String]]]
	.mapValues(tupleList => tupleList.map(t => t._2)) // Now we have a Map[String, List[String]]
	.mapValues(stringList => new TreeSet[String] ++ stringList) // Create a new TreeSet and add everything from the list to it
For code that actually runs an example is (starting with the List though, you can do a.iterator first to get the exact same starting point I have):
code:
val a = List(Tuple2("a", "1"), Tuple2("a", "2"), Tuple2("b", "1"), Tuple2("b", "2"), Tuple2("b", "3"))
val b = a.groupBy(t => t._1).mapValues(v => v.map(v => v._2)).mapValues(v => TreeSet[String] ++ v)
Which works, but is horribly slow compared to bringing in a java.util.HashMap and java.util.TreeSet and doing it much less elegantly. Horribly slow in this case for comparison is that the Scala approach hadn't finished doing a single task out of ~200 in 30 minutes while the Javabased approach finish in approx 6 minutes.

So im guessing I'm missing something obvious here and I'm creating way too many objects/maps/lists or something.

Cabbage Disrespect
Apr 24, 2009

ROBUST COMBAT
Leonard Riflepiss
Soiled Meat

InAndOutBrennan posted:

Best I came up with is, pseudocodish:
code:
Iterator[Tuple2[String, String]]
	.toList // Iterator has no group by
	.groupBy(t => t._1) // Now we have a Map[String, List[Tuple2[String, String]]]
	.mapValues(tupleList => tupleList.map(t => t._2)) // Now we have a Map[String, List[String]]
	.mapValues(stringList => new TreeSet[String] ++ stringList) // Create a new TreeSet and add everything from the list to it
For code that actually runs an example is (starting with the List though, you can do a.iterator first to get the exact same starting point I have):
code:
val a = List(Tuple2("a", "1"), Tuple2("a", "2"), Tuple2("b", "1"), Tuple2("b", "2"), Tuple2("b", "3"))
val b = a.groupBy(t => t._1).mapValues(v => v.map(v => v._2)).mapValues(v => TreeSet[String] ++ v)
Which works, but is horribly slow compared to bringing in a java.util.HashMap and java.util.TreeSet and doing it much less elegantly. Horribly slow in this case for comparison is that the Scala approach hadn't finished doing a single task out of ~200 in 30 minutes while the Javabased approach finish in approx 6 minutes.

So im guessing I'm missing something obvious here and I'm creating way too many objects/maps/lists or something.

You can:
code:
val brennansMap = foldLeft(Map[String, TreeSet[String]]().withDefaultValue(TreeSet.empty)) { build yo map up by matching in here }
but I make no guarantees about that being faster (and am on my phone anyway, so in my goony laziness I've decided that checking it is too hard).

For the general pattern of "gee I have to transform my collection into these intermediate collections and I wish that I didn't", check out collection.breakOut (but as you're doing that to get groupBy here, it's not really helpful, just good to know). When you're doing really performance-critical stuff, sometimes the easiest option is to bite the bullet, do the disgusting mutable thing, then wrap it in immutable functional ivory towers so that nobody else has to see what you've done.

InAndOutBrennan
Dec 11, 2008

Thanks!

I'm having some trouble with the "build yo map up by matching in here" put you've pointed me in a couple of interesting directions, breakout seems to be very interesting if I can get my head around it.

Edit
What I ended up with:
code:
a.iterator.foldLeft(Map[String, TreeSet[String]]().withDefaultValue(TreeSet[String]().empty))((m, s) => m + (s._1 -> (m(s._1) + s._2)))
Initial tests shows this runs reasonably fast (10ish minutes), haven't been able to compare the two head to head yet though. Need an empty cluster for that. Wonder what makes the huge difference, but thanks again.

InAndOutBrennan fucked around with this message at 14:06 on Jun 16, 2016

fritz
Jul 26, 2003

I've started trying to pick up scala. What's the praxis for choosing

code:
    val filename: String = getClass.getResource("/sample.html").getFile
vs

code:
  val filename: String = getClass getResource "/sample.html" getFile
? (The latter gives warnings unless I "import scala.language.postfixOps")

kugutsu
Dec 31, 2008
Never use postfix ops (which is what calling getFile there without the dot is). They're ambiguous and don't really improve readability in any circumstance. If you want to use a trailing method call with infix invocation, you can group the infix part with parentheses.

code:
(listA ++ listB).tail
For infix notation in general, I would recommend against it except for operators or when using higher order functions. It's fashionable (and recommended by the style guide) to chain higher order functions like so:

code:
things filter (_.someCondition) map (_.toOtherThing)
The reasons they give for this are dubious, but it's pretty popular. From what I've seen, overuse of infix method invocation makes scala code difficult to read, so I tend to be pretty conservative about using it.

Sedro
Dec 31, 2008
I would avoid infix notation in general and use it only for simple binary expressions where it improves readability:
code:
val myTuple = "foo".->("bar")
val myTuple = "foo" -> "bar" // better

if (myList.contains(3)) { ... }
if (myList contains 3) { ... } // better

if (a.==(b)) { ... }
if (a == b) { ... } // better

if (!myList.contains(3)) { ... }
if (!(myList contains 3)) { ... } // better?
I would also go against the style guide in that map/filter example. I find that infix notation makes the code harder to edit. If you want to break the expression onto multiple lines you need to surround it with parenthesis:
code:
val result = (things
  filter (_.someCondition) 
  map (_.toOtherThing))

val filename: String = (getClass
  getResource "/sample.html"
  getFile)

fritz
Jul 26, 2003

Cool, thanks all.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
I would have:

code:
things
  .filter(_.someCondition)
  .map(_.toOtherThing)
Save infix for where it really adds clarity such as operators or DSLs.

Hypnobeard
Sep 15, 2004

Obey the Beard



I've got this code:

code:
def fun(q: List[List[Elem]]): List[String] = {
  if (q.isEmpty) List(Nil)
  else {
    for {
      foo <- b            // foo: List[Elem], b: List[List[Elem]]
      bar <- d(a)         // bar: String, d: Map[List[Elem], String]
      baz <- fun(q - foo) // baz: List[String]
    } yield bar :: baz
  }
}
This consistently returns just List().

I'm not sure why the for comprehension isn't actually yielding what I think it should be yielding. Any suggestions for what to look at? What am I missing?

mutantmell
Oct 14, 2012

Hypnobeard posted:

I'm not sure why the for comprehension isn't actually yielding what I think it should be yielding. Any suggestions for what to look at? What am I missing?

It's hard to tell what exactly is going on here without knowing what b, d, a, and fun are.

That said, if this for comprehension always returns List(), then one of b, d(a), or fun(...) must be returning List() as well; since there are no elements in that list, it doesn't have anything to yield.

Hypnobeard
Sep 15, 2004

Obey the Beard



mutantmell posted:

It's hard to tell what exactly is going on here without knowing what b, d, a, and fun are.

That said, if this for comprehension always returns List(), then one of b, d(a), or fun(...) must be returning List() as well; since there are no elements in that list, it doesn't have anything to yield.

Well, that's why I included the types for those in the code sample. The first two elements all produce something when tried independently, so there's something off in the recursion, but I'm not sure where the flat List() is coming from--I'd expect to see List(List()) if there was no data. Debugging in the scala ide isn't helping, either.

mutantmell
Oct 14, 2012

Hypnobeard posted:

Well, that's why I included the types for those in the code sample. The first two elements all produce something when tried independently, so there's something off in the recursion, but I'm not sure where the flat List() is coming from--I'd expect to see List(List()) if there was no data. Debugging in the scala ide isn't helping, either.

If any of the lists on the right of the '<-' is empty, then the result will be 'List()'. The for-comprehension can be read as "for each element foo in b, for each element bar in d(a), for each element baz in fun(q - foo), transform it into bar :: baz)'. If any of those lists are empty, then it cannot do anything.

It may help the debugger if you manually do the de-sugaring of the for comprehension. I'll put in a spoiler in case you want to do it yourself:


b.flatMap { foo =>
d(a).flatMap { bar =>
fun(q - foo).map { baz =>
bar :: baz
}
}
}

Hypnobeard
Sep 15, 2004

Obey the Beard



mutantmell posted:

If any of the lists on the right of the '<-' is empty, then the result will be 'List()'. The for-comprehension can be read as "for each element foo in b, for each element bar in d(a), for each element baz in fun(q - foo), transform it into bar :: baz)'. If any of those lists are empty, then it cannot do anything.

It may help the debugger if you manually do the de-sugaring of the for comprehension. I'll put in a spoiler in case you want to do it yourself:


b.flatMap { foo =>
d(a).flatMap { bar =>
fun(q - foo).map { baz =>
bar :: baz
}
}
}


Thanks! I'll try this out today.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Is baz a List[String] or is fun(q - foo) a List[String]?

Hypnobeard
Sep 15, 2004

Obey the Beard



KernelSlanders posted:

Is baz a List[String] or is fun(q - foo) a List[String]?

Turned out the method I was using to calculate (q - foo) was the culprit--fixing that resolved everything else.

krnhotwings
May 7, 2009
Grimey Drawer
Alrighty, I'm trying to gently caress around with generics and reflection, via TypeTag, to try and retain type information on runtime:

http://pastebin.com/Cj6z2Ez4
(It's best to download this file and feed it into the 'scala' command. If you copy and paste it into the REPL, it behaves differently.. I'm on 2.11.8)

As commented, line 11 loses type information because of the generic type declaration in line 6. If I remove that declaration everything works as expected, but for various reasons, let's say that I have to have it there and that I can't manually declare types on line 11 (because realistically, I wouldn't know what kinda of type I'd be getting.)

Is there some way for me to use the underscore/wildcard type in conjunction with TypeTag to keep the type information down that chain of calls?

krnhotwings fucked around with this message at 06:21 on Aug 25, 2016

Good Will Hrunting
Oct 8, 2012

I changed my mind.
I'm not sorry.
So I have a list of tuples that looks like (a, (b, x)). I want as a result a map of a to (b, x) where x is the greatest x for that given a and b combo. Basically, if the x I'm examining is greater than the greatest x so far for that a as well as that b, update the result map with the a, (b, x) combo. I have legitimately no idea how to approach this. It seems like I should be able to do this with foldLeft, am I crazy?

Sedro
Dec 31, 2008

Good Will Hrunting posted:

So I have a list of tuples that looks like (a, (b, x)). I want as a result a map of a to (b, x) where x is the greatest x for that given a and b combo. Basically, if the x I'm examining is greater than the greatest x so far for that a as well as that b, update the result map with the a, (b, x) combo. I have legitimately no idea how to approach this. It seems like I should be able to do this with foldLeft, am I crazy?

foldLeft is a good approach. You can use a map as an accumulator, keyed by (a, b) tuples. You would start with an empty map, and compare the value of X for each element in your list. If X is greater, add it to the map, otherwise continue.

code:
val list = List[(A, (B, X))] = ???

list.foldLeft(Map.empty[(A, B), X]) { case (accum, (a, (b, x))) =>
  if (accum.get((a, b)).forall(_ < x)) {
    // found a greater value for X
    accum + ((a, b) -> x)
  } else {
    accum
  }
}
This will get you a map of (A, B) -> X, which is not quite what you asked for. Since you want a map of A -> (B, X) you will need to decide how to resolve a conflict where two (A, B) pairs have the same A and a different B.

Good Will Hrunting
Oct 8, 2012

I changed my mind.
I'm not sorry.

Thanks for this. Basically, my input is something like [ (A, (B, X)), (A, (C, X)), (A, (D, X), (A, (C, Y)), (A, (C, Z)))] and so on. It's a mapping of an Id, to tuples containing a value and a timestamp for that value. I have no choice as to how that's represented, but I do have some leeway over how I process it and reach my desire result of Map[A, (C, (maximum of X, Y, Z)].

This is one of the harder things I've done in Scala so far (I'm using Spark) but it's really neat and I'm enjoying it so far.

Adbot
ADBOT LOVES YOU

Ganondork
Dec 26, 2012

Ganondork
On my tablet at the moment, and don't have my laptop handy, but you could try something like:

code:
(a, (b,c)).zipped.toMap

  • Locked thread