Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Analytic Engine
May 18, 2009

not the analytical engine
Is there enough interest here for a dedicated Data Visualization thread?
EDIT: Yes and here it is. PM me if you want your message edited or removed.


Stuff that isn't Software

Data Stories is the best (and possibly only) dedicated data vis podcast.

visualising data is an insanely dense collection of project links with interesting commentary.
Check out their yearly reviews:
January to June 2011 July to December 2011
January to June 2012 July to December 2012
January to June 2013 July to December 2013
January to June 2014

Data-vis-jobs is a relatively small Google group that attracts great job/freelance/fellowship opportunities.

The work of designer/statistician Edward Tufte is regularly hyped. I can recommend his first book.

ItBurns posted:

I'll second Tufte's books, not that they need it. I have the one in the OP next to me right now. If nothing else they're beautifully illustrated and you can put them on your coffee table.

MrMoo posted:

I like to think Reuters Graphics has awesome charting for all its TV, magazine and digital work, but I not aware of any of it and certainly working at Reuters we have only access to F/OSS solutions like D3 :shrug:

Both Reuters, Bloomberg, The Times have job openings for data visualisation experts to munge Javasciprt, HTML, CSS for politics coverage, markets, and anything else.

Software

D3.js is a powerful web framework for interactive visualizations using JavaScript/HTML5/CSS.
It covers almost all of jQuery's functionality, though the jQuery thread suggests learning D3 second and warns that D3 has more browser restrictions.
This book is a very approachable D3/web design guide hosted for free. It was so good I bought the Kindle version.
bl.ocksplorer.org is a great search engine for D3 functions that crawls the site where people host their D3 projects.
These two videos give a good intro to D3. The first is from a design perspective. The second goes into detail on the point of D3 and is by a guy that absolutely has to be a Goon.

MrMoo posted:

The spiffiest software for data visualisation is Tableau with hosted and deployed models.

For some lame reason a competitor Datawatch exists, there is a video attempting to explain the advantages. Generally around RAD or "exploratory analytics" which is hilarious as the former CTO announced they don't support that.

Lumpy posted:

Bokeh: http://bokeh.pydata.org is a data vis library that at least one goon works on.

ItBurns posted:

I'm using a couple things not mentioned. The first is Highcharts/Highstock. It seems to be super common as I've noticed it on a ton of websites in lieu of static images of simpler bar/line charts. In some ways it's a less flexible D3, but it's still highly customizable and the built-in types cover a lot of ground with new stuff being added pretty regularly.

http://www.highcharts.com/
Example: http://www.gw2spidy.com/item/46741

Some other options for people who are accustomed to R might be Shiny or ggVis. Neither are very mature, but they're good for simple stuff if you aren't a web designer as they take the guesswork out of making things presentable. ggVis is really similar to ggPlot if you've used that package.

http://shiny.rstudio.com/
http://ggvis.rstudio.com/

Leaflet is a mapping program similar to what you might get from Google maps, or exactly what you'd get on Craigslist (because they use it). Setting it up is really simple, but the actual business of mapping data is probably where it's going to get tricky for some people if you aren't familiar with projections. On the other hand, if you've ever said 'gently caress shape files forever' then this is going to be a breath of fresh air.

http://leafletjs.com/

Open Questions

Ahz posted:

I'm working on integrating data viz into my app and was thinking about the current climate for client-side processing vs. server side.

I can order and group my data as needed for server-side. But I'm wondering if it's worth it to save some processing on my end and put it on the client if most clients these days can handle the calcs averaging datasets of maybe 10,000 rows by 5-10 columns. But them I'm also increasing my transfer, so I just don't know.

Also, are some types of visualizations more/better suited to mobile vs. desktop?

ItBurns posted:

I'm most curious about what people are doing to generate/serve data. Being an R guy I'm doing 99% of my analytics in R and producing JSON/CSV files at regular intervals. It's not real-time and it's a real pain for anything where the user has a lot of freedom.

Plugs

Post your website or portfolio and show off to other Goons

Analytic Engine fucked around with this message at 05:35 on Aug 20, 2014

Adbot
ADBOT LOVES YOU

YO MAMA HEAD
Sep 11, 2007

I guess we'll find out!

Ahz
Jun 17, 2001
PUT MY CART BACK? I'M BETTER THAN THAT AND YOU! WHERE IS MY BUTLER?!
I'd be interested in some good resources.

Analytic Engine
May 18, 2009

not the analytical engine

Ahz posted:

I'd be interested in some good resources.

Edit: I added stuff to the OP.

Analytic Engine fucked around with this message at 07:09 on Aug 14, 2014

Ahz
Jun 17, 2001
PUT MY CART BACK? I'M BETTER THAN THAT AND YOU! WHERE IS MY BUTLER?!
I'm working on integrating data viz into my app and was thinking about the current climate for client-side processing vs. server side.

I can order and group my data as needed for server-side. But I'm wondering if it's worth it to save some processing on my end and put it on the client if most clients these days can handle the calcs averaging datasets of maybe 10,000 rows by 5-10 columns. But them I'm also increasing my transfer, so I just don't know.

Also, are some types of visualizations more/better suited to mobile vs. desktop?

Analytic Engine
May 18, 2009

not the analytical engine

Ahz posted:

I'm working on integrating data viz into my app and was thinking about the current climate for client-side processing vs. server side.

I can order and group my data as needed for server-side. But I'm wondering if it's worth it to save some processing on my end and put it on the client if most clients these days can handle the calcs averaging datasets of maybe 10,000 rows by 5-10 columns. But them I'm also increasing my transfer, so I just don't know.

Also, are some types of visualizations more/better suited to mobile vs. desktop?

These are great questions and I don't know the answers, hopefully more Goons will check out the thread.

I've found interactivity to be harder on mobile since there's no hovering and finger gestures already navigate the browser.
I had a fiddly context-sensitive menu that was unusable on mobile so I added a separate set of big touchable buttons.

MrMoo
Sep 14, 2000

The spiffiest software for data visualisation is Tableau with hosted and deployed models.

For some lame reason a competitor Datawatch exists, there is a video attempting to explain the advantages. Generally around RAD or "exploratory analytics" which is hilarious as the former CTO announced they don't support that.

I like to think Reuters Graphics has awesome charting for all its TV, magazine and digital work, but I not aware of any of it and certainly working at Reuters we have only access to F/OSS solutions like D3 :shrug:

Both Reuters, Bloomberg, The Times have job openings for data visualisation experts to munge Javasciprt, HTML, CSS for politics coverage, markets, and anything else.

MrMoo fucked around with this message at 00:10 on Aug 16, 2014

Lumpy
Apr 26, 2002

La! La! La! Laaaa!



College Slice
Bokeh: http://bokeh.pydata.org is a data vis library that at least one goon works on. Probably OP worthy.

the
Jul 18, 2004

by Cowcaster
I love visualizations.

However I my knowledge extends nowhere beyond using Python (pandas, seaborn, etc.) and I'm not very good at it.

Would love to learn more in this thread.

ItBurns
Jul 24, 2007
I'm using a couple things not mentioned. The first is Highcharts/Highstock. It seems to be super common as I've noticed it on a ton of websites in lieu of static images of simpler bar/line charts. In some ways it's a less flexible D3, but it's still highly customizable and the built-in types cover a lot of ground with new stuff being added pretty regularly.

http://www.highcharts.com/
Example: http://www.gw2spidy.com/item/46741

Some other options for people who are accustomed to R might be Shiny or ggVis. Neither are very mature, but they're good for simple stuff if you aren't a web designer as they take the guesswork out of making things presentable. ggVis is really similar to ggPlot if you've used that package.

http://shiny.rstudio.com/
http://ggvis.rstudio.com/

Leaflet is a mapping program similar to what you might get from Google maps, or exactly what you'd get on Craigslist (because they use it). Setting it up is really simple, but the actual business of mapping data is probably where it's going to get tricky for some people if you aren't familiar with projections. On the other hand, if you've ever said 'gently caress shape files forever' then this is going to be a breath of fresh air.

http://leafletjs.com/


I'm most curious about what people are doing to generate/serve data. Being an R guy I'm doing 99% of my analytics in R and producing JSON/CSV files at regular intervals. It's not real-time and it's a real pain for anything where the user has a lot of freedom.

Edit: I'll second Tufte's books, not that they need it. I have the one in the OP next to me right now. If nothing else they're beautifully illustrated and you can put them on your coffee table.

ItBurns fucked around with this message at 21:44 on Aug 16, 2014

mortarr
Apr 28, 2005

frozen meat at high speed
Good idea for a thread!

quote:

Post your website or portfolio and show off to other Goons

I've got nothing public to show, but I just threw this together as a tech demo for management last week. It's an example of call data for water related faults logged over an 18-month period, to show how we might visualise non-financial data. The slider at the bottom changes the selected month, mouse-over changes the info up top.

Tools are pure js + telerik/kendo ui slider and treemap components, which were fairly easy to get going.

YO MAMA HEAD
Sep 11, 2007

This is only half as impressive without sound, but I've been developing an annotation/basic audio editing tool and I use d3 for drawing waveforms.


(click for gfy)

This sits a little outside of true data visualization, but having the capability to zoom, scroll, and mute the waveform in a meaningful visual way is super helpful.

duck monster
Dec 15, 2004

I should make a note about high-charts and in fact many JS chart libraries.

We've had to abandon using them at work (Government science department in australia) because of its license. They've licensed it under a "creative commons non commercial" license which can't be used with the GPL or in fact most share-alike licenses because it restricts people from selling the end result, a restriction the GPL actually forbids.

If your using GPL type libraries, don't combine them with highchart or other non-free libraries. Instead stick with either GPL or permissive MIT/BSD licensed libraries.

edit: LeafletJS is awesome. We do some really seriously complicated GIS stuff here, like hundreds of layers with thousands of points and polygons being shat out of an oracle backend, and it handles it flawlessly. Its really good stuff.

duck monster fucked around with this message at 04:47 on Aug 19, 2014

Analytic Engine
May 18, 2009

not the analytical engine
visualising data is an insanely dense collection of project links with interesting commentary.

Check out their yearly reviews of the data vis world:
January to June 2011 July to December 2011
January to June 2012 July to December 2012
January to June 2013 July to December 2013
January to June 2014

MrMoo
Sep 14, 2000

I like history of tree maps, never knew they were so complicated. The first problem when looking at a tree map is that the squares for equal values are not always the same size.

ATM Machine
Aug 20, 2007

I paid $5 for this

YO MAMA HEAD posted:

This is only half as impressive without sound, but I've been developing an annotation/basic audio editing tool and I use d3 for drawing waveforms.


(click for gfy)

This sits a little outside of true data visualization, but having the capability to zoom, scroll, and mute the waveform in a meaningful visual way is super helpful.

This is actually interesting to me, as I do a fair amount of work tooling around with the webAudioAPI, but I'd never though about combining D3.js and FFT data before, so there goes my next free weekend.

YO MAMA HEAD
Sep 11, 2007

Web Audio unfortunately wasn't quite viable when I started the project. Amplitude measurements are made on the server by SoX and stored in the database, audio playback is done with SoundJS, and d3.js manages the waveform. It's all kind of fake, but works pretty convincingly—muting is done client-side by temporarily killing the SoundJS playback volume and setting the d3 bars to 0, and then gets sent to Gearman on the server for a re-render in SoX (becoming a "true" muted wavefor).

Analytic Engine
May 18, 2009

not the analytical engine

mortarr posted:

I've got nothing public to show, but I just threw this together as a tech demo for management last week. It's an example of call data for water related faults logged over an 18-month period, to show how we might visualise non-financial data. The slider at the bottom changes the selected month, mouse-over changes the info up top.

Tools are pure js + telerik/kendo ui slider and treemap components, which were fairly easy to get going.

MrMoo posted:

I like history of tree maps, never knew they were so complicated. The first problem when looking at a tree map is that the squares for equal values are not always the same size.
If you're interested in tree maps check out Data Stories episode 29 with Ben Shneiderman

Nybble
Jun 28, 2008

praise chuck, raise heck
After starting my career in Business Intelligence and moving into programming, hopefully I can be of some help in this thread!

Qlikview and Tableau are the current front runners right now for amateur data analysis. MicroStrategy is lagging behind now, which is unfortunate but not surprising. Qlikview is a bit more complex in what it can do, including having an internal scripting language that can be run at execution time to load in data from databases, scrape websites, etc to update your dashboard. Tableau is better for on-demand stuff where your data has already been cleaned and you'd like something better than Excel for displaying it. The public version is nice; allows you to link and embed your dashboards online. Very exciting stuff there.

If you're familiar with web development, I would say it might be better to use Qlikview and Tableau as prototyping, and then roll your own dashboards. Anytime you get to the point where you'd like to display up-to-date data to the public, unfortunately their license costs get very prohibitive for the average person. (If you're running an organization and have access to a large budget, then go ahead!) Highcharts and D3 are great for this. I haven't played with http://www.chartjs.org/ yet, but it might be something to check out if want something simple and don't want to deal with Highcharts licensing.

ItBurns posted:

I'm most curious about what people are doing to generate/serve data. Being an R guy I'm doing 99% of my analytics in R and producing JSON/CSV files at regular intervals. It's not real-time and it's a real pain for anything where the user has a lot of freedom.

I have the most experience using Python and SQL to generate data, and then transforming it into JSON for Highcharts. There was a task that ran every hour that aggregated data in a stats database I built. Each page in the website would request a certain filter of that aggregated data, and then populate the charts with it. Strangely enough for all the stats work I do, I'm very naive about R, but I wonder if Shiny might be good for what you're trying to do?

ItBurns
Jul 24, 2007

Nybble posted:

I have the most experience using Python and SQL to generate data, and then transforming it into JSON for Highcharts. There was a task that ran every hour that aggregated data in a stats database I built. Each page in the website would request a certain filter of that aggregated data, and then populate the charts with it. Strangely enough for all the stats work I do, I'm very naive about R, but I wonder if Shiny might be good for what you're trying to do?

This isn't all that different from what I have now, sans Shiny. The main issue with that approach is that I end up generating a lot of stuff that nobody will ever use, and that it's restricted to being x-hours old.

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender
nth-ing the Tableau love. We just started using it a couple months ago and it's been awesome for rapid development.

What we use for most of our day-to-day stuff, though, is Splunk. The biggest downside is that it's expensive (the license is per gig of data indexed per day), but if you can afford it and have the resources to do dev work against it, it's extremely powerful. It might not be as flashy as some of the other solutions out there, but when anybody in your organization can run ad hoc queries against your indexed data, suddenly you have a lot more perspectives about what's important and what should be readily available, which turns into new dashboard ideas.

Edit: I've been thinking about having our Splunk instance process data and dump it somewhere that I can pick it up with Tableau for modeling, but I haven't had time to start looking into it. It'd probably work OK, though, since I know it can output CSVs from scheduled jobs, and I imagine there's probably some way to have it write to a DB.

Kreeblah fucked around with this message at 06:44 on Aug 26, 2014

mortarr
Apr 28, 2005

frozen meat at high speed
Yeah, I just tried Tableau for the first time today with the data I used in this example I posted earlier:


That took me something like a day of data-wrangling and writing javascript, but gently caress me it only took about five minutes to get rougly the same result in Tableau. Dunno if I'd give tableau to all our users, but for the more clued up ones and for prototyping it looks pretty sweet.

Are there any chart type add-ins you can get for it? I'd like something that could generate a streamgraph rather than just stacked area.

Nam Taf
Jun 25, 2005

I am Fat Man, hear me roar!

Tableau has won over a few of the guys at work so I guess that's another vote for it. I was really hoping Bokeh or something like that in the Python world would be able to do interactive, linked-axis graphs (that is to say, be able to drag on an 'index' graph along the bottom to select area on a main plot, like google's finance display in flash) but I haven't seen a solid example of it. I would imagine D3 could, but that's Javascript and I do Python though if it's what we need then maybe I'll have to jump across.

Anyone have any advice?

Nam Taf fucked around with this message at 09:24 on Aug 27, 2014

Sagacity
May 2, 2003
Hopefully my epitaph will be funnier than my custom title.

Kreeblah posted:

What we use for most of our day-to-day stuff, though, is Splunk. The biggest downside is that it's expensive (the license is per gig of data indexed per day), but if you can afford it and have the resources to do dev work against it, it's extremely powerful.
If you like Splunk you should probably also take a look at Kibana in combination with Elastic Search. It's really quite powerful (outside of log file processing, too) but it doesn't have the crazy pricing of Splunk.

Mniot
May 22, 2003
Not the one you know

Kreeblah posted:

nth-ing the Tableau love. We just started using it a couple months ago and it's been awesome for rapid development.

What we use for most of our day-to-day stuff, though, is Splunk. The biggest downside is that it's expensive (the license is per gig of data indexed per day), but if you can afford it and have the resources to do dev work against it, it's extremely powerful. It might not be as flashy as some of the other solutions out there, but when anybody in your organization can run ad hoc queries against your indexed data, suddenly you have a lot more perspectives about what's important and what should be readily available, which turns into new dashboard ideas.

Edit: I've been thinking about having our Splunk instance process data and dump it somewhere that I can pick it up with Tableau for modeling, but I haven't had time to start looking into it. It'd probably work OK, though, since I know it can output CSVs from scheduled jobs, and I imagine there's probably some way to have it write to a DB.

Splunk is awesome. Sumo Logic is basically hosted Splunk (not as powerful, but affordable) and does some pretty visualizations.

Grafana is a nice front-end for Graphite. Very pretty and very easy to use.

Datadog is a hosted service for StatsD data, graphs, and some other things. I've been liking it.

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
Tableau is pretty excellent, but the one major knock against it is that it's not great for working with data in context. You can build beautiful dashboards and visualizations, but the second you need to integrate it into a web page it becomes an exercise in futility, especially if the data is sensitive and/or you're dealing with some flavor of Tableau that's non-public.

That said, we run our own Tableau server at work and I manage it. I'm not the best with Tableau, but I'm passable so I can lend my expertise if anyone needs it.

Knyteguy
Jul 6, 2005

YES to love
NO to shirts


Toilet Rascal
Here's some amazing looking data visualization software one of the clients I work for may be implementing: http://www.datazen.com/

Pretty much priced for enterprise only though.

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender

Knyteguy posted:

Here's some amazing looking data visualization software one of the clients I work for may be implementing: http://www.datazen.com/

Pretty much priced for enterprise only though.

Could you post a trip report if they end up going with it? The Win8 requirement sucks, but those dashboards look really nice.

Knyteguy
Jul 6, 2005

YES to love
NO to shirts


Toilet Rascal

Kreeblah posted:

Could you post a trip report if they end up going with it? The Win8 requirement sucks, but those dashboards look really nice.

You bet. I'd say it's about a 90% chance they'll go with it.

Kobayashi
Aug 13, 2004

by Nyc_Tattoo

Ahz posted:

I'm working on integrating data viz into my app and was thinking about the current climate for client-side processing vs. server side.

I can order and group my data as needed for server-side. But I'm wondering if it's worth it to save some processing on my end and put it on the client if most clients these days can handle the calcs averaging datasets of maybe 10,000 rows by 5-10 columns. But them I'm also increasing my transfer, so I just don't know.

Also, are some types of visualizations more/better suited to mobile vs. desktop?

I'm by no means an expert, but I've been using DC.js (D3 + Crossfilter). My current working prototype sanitizes data across monthly spreadsheets for the year to date, outputting a ~15,000 record, 7-column JSON file (just over 2 Mb). I then pull that down to the client, which does all the analysis. Currently that involves computing a half-dozen dimensions and producing interactive graphs. I very roughly benchmarked it at around a million various map-reduce calls. I've been really surprised by how quickly rendering completes. Testing on desktop and an iPhone 5 over 4G shows no noticeable processing overhead once the initial JSON file has been downloaded.

I expect my app to grow by roughly an order of magnitude by the end, to somewhere in the order of 100k records. At this point I'm convinced that I'm far more likely to introduce memory leaks or use an inefficient algorithm before I reach the limits of the Javascript libraries I'm using. For comparison, the linked Crossfilter example uses a 230k record, 5.3 Mb dataset. It's amazing what you can do with Javascript these days...

Kallikrates
Jul 7, 2002
Pro Lurker
Maybe this is the wrong thread, but I want to visualize some data, the visualization isn't the problem (yet) the problem would be modeling data.

I want to visualize accelerometer data as positions, (this data can be cleaned and improved with on device magnetometer, barometer, and gyroscope data), by integrating twice and doing more math. Normally this wouldn't be possible(remotely accurate) because the small % errors at each tick cause huge errors when integrated twice. However I know several facts about the movement I am trying to capture that might help me in modeling it. For example approximate distance and target location will probably be known ahead of time, and once moved to a location the device will return near to the starting point. Starting point is also 0,0,0. Is anyone familiar with work like this or could point to resources?

ohgodwhat
Aug 6, 2005

A Kalman filter, perhaps?

Kallikrates
Jul 7, 2002
Pro Lurker

ohgodwhat posted:

A Kalman filter, perhaps?

Kalman came up, and I have started doing the research. A normal 2d free body might be a good start with an extra dimension for z added, and there are some good resources for that.

minidracula
Dec 22, 2007

boo woo boo
I'm an idiot who usually uses code or code-esque things to make charts, graphs, plots, and the like, however, I want to like easier-to-use tools for (at least!) less complex visualizations of data; tools like Tableau. My stupid question is: mechanically, how do you use this thing?

I mean, I must be dumb if I can't figure this out, but to be fair (to other people), I haven't really watched all of the videos and what not they've made (though I distinctly recall watching at least one or two). I've opened up and poked at a few of the sample workbooks, but I guess I'm ideally looking for a HOWTO type set of instructions as text. I have data in a variety of formats, but the de facto format I can almost always distill things down to is some sort of CSV/TSV, or fairly basic SQLite databases.

For whatever it's worth, this is me using Tableau Desktop 8.2.

EDIT: Oh, I should perhaps add, while this isn't the main thing I'm looking for, I'm also very interested in the R integration available in Tableau (8.1 and later, including the 8.2 version I'm tinkering with), if anyone else here has used that at all.

minidracula fucked around with this message at 05:47 on Oct 14, 2014

Flash Gordon
May 8, 2006
Death To Ming

minidracula posted:

I'm an idiot who usually uses code or code-esque things to make charts, graphs, plots, and the like, however, I want to like easier-to-use tools for (at least!) less complex visualizations of data; tools like Tableau. My stupid question is: mechanically, how do you use this thing?

I mean, I must be dumb if I can't figure this out, but to be fair (to other people), I haven't really watched all of the videos and what not they've made (though I distinctly recall watching at least one or two). I've opened up and poked at a few of the sample workbooks, but I guess I'm ideally looking for a HOWTO type set of instructions as text. I have data in a variety of formats, but the de facto format I can almost always distill things down to is some sort of CSV/TSV, or fairly basic SQLite databases.

For whatever it's worth, this is me using Tableau Desktop 8.2.

EDIT: Oh, I should perhaps add, while this isn't the main thing I'm looking for, I'm also very interested in the R integration available in Tableau (8.1 and later, including the 8.2 version I'm tinkering with), if anyone else here has used that at all.

In my experience, the R integration was fairly disappointing. One of my old coworkers used it to make a fairly interesting dashboard but it seemed like a lot more trouble than it's worth. If I were to try and do what she did, I think I would just start from the R direction and use Shiny.

Where Tableau really shone in my organization was the integration with Active Directory. We had a fairly complex set of security requirements for who could see what data and by using row-level permissions and the USERNAME (or whatever) calculated field as a filter, it was a fast and cheap way to implement it in the visualizations.

Here is the basic workflow, based on my experience:
1. Connect to your data. I used CSV files for rapid stuff or connected to SQL Server if it needed to be live. If you're going to be using the connection a lot or sharing it amongst people, publish it to the Tableau Server (if you have access to one) and extract it - you'll get huge performance gains although the extracts can take forever to refresh if you try and put too much data in them.
2. Start laying out individual worksheets. These are individual displays that make up the components of a dashboard - line charts, histograms, whatever. See below for more on this.
3. Once I have my worksheets created, start assembling them into dashboards. These will be what you'll publish and your user's will actually interact with. I usually wait until this point to start tying together the worksheets with action filters or anything that needs to operate across worksheets/dashboards.
4. Publish to the server - if you have the ability to have an admin create separate 'sites' it's super useful to have a testing server and a live one.

Creating individual worksheets:
I honestly think the best way to learn this is just to watch videos and mess around until you get results. But basically, the interface consists of shelves. You can place fields from your data (or calculations) onto these shelves. Once on there, they inform Tableau about how to display the data. So if you had a categorical variable on the column shelf, it would divide your worksheet into columns based on those variables. Similarly, if you placed it instead on the color shelf, it would change the color. I find it really helpful to think about the underlying SQL queries that Tableau is running (you can actually find these in a log file if you really want) - the values you're displaying are the SELECT and various aggregations and the structural shelves are what the data is being grouped by.

If people would find it useful, I will post my strategy for developing table calculations when I have time to type it up. I think that Tableau is pretty awesome and intuitive until you start using TCs and then the learning curve goes vertical and your life sucks. I was in a situation where we had limited ability to change the structure of the data we were visualizing so I spent way too much time wrangling data into the proper shape using TCs so I think I'm fairly good at it. And having a basic template for setting them up is enormously helpful, in my opinion.

Kreeblah
May 17, 2004

INSERT QUACK TO CONTINUE


Taco Defender

Flash Gordon posted:

If people would find it useful, I will post my strategy for developing table calculations when I have time to type it up. I think that Tableau is pretty awesome and intuitive until you start using TCs and then the learning curve goes vertical and your life sucks. I was in a situation where we had limited ability to change the structure of the data we were visualizing so I spent way too much time wrangling data into the proper shape using TCs so I think I'm fairly good at it. And having a basic template for setting them up is enormously helpful, in my opinion.

I dunno about anybody else, but I'd really appreciate this. I ran into a roadblock I couldn't figure out trying to use it with our JIRA DB (namely, for a particular project, there are two custom datetime fields that I need to use to calculate a duration; the duration calculation itself would be doable, but the normalized form of the DB made figuring out how to define my data source and calculated fields to get the datetime data in the first place kinda brain-breaking). I think I'm just going to write an app to extract data from their API and turn it into a TDE, but it'd be nice to have a better idea of what the alternatives would be.

v1nce
Sep 19, 2004

Plant your brassicas in may and cover them in mulch.
Loving all this stats and graphing stuff; I've had good times with Highcharts in the past, and would love to use D3 at work, if only IE8 wasn't such a pressing requirement.

Not stats related, but a new system requirement crossed my desk for a workflow builder, where you start at Point A and basically work through a series of defined tasks until you hit Point X. The guy doing the UI obviously wasn't feeling too confident about what we could do, because what he drew up was basically an HTML table with vertical/horizontal arrows between cells. You click a cell to edit it, and you pick the "parent" activity from a drop-down list contained within that cells settings. Bleh.

A bit of Googling and a few hours (days) later, I was able to put out a prototype using PlumbJs within AngularJs. Now we can drag-and-drop node types from a palette into a workflow canvas, and drag-and-drop connections between nodes to establish a workflow between them. I basically ripped off Unreal Engine 4's Blueprint and I'm loving the results. As far as visualising the execution path goes, it's miles ahead of what we could have ended up with.

Pollyanna
Mar 5, 2005

Milk's on them.


Yay, cool! This is what I want to do my upcoming project in. I have a few questions on doing dataviz:

How do I know that a problem I'm thinking of can have dataviz applied to it? Like, how can I pick up on something possibly being well-explained using data visualization? I don't know if that makes much sense - basically, how can I "detect" that a dataviz application would be helpful?

Where does everybody get their data? I've found data.gov, and that seems to be the best for public, governmental data, but I've had trouble finding other sources for stuff like Twitter or video games.

Speaking of Twitter, how does everyone do their data collection on it? One of my project ideas involved analyzing #GamerGate tweets and comparing the word usage of the trolls vs. their targets. However, I've had trouble getting a large enough data set to sample - 1500 tweets max :(

Where else can I go to read more about dataviz practices/how to get better at it?

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


I saw a link to http://www.reddit.com/r/dataisbeautiful/ today, and there are some nice visualizations there.

Adbot
ADBOT LOVES YOU

minidracula
Dec 22, 2007

boo woo boo
Possibly bizarre question that might be better posted elsewhere, but: does anyone have any recommendations for graphing/plotting/visualization libraries for Fortran that aren't DSLIN? Ideally something free for commercial use as well (which DSLIN isn't), e.g. MIT or BSD licensed.

Hrm, now that I think of it, maybe PLplot is worth using for this... written in C, but with a Fortran interface (among others)... Still, willing to entertain other suggestions.

  • Locked thread