(You can find the complete sourcecode for the below example here and an interactive version here) I’ve written before that I think Tableau is the best data exploration / visualization tool on the market. I still think that, but Tableau does suffer a few drawbacks:
- Cost. The last time I purchased a Tableau license (which was a while ago), the price was $1000 per seat. That’s expensive — prohibitively so for most start-ups.
- Scalability. One Tableau license does not a BI platform make. Tableau worksheets are great for presentations to large groups, but if PMs / Product Owners / business stiffs need to check data every day, looking over an analyst’s shoulder at a Tableau worksheet becomes unrealistic. Enter Tableau Server, which allows for Tableau worksheets to be served and interacted with via the web — at a considerable cost. More Tableau licenses are needed (as many as active connections must be maintained) on top of a server license.
- Ease of use. Creating simple visualizations in Tableau can be done easily by someone not familiar with the nuances of the software, but achieving the more sophisticated solutions on which Tableau’s reputation was built requires expertise that can only be acquired through ascending Tableau’s steep learning curve. Tableau’s customer support is almost non-existent — it takes the form of a user forum, in which a handful of experts help an endless throng of newbies style their charts and connect to non-standard databases. Tableau isn’t intuitive, and developing deep proficiency in the product is — again — expensive: Tableau offers courses all around the world which purport to teach sophisticated visualization techniques for an exorbitant fee. If you work for Microsoft, this is no problem; just don’t forget your gold card at home. If you work for a start-up, learning Tableau this way is not a viable option.
- Flexibility. Tableau is good at visualizing data in a database, but it’s not good at transforming that data. Its Calculated Field functionality works great for simple manipulations (say, converting a number into a percentage), but it’s incredibly cumbersome to use for more advanced calculations. This means that any statistical analysis has to be done at the data layer — which presents de-aggregation problems for some calculations (like LCV).
The metrics div will hold our visualizations object. Most of the style elements should be self-explanatory. The next step in creating the dashboard is to import some data. Your functional dashboard will probably retrieve data from your database via an AJAX call to a helper script (which will query the data and format it in JSON). For the purposes of this tutorial, I’ll generate objects and populate them with random data. Each object will have two elements: Date and DAU (daily active users). Add this bit of code to the document.ready() function:
We have enough of a framework to start visualizing the data we generated. But first we need to create some helper functions: since we are storing our data in objects, we’ll need functions to get the minimum and maximum values of the elements in the objects to create the axes of the graph. D3.js actually provides methods for doing this called d3.min() and d3.max(), but these only work on arrays, not on objects. The two helper functions are defined below:
Now we have everything we need to build our chart object:
A couple of things to note:
- When we create the visualization, we specify the data to use in the data() method. This expects an array.
- Since our objects were created in order of ascending date, we can pull the first and last elements of the array and use the date data fields as the minimum and maximum values for the date axis.
- We create a visualization object (SVG), style it with the classes we defined earlier, then append a grouping element to group the visualizations we’ll create later
- We create two axis scales using the d3.scale() and d3.time.scale() methods. d3.time.scale() allows us to scale the x axis by date. Domain is the range of the values we’ll use for the scales. The Range() method helps scale the data to fit into the physical size of the graph.
- Our visualization object is called vis.
We have now created a visualization object and attached it to the DOM. If you were to load this HTML file in a browser, you’d see this:
Now we’ll add the axes and axes labels with this code:
This creates a line object, gives it values (x values from the date element and y values from the DAU element of the data objects), appends a path object to the visualization object (vis) based on the line’s data, and gives that path object a “stroke” (color) of black. You should now see a line graph on your chart: We can now add some data points to the path by creating circle objects and appending those to the visualization object (vis). We will also include some mouseover effects for browsing through the data points:
A few notes:
- We first create the dataCirclesGroup group element to group all circles together. This will be more important in the next blog post, in which some graphs will contain multiple line charts.
- We append circle objects, style them with the “dot” style defined earlier, give them a “fill” (color) of red and a radius (r) of 3. Note that all size parameters in d3 are automatically interpreted in pixels.
- We then create some “mouseover” and “mouseout” functions to increase and decrease the size of the circles.
At this point, we have a fully functional line chart with circles superimposed on each point that animate on mouseover. The next post in this series will extend this basic functionality into a framework that allows for dynamic chart creation and multiple graphs on each chart.