Deriving information from data can be a big challenge. In this post, we will learn how to create a simple data visualization using d3.js. Moreover, we will try to understand the basic concepts under this amazing library as well as go through a brief overview of javascript, css, and svg.

An old and common approach to retrieve information from data is to use report files such as PDF, XLS, or static charts. However, the power of computers today is infinitely greater than that of computers 15 years ago. Therefore, there are many new ways to look at data. Although the power of computers has significantly increased, the size of screens has remained the same and hasn't kept up with the improvements of other components such as processors, memories, and secondary memories. In other words, utilizing the inifinite power of computers inside screens of such small and stagnant sizes can prove to be a real challenge.

D3.js is a javascript library for manipulating documents based on data. This is a description from its creator, Mike Bostock:

D3.js is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation.

DOM Manupulation.png

So, let's get started!

For this post, we are using a public dataset from the 2014 World Cup which has information about players that participted in this World Cup.  You can download it here. Basically, the dataset has the following information on each player:

Player String Player’s name.
Age Int Player’s age.
Captain Int Value (1 or 0) indicates whether the player is a captain.
Club String The player’s club when not playing for the national team in the World Cup.
Country String The country the player represents in the World Cup.
Group String The player’s national team belongs to this World Cup group.
Jersey Int The player’s jersey number.
Position String The player’s position.
Rank Int The ranking of the country the player represents.
Selections Int The number of World Cup appearances for this player.

Before we start, we need to clean up our data set since the entire set of data would be an overload for our little 1200x900 screen. In order to clean our dataset, we can remove countries that weren't top performers, so that the remaining countries are: Brazil (Vai Brasil!), Spain, Colombia, Uruguay, France, Argentina, Germany and Belgium. The cleaned dataset can be downloaded here.

The first step to understanding what a data visualization is and how it works through the web is to understand how html, javascript, css, and svg can work together to collect data and show it properly. Here is a quick review of what html, javascript, css and svg are:

Web browsers receive HTML documents from a webserver or from local storage and render them into multimedia web pages. HTML describes the structure of a web page semantically and includes cues for the appearance of the document.

JavaScript is a high-leveldynamicweakly typedobject-basedmulti-paradigm, and interpreted client-side programming language.

Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language.

Scalable Vector Graphics (SVG) is an XML-based vector image format for two-dimensional graphics with support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium (W3C) since 1999.

Basically, we need to get data using javascript functions and create an svg graphic inside the html. We can also utilize css to make it clear and stylish. Fortunately d3.js already has a lot of functions to help us do all of this.

So, the first step is to create a html file from scratch and include the d3.js library like this:

<script src="https://d3js.org/d3.v3.min.js"></script>
Once it's already loaded on our page, let's create our javascript file and then define the area of our data visualization. It's a good approach to create the width, height, and margins at the top of the file. For this World Cup example, let's create something in this range:
var margin = {top: 20, right: 20, bottom: 30, left: 40},
width = 960 - margin.left - margin.right,
height = 500 - margin.top - margin.bottom;
The next step is to define how our visualization will scale the data. Most of the times we don't have the right metrics for what we want to represent, so d3.js provides an awesome tool to use as a scale. As we will have 2 axes, x and y, we need to create 2 scales for each axis:
var x = d3.scale.linear(),
.range([0, width]);

var y = d3.scale.linear()
.range([height, 0]);

We just defined a linear scale for x and y with the size of our width and height predefined. d3.js has several types of scales, which you can check out here. Once our scales are done, let's create the axis with the proper orientation:

var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");

var yAxis = d3.svg.axis()
.scale(y)
.orient("left");

Last but not least, we need create the svg and attach it to the body. d3.js has a method for attaching elements into the DOM. We need to select which part of the page we want to attach with our element, and then create the element. The code to do that is:

var svg = d3.select("body")
.append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");

We are selecting the element "body" and attaching a "svg" element inside it with width and height as attributes. We are also attaching a "g" element with a transform attribute into the "svg". If you open the page now, you will be able to see something like this:

svg element.png

The svg was successfully attached to our body and the g element is there under the svg. So far so good! 

Before we continue, let's create another scale using colors. It will be useful to be able to identify each country in terms of colors in our graphic. This function basically generates 10 random colors to be consumed by any property that we assign, so just add this line above the svg creation:

var color = d3.scale.category10();

The next step is to get the data and add our visualization inside the svg element.  The d3.js library has a lot of functions that allow the consumption of data from a lot of sources, like csv, json, and tsv. For this post, we are using tsv and can properly load the data like so: 

d3.tsv("worldcup.tsv", function(error, data) {
if (error) throw error;
});
So we are telling d3 to consume this file from the root point of this project. If something isn't working, we can eliminate the error. If we assume that the right file is in the right place, we can continue to understand how to render our data visualization. It's time to go back to our raw data and think about what we want to do with it. After carefully looking at the data, a scatter plot to contrast the age of players with the number of selections in official games playing by the country seems to be an option.
 
However, if we put a console.log(data);  inside the tsv function, we will see that "selections" and "age" are strings and not integers, which is not desirable. So let's do a loop over all data in order to convert it to an integer. We just need to input:
data.forEach(function(d) {
d.Age = +d.Age;
d.Selections = +d.Selections;
});
So now we have "age" and "selections" as actual numbers and we can move on to creating our data visualzation. 
 
The next step is to define the domains for each axis. d3.js already has the function to do so, we just need to decide which property of data we want to assign to x and which to y. Here's the code:
x.domain(d3.extent(data, function(d) { return d.Selections; }));
y.domain(d3.extent(data, function(d) { return d.Age; }));
As you can see, the domain of x is assigned to show the "selections" and y is assigned to show "ages". Pretty neat, isn't it?
 
Now it's time to create the axis in the DOM. To do that, let's add a "g" element containing a class into the svg. Also, let's rename the y-axis and x-axis that we previously defined. We need to adjust the legend of each axis to show that we are looking at "age" and "selections". 
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis)
.append("text")
.attr("class", "label")
.attr("x", width)
.attr("y", -6)
.style("text-anchor", "end")
.text("Selections");

svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("class", "label")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("Age");
After we do that, let's refresh the page and see what our axes looks like. If you have followed everything up to here, then you should see something like this:
 
graphic.png
Great, it works! Now we need to apply some css to create a better experience for the user:
body {
font: 10px sans-serif;
}

.axis path,
.axis line {
fill: none;
stroke: #000;
shape-rendering: crispEdges;
}
 This allows us to change the font style: 
 
graphic font style.png
Much better right? Let's move on.
 
To create the circles inside the svg, we need to follow the same steps if we want to attach any other elements. If we want to initiate a new element with the data we can use the functions .data("data here") and .enter(), and the variable will be ready for us to utilize. 
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 6.5)
.attr("cx", function(d) { return x(d.Selections); })
.attr("cy", function(d) { return y(d.Age); })
.attr("fill", function(d, i) {
return color(d.Country);
});
And a bit more css:
.dot {
stroke: #000;
}
Here we go:
 
  colored static graphic.png

We now have a colored static graphic which gives us a visual interpretation of the players. However, we don't know much about this graphic because we have no idea what each color represents. So, we need to create a legend. We can do so using the same approach as before. We can select and apply the data and then create the box and the legend text like this:

var legend = svg.selectAll(".legend")
.data(color.domain())
.enter().append("g")
.attr("class", "legend")
.attr("transform", function(d, i) { return "translate(30," + i * 20 + ")"; });

legend.append("rect")
.attr("x", width - 18)
.attr("width", 18)
.attr("height", 18)
.style("fill", color);

legend.append("text")
.attr("x", width - 24)
.attr("y", 9)
.attr("dy", ".35em")
.style("text-anchor", "end")
.text(function(d) { return d; });
And now we have a legend to idenfity the countries:
 
colored static graphic with legend.png
So let's start by doing some analysis. The first observation is that the majority of players are less than 32 years old and most are concentrated in less than 40 selections (or 40 official games). Also, we can see that orange players (who are from Spain) have a lot of selections and are spread towards the right side of the graphic which indicates that time for them is almost over. 
 
Finally, let's add a mouseover event inside each circle to show each player.  To do that, let's create a bordered absolute div to showcase the country and the name of the player. Let's put it just above of the .dot creation.
var div = d3.select("body").append("div")
.attr("class", "tooltip")
.style("opacity", 0);
And the css for it: 
div.tooltip {  
position: absolute;
text-align: center;
width: 80px;
height: 48px;
padding: 2px;
font: 12px sans-serif;
background: lightsteelblue;
border: 0px;
border-radius: 8px;
pointer-events: none;
}
Finally, let's attach the events of mouseover and mouseout in the .dot element .
 
svg.selectAll(".dot")
.data(data)
.enter().append("circle")
.attr("class", "dot")
.attr("r", 6.5)
.attr("cx", function(d) { return x(d.Selections); })
.attr("cy", function(d) { return y(d.Age); })
.attr("fill", function(d, i) {
return color(d.Country);
})
.on("mouseover", function(d, i) {
d3.select(this)
  .attr("fill", "red");

div.transition()
.duration(200)
.style("opacity", .9);

div .html("<b>" + d.Country + "</b><br/>" + d.Player)
.style("left", (d3.event.pageX) + "px")
.style("top", (d3.event.pageY - 28) + "px");
})
.on("mouseout", function(d, i) {
d3.select(this)
.attr("fill", function() {
return color(d.Country);
});

div.transition()
.duration(500)
.style("opacity", 0);
});    
 And that's it! Now when we mouseover the circle, it shows the current player.
graphic mouseover.png
The whole file can be viewed here.
 
Thanks for reading! 

Author

André Gonzaga

André Gonzaga is a Software Engineer at Avenue Code and has been working in the field since 2009. His primary expertise is in web programming, where he utilizes different languages (including backend and frontend). He graduated with a degree in Information Systems from UFMG and with a degree in Data Science specialization from John Hopkins University. He is currently pursuing his masters degree in Knowledge Graphs at UFMG as well. André is a technology/data enthusiast who believes we can always improve our way of thinking.


What are CSS Custom Properties

READ MORE

How to Build Your Own Facial Recognition Web App

READ MORE

4 Ways to Make Your Angular App Sustainable and Scalable

READ MORE

How to Build a Custom Component in VueJS

READ MORE