TL;DR: If you want to skip the detailed mumbo jumbo, go straight to the code example in the conclusion.
Introduction¶
When doing data science in a Jupyter notebook, there are plenty of options for the standard data visualization needs: matplotlib, pandas, seaborn, bokeh, etc. Occasionally you might be stuck in a situation where you can not easily express the desired visualization with the standard vocabulary provided by these tools. In these cases I like to leverage the flexibility of D3.js to build a custom graph or diagram.
In this article, I'll discuss an approach how to implement a custom do-it-yourself D3.js visualization in a Jupyter Notebook. This topic is covered in some other places around the web, but I couldn't find a complete approach that connects all the dots and isn't too hackish.
In particular, I'll try to keep these things in mind here:
- Leverage the existing Jupyter functionality,
like a RequireJS environment
to have clean dependency handling
and to avoid
<script src="...">
loading/order headaches. - No additional packages or dependencies to install
- The visualisation should work both in an interactive notebook context and in the exported HTML version.
- Avoid the typical D3.js boilerplate to insert the drawing at the desired
location (insert a
<div id="...">
in the markup and use the corresponding id in the javascript code). Keeping these id's properly in sync is annoying to maintain, especially if you iterate a lot or want multiple drawings in the same notebook.
The basics with inline %%javascript
snippets¶
To cover the basics, let's start simple with just inline %%javascript
snippet cells.
First, we tell the RequireJS environment where to find the version of D3.js we want.
Note that the .js
extension is omitted in the URL.
%%javascript
require.config({
paths: {
d3: 'https://d3js.org/d3.v5.min'
}
});
We can now create a D3.js powered SVG drawing, for example as follows:
%%javascript
(function(element) {
require(['d3'], function(d3) {
var data = [1, 2, 4, 8, 16, 8, 4, 2, 1]
var svg = d3.select(element.get(0)).append('svg')
.attr('width', 400)
.attr('height', 200);
svg.selectAll('circle')
.data(data)
.enter()
.append('circle')
.attr("cx", function(d, i) {return 40 * (i + 1);})
.attr("cy", function(d, i) {return 100 + 30 * (i % 3 - 1);})
.style("fill", "#1570a4")
.transition().duration(2000)
.attr("r", function(d) {return 2*d;})
;
})
})(element);
Note:
- We leverage RequireJS to hand us a loaded
d3
library in a closure. - The SVG drawing is appended to the current output cell:
element
is the jQuery powered wrapper for thiselement.get(0)
is the DOM node itself that can be handed tod3.select()
- The
element
variable is a global variable and overwritten on each rendering of a Javascript cell, so to make sure we capture the correctelement
inside our D3.js code (which could be executed in a different context), we wrap the whole thing in a closure.
And now, the real work¶
Unless you're just toying around with simple visualizations, these D3.js scripts can get very extensive, which is not very ideal to work on directly in a interactive notebook. You probably want to develop the D3.js script in an editor or IDE that gives you a bit more code intelligence.
Let's cover how to get things working with an separate .js
script
and, additionally, with data that is initially defined or constructed in Python.
First, we'll need these imports:
from IPython.display import display, Javascript, HTML
import json
and we set up the RequireJS search path (repeated here for completeness)
%%javascript
require.config({
paths: {
d3: 'https://d3js.org/d3.v5.min'
}
});
Let's say we have a script circles.js
that implements a certain visualisation, rougly with this structure:
define('circles', ['d3'], function (d3) {
function draw(container, data) {
var svg = d3.select(container).append("svg");
// D3.js drawing stuff here ...
}
return draw;
});
In addition to declaring a dependency on the d3
library like before, we now define
a "module" called circles
.
This explicit naming is not the standard RequireJS way,
but we have to do it because we will embed the Javascript code in the HTML document directly, instead of loading the file with a separate request.
For simplicity, the defined module is just a single function (internally called draw
),
which expects a container to append the SVG element to and a data object.
Assuming that this script circles.js
lives alongside the notebook file, we can inject the javascript code in the notebook like this:
Javascript(filename='circles.js')
Now, we define some data in Python space:
data = [5, 10, 20, 40, 50, 30, 10, 20, 40, 10, 5]
And we draw the circles like this:
Javascript("""
(function(element){
require(['circles'], function(circles) {
circles(element.get(0), %s)
});
})(element);
""" % json.dumps(data))
Note:
- We use RequireJS again to get our
circles
"module" which is just our drawing function - The container to add the drawing to is
element.get(0)
as discussed above - We convert the data to JSON and inject it in the
circles
function call in Javascript. For simplicity, I used basic Python string formatting with%
, but other templating solutions are possible of course. - If you want multiple drawings for different data sets, you probably want to put the
Javascript
thing in a reusable function. Don't forget to return theJavascript
object so it rendered properly.
Some additional practical details¶
Add CSS¶
Apart from your D3.js script, you usually also want to add some custom CSS to the mix, preferably in a separate file as well. Define the CSS in a HTML file, e.g. circles.css.html
as follows
<style>
svg circle {
stroke: #16527b;
stroke-width: 1px;
}
</style>
and load it like this:
HTML(filename='circles.css.html')
Easy reloading¶
Reloading the code from the separate script while developing can be very cumbersome. RequireJS will not automatically "reload" a module that is already defined. For a full hard refresh you should: clear the output of the Javascript(filename=..)
cell (or clear all outputs of the whole notebook), save the notebook and refresh the page in your browser.
Luckily there is a much easier way using require.undef
. Put it at the top of the script file, before the define
to "unload" the module before redefining it again. In our example:
require.undef('circles');
define('circles', ['d3'], function (d3) {
// ...
});
Now you just have the re-execute the Javascript(filename=..)
cell and the code will be reloaded, which is a lot more intuitive during development.
All together now¶
To conclude, all the bits together, in a more compact way.
from IPython.display import display, Javascript, HTML
import json
display(Javascript("require.config({paths: {d3: 'https://d3js.org/d3.v5.min'}});"))
display(Javascript(filename="circles.js"))
display(HTML(filename="circles.css.html"))
def draw_circles(data, width=600, height=400):
display(Javascript("""
(function(element){
require(['circles'], function(circles) {
circles(element.get(0), %s, %d, %d);
});
})(element);
""" % (json.dumps(data), width, height)))
draw_circles([10, 60, 40, 5, 30, 10], width=500, height=200)