This Is The Tool We Have

I'm trying out the fastpages in hopes that I wont have to spend the time building out my own website toolset. I've been slowly building something out of Wagtail which is a really just Djnago with some bells. The real allure though is going to be the Notebook conversions - specifically the Data Visualizations. The library for interactive version is Altair and we're going to explore some data!

Lets Explore!

So, the tutorial for Scatter Plot uses the car data but I figured we mind as well do the classic Iris dataset. We'll start by importing the dataset from vega_datasets - which is the Javascript library that Altair is built on top of - using iris = data.iris().

iris = data.iris()
iris
sepalLength sepalWidth petalLength petalWidth species
0 5.1 3.5 1.4 0.2 setosa
1 4.9 3.0 1.4 0.2 setosa
2 4.7 3.2 1.3 0.2 setosa
3 4.6 3.1 1.5 0.2 setosa
4 5.0 3.6 1.4 0.2 setosa
... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica

150 rows × 5 columns

First lets see the graph - and then we'll discuss the functions

alt\
    .Chart(iris)\
    .mark_point()\
    .encode(
        x='sepalLength',
        y='sepalWidth',
        color='species',
        tooltip = ['species', 'petalLength', 'petalWidth']
    )\
    .interactive()

It is interesting to note that - per the Docs - :

Create a basic Altair/Vega-Lite chart.

Although it is possible to set all Chart properties as constructor attributes, it is more idiomatic to use methods such as mark_point(), encode(), transform_filter(), properties(), etc.

.. which means that it's found a way to do something similar to the R Programming Languages pipe operator. For reference, it would looks something like this:mtcars %>% ggplot(aes(wt, mpg)) + geom_point(aes(colour = factor(cyl)))

Example R Graph

First we tell altair to make a chart using the dataset we're using:

.Chart(iris)

... which is then followed by the kind of graph that we're after - in this case we're after a scatter plot:

.mark_point()

... and then we tell it where everything belongs.

.encode(
    x='sepalLength',
    y='sepalWidth',
    color='species',
    tooltip = ['species', 'petalLength', 'petalWidth']
)

Of interest is that you can add data from the other columns easily using the tooltip without having to add anything extra. Layering information which is relevant but lacks a meaningful graphic representation was a nice touch.

Then of course, you allow users to interact with it via:

.interactive()

Lets see what this post looks like on the blog!