Interactive plots with Python

When working with large datasets the explanatory analysis proofs to be of high importance, both for the client and the data scientist. A visual and potentially interactive approach can greatly help with the development of research hypotheses and is a great start for the data cleaning and prepping process. In Python, there are great packages available to visualize data and great interactive plots.

This blog post gives a small overview of available tools, development, and deployment.

The vibrant and ever-developing landscape of Python libraries offers a variety of possibilities in 2021.

Interactive Python vizualisation libraries

There are low code alternatives like:

I personally love Tableau, especially in client interactions and meetings. However, in the research-focused setting, we then work more on the code level. Any list at this point can merely be an inspiration to play:

Plotly (-Express) / Plotly-ChartStudio

The weapon of choice featured in this post is Plotly with its great extensions and wrapper Plotly-Express and ChartStudio. We love the highly developed library that allows creating interactive graphs with just a few lines of code. The deployment is easy as it works directly from a Jupyther notebook.

Example – Show me the code….

I directly mount the Google drive, that stores my dataset in this example, from the notebook and import the relevant libraries. The credentials and the api_key for the hosting solution are also provided in the script.

from google.colab import drive
drive.mount('/content/drive')
pip install dash
pip install chart_studio
import os
import pandas as pd
import numpy as np
import plotly.express as px
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import chart_studio
chart_studio.tools.set_credentials_file(username='XXXXXXX', api_key='XXXXXXXXXXXXXXXXXXX')

The data was gathered from the three largest crowdfunding sites in Germany. In this example, I want to explorer several variables at once. First, the success of the campaign and therefore if the funding limit was reached. Second, the presentation format is represented by the video length of the pitch and the number of pictures used. Third, we are interested if the start-up has already registered a patent. Furthermore, the graph should illustrate the equity offered by the start-up on the platform.

fig2 = px.box(df, x="plat", y="equityoffer")
fig2.show()
py.plot(fig2, filename = 'boxplot', auto_open=True)

fig = px.scatter_3d(df, x="n_words", y="n_vidlength", z="n_picgraph", color="d_fundlim", size="equityoffer", hover_name="location",
                  symbol="d_Patents_registered")
fig.show()
py.plot(fig, filename = 'basic-line', auto_open=True)

Play with the graphs and have fun 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *