For this project, our goal is to retrieve data from an API and transform it into a Tableau Hyper file, a consumable format for analytics. We’ll be accessing national holiday data from Calendarific, but this process can be used for any API data source with some modifications in the python script.
Python/Tableau Integration Installing TabPy. Tableau Table Calculations Explained. Getting Started with TabPy and Rserve in Tableau Prep. With the getoutputschema function: You can add any dummy table as your input to begin a Tableau Prep flow, but you’ll need to include the getoutputschema function to your Python script to define the structure of the output from the Python script. I included this function in my script to give you an example of how it can look like.
You’ll need an API key for this one. Thankfully, you can sign up for free and receive a key within a couple of minutes.
Since the data is JSON, to get it into a pandas dataframe you’ll need to flatten it using a package like flatten_json.
Here’s what the holiday data from the Calendarific API looks like:
After we create the script to transform the data into a pandas dataframe, we can use the .py script in Tableau Prep to generate a Tableau Hyper file.
Here’s our python script:
The data() function loops through each country in the input data to generate an API call, flatten the JSON response and append the data to a dataframe.
The get_Output_Schema function is required because we will have new columns in our output (holiday name, holiday date, etc).
Once we run the flow, we’ll have a clean Hyper file ready for analytics and visualization.
And a quick visualization of holidays in Japan.
The flow can be automated used the Tableau Server Data Management Add-on, or you can manually rerun the flow in Tableau Prep. This is holiday data, so re-running once a year won’t be an issue.