R1 Data Explorer API

Creating an API with FLASK on Digitalocean

Improving the Data Explorer having a real Database in the backend, instead of just downloading two csv files.

Problem

In my Script to scrape the radioeins summer sunday results I create one table with aggregated charts and one table with individual votes from the jury. A combination of both tables will be created as a view in the database. So now I have important results generated in a database, but no online database conected to my App. Right now, I do a workaround of importing these tables as csv, but that is not a suitable solution.

The database will be hosted as a managed Database at digitalocean. I don’t want to care about how to host the database for now. I just want it up and running.

After a very quick research Flask seems like a good solution for an API to that database. In the following I want to describe how I built the API, mainly for documentation purposes, but also to have a reason for a little more in depth research.

Environment Python pipenv

I haven’t used much python on my own local machine, but I know xkcd. So first of all, lets think briefliy about how I should use Python environments.

  1. Make sure I use python 3 as default for python, as Python 2.7 still comes with mac os for compatibility reasons.
  2. I want an envioronment that is fixed for my project
  3. I want to export my environment, so it can be built in the cloud at digitalocean

Pipenv looks like a very convenient way to handle dependencies. There are many more ways out there to handle python environments, but for now pipenv seems like one, i feel comfortable to handle.

Pipenv is installed with --user flag, so the environment is not used systemwide, but only by my user. In order to work, I have to specify my python $PATH as follows and described on Stackoverflow.

1
2
PYTHON_BIN_PATH="$(python -m site --user-base)/bin"
PATH="$PATH:$PYTHON_BIN_PATH"

Afterwards, I’ll be able to install packages in my new environment.

1
pipenv install flask flask-restful pandas psycopg

To run the app, considering my envirionment I have to explicitely use the shell in a pipenv command.

1
2
pipenv shell # opens custom shell
python app.py

The app runs and I can save my environment in a requirements.txt More advanced feature are well documented.

1
2
3
pipenv lock -r > requirements.txt
pipenv install --ignore-pipfile

Flask - What is it?

Flask is a web framework, which utilises Werkzeug and Jinja to make it easy to build web applications. It seems to have started as an April fools joke because it was only a small wrapper around the mentioned tools, but resonated well in the community.

I chose it, because great tutorials were created in the community and I prefer lightweighted tools. Maybe it makes sense to skip Flask and get to Werkzeug directly, as I currently don’t use ninja at all.

Flask App

App task is simple. Create two Get requests, which establisch a connection to the database with psycopg and return the complete table. Of course here is room for improvements. I could only call for the immediate data necessary, but for my pupose it is ok for now.

The code can be found below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# Python Script for Flask API
# Target: Simple Get Request API
# Receive two Tables from my Database and give it to the requester

from flask import Flask
from flask_restful import Resource, Api, reqparse
import pandas as pd
import ast
import psycopg2

app = Flask(__name__)
api = Api(app)

class Charts_Export(Resource):
	def get(self):

	# SQL Query to receive Chart Results
		conn = psycopg2.connect(database = "radioeins-dataexplorer",
								user="app_user", 
    							password="", 
    							host="db-postgresql.ondigitalocean.com", 
    							port="5432")

		data = pd.read_sql_query("SELECT * FROM charts_export;", conn)
		data = data.to_json()
		conn.close()	
		# return result
		return{'data': data}, 200  

class Vote_Charts(Resource):
	def get(self):

	# SQL Query to receive Vote Charts
		conn = psycopg2.connect(database = "radioeins-dataexplorer",
								user="app_user", 
    							password="",  
    							host="db-postgresql.ondigitalocean.com", 
    							port="5432")

		data = pd.read_sql_query("""
				SELECT * 
				FROM "Vote_charts";
				""", conn)
		data = data.to_json()
		conn.close()
		# return result
		return{'data': data}, 200  


api.add_resource(Charts_Export, '/charts')  # add endpoints
api.add_resource(Vote_Charts, '/votes')

if __name__ == '__main__':
	app.run(host='0.0.0.0',
			port='8080')  # run our Flask app :host='0.0.0.0'

Gunicorn

OK, here I stumble in completely new territory. WSGI means Web Server Gateway Interface and is an interface between webserver and application. For my application to speak with the webserver I need to configure the WSGI. I found a tutorial from digitalocean where gunicorn is used for this purpose.

I’m still not entirely sure, how it works on the app platform. On a self configured server, I would start my application with unicorn.

1
gunicorn --worker-tmp-dir /dev/shm --config gunicorn_config.py --bind 0.0.0.0:8080 main:app

In Digitalocean I create a configuration file as given in the tutorial. I definitely need to look deeper into this topic to really grasp what is happening here.

1
2
bind = "0.0.0.0:8080"
workers = 2

Deployment

Deployment on digitalocean cloud is straight forward. I connect my gitlab repo to digitalocean and the building and deployment process starts afterwards. My Databse is hosted there as well and I can connect both directly on digital ocean.

Sweet, the API is up and running and I could use it directly in my Shiny app. I probably will do it as a prototype, but won’t keep the server at DO running, as it doesn’t really pays off at the moment.

The LatestT