{ "cells": [ { "cell_type": "markdown", "id": "59d19f73", "metadata": {}, "source": [ "# Python API Example - Arb Breakevens Data Import and Storage in Dataframe\n", "\n", "This guide is designed to provide an example of how to call the Arb Breakevens API endpoint, and store the data accordingly.\n", "\n", "__N.B. This guide is just for Arb Breakevens data. If you're looking for other API data products (such as Netbacks, Price releases or Freight routes), please refer to their according code example files.__ " ] }, { "cell_type": "markdown", "id": "b0a05be4", "metadata": {}, "source": [ "### Have any questions?\n", "\n", "If you have any questions regarding our API, or need help accessing specific datasets, please contact us at:\n", "\n", "__data@sparkcommodities.com__" ] }, { "cell_type": "markdown", "id": "9e00ae34", "metadata": {}, "source": [ "## 1. Importing Data\n", "\n", "Here we define the functions that allow us to retrieve the valid credentials to access the Spark API.\n", "\n", "__This section can remain unchanged for most Spark API users.__" ] }, { "cell_type": "code", "execution_count": null, "id": "1161e807", "metadata": {}, "outputs": [], "source": [ "import json\n", "import os\n", "import sys\n", "import pandas as pd\n", "import numpy as np\n", "from base64 import b64encode\n", "from pprint import pprint\n", "from urllib.parse import urljoin\n", "import datetime\n", "from io import StringIO\n", "\n", "\n", "try:\n", " from urllib import request, parse\n", " from urllib.error import HTTPError\n", "except ImportError:\n", " raise RuntimeError(\"Python 3 required\")\n", "\n", "\n", "API_BASE_URL = \"https://api.sparkcommodities.com\"\n", "\n", "\n", "def retrieve_credentials(file_path=None):\n", " \"\"\"\n", " Find credentials either by reading the client_credentials file or reading\n", " environment variables\n", " \"\"\"\n", " if file_path is None:\n", " client_id = os.getenv(\"SPARK_CLIENT_ID\")\n", " client_secret = os.getenv(\"SPARK_CLIENT_SECRET\")\n", " if not client_id or not client_secret:\n", " raise RuntimeError(\n", " \"SPARK_CLIENT_ID and SPARK_CLIENT_SECRET environment vars required\"\n", " )\n", " else:\n", " # Parse the file\n", " if not os.path.isfile(file_path):\n", " raise RuntimeError(\"The file {} doesn't exist\".format(file_path))\n", "\n", " with open(file_path) as fp:\n", " lines = [l.replace(\"\\n\", \"\") for l in fp.readlines()]\n", "\n", " if lines[0] in (\"clientId,clientSecret\", \"client_id,client_secret\"):\n", " client_id, client_secret = lines[1].split(\",\")\n", " else:\n", " print(\"First line read: '{}'\".format(lines[0]))\n", " raise RuntimeError(\n", " \"The specified file {} doesn't look like to be a Spark API client \"\n", " \"credentials file\".format(file_path)\n", " )\n", "\n", " print(\">>>> Found credentials!\")\n", " print(\n", " \">>>> Client_id={}, client_secret={}****\".format(client_id, client_secret[:5])\n", " )\n", "\n", " return client_id, client_secret\n", "\n", "\n", "def do_api_post_query(uri, body, headers):\n", " url = urljoin(API_BASE_URL, uri)\n", "\n", " data = json.dumps(body).encode(\"utf-8\")\n", "\n", " # HTTP POST request\n", " req = request.Request(url, data=data, headers=headers)\n", " try:\n", " response = request.urlopen(req)\n", " except HTTPError as e:\n", " print(\"HTTP Error: \", e.code)\n", " print(e.read())\n", " sys.exit(1)\n", "\n", " resp_content = response.read()\n", "\n", " # The server must return HTTP 201. Raise an error if this is not the case\n", " assert response.status == 201, resp_content\n", "\n", " # The server returned a JSON response\n", " content = json.loads(resp_content)\n", "\n", " return content\n", "\n", "\n", "def do_api_get_query(uri, access_token, format='json'):\n", " \"\"\"\n", " After receiving an Access Token, we can request information from the API.\n", " \"\"\"\n", " url = urljoin(API_BASE_URL, uri)\n", "\n", " if format == 'json':\n", " headers = {\n", " \"Authorization\": \"Bearer {}\".format(access_token),\n", " \"Accept\": \"application/json\",\n", " }\n", " elif format == 'csv':\n", " headers = {\n", " \"Authorization\": \"Bearer {}\".format(access_token),\n", " \"Accept\": \"text/csv\"\n", " }\n", " else:\n", " raise ValueError('The format parameter only takes `csv` or `json` as inputs')\n", "\n", " # HTTP GET request\n", " req = request.Request(url, headers=headers)\n", " try:\n", " response = request.urlopen(req)\n", " except HTTPError as e:\n", " print(\"HTTP Error: \", e.code)\n", " print(e.read())\n", " sys.exit(1)\n", "\n", " resp_content = response.read()\n", " #status = response.status\n", "\n", " # The server must return HTTP 200. Raise an error if this is not the case\n", " assert response.status == 200, resp_content\n", "\n", " # Storing response based on requested format\n", " if format == 'json':\n", " content = json.loads(resp_content)\n", " elif format == 'csv':\n", " content = resp_content\n", "\n", " return content\n", "\n", "\n", "def get_access_token(client_id, client_secret):\n", " \"\"\"\n", " Get a new access_token. Access tokens are the thing that applications use to make\n", " API requests. Access tokens must be kept confidential in storage.\n", "\n", " # Procedure:\n", "\n", " Do a POST query with `grantType` in the body. A basic authorization\n", " HTTP header is required. The \"Basic\" HTTP authentication scheme is defined in\n", " RFC 7617, which transmits credentials as `clientId:clientSecret` pairs, encoded\n", " using base64.\n", " \"\"\"\n", "\n", " # Note: for the sake of this example, we choose to use the Python urllib from the\n", " # standard lib. One should consider using https://requests.readthedocs.io/\n", "\n", " payload = \"{}:{}\".format(client_id, client_secret).encode()\n", " headers = {\n", " \"Authorization\": \"Basic {}\".format(b64encode(payload).decode()),\n", " \"Accept\": \"application/json\",\n", " \"Content-Type\": \"application/json\",\n", " }\n", " body = {\n", " \"grantType\": \"clientCredentials\",\n", " }\n", "\n", " content = do_api_post_query(uri=\"/oauth/token/\", body=body, headers=headers)\n", "\n", " print(\n", " \">>>> Successfully fetched an access token {}****, valid {} seconds.\".format(\n", " content[\"accessToken\"][:5], content[\"expiresIn\"]\n", " )\n", " )\n", "\n", " return content[\"accessToken\"]\n" ] }, { "cell_type": "markdown", "id": "9c527e40", "metadata": {}, "source": [ "## Reference Data fetching\n", "\n", "In the fetch request, we use the URL:\n", "\n", "__uri=\"/v1.0/netbacks/reference-data/\"__\n", "\n", "This query shows an overview on all available netbacks and according arb breakevens, showing all available ports and possible routes to/from these destinations (i.e. via Suez, Panama etc.)." ] }, { "cell_type": "code", "execution_count": null, "id": "ada4f167", "metadata": {}, "outputs": [], "source": [ "# Define the function for listing all netbacks\n", "def list_netbacks(access_token):\n", " \"\"\"\n", " Fetch available routes. Return contract ticker symbols\n", "\n", " # Procedure:\n", "\n", " Do a GET query to /v1.0/routes/ with a Bearer token authorization HTTP header.\n", " \"\"\"\n", " content = do_api_get_query(\n", " uri=\"/v1.0/netbacks/reference-data/\", access_token=access_token\n", " )\n", "\n", " print(\">>>> All the routes you can fetch\")\n", " tickers = []\n", " fobPort_names = []\n", "\n", " availablevia = []\n", "\n", " for contract in content[\"data\"][\"staticData\"][\"fobPorts\"]:\n", " tickers.append(contract[\"uuid\"])\n", " fobPort_names.append(contract[\"name\"])\n", "\n", " availablevia.append(contract[\"availableViaPoints\"])\n", "\n", " reldates = content[\"data\"][\"staticData\"][\"sparkReleases\"]\n", "\n", " dicto1 = content[\"data\"]\n", "\n", " return tickers, fobPort_names, availablevia, reldates, dicto1" ] }, { "cell_type": "markdown", "id": "1e890e9e", "metadata": {}, "source": [ "## N.B. Credentials\n", "\n", "Here we call the above functions, and input the file path to our credentials.\n", "\n", "N.B. You must have downloaded your client credentials CSV file before proceeding. Please refer to the API documentation if you have not dowloaded them already. Instructions for downloading your credentials can be found here:\n", "\n", "https://www.sparkcommodities.com/api/request/authentication.html\n" ] }, { "cell_type": "code", "execution_count": null, "id": "2b010f83", "metadata": {}, "outputs": [], "source": [ "# Input the path to your client credentials here\n", "client_id, client_secret = retrieve_credentials(file_path=\"/tmp/client_credentials.csv\")\n", "\n", "# Authenticate:\n", "access_token = get_access_token(client_id, client_secret)\n", "print(access_token)\n", "\n", "# Fetch all contracts:\n", "tickers, fobPort_names, availablevia, reldates, dicto1 = list_netbacks(access_token)" ] }, { "cell_type": "code", "execution_count": null, "id": "5d456a0a", "metadata": {}, "outputs": [], "source": [ "# Prints the callable route options, corresponding to each Route ID number shown above\n", "# I.e. availablevia[2] shows the available route options for tickers[2]\n", "\n", "print(availablevia)" ] }, { "cell_type": "code", "execution_count": null, "id": "7bcee4d3", "metadata": {}, "outputs": [], "source": [ "# Print the names of each of the ports, corresponding to Route ID and availablevia details shown above\n", "# Some of these options are currently unavailable. \n", "# Please refer to the Netbacks tool on the Spark Platform to check which Netbacks are currently available\n", "\n", "print(fobPort_names)" ] }, { "cell_type": "code", "execution_count": null, "id": "4003cf3b", "metadata": {}, "outputs": [], "source": [ "# Shows the structure of the raw dictionary called\n", "dicto1" ] }, { "cell_type": "markdown", "id": "91809a58", "metadata": {}, "source": [ "### Reformatting\n", "\n", "For a more accessible data format, we filter the data to only retrieve ports that have available Netbacks data. We then reformat this into a DataFrame." ] }, { "cell_type": "code", "execution_count": null, "id": "aaefce45", "metadata": {}, "outputs": [], "source": [ "# Define formatting data function\n", "def format_store(available_via, fob_names, tickrs):\n", " dict_store = {\n", " \"Index\": [],\n", " \"Ports\": [],\n", " \"Ticker\": [],\n", " \"Available Via\": []\n", " }\n", " \n", " c = 0\n", " for a in available_via:\n", " ## Check which routes have non-empty Netbacks data and save indices\n", " if len(a) != 0:\n", " dict_store['Index'].append(c)\n", "\n", " # Use these indices to retrive the corresponding Netbacks info\n", " dict_store['Ports'].append(fob_names[c])\n", " dict_store['Ticker'].append(tickrs[c])\n", " dict_store['Available Via'].append(available_via[c])\n", " c += 1\n", " # Show available Netbacks ports in a DataFrame (with corresponding indices)\n", " dict_df = pd.DataFrame(dict_store)\n", " return dict_df\n", "\n", "\n", "# Run formatting data function\n", "available_df = format_store(availablevia,fobPort_names,tickers)\n", "\n", "# View some of the dataframe\n", "available_df.head()" ] }, { "cell_type": "markdown", "id": "e447d6b2", "metadata": {}, "source": [ "## Fetching Arb Breakevens Data specific to one port\n", "\n", "Now that we can see all the available Netbacks data available to us, we can start to define what ports we want to call Arb Breakevens data for (by referring to 'available_df' above).\n", "\n", "The first step is to choose which port ID ('my_ticker') we want. We check what possible routes are available for this port ('possible_via') and then choose one ('my_via').\n", "\n", "__This is where you should input the specific Netbacks parameters you want to see__" ] }, { "cell_type": "code", "execution_count": null, "id": "a4480909", "metadata": {}, "outputs": [], "source": [ "# Choose route ID and price release date\n", "\n", "# Here we define which port we want\n", "port = \"Sabine Pass\"\n", "ti = int(available_df[available_df[\"Ports\"] == port][\"Index\"])\n", "my_ticker = tickers[ti]\n", "\n", "print(my_ticker)" ] }, { "cell_type": "code", "execution_count": null, "id": "cae8ec64", "metadata": {}, "outputs": [], "source": [ "# See possible route passage options\n", "possible_via = availablevia[tickers.index(my_ticker)]\n", "print(possible_via)" ] }, { "cell_type": "code", "execution_count": null, "id": "c21d6991", "metadata": {}, "outputs": [], "source": [ "# Choose route passage\n", "my_via = possible_via[0]\n", "print(my_via)" ] }, { "cell_type": "markdown", "id": "2d1bf4c0", "metadata": {}, "source": [ "## Data Import Function\n", "\n", "Defining functio to fetch Arb Breakevens data, as well as the data format of choice. In the fetch request, we use the URL:\n", "\n", "__uri=\"/v1.0/netbacks/arb-breakevens/\"__\n", "\n", "We then print the output. The data function takes 6 inputs:\n", "- __breakeven -__ which breakeven you're looking to pull, 'jkm-ttf' or 'freight' \n", "- __ticker -__ which FoB port ticker to use\n", "- __via -__ which via point to use ('cogh', 'panama' etc.)\n", "- __start -__ what date you want the historical data to start from. Format in yyyy-mm-dd\n", "- __end -__ what date you want the historical data to end at (inclusive). Format in yyyy-mm-dd\n", "- __format -__ which format you'd like to output the data, 'json' or 'csv'. Metadata is only available via JSON\n", "\n", "__This function does not need to be altered by the user.__" ] }, { "cell_type": "code", "execution_count": null, "id": "eb563eb4", "metadata": {}, "outputs": [], "source": [ "## Defining the function\n", "from io import StringIO\n", "\n", "def fetch_breakevens(access_token, ticker, via=None, breakeven='jkm-ttf', start=None, end=None, format='json'):\n", " \n", " #For a route, fetch then display the route details\n", " #https://api.sparkcommodities.com/v1.0/netbacks/arb-breakevens/\n", " \n", "\n", "\n", " query_params = breakeven + '/' + \"?fob-port={}\".format(ticker)\n", "\n", " if via is not None:\n", " query_params += \"&via-point={}\".format(via)\n", " if start is not None:\n", " query_params += \"&start={}\".format(start)\n", " if end is not None:\n", " query_params += \"&end={}\".format(end)\n", "\n", " uri = \"/v1.0/netbacks/arb-breakevens/{}\".format(query_params)\n", " print(uri)\n", " content = do_api_get_query(\n", " uri=\"/v1.0/netbacks/arb-breakevens/{}\".format(query_params),\n", " access_token=access_token, format=format,\n", " )\n", " \n", " if format == 'json':\n", " my_dict = content['data']\n", " else:\n", " my_dict = content.decode('utf-8')\n", " my_dict = pd.read_csv(StringIO(my_dict))\n", "\n", " return my_dict\n", "\n", "## Calling that function and storing the output - JSON version\n", "my_dict = fetch_breakevens(access_token, my_ticker, via='cogh', breakeven='jkm-ttf', start='2025-07-20', end='2025-07-30')\n" ] }, { "cell_type": "code", "execution_count": null, "id": "3c9e519c", "metadata": {}, "outputs": [], "source": [ "# JSON data sample\n", "my_dict" ] }, { "cell_type": "markdown", "id": "ff498957", "metadata": {}, "source": [ "## CSV example" ] }, { "cell_type": "code", "execution_count": null, "id": "30d317eb", "metadata": {}, "outputs": [], "source": [ "# calling the same data in CSV format. This option automatically converts the data to a pandas dataframe.\n", "df = fetch_breakevens(access_token, my_ticker, via='cogh', breakeven='jkm-ttf', start='2025-07-20', end='2025-07-30', format='csv')" ] }, { "cell_type": "code", "execution_count": null, "id": "1816891e", "metadata": {}, "outputs": [], "source": [ "# CSV data sample as a Pandas DataFrame\n", "df" ] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }